Subject: | |
From: | |
Reply To: | |
Date: | Thu, 1 Jun 2006 08:19:59 -0400 |
Content-Type: | text/plain |
Parts/Attachments: |
|
|
Steve,
core dump will help.
Core dump is enabled by default. All core dumps go to: /tmp/corefiles/*
ls -lt to see the ownership and time.
Core-dumping on network file system (e.g. NFS or IBRIX) overflows
kernel stack. This means that it can't be core-dumped to your home dir.
The bad news on such setup is that you need to know where your code
runs at.
Send us an email when the code 'bus error'ed' out. I'll search
through all compute nodes for you to find where the core dump is.
If you happen to know where it is, then you can ssh into that compute
node.
Perhaps, a lot advanced users will like to know that. We ought to
come up with a better way to find out where the core dump files are:
in case users are interested.
Thanks,
Robin
On May 31, 2006, at 9:33 PM, Stephen Wright wrote:
> I compiled it with icc.
>
> I've resubmitted the job to see if it happens again overnight or if
> it was
> a fluke. So far, I haven't had trouble with other jobs running the
> same
> code on different data. I'll go over the source tomorrow and make
> sure I
> have checks on possible out-of-bounds array references.
>
> A job that runs for 8 hours before puking probably isn't a good
> candidate
> for "icc -g". Would it be helpful to have a core dump anyway? If
> so, how
> do we enable core dumping?
>
> SW
>
>> For more info please look at:
>>
>> http://web.mit.edu/answers/unix/unix_bus_or_seg.html
>>
>>
>>> It's in C.
>>>
>>>> Steve,
>>>>
>>>> Is it a C or C++ code ?
>>>>
>>>> Thanks,
>>>> Robin
>>>>
>>>>
>>>> On May 31, 2006, at 5:46 PM, Stephen Wright wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> I just had a PBS batch job quit after 8 of 48 hours with the
>>>>> following error:
>>>>>
>>>>> /var/spool/PBS/mom_priv/jobs/4578.mulnx31.SC: line 12: 18368
>>>>> Bus error
>>>>>
>>>>> Does this indicate a PBS problem or something I should be looking
>>>>> for in
>>>>> my code?
>>>>>
>>>>> Steve
>>>>
>>>
>>
>>
>> --
>> Jaime E. Combariza
>> Assistant Director Research Computing
>> Academic Technology Services
>> [log in to unmask]
>> (513) 529-5080
>> Miami University
>> Oxford, Ohio 45056
>>
|
|
|