foundry27 : Post

Forum Topic - problem debugging core file: (23 Items)

View: as

Mario Charest

04/20/2009 4:03 PM

post27531

Use IDE 4.6.0 (april).  Setup to use 6.3.2 tools. I notice the debugger has problem identifing threads in the core file.
  I try updating too 6.8.u1 by copying the executable over the 6.3.2 file.

When launch the debug session I get a "Problem Occured" dialog with the following message:

Error creating session
Cannot access memory at address 0x8bd231e5
  Cannot access memory at address 0x8bd231e5
  Cannot access memory at address 0x8bd231e5

And the session terminates.

I manually launch gdb from outside the IDE and do see the error message come once but it's usable.

Aleksandar Ristovski(deleted)

04/20/2009 4:05 PM

post27532

Re: problem debugging core file

What is the target architecture?

Mario Charest

04/20/2009 4:06 PM

post27533

RE: problem debugging core file

Target is x86.

> -----Original Message-----
> From: Aleksandar Ristovski [mailto:community-noreply@qnx.com]
> Sent: April-20-09 4:06 PM
> To: general-toolchain
> Subject: Re: problem debugging core file
> 
> What is the target architecture?
> 
> 
> 
> _______________________________________________
> General
> http://community.qnx.com/sf/go/post27532
>

Aleksandar Ristovski(deleted)

04/20/2009 4:16 PM

post27534

Re: problem debugging core file

How do you run dumper? What options?

Note that gdb works the best with full cores. However, 
threads should be read from a core's note. Do you see 
anything for

"info threads"?

Also, could you try command line session, but first set 
solib-search-path? Or simply copy all the libraries that 
were loaded when core was generated to a directory where 
your executable is, then change cur. dir. to that directory 
and run gdb.

Does info shared look correct? Do all the symbols of the 
shared libraries get found and loaded?


Thanks,

Aleksandar

Mario Charest

04/20/2009 4:41 PM

post27537

RE: problem debugging core file


> -----Original Message-----
> From: Aleksandar Ristovski [mailto:community-noreply@qnx.com]
> Sent: April-20-09 4:17 PM
> To: general-toolchain
> Subject: Re: problem debugging core file
> 
> How do you run dumper? What options?

dumper -w -z9

> 
> Note that gdb works the best with full cores.

I do not understand ?

> However,
> threads should be read from a core's note. Do you see
> anything for
> 
> "info threads"?

Nothing shows up , but coreinfo shows the info about the 3 threads.

> 
> Also, could you try command line session, but first set
> solib-search-path? 

Give: Cannot access memory at address 0x8bd231e5

> Or simply copy all the libraries that
> were loaded when core was generated to a directory where
> your executable is, then change cur. dir. to that directory
> and run gdb.

All libraries are system libraries
	- libc.so.2
	- libc.so.3 (target is running 6.4.0 but the binary is 6.3.2)
	- libcpp.so.4 
	- libz.so.2
	- libm.so.2
	- libsocket.so.2

> 
> Does info shared look correct? Do all the symbols of the
> shared libraries get found and loaded?

When I run info shared I get the error message:
Cannot access memory at address 0x8bd231e5

> 
> 
> Thanks,
> 
> Aleksandar
> 
> 
> _______________________________________________
> General
> http://community.qnx.com/sf/go/post27534
>

Aleksandar Ristovski(deleted)

04/20/2009 4:47 PM

post27540

Re: problem debugging core file

Mario Charest wrote:
> 
>> Note that gdb works the best with full cores.
> 
> I do not understand ?

dumper has options that create smaller dump files (like -m 
for example). Typically you don't want to use any of them 
(-w and -z are fine).

> 
>> However,
>> threads should be read from a core's note. Do you see
>> anything for
>>
>> "info threads"?
> 
> Nothing shows up , but coreinfo shows the info about the 3 threads.

Odd.

> 
>> Also, could you try command line session, but first set
>> solib-search-path? 
> 
> Give: Cannot access memory at address 0x8bd231e5
> 
>> Or simply copy all the libraries that
>> were loaded when core was generated to a directory where
>> your executable is, then change cur. dir. to that directory
>> and run gdb.
> 
> All libraries are system libraries
> 	- libc.so.2
> 	- libc.so.3 (target is running 6.4.0 but the binary is 6.3.2)
> 	- libcpp.so.4 
> 	- libz.so.2
> 	- libm.so.2
> 	- libsocket.so.2
> 
>> Does info shared look correct? Do all the symbols of the
>> shared libraries get found and loaded?
> 
> When I run info shared I get the error message:
> Cannot access memory at address 0x8bd231e5

That is odd; if you can, attach the executable and the core 
so I can take a look.

How are you running gdb? What commands are you using?

---
Aleksandar

Mario Charest

04/20/2009 4:51 PM

post27543

Re: problem debugging core file

> Mario Charest wrote:
> > 

> >> However,
> >> threads should be read from a core's note. Do you see
> >> anything for
> >>
> >> "info threads"?
> > 
> > Nothing shows up , but coreinfo shows the info about the 3 threads.
> 
> Odd.
> 
> > 
> >> Does info shared look correct? Do all the symbols of the
> >> shared libraries get found and loaded?
> > 
> > When I run info shared I get the error message:
> > Cannot access memory at address 0x8bd231e5
> 
> That is odd; if you can, attach the executable and the core 
> so I can take a look.

Attached and compressed with 7zip.

> 
> How are you running gdb? What commands are you using?

/opt/qnx632/.../gdb file_server.exe file_server.exe.gdb (Linux host)


> 
> ---
> Aleksandar

Attachment:

file_server.7z 3.73 MB

Aleksandar Ristovski(deleted)

Re: problem debugging core file

Aleksandar Ristovski(deleted)

04/20/2009 5:00 PM

post27544

Re: problem debugging core file

Mario Charest wrote:
> 
> Attached and compressed with 7zip.
> 


I reproduced it here. I will take a look and get back to you 
soon.

Thanks,

Aleksandar

Mario Charest

04/20/2009 5:01 PM

post27545

RE: problem debugging core file


> -----Original Message-----
> From: Aleksandar Ristovski [mailto:community-noreply@qnx.com]
> Sent: April-20-09 5:01 PM
> To: general-toolchain
> Subject: Re: problem debugging core file
> 
> Mario Charest wrote:
> >
> > Attached and compressed with 7zip.
> >
> 
> 
> I reproduced it here. I will take a look and get back to you
> soon.

Awesome, thanks Aleksandar!

> 
> Thanks,
> 
> Aleksandar
> 
> 
> _______________________________________________
> General
> http://community.qnx.com/sf/go/post27544
>

Colin Burgess(deleted)

Re: problem debugging core file

Colin Burgess(deleted)

04/20/2009 5:15 PM

post27546

Re: problem debugging core file

The core file has a corrupted PT_DYNAMIC segment.

Try the latest head dumper on your system Mario, see if it's any better.

Aleks - could this be PR63001?

Mario Charest wrote:
>> Mario Charest wrote:
> 
>>>> However,
>>>> threads should be read from a core's note. Do you see
>>>> anything for
>>>>
>>>> "info threads"?
>>> Nothing shows up , but coreinfo shows the info about the 3 threads.
>> Odd.
>>
>>>> Does info shared look correct? Do all the symbols of the
>>>> shared libraries get found and loaded?
>>> When I run info shared I get the error message:
>>> Cannot access memory at address 0x8bd231e5
>> That is odd; if you can, attach the executable and the core 
>> so I can take a look.
> 
> Attached and compressed with 7zip.
> 
>> How are you running gdb? What commands are you using?
> 
> /opt/qnx632/.../gdb file_server.exe file_server.exe.gdb (Linux host)
> 
> 
>> ---
>> Aleksandar
> 
> 
> 
> 
> _______________________________________________
> General
> http://community.qnx.com/sf/go/post27543

-- 
cburgess@qnx.com

Mario Charest

RE: problem debugging core file

Mario Charest

04/20/2009 5:50 PM

post27548

RE: problem debugging core file


> -----Original Message-----
> From: Colin Burgess [mailto:community-noreply@qnx.com]
> Sent: April-20-09 5:15 PM
> To: general-toolchain
> Subject: Re: problem debugging core file
> 
> The core file has a corrupted PT_DYNAMIC segment.
> 
Could this be an indication of something not quite right in my setup.  I have a couple of programs doing a SIGBUS when a
 machine on the network is turn off, quite a head scratcher. 

> Try the latest head dumper on your system Mario, see if it's any
> better.
> 

A head dumper? That sounds like it's going to hurt, a lot.

> Aleks - could this be PR63001?
> 
> Mario Charest wrote:
> >> Mario Charest wrote:
> >
> >>>> However,
> >>>> threads should be read from a core's note. Do you see
> >>>> anything for
> >>>>
> >>>> "info threads"?
> >>> Nothing shows up , but coreinfo shows the info about the 3 threads.
> >> Odd.
> >>
> >>>> Does info shared look correct? Do all the symbols of the
> >>>> shared libraries get found and loaded?
> >>> When I run info shared I get the error message:
> >>> Cannot access memory at address 0x8bd231e5
> >> That is odd; if you can, attach the executable and the core
> >> so I can take a look.
> >
> > Attached and compressed with 7zip.
> >
> >> How are you running gdb? What commands are you using?
> >
> > /opt/qnx632/.../gdb file_server.exe file_server.exe.gdb (Linux host)
> >
> >
> >> ---
> >> Aleksandar
> >
> >
> >
> >
> > _______________________________________________
> > General
> > http://community.qnx.com/sf/go/post27543
> 
> --
> cburgess@qnx.com
> 
> _______________________________________________
> General
> http://community.qnx.com/sf/go/post27546
>

Aleksandar Ristovski(deleted)

Re: problem debugging core file

Aleksandar Ristovski(deleted)

04/20/2009 6:42 PM

post27552

Re: problem debugging core file

Mario Charest wrote:
> 
>> -----Original Message-----
>> From: Colin Burgess [mailto:community-noreply@qnx.com]
>> Sent: April-20-09 5:15 PM
>> To: general-toolchain
>> Subject: Re: problem debugging core file
>>
>> The core file has a corrupted PT_DYNAMIC segment.
>>
> Could this be an indication of something not quite right in my setup.  I have a couple of programs doing a SIGBUS when
 a machine on the network is turn off, quite a head scratcher. 

If you use core file only, without executable, thread 
information is being loaded as well as shared libraries. I 
would think the core is good, but this suggests it doesn't 
match the executable.

 From the core, I get _r_debug to be at 0xb0873380 (and the 
fact that shared libraries are correctly identified confirms 
this is correct). However, looking at libc.so.3 versions I 
have on my host, I could not find a matching one.

What exactly do you have on your target system? How did you 
build the image? Yourself, or from the CD?

Thanks,

Aleksandar

Mario Charest

04/21/2009 8:57 AM

post27592

RE: problem debugging core file

> If you use core file only, without executable, thread
> information is being loaded as well as shared libraries. I
> would think the core is good, but this suggests it doesn't
> match the executable.
> 
>  From the core, I get _r_debug to be at 0xb0873380 (and the
> fact that shared libraries are correctly identified confirms
> this is correct). However, looking at libc.so.3 versions I
> have on my host, I could not find a matching one.

It's a libc.so.3 given to us by QSS that fixes some C++ issues when a 6.3.2 program is run on 6.4.0.

> 
> What exactly do you have on your target system? How did you
> build the image? Yourself, or from the CD?

It's a custom image with a bunch of unofficial patch for 6.4.0 and hybrid between 6.4.0 and unreleased 6.4.1 ;-)

> 
> Thanks,
> 
> Aleksandar
> 
> 
> 
> 
> 
> 
> 
> 
> _______________________________________________
> General
> http://community.qnx.com/sf/go/post27552
>

Aleksandar Ristovski(deleted)

04/21/2009 9:08 AM

post27595

Re: problem debugging core file

Mario Charest wrote:
>> If you use core file only, without executable, thread
>> information is being loaded as well as shared libraries. I
>> would think the core is good, but this suggests it doesn't
>> match the executable.

What about executable/core mismatch? You need exactly the 
same binary that cored, but ideally with debug info (I see 
the one you provided has debug info, but it's not the one 
that generated the core).

>>
>>  From the core, I get _r_debug to be at 0xb0873380 (and the
>> fact that shared libraries are correctly identified confirms
>> this is correct). However, looking at libc.so.3 versions I
>> have on my host, I could not find a matching one.
> 
> It's a libc.so.3 given to us by QSS that fixes some C++ issues when a 6.3.2 program is run on 6.4.0.
> 
>> What exactly do you have on your target system? How did you
>> build the image? Yourself, or from the CD?
> 
> It's a custom image with a bunch of unofficial patch for 6.4.0 and hybrid between 6.4.0 and unreleased 6.4.1 ;-)

Ok, just make sure you have matching libraries in your image 
and your host-side momentics.


Thanks,

Aleksandar

Mario Charest

04/21/2009 9:11 AM

post27596

RE: problem debugging core file


> -----Original Message-----
> From: Aleksandar Ristovski [mailto:community-noreply@qnx.com]
> Sent: April-21-09 9:09 AM
> To: general-toolchain
> Subject: Re: problem debugging core file
> 
> Mario Charest wrote:
> >> If you use core file only, without executable, thread
> >> information is being loaded as well as shared libraries. I
> >> would think the core is good, but this suggests it doesn't
> >> match the executable.
> 
> What about executable/core mismatch? You need exactly the
> same binary that cored, but ideally with debug info (I see
> the one you provided has debug info, but it's not the one
> that generated the core).
> 
Indeed my bad.  However with a matching core and executable the result is the same.  However the libc.so.3 on my host 
machine doesn't match the target.

> 
> Ok, just make sure you have matching libraries in your image
> and your host-side momentics.
> 


I copied all the .so files from the target in the same directory as the executable and core file is that good enough.

> 
> Thanks,
> 
> Aleksandar
> 
> 
> _______________________________________________
> General
> http://community.qnx.com/sf/go/post27595
>

Aleksandar Ristovski(deleted)

04/21/2009 9:43 AM

post27612

Re: problem debugging core file

Mario Charest wrote:
> 
>> -----Original Message-----
>> From: Aleksandar Ristovski [mailto:community-noreply@qnx.com]
>> Sent: April-21-09 9:09 AM
>> To: general-toolchain
>> Subject: Re: problem debugging core file
>>
>> Mario Charest wrote:
>>>> If you use core file only, without executable, thread
>>>> information is being loaded as well as shared libraries. I
>>>> would think the core is good, but this suggests it doesn't
>>>> match the executable.
>> What about executable/core mismatch? You need exactly the
>> same binary that cored, but ideally with debug info (I see
>> the one you provided has debug info, but it's not the one
>> that generated the core).
>>
> Indeed my bad.  However with a matching core and executable the result is the same.  However the libc.so.3 on my host 
machine doesn't match the target.
> 
>> Ok, just make sure you have matching libraries in your image
>> and your host-side momentics.
>>
> 
> 
> I copied all the .so files from the target in the same directory as the executable and core file is that good enough.

That should work. Change your current dir to that directory, 
then start gdb as you did before.

We, unfortunately, still don't have warnings for mismatching 
libraries when examining core, but it's on my TODO list. For 
now, you have to make sure yourself you get the right shared 
libraries.

Note, however, that libraries you fetched from your target 
(from your /proc/boot) will be stripped of some info that 
gdb really likes (like most of the section headers) and it 
would be ideal if you found the libraries that were used by 
mkifs to create the image.

Another approach is to put exe/core in a directory (and 
nothing else there) then start gdb:

$ gdb
...
(gdb) set solib-search-path thisdoesnotexist
(gdb) set solib-absolute-prefix thisdoesnotexist

This will "force" gdb not to find any symbols from shared 
libraries. This should give you sane debugging session, 
without, of course, all the comfort of printing symbol names 
corresponding to the shared libraries. You should, however, 
see threads, backtrace (instruction pointers) printed 
correctly, symbols in the backtrace from your executable and 
info shared should be correct. Also, you should see your 
process died due to SIGBUS.

Hope this helps,

Aleksandar

Colin Burgess(deleted)

RE: problem debugging core file

Colin Burgess(deleted)

04/20/2009 7:40 PM

post27553

RE: problem debugging core file

If the program is being launched from the remote node (or has mmap()ed files from that node), then a SIGBUS is expected 
when the next pagefault happens.
 
Dumper hasn't changed a lot since 6.4.0, except for a stack bug.
 
I wonder if it's freaking out trying to dump your dying process... I have a theory but will get back to you later...

________________________________

From: Mario Charest [mailto:community-noreply@qnx.com]
Sent: Mon 4/20/2009 5:50 PM
To: general-toolchain
Subject: RE: problem debugging core file





> -----Original Message-----
> From: Colin Burgess [mailto:community-noreply@qnx.com]
> Sent: April-20-09 5:15 PM
> To: general-toolchain
> Subject: Re: problem debugging core file
>
> The core file has a corrupted PT_DYNAMIC segment.
>
Could this be an indication of something not quite right in my setup.  I have a couple of programs doing a SIGBUS when a
 machine on the network is turn off, quite a head scratcher.

> Try the latest head dumper on your system Mario, see if it's any
> better.
>

A head dumper? That sounds like it's going to hurt, a lot.

> Aleks - could this be PR63001?
>
> Mario Charest wrote:
> >> Mario Charest wrote:
> >
> >>>> However,
> >>>> threads should be read from a core's note. Do you see
> >>>> anything for
> >>>>
> >>>> "info threads"?
> >>> Nothing shows up , but coreinfo shows the info about the 3 threads.
> >> Odd.
> >>
> >>>> Does info shared look correct? Do all the symbols of the
> >>>> shared libraries get found and loaded?
> >>> When I run info shared I get the error message:
> >>> Cannot access memory at address 0x8bd231e5
> >> That is odd; if you can, attach the executable and the core
> >> so I can take a look.
> >
> > Attached and compressed with 7zip.
> >
> >> How are you running gdb? What commands are you using?
> >
> > /opt/qnx632/.../gdb file_server.exe file_server.exe.gdb (Linux host)
> >
> >
> >> ---
> >> Aleksandar
> >
> >
> >
> >
> > _______________________________________________
> > General
> > http://community.qnx.com/sf/go/post27543
>
> --
> cburgess@qnx.com
>
> _______________________________________________
> General
> http://community.qnx.com/sf/go/post27546
>


_______________________________________________
General
http://community.qnx.com/sf/go/post27548

Attachment:

winmail.dat 5.62 KB

Mario Charest

04/21/2009 8:56 AM

post27591

RE: problem debugging core file


> -----Original Message-----
> From: Colin Burgess [mailto:community-noreply@qnx.com]
> Sent: April-20-09 7:40 PM
> To: general-toolchain
> Subject: RE: problem debugging core file
> 
> If the program is being launched from the remote node (or has mmap()ed
> files from that node), then a SIGBUS is expected when the next
> pagefault happens.

Yes the file is loaded from another node, what I do not understand is that only C++ program that will generated an 
exception upon not being able to access a file from that same node with generated a SIGBUS. The other programs (all 
loaded from that node) keep on running.

Does a C++ exception somehow trigger a pagefault?

> 
> Dumper hasn't changed a lot since 6.4.0, except for a stack bug.
> 
> I wonder if it's freaking out trying to dump your dying process... I
> have a theory but will get back to you later...
> 
> ________________________________
> 
> From: Mario Charest [mailto:community-noreply@qnx.com]
> Sent: Mon 4/20/2009 5:50 PM
> To: general-toolchain
> Subject: RE: problem debugging core file
> 
> 
> 
> 
> 
> > -----Original Message-----
> > From: Colin Burgess [mailto:community-noreply@qnx.com]
> > Sent: April-20-09 5:15 PM
> > To: general-toolchain
> > Subject: Re: problem debugging core file
> >
> > The core file has a corrupted PT_DYNAMIC segment.
> >
> Could this be an indication of something not quite right in my setup.
> I have a couple of programs doing a SIGBUS when a machine on the
> network is turn off, quite a head scratcher.
> 
> > Try the latest head dumper on your system Mario, see if it's any
> > better.
> >
> 
> A head dumper? That sounds like it's going to hurt, a lot.
> 
> > Aleks - could this be PR63001?
> >
> > Mario Charest wrote:
> > >> Mario Charest wrote:
> > >
> > >>>> However,
> > >>>> threads should be read from a core's note. Do you see anything
> > >>>> for
> > >>>>
> > >>>> "info threads"?
> > >>> Nothing shows up , but coreinfo shows the info about the 3
> threads.
> > >> Odd.
> > >>
> > >>>> Does info shared look correct? Do all the symbols of the shared
> > >>>> libraries get found and loaded?
> > >>> When I run info shared I get the error message:
> > >>> Cannot access memory at address 0x8bd231e5
> > >> That is odd; if you can, attach the executable and the core so I
> > >> can take a look.
> > >
> > > Attached and compressed with 7zip.
> > >
> > >> How are you running gdb? What commands are you using?
> > >
> > > /opt/qnx632/.../gdb file_server.exe file_server.exe.gdb (Linux
> host)
> > >
> > >
> > >> ---
> > >> Aleksandar
> > >
> > >
> > >
> > >
> > > _______________________________________________
> > > General
> > > http://community.qnx.com/sf/go/post27543
> >
> > --
> > cburgess@qnx.com
> >
> > _______________________________________________
> > General
> > http://community.qnx.com/sf/go/post27546
> >
> 
> 
> _______________________________________________
> General
> http://community.qnx.com/sf/go/post27548
> 
> 
> 
> 
> 
> _______________________________________________
> General
> http://community.qnx.com/sf/go/post27553

Colin Burgess(deleted)

04/21/2009 9:19 AM

post27600

Re: problem debugging core file

I imagine it's simply touching a page of data/code that hasn't been loaded in yet.  A pagefault that cannot
be satisfied (since the node has gone away) will cause a SIGBUS.

It looks like dumper will then try to read the pages from the remote node which will of course fail, but
dumper just writes out garbage as a result.

I'll raise a new PR.

Mario Charest wrote:
> 
>> -----Original Message-----
>> From: Colin Burgess [mailto:community-noreply@qnx.com]
>> Sent: April-20-09 7:40 PM
>> To: general-toolchain
>> Subject: RE: problem debugging core file
>>
>> If the program is being launched from the remote node (or has mmap()ed
>> files from that node), then a SIGBUS is expected when the next
>> pagefault happens.
> 
> Yes the file is loaded from another node, what I do not understand is that only C++ program that will generated an 
exception upon not being able to access a file from that same node with generated a SIGBUS. The other programs (all 
loaded from that node) keep on running.
> 
> Does a C++ exception somehow trigger a pagefault?
> 
>> Dumper hasn't changed a lot since 6.4.0, except for a stack bug.
>>
>> I wonder if it's freaking out trying to dump your dying process... I
>> have a theory but will get back to you later...
>>
>> ________________________________
>>
>> From: Mario Charest [mailto:community-noreply@qnx.com]
>> Sent: Mon 4/20/2009 5:50 PM
>> To: general-toolchain
>> Subject: RE: problem debugging core file
>>
>>
>>
>>
>>
>>> -----Original Message-----
>>> From: Colin Burgess [mailto:community-noreply@qnx.com]
>>> Sent: April-20-09 5:15 PM
>>> To: general-toolchain
>>> Subject: Re: problem debugging core file
>>>
>>> The core file has a corrupted PT_DYNAMIC segment.
>>>
>> Could this be an indication of something not quite right in my setup.
>> I have a couple of programs doing a SIGBUS when a machine on the
>> network is turn off, quite a head scratcher.
>>
>>> Try the latest head dumper on your system Mario, see if it's any
>>> better.
>>>
>> A head dumper? That sounds like it's going to hurt, a lot.
>>
>>> Aleks - could this be PR63001?
>>>
>>> Mario Charest wrote:
>>>>> Mario Charest wrote:
>>>>>>> However,
>>>>>>> threads should be read from a core's note. Do you see anything
>>>>>>> for
>>>>>>>
>>>>>>> "info threads"?
>>>>>> Nothing shows up , but coreinfo shows the info about the 3
>> threads.
>>>>> Odd.
>>>>>
>>>>>>> Does info shared look correct? Do all the symbols of the shared
>>>>>>> libraries get found and loaded?
>>>>>> When I run info shared I get the error message:
>>>>>> Cannot access memory at address 0x8bd231e5
>>>>> That is odd; if you can, attach the executable and the core so I
>>>>> can take a look.
>>>> Attached and compressed with 7zip.
>>>>
>>>>> How are you running gdb? What commands are you using?
>>>> /opt/qnx632/.../gdb file_server.exe file_server.exe.gdb (Linux
>> host)
>>>>
>>>>> ---
>>>>> Aleksandar
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> General
>>>> http://community.qnx.com/sf/go/post27543
>>> --
>>> cburgess@qnx.com
>>>
>>>...

Mario Charest

04/21/2009 9:23 AM

post27605

RE: problem debugging core file


> -----Original Message-----
> From: Colin Burgess [mailto:community-noreply@qnx.com]
> Sent: April-21-09 9:20 AM
> To: general-toolchain
> Subject: Re: problem debugging core file
> 
> I imagine it's simply touching a page of data/code that hasn't been
> loaded in yet.  A pagefault that cannot
> be satisfied (since the node has gone away) will cause a SIGBUS.

Is there a way to force the code/data to be loaded at startup.  I always assume the code/data was loaded at startup.  
This seems extremely unrealtime to me?

> 
> It looks like dumper will then try to read the pages from the remote
> node which will of course fail, but
> dumper just writes out garbage as a result.
> 
> I'll raise a new PR.
> 
> Mario Charest wrote:
> >
> >> -----Original Message-----
> >> From: Colin Burgess [mailto:community-noreply@qnx.com]
> >> Sent: April-20-09 7:40 PM
> >> To: general-toolchain
> >> Subject: RE: problem debugging core file
> >>
> >> If the program is being launched from the remote node (or has
> mmap()ed
> >> files from that node), then a SIGBUS is expected when the next
> >> pagefault happens.
> >
> > Yes the file is loaded from another node, what I do not understand is
> that only C++ program that will generated an exception upon not being
> able to access a file from that same node with generated a SIGBUS. The
> other programs (all loaded from that node) keep on running.
> >
> > Does a C++ exception somehow trigger a pagefault?
> >
> >> Dumper hasn't changed a lot since 6.4.0, except for a stack bug.
> >>
> >> I wonder if it's freaking out trying to dump your dying process... I
> >> have a theory but will get back to you later...
> >>
> >> ________________________________
> >>
> >> From: Mario Charest [mailto:community-noreply@qnx.com]
> >> Sent: Mon 4/20/2009 5:50 PM
> >> To: general-toolchain
> >> Subject: RE: problem debugging core file
> >>
> >>
> >>
> >>
> >>
> >>> -----Original Message-----
> >>> From: Colin Burgess [mailto:community-noreply@qnx.com]
> >>> Sent: April-20-09 5:15 PM
> >>> To: general-toolchain
> >>> Subject: Re: problem debugging core file
> >>>
> >>> The core file has a corrupted PT_DYNAMIC segment.
> >>>
> >> Could this be an indication of something not quite right in my
> setup.
> >> I have a couple of programs doing a SIGBUS when a machine on the
> >> network is turn off, quite a head scratcher.
> >>
> >>> Try the latest head dumper on your system Mario, see if it's any
> >>> better.
> >>>
> >> A head dumper? That sounds like it's going to hurt, a lot.
> >>
> >>> Aleks - could this be PR63001?
> >>>
> >>> Mario Charest wrote:
> >>>>> Mario Charest wrote:
> >>>>>>> However,
> >>>>>>> threads should be read from a core's note. Do you see anything
> >>>>>>> for
> >>>>>>>
> >>>>>>> "info threads"?
> >>>>>> Nothing shows up , but coreinfo shows the info about the 3
> >> threads.
> >>>>> Odd.
> >>>>>
> >>>>>>> Does info shared look correct? Do all the symbols of the shared
> >>>>>>> libraries get found and loaded?
> >>>>>> When I run info shared I get the error message:
> >>>>>> Cannot...

View Full Message

Colin Burgess(deleted)

04/21/2009 9:37 AM

post27610

Re: problem debugging core file

Mario Charest wrote:
> 
>> -----Original Message-----
>> From: Colin Burgess [mailto:community-noreply@qnx.com]
>> Sent: April-21-09 9:20 AM
>> To: general-toolchain
>> Subject: Re: problem debugging core file
>>
>> I imagine it's simply touching a page of data/code that hasn't been
>> loaded in yet.  A pagefault that cannot
>> be satisfied (since the node has gone away) will cause a SIGBUS.
> 
> Is there a way to force the code/data to be loaded at startup.  I always assume the code/data was loaded at startup.  
This seems extremely unrealtime to me?

You can turn off lazy pagefaulting with -mL to procnto or mlockall() in your program.

>> It looks like dumper will then try to read the pages from the remote
>> node which will of course fail, but
>> dumper just writes out garbage as a result.
>>
>> I'll raise a new PR.

Scratch that - I just analysed the core file further and it is sane - it's the executable mismatch that appears
to be the problem.

>>
>> Mario Charest wrote:
>>>> -----Original Message-----
>>>> From: Colin Burgess [mailto:community-noreply@qnx.com]
>>>> Sent: April-20-09 7:40 PM
>>>> To: general-toolchain
>>>> Subject: RE: problem debugging core file
>>>>
>>>> If the program is being launched from the remote node (or has
>> mmap()ed
>>>> files from that node), then a SIGBUS is expected when the next
>>>> pagefault happens.
>>> Yes the file is loaded from another node, what I do not understand is
>> that only C++ program that will generated an exception upon not being
>> able to access a file from that same node with generated a SIGBUS. The
>> other programs (all loaded from that node) keep on running.
>>> Does a C++ exception somehow trigger a pagefault?
>>>
>>>> Dumper hasn't changed a lot since 6.4.0, except for a stack bug.
>>>>
>>>> I wonder if it's freaking out trying to dump your dying process... I
>>>> have a theory but will get back to you later...
>>>>
>>>> ________________________________
>>>>
>>>> From: Mario Charest [mailto:community-noreply@qnx.com]
>>>> Sent: Mon 4/20/2009 5:50 PM
>>>> To: general-toolchain
>>>> Subject: RE: problem debugging core file
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>> -----Original Message-----
>>>>> From: Colin Burgess [mailto:community-noreply@qnx.com]
>>>>> Sent: April-20-09 5:15 PM
>>>>> To: general-toolchain
>>>>> Subject: Re: problem debugging core file
>>>>>
>>>>> The core file has a corrupted PT_DYNAMIC segment.
>>>>>
>>>> Could this be an indication of something not quite right in my
>> setup.
>>>> I have a couple of programs doing a SIGBUS when a machine on the
>>>> network is turn off, quite a head scratcher.
>>>>
>>>>> Try the latest head dumper on your system Mario, see if it's any
>>>>> better.
>>>>>
>>>> A head dumper? That sounds like it's going to hurt, a lot.
>>>>
>>>>> Aleks - could this be PR63001?
>>>>>
>>>>> Mario Charest wrote:
>>>>>>> Mario Charest wrote:
>>>>>>>>> However,
>>>>>>>>> threads should be read from a core's note. Do you see anything
>>>>>>>>> for
>>>>>>>>>
>>>>>>>>> "info...

View Full Message

Aleksandar Ristovski(deleted)

Re: problem debugging core file

Aleksandar Ristovski(deleted)

04/20/2009 5:58 PM

post27549

Re: problem debugging core file

Colin Burgess wrote:
> The core file has a corrupted PT_DYNAMIC segment.

How did you determine that?

> 
> Try the latest head dumper on your system Mario, see if it's any better.
> 
> Aleks - could this be PR63001?

I am not sure; I don't see this core being corrupt - what am 
I missing?

Colin Burgess(deleted)

04/21/2009 9:13 AM

post27597

Re: problem debugging core file

cburgess@titirangi100:~$ ntox86-gdb file_server.exe
GNU gdb 6.7 qnx-nto
Copyright (C) 2007 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>;
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "--host=i686-pc-linux-gnu --target=i386-pc-nto-qnx6.3.0"...
(gdb) info files
Symbols from "/home/cburgess/file_server.exe".
Local exec file:
     `/home/cburgess/file_server.exe', file type elf32-i386.
     Entry point: 0x804eed4
     0x080480f4 - 0x08048108 is .interp
     0x08048108 - 0x08048c88 is .hash
     0x08048c88 - 0x0804a9f8 is .dynsym
     0x0804a9f8 - 0x0804d294 is .dynstr
     0x0804d294 - 0x0804d642 is .gnu.version
     0x0804d644 - 0x0804d664 is .gnu.version_r
     0x0804d664 - 0x0804d6ac is .rel.got
     0x0804d6ac - 0x0804d7e4 is .rel.bss
     0x0804d7e4 - 0x0804df7c is .rel.plt
     0x0804df7c - 0x0804df89 is .init
     0x0804df8c - 0x0804eecc is .plt
     0x0804eecc - 0x08174864 is .text
     0x08174864 - 0x0817486c is .fini
     0x08174880 - 0x0818bd20 is .rodata
     0x0818cd20 - 0x0818d488 is .data
     0x0818d488 - 0x081b87c8 is .eh_frame
     0x081b87c8 - 0x081d069c is .gcc_except_table
     0x081d069c - 0x081d0784 is .dynamic
     0x081d0784 - 0x081d0878 is .ctors
     0x081d0878 - 0x081d0954 is .dtors
     0x081d0954 - 0x081d0d50 is .got
     0x081d0d60 - 0x081e9a60 is .bss
(gdb) x/32w 0x081d069c
0x81d069c <__FRAME_END__+98008>:    0x00000001  0x00000001  0x00000001  0x00000072
0x81d06ac <__FRAME_END__+98024>:    0x00000001  0x000000ef  0x00000001  0x000002f4
0x81d06bc <__FRAME_END__+98040>:    0x00000001  0x000022a1  0x0000000c  0x0804df7c
0x81d06cc <__FRAME_END__+98056>:    0x0000000d  0x08174864  0x00000004  0x08048108
0x81d06dc <__FRAME_END__+98072>:    0x00000005  0x0804a9f8  0x00000006  0x08048c88
0x81d06ec <__FRAME_END__+98088>:    0x0000000a  0x0000289c  0x0000000b  0x00000010
0x81d06fc <__FRAME_END__+98104>:    0x00000015  0x00000000  0x00000003  0x081d0954
0x81d070c <__FRAME_END__+98120>:    0x00000002  0x00000798  0x00000014  0x00000011
(gdb) target core file_server.exe.core
warning: exec file is newer than core file.
Cannot access memory at address 0x8b14ec83
(gdb) target core file_server.exe.core
(gdb) x/32w 0x081d069c
0x81d069c <__FRAME_END__+98008>:    0xb0319790  0xb0319070  0xb031f6f0  0xb035a780
0x81d06ac <__FRAME_END__+98024>:    0xb03208a0  0xb035a7d0  0xb031a180  0xb0367080
0x81d06bc <__FRAME_END__+98040>:    0xb0342280  0xb0354f00  0xb031e5b0  0xb03594f0
0x81d06cc <__FRAME_END__+98056>:    0xb0321250  0xb0354e90  0xb03293f0  0xb82c1f90
0x81d06dc <__FRAME_END__+98072>:    0xb031f840  0xb035abe0  0xb03195c0  0xb03598d0
0x81d06ec <__FRAME_END__+98088>:    0xb031e600  0xb033a5ac  0xb034fae0  0xb828c4e0
0x81d06fc <__FRAME_END__+98104>:    0xb0329e60  0xb0319490  0xb828e280  0xb034e3f0
0x81d070c <__FRAME_END__+98120>:    0xb033af38  0xb035a440  0xb82850a8  0xb03429f0
(gdb) quit


Aleksandar Ristovski wrote:
> Colin Burgess wrote:
>> The core file has a corrupted PT_DYNAMIC segment.
> 
> How did you determine that?
> 
>> Try the latest head dumper on your system Mario, see if it's any better.
>>
>> Aleks - could this be PR63001?
> 
> I am not sure; I don't see this core being corrupt - what am 
> I missing?
> 
> 
> 
> _______________________________________________
> General
> http://community.qnx.com/sf/go/post27549
> 

-- 
cburgess@qnx.com

Return

The text you entered is not a valid object ID
More Information
Object IDs begin with an object prefix and end with a number. For example, if you enter
artf2345
the application will jump directly to an artifact with the ID artf2345. Some valid object prefixes are:
artf	for an artifact
doc	for a document
page	for a project page
topc	for a discussion topic
wiki	for a wiki page