Santosh Kumar
08/22/2008 4:28 PM
post12299
|
Hi Sean/Robert,
I am guessing that you have heard from Yao regarding the io-pkt crash issue. I would like to provide you a brief
background of the code and whats loaded on it.
1. All modifications that I did on io-pkt are commented with "Santosh Added/Modifed/removed" tags surrounding the code.
a simple grep would tell you all files that I have touched. I have a cleaner implementation in mind but havent gotten
around to making that yet.
2. Following modules are loaded when io-pkt is running:
a. lsm-qnet
b. pq2fads-ppc8260 driver (QNX provided)
c. Broadcom BCM5691Driver
d. lsm-tipc (ported TIPC software)
3. io-pkt doesnt crash right away. It usually crashes after around 2-3 hours of operation. When running CPU utilization
remains normal (used hogs to check).
I have provided the source code of io-pkt but havent provided for TIPC or Broadcom. Let me know what I can do.
Thanks,
Santosh
|
|
|
Andrew Boyd(deleted)
08/25/2008 9:13 AM
post12341
|
From: Santosh Kumar
> io-pkt ... crashes after around 2-3 hours
Ensure that dumper is running. Examine the
core dump with gdb, which will tell you what
code caused the crash.
If you're not sure how to do this, no problem.
Create a tar file with your io-pkt executable
and core file and all loaded dll .so files,
which are listed by:
# pidin -p io-pkt-v4 mem
Once you figure out which dll is causing the
crash, re-compile it with "-g -O0", get another
core dump, and gdb will tell you exactly which
line of code you crashed on, which variable was
being accessed, etc. At this point the problem
should be pretty self-evident.
Easily reproducible core dumps are generally
pretty low-hanging fruit.
--
aboyd
|
|
|
Santosh Kumar
08/25/2008 10:19 AM
post12356
|
Hi Andrew,
I've already done the first part and the code consistently seems to crash in one specific area. I have passed on the
core, binary and source of it to support and got a response that some thread if crashing the stack itself.
With so many .so loaded, how do I figure out which specific dll casued this (assuming it is dll which is causing it).
I've been advised to run io-pkt with dll's included with all symbols and let things run till crash occurs and then
hopefully the stack trace would tell me whats happening. I will be trying this today.
If there is a easier method to know which dll is causing the problem, let me know. Maybe I should run the test with only
a specific dll loaded at a time assuming the culprit bring one of our internally developed drivers/module. I could do
that simultaneously.
Thank you for your advise.
Regards,
Santosh
> From: Santosh Kumar
>
> > io-pkt ... crashes after around 2-3 hours
>
> Ensure that dumper is running. Examine the
> core dump with gdb, which will tell you what
> code caused the crash.
>
> If you're not sure how to do this, no problem.
>
> Create a tar file with your io-pkt executable
> and core file and all loaded dll .so files,
> which are listed by:
>
> # pidin -p io-pkt-v4 mem
>
> Once you figure out which dll is causing the
> crash, re-compile it with "-g -O0", get another
> core dump, and gdb will tell you exactly which
> line of code you crashed on, which variable was
> being accessed, etc. At this point the problem
> should be pretty self-evident.
>
> Easily reproducible core dumps are generally
> pretty low-hanging fruit.
>
> --
> aboyd
|
|
|
Andrew Boyd(deleted)
08/25/2008 10:24 AM
post12359
|
From: Santosh Kumar
> the code consistently seems to crash in one specific area
Where?
> With so many .so loaded, how do I figure out which specific dll
The backtrace in gdb (of the core dump) will tell you in which
function (of which dll) that the crash occurred.
If you then compile that dll debug (-g -O0) then the next
core file you get, gdb will tell you which exact line and
variable access caused the crash.
--
aboyd
|
|
|
Robert Craig
08/25/2008 10:33 AM
post12361
|
Hi Santosh:
We're in a difficult position because the core dump that you
sent us doesn't seem to be of much use for some reason. We also don't
have the capability of reproducing your set up given the driver / stack
mods / new protocol that are part of your setup. While the stack is in
the section of the code that is related to io-pkt directly, without a
backtrace, there isn't really much that we can do to diagnose the root
cause of the issue. My recommendation is to build a full debug version
of the stack, drivers and protocols and set things up to run everything
in GDB. As long as you've got multiple interfaces, you can have one
instantiation of the stack set up to interface with your host and GDB
which then runs the second instance of the stack with all of the
protocols to reproduce the crash.
I can walk you through this if you like...
Robert.
-----Original Message-----
From: Santosh Kumar [mailto:community-noreply@qnx.com]
Sent: Monday, August 25, 2008 10:20 AM
To: technology-networking
Subject: Re: RE: io-pkt Crash
Hi Andrew,
I've already done the first part and the code consistently seems to
crash in one specific area. I have passed on the core, binary and source
of it to support and got a response that some thread if crashing the
stack itself.
With so many .so loaded, how do I figure out which specific dll casued
this (assuming it is dll which is causing it). I've been advised to run
io-pkt with dll's included with all symbols and let things run till
crash occurs and then hopefully the stack trace would tell me whats
happening. I will be trying this today.
If there is a easier method to know which dll is causing the problem,
let me know. Maybe I should run the test with only a specific dll loaded
at a time assuming the culprit bring one of our internally developed
drivers/module. I could do that simultaneously.
Thank you for your advise.
Regards,
Santosh
> From: Santosh Kumar
>
> > io-pkt ... crashes after around 2-3 hours
>
> Ensure that dumper is running. Examine the
> core dump with gdb, which will tell you what
> code caused the crash.
>
> If you're not sure how to do this, no problem.
>
> Create a tar file with your io-pkt executable
> and core file and all loaded dll .so files,
> which are listed by:
>
> # pidin -p io-pkt-v4 mem
>
> Once you figure out which dll is causing the
> crash, re-compile it with "-g -O0", get another
> core dump, and gdb will tell you exactly which
> line of code you crashed on, which variable was
> being accessed, etc. At this point the problem
> should be pretty self-evident.
>
> Easily reproducible core dumps are generally
> pretty low-hanging fruit.
>
> --
> aboyd
_______________________________________________
Technology
http://community.qnx.com/sf/go/post12356
|
|
|
Santosh Kumar
08/25/2008 12:38 PM
post12374
|
Hi Robert,
I got what needs to be done. Ill set it up and when I get the crash Ill report back.
Thanks,
Santosh
> Hi Santosh:
> We're in a difficult position because the core dump that you
> sent us doesn't seem to be of much use for some reason. We also don't
> have the capability of reproducing your set up given the driver / stack
> mods / new protocol that are part of your setup. While the stack is in
> the section of the code that is related to io-pkt directly, without a
> backtrace, there isn't really much that we can do to diagnose the root
> cause of the issue. My recommendation is to build a full debug version
> of the stack, drivers and protocols and set things up to run everything
> in GDB. As long as you've got multiple interfaces, you can have one
> instantiation of the stack set up to interface with your host and GDB
> which then runs the second instance of the stack with all of the
> protocols to reproduce the crash.
>
> I can walk you through this if you like...
>
> Robert.
>
> -----Original Message-----
> From: Santosh Kumar [mailto:community-noreply@qnx.com]
> Sent: Monday, August 25, 2008 10:20 AM
> To: technology-networking
> Subject: Re: RE: io-pkt Crash
>
> Hi Andrew,
>
> I've already done the first part and the code consistently seems to
> crash in one specific area. I have passed on the core, binary and source
> of it to support and got a response that some thread if crashing the
> stack itself.
>
> With so many .so loaded, how do I figure out which specific dll casued
> this (assuming it is dll which is causing it). I've been advised to run
> io-pkt with dll's included with all symbols and let things run till
> crash occurs and then hopefully the stack trace would tell me whats
> happening. I will be trying this today.
>
> If there is a easier method to know which dll is causing the problem,
> let me know. Maybe I should run the test with only a specific dll loaded
> at a time assuming the culprit bring one of our internally developed
> drivers/module. I could do that simultaneously.
>
> Thank you for your advise.
>
> Regards,
> Santosh
> > From: Santosh Kumar
> >
> > > io-pkt ... crashes after around 2-3 hours
> >
> > Ensure that dumper is running. Examine the
> > core dump with gdb, which will tell you what
> > code caused the crash.
> >
> > If you're not sure how to do this, no problem.
> >
> > Create a tar file with your io-pkt executable
> > and core file and all loaded dll .so files,
> > which are listed by:
> >
> > # pidin -p io-pkt-v4 mem
> >
> > Once you figure out which dll is causing the
> > crash, re-compile it with "-g -O0", get another
> > core dump, and gdb will tell you exactly which
> > line of code you crashed on, which variable was
> > being accessed, etc. At this point the problem
> > should be pretty self-evident.
> >
> > Easily reproducible core dumps are generally
> > pretty low-hanging fruit.
> >
> > --
> > aboyd
>
>
>
>
> _______________________________________________
> Technology
> http://community.qnx.com/sf/go/post12356
|
|
|
Robert Craig
08/25/2008 10:47 AM
post12363
|
As a side note, io-pkt has a SIGSEGV handler in it that may be causing
the confusing backtraces that we're seeing.
The SIGSEGV handler is needed to gracefully bring down drivers that may
try to DMA data into areas of memory that are no longer "owned" by the
stack after the process has died.
To make it easy, you can just comment out the lines in trunk/sys/main.c
(around line 300... sigdelset(&signals, SIGSEGV) / signal(SIGSEGV,
segv_handler)) so that the SIGSEGV comes directly from the code instead
of the signal handler.
Robert.
-----Original Message-----
From: Robert Craig
Sent: Monday, August 25, 2008 10:34 AM
To: 'post12356@community.qnx.com'
Subject: RE: RE: io-pkt Crash
Hi Santosh:
We're in a difficult position because the core dump that you
sent us doesn't seem to be of much use for some reason. We also don't
have the capability of reproducing your set up given the driver / stack
mods / new protocol that are part of your setup. While the stack is in
the section of the code that is related to io-pkt directly, without a
backtrace, there isn't really much that we can do to diagnose the root
cause of the issue. My recommendation is to build a full debug version
of the stack, drivers and protocols and set things up to run everything
in GDB. As long as you've got multiple interfaces, you can have one
instantiation of the stack set up to interface with your host and GDB
which then runs the second instance of the stack with all of the
protocols to reproduce the crash.
I can walk you through this if you like...
Robert.
-----Original Message-----
From: Santosh Kumar [mailto:community-noreply@qnx.com]
Sent: Monday, August 25, 2008 10:20 AM
To: technology-networking
Subject: Re: RE: io-pkt Crash
Hi Andrew,
I've already done the first part and the code consistently seems to
crash in one specific area. I have passed on the core, binary and source
of it to support and got a response that some thread if crashing the
stack itself.
With so many .so loaded, how do I figure out which specific dll casued
this (assuming it is dll which is causing it). I've been advised to run
io-pkt with dll's included with all symbols and let things run till
crash occurs and then hopefully the stack trace would tell me whats
happening. I will be trying this today.
If there is a easier method to know which dll is causing the problem,
let me know. Maybe I should run the test with only a specific dll loaded
at a time assuming the culprit bring one of our internally developed
drivers/module. I could do that simultaneously.
Thank you for your advise.
Regards,
Santosh
> From: Santosh Kumar
>
> > io-pkt ... crashes after around 2-3 hours
>
> Ensure that dumper is running. Examine the
> core dump with gdb, which will tell you what
> code caused the crash.
>
> If you're not sure how to do this, no problem.
>
> Create a tar file with your io-pkt executable
> and core file and all loaded dll .so files,
> which are listed by:
>
> # pidin -p io-pkt-v4 mem
>
> Once you figure out which dll is causing the
> crash, re-compile it with "-g -O0", get another
> core dump, and gdb will tell you exactly which
> line of code you crashed on, which variable was
> being accessed, etc. At this point the problem
> should be pretty self-evident.
>
> Easily reproducible core dumps are generally
> pretty low-hanging fruit.
>
> --
> aboyd
_______________________________________________
Technology
http://community.qnx.com/sf/go/post12356
|
|
|
Andrew Boyd(deleted)
08/25/2008 10:51 AM
post12364
|
From: Robert Craig
> io-pkt has a SIGSEGV handler
If you update your gdb, it will
do the "right thing" and backtrace
through the SIGSEGV hander to the
real culprit. 6.7 seems to do the
trick.
I am a real fan of the sigsegv
handler, because the hardware is
squelched, and doesn't continue
writing into memory that the OS
thinks is free.
--
aboyd
|
|
|
Santosh Kumar
08/25/2008 1:24 PM
post12377
|
I typically debug using the Momentics IDE. You mentioned I need to update the gdb. You mean ntoppc-gdb needs to be
updated? What version do you recommend and where can I find the binaries for it if any?
Thanks,
Santosh
> From: Robert Craig
>
> > io-pkt has a SIGSEGV handler
>
> If you update your gdb, it will
> do the "right thing" and backtrace
> through the SIGSEGV hander to the
> real culprit. 6.7 seems to do the
> trick.
>
> I am a real fan of the sigsegv
> handler, because the hardware is
> squelched, and doesn't continue
> writing into memory that the OS
> thinks is free.
>
> --
> aboyd
|
|
|
Andrew Boyd(deleted)
08/25/2008 1:30 PM
post12378
|
From: Santosh Kumar
> You mean ntoppc-gdb needs to be updated?
right - that's what I used, to examine a
core dump.
> What version do you recommend and
6.7
> where can I find the binaries for it if any?
I can give you a URL to an internal server, but that's
not going to help you any! I can email you an x86
binary of ntoppc-gdb if that helps.
--
aboyd
|
|
|
Santosh Kumar
|
Re: RE: RE: RE: io-pkt Crash
|
Santosh Kumar
08/25/2008 1:39 PM
post12380
|
Re: RE: RE: RE: io-pkt Crash
> I can give you a URL to an internal server, but that's
> not going to help you any! I can email you an x86
> binary of ntoppc-gdb if that helps.
Yes please can you email it to me? I seem to have version 5.2. Thanks!
-Santosh
|
|
|
Colin Burgess(deleted)
08/25/2008 1:41 PM
post12381
|
The command-line tools project has downloads...
http://community.qnx.com/sf/projects/toolchain
Santosh Kumar wrote:
>> I can give you a URL to an internal server, but that's
>> not going to help you any! I can email you an x86
>> binary of ntoppc-gdb if that helps.
>
> Yes please can you email it to me? I seem to have version 5.2. Thanks!
>
> -Santosh
>
>
>
>
> _______________________________________________
> Technology
> http://community.qnx.com/sf/go/post12380
>
--
cburgess@qnx.com
|
|
|
Robert Craig
08/25/2008 1:37 PM
post12379
|
Hi Santosh:
That's a tricky one. I THINK that you'll be OK as long as you don't
try and backtrace through the signal handler (which is why I suggested
removing registering the signal handler for your debug version). I'm
not sure how you'd go about updating the GDB binaries on your host and
still have everything play nicely together with the IDE. The other
possibility is to try using the 6.4.0 pre-release since it's got updated
everything in it.
Robert.
-----Original Message-----
From: Santosh Kumar [mailto:community-noreply@qnx.com]
Sent: Monday, August 25, 2008 1:25 PM
To: technology-networking
Subject: Re: RE: RE: io-pkt Crash
I typically debug using the Momentics IDE. You mentioned I need to
update the gdb. You mean ntoppc-gdb needs to be updated? What version do
you recommend and where can I find the binaries for it if any?
Thanks,
Santosh
> From: Robert Craig
>
> > io-pkt has a SIGSEGV handler
>
> If you update your gdb, it will
> do the "right thing" and backtrace
> through the SIGSEGV hander to the
> real culprit. 6.7 seems to do the
> trick.
>
> I am a real fan of the sigsegv
> handler, because the hardware is
> squelched, and doesn't continue
> writing into memory that the OS
> thinks is free.
>
> --
> aboyd
_______________________________________________
Technology
http://community.qnx.com/sf/go/post12377
|
|
|
Santosh Kumar
|
Re: RE: RE: RE: io-pkt Crash
|
Santosh Kumar
08/25/2008 1:44 PM
post12382
|
Re: RE: RE: RE: io-pkt Crash
> I'm not sure how you'd go about updating the GDB binaries on your host and
> still have everything play nicely together with the IDE. The other
> possibility is to try using the 6.4.0 pre-release since it's got updated
> everything in it.
I intend to take a backup of my existing version and then use the x86 binary Andrew would email me. Since IDE invokes
ntoppc-gdb internally, it should take up the new one unless there are some dynamic libraries that it depends on which I
wouldnt have with my 6.3.2 installation. Upgrading to 6.4.0 pre-release would take a lot of time wouldnt it? Im doing
several things in parallel. Anyway let me know. Otherwise, I would simple comment out the lines you recommended and use
existing gdb version 5.2
Thanks,
Santosh
|
|
|
Robert Craig
|
RE: RE: RE: RE: io-pkt Crash
|
Robert Craig
08/25/2008 1:58 PM
post12383
|
RE: RE: RE: RE: io-pkt Crash
Let's give Andrew's binary a try. If that doesn't work, comment out the
SIGSEGV handler and if THAT doesn't work, trying the 6.4.0 pre-release
would be the last option.
R.
-----Original Message-----
From: Santosh Kumar [mailto:community-noreply@qnx.com]
Sent: Monday, August 25, 2008 1:45 PM
To: technology-networking
Subject: Re: RE: RE: RE: io-pkt Crash
> I'm not sure how you'd go about updating the GDB binaries on your host
and
> still have everything play nicely together with the IDE. The other
> possibility is to try using the 6.4.0 pre-release since it's got
updated
> everything in it.
I intend to take a backup of my existing version and then use the x86
binary Andrew would email me. Since IDE invokes ntoppc-gdb internally,
it should take up the new one unless there are some dynamic libraries
that it depends on which I wouldnt have with my 6.3.2 installation.
Upgrading to 6.4.0 pre-release would take a lot of time wouldnt it? Im
doing several things in parallel. Anyway let me know. Otherwise, I would
simple comment out the lines you recommended and use existing gdb
version 5.2
Thanks,
Santosh
_______________________________________________
Technology
http://community.qnx.com/sf/go/post12382
|
|
|
Santosh Kumar
|
Re: RE: RE: RE: RE: io-pkt Crash
|
Santosh Kumar
08/25/2008 3:30 PM
post12398
|
Re: RE: RE: RE: RE: io-pkt Crash
> Let's give Andrew's binary a try. If that doesn't work, comment out the
> SIGSEGV handler and if THAT doesn't work, trying the 6.4.0 pre-release
> would be the last option.
>
> R.
Right now, I have setup a system with debugger attached to it. Im using gdb 5.2 with no signal handler for SIGSEGV.
If that doesnt help or I see something weird then Ill try Andrew's binary followed by 6.4.0 release option.
Will post the results as soon as I get the crash.
Thanks,
Santosh
|
|
|
Santosh Kumar
|
Re: RE: RE: RE: RE: io-pkt Crash
|
Santosh Kumar
08/25/2008 6:45 PM
post12409
|
Re: RE: RE: RE: RE: io-pkt Crash
Hi,
I was unable to reproduce the crash, There was one difference between the current setup and the previous one.
In order to perform remote debug, I started io-net with prefix=/home/alt and then started io-pkt. Loaded all modules as
usual excepting one Ethernet interface (using devn-pq2fads-ppc8260.so) and lsm-qnet module. In the origninal setup that
crashed, I was bring up io-pkt with the BSP and no io-net at all.
What can I now do to attach/start the debugger and try to reproduce the same problem? Do you guys have native ppc gdb?
-Santosh
|
|
|
Colin Burgess(deleted)
08/25/2008 9:20 PM
post12411
|
If you have a serial port you can run pdebug over that...
(target)
# pdebug /dev/ser1,57600
(host)
(gdb) set remotebaudrate 57600
(gdb) target qnx /dev/ser1
Then you don't need the extra io-net
Santosh Kumar wrote:
> Hi,
>
> I was unable to reproduce the crash, There was one difference between the current setup and the previous one.
>
> In order to perform remote debug, I started io-net with prefix=/home/alt and then started io-pkt. Loaded all modules
as usual excepting one Ethernet interface (using devn-pq2fads-ppc8260.so) and lsm-qnet module. In the origninal setup
that crashed, I was bring up io-pkt with the BSP and no io-net at all.
>
> What can I now do to attach/start the debugger and try to reproduce the same problem? Do you guys have native ppc gdb?
>
> -Santosh
>
>
>
> _______________________________________________
> Technology
> http://community.qnx.com/sf/go/post12409
>
--
cburgess@qnx.com
|
|
|
Santosh Kumar
08/25/2008 11:05 PM
post12412
|
> If you have a serial port you can run pdebug over that...
>
> (target)
> # pdebug /dev/ser1,57600
>
>
> (host)
>
> (gdb) set remotebaudrate 57600
> (gdb) target qnx /dev/ser1
>
> Then you don't need the extra io-net
Thanks I will try this. However, I am connected to the system via serial port through a terminal server. Is it possible
to configure pdebug to use that? Let me know, if not, I will hook it up to my machine overnight today.
Thanks,
Santosh
> Santosh Kumar wrote:
> > Hi,
> >
> > I was unable to reproduce the crash, There was one difference between the
> current setup and the previous one.
> >
> > In order to perform remote debug, I started io-net with prefix=/home/alt and
> then started io-pkt. Loaded all modules as usual excepting one Ethernet
> interface (using devn-pq2fads-ppc8260.so) and lsm-qnet module. In the
> origninal setup that crashed, I was bring up io-pkt with the BSP and no io-net
> at all.
> >
> > What can I now do to attach/start the debugger and try to reproduce the same
> problem? Do you guys have native ppc gdb?
> >
> > -Santosh
> >
> >
> >
> > _______________________________________________
> > Technology
> > http://community.qnx.com/sf/go/post12409
> >
>
> --
> cburgess@qnx.com
|
|
|
Yao Zhao(deleted)
08/25/2008 11:31 PM
post12413
|
or just let it crash and get core if you have the sigv_handler commented out.
|
|
|
Colin Burgess(deleted)
08/25/2008 11:45 PM
post12414
|
As long as the terminal server is 8 bit clean I think it would work...
Santosh Kumar wrote:
>> If you have a serial port you can run pdebug over that...
>>
>> (target)
>> # pdebug /dev/ser1,57600
>>
>>
>> (host)
>>
>> (gdb) set remotebaudrate 57600
>> (gdb) target qnx /dev/ser1
>>
>> Then you don't need the extra io-net
>
>
> Thanks I will try this. However, I am connected to the system via serial port through a terminal server. Is it
possible to configure pdebug to use that? Let me know, if not, I will hook it up to my machine overnight today.
>
> Thanks,
> Santosh
>> Santosh Kumar wrote:
>>> Hi,
>>>
>>> I was unable to reproduce the crash, There was one difference between the
>> current setup and the previous one.
>>> In order to perform remote debug, I started io-net with prefix=/home/alt and
>> then started io-pkt. Loaded all modules as usual excepting one Ethernet
>> interface (using devn-pq2fads-ppc8260.so) and lsm-qnet module. In the
>> origninal setup that crashed, I was bring up io-pkt with the BSP and no io-net
>> at all.
>>> What can I now do to attach/start the debugger and try to reproduce the same
>> problem? Do you guys have native ppc gdb?
>>> -Santosh
>>>
>>>
>>>
>>> _______________________________________________
>>> Technology
>>> http://community.qnx.com/sf/go/post12409
>>>
>> --
>> cburgess@qnx.com
>
>
>
>
> _______________________________________________
> Technology
> http://community.qnx.com/sf/go/post12412
>
--
cburgess@qnx.com
|
|
|
Robert Craig
|
RE: RE: RE: RE: RE: io-pkt Crash
|
Robert Craig
08/26/2008 10:29 AM
post12430
|
RE: RE: RE: RE: RE: io-pkt Crash
Hi Santosh:
The OTHER possibility is that by compiling everything with a
different optimization level, you've changed the behaviour such that
things no longer crash. Can you reproduce the crash using the new
binaries without changing anything else?
Robert.
-----Original Message-----
From: Santosh Kumar [mailto:community-noreply@qnx.com]
Sent: Monday, August 25, 2008 6:46 PM
To: technology-networking
Subject: Re: RE: RE: RE: RE: io-pkt Crash
Hi,
I was unable to reproduce the crash, There was one difference between
the current setup and the previous one.
In order to perform remote debug, I started io-net with prefix=/home/alt
and then started io-pkt. Loaded all modules as usual excepting one
Ethernet interface (using devn-pq2fads-ppc8260.so) and lsm-qnet module.
In the origninal setup that crashed, I was bring up io-pkt with the BSP
and no io-net at all.
What can I now do to attach/start the debugger and try to reproduce the
same problem? Do you guys have native ppc gdb?
-Santosh
_______________________________________________
Technology
http://community.qnx.com/sf/go/post12409
|
|
|
Yao Zhao(deleted)
|
Re: RE: RE: RE: RE: RE: io-pkt Crash
|
Yao Zhao(deleted)
08/26/2008 11:32 AM
post12442
|
Re: RE: RE: RE: RE: RE: io-pkt Crash
I am still using Santosh's old core dump:
NowI knew we have a segv_handler in io-pkt then the disassembly in core file makes sense and looks like executable is
relocated, that is why ntoppc-gdb can't figure out the correct backtrace with correct function name.
My analysis below:
r0 0x480864fc 1208509692
r1 0x47fbeb00 1207692032
r2 0x480d5610 1208833552
r3 0x0 0
r4 0xfe36dea0 4265008800
r5 0x4 4
r6 0x3ff 1023
r7 0x9 9
r8 0xfe377e38 4265049656
r9 0x0 0
r10 0xfe36e0b8 4265009336
r11 0x0 0
r12 0x48681038 1214779448
r13 0x480d9a48 1208851016
r14 0x0 0
r15 0x0 0
r16 0x0 0
r17 0x2 2
r18 0x79ef506e 2045726830
r19 0x110f6584 286221700
r20 0x0 0
r21 0x0 0
r22 0x14 20
r23 0x4862d900 1214437632
r24 0x10 16
r25 0x480d6d58 1208839512
r26 0x47fbee24 1207692836
r27 0xc0a80a5a 3232238170
r28 0xfffffffc 4294967292
r29 0x48681010 1214779408
r30 0x47fbeb20 1207692064
r31 0x47fbeb98 1207692184
pc 0x48086510 1208509712
cr 0x2000003f 536870975
lr 0x480864fc 1208509692
ctr 0xfe37bcb0 4265065648
xer 0x0 0
(gdb) x/128 0x47fbeb00-32
0x47fbeae0: 0x47fbeb00 0x48081b94 0x00000000 0x480f0e58
0x47fbeaf0: 0x47fbeb20 0x480f1d90 0xfe380aa8 0x47fbeb98
0x47fbeb00: 0x47fbeb10 0x480864fc 0x00000000 0x480f0e58
0x47fbeb10: 0x47fbeb20 0xfe31b540 0x48635c00 0x480e88c0
0x47fbeb20: 0x0000000b 0x00000001 0x00000000 0x0000000b
0x47fbeb30: 0x4807ed2c 0xc0a80a5a 0x00000000 0x00000000
0x47fbeb40: 0x00000000 0x00000000 0x480864e0 0x47fbeb80
0x47fbeb50: 0xffbffafd 0xffffffff 0x00000000 0x84000000
0x47fbeb60: 0x47fbeb90 0xfe37c194 0x00000000 0x00000000
0x47fbeb70: 0x47fbeb90 0x00000000 0x00000044 0x00000001
0x47fbeb80: 0x00000000 0x00000000 0x00000000 0x00000000
0x47fbeb90: 0x00000000 0x00000000 0x00000024 0x47fbed50
0x47fbeba0: 0x480d5610 0x48635c00 0x4862fc6c 0x47fbee24
0x47fbebb0: 0x00000400 0x00000000 0x4862d900 0x00000000
0x47fbebc0: 0x00000001 0x4862fc6c 0x48681038 0x480d9a48
0x47fbebd0: 0x00000000 0x00000000 0x00000000 0x00000002
0x47fbebe0: 0x79ef506e 0x110f6584 0x00000000 0x00000000
0x47fbebf0: 0x00000014 0x4862d900 0x00000010 0x480d6d58
0x47fbec00: 0x47fbee24 0xc0a80a5a 0xfffffffc 0x48681010
0x47fbec10: 0x00000000 0x48635c00 0x00000000 0x4807dd60
0x47fbec20: 0x800ab932 0x4807ed2c 0x84000034 0x00000000
0x47fbec30: 0x47fbec40 0xfe339f88 0x00000014 0x00000000
0x47fbec40: 0x000005dc 0x00000000 0x00000000 0x00000002
0x47fbec50: 0x00000044 0x480e88c0 0x00000000 0x480ef6c0
0x47fbec60: 0x47fbec90 0x48059c98 0x00000000 0x480f0e58
0x47fbec70: 0x47fbec90 0x00000000 0x48635c00 0x480e88c0
0x47fbec80: 0x47fbeca0 0x480ef6c0 0x480dd320 0x000000ff
0x47fbec90: 0x47fbecf0 0x4805cb78 0x48635c5c 0x486657bc
0x47fbeca0: 0x47fbecd0 0x480a6f14 0xfe380aa8 0x4816c8e0
0x47fbecb0: 0x00000020 0x00000014 0x4862da00 0x00000000
0x47fbecc0: 0x486657bc ...
View Full Message
|
|
|
Santosh Kumar
|
Re: RE: RE: RE: RE: RE: io-pkt Crash
|
Santosh Kumar
08/26/2008 1:21 PM
post12446
|
Re: RE: RE: RE: RE: RE: io-pkt Crash
> Hi Santosh:
> The OTHER possibility is that by compiling everything with a
> different optimization level, you've changed the behaviour such that
> things no longer crash. Can you reproduce the crash using the new
> binaries without changing anything else?
>
> Robert.
Hi Robert,
I think you might be right. I ran the same binaries without signal handler overnight and it still hasnt crashed. What is
the default optimization level of the non-debug version? Maybe I can simple set the CCOPTS flag with -O0 but without
the -g option and then run the whole thing again. I have seen strange behaviors with optimization all the time. So
wouldnt be surprised. What do you suggest?
Thanks,
Santosh
> -----Original Message-----
> From: Santosh Kumar [mailto:community-noreply@qnx.com]
> Sent: Monday, August 25, 2008 6:46 PM
> To: technology-networking
> Subject: Re: RE: RE: RE: RE: io-pkt Crash
>
> Hi,
>
> I was unable to reproduce the crash, There was one difference between
> the current setup and the previous one.
>
> In order to perform remote debug, I started io-net with prefix=/home/alt
> and then started io-pkt. Loaded all modules as usual excepting one
> Ethernet interface (using devn-pq2fads-ppc8260.so) and lsm-qnet module.
> In the origninal setup that crashed, I was bring up io-pkt with the BSP
> and no io-net at all.
>
> What can I now do to attach/start the debugger and try to reproduce the
> same problem? Do you guys have native ppc gdb?
>
> -Santosh
>
>
>
> _______________________________________________
> Technology
> http://community.qnx.com/sf/go/post12409
|
|
|
Robert Craig
08/26/2008 1:36 PM
post12448
|
Hi Santosh:
These problems can be pretty tricky to track down and suggest
either a compiler optimization bug or code assumptions that are
incorrect (usually something like multi-threading where a "volatile"
keyword has been forgotten). In any case, until you know for sure, you
might just be narrowing a timing window for the fault as opposed to
removing it altogether, so it can come back and bite you later.
This could take a bit of time, but I'd suggest trying to get to the root
cause. Compiling everything without optimizations is going to have a
significant performance penalty. Figure out which of the binaries
(driver, io-pkt, protocol) is causing the problem and then try and work
it down to the source file (that isn't going to be easy). I'd also
recommend keeping out the SIGSEGV handler so that you can get a
backtrace. That'll give you more information about which source files
you should try compiling un-optimized....
Robert.
-----Original Message-----
From: Santosh Kumar [mailto:community-noreply@qnx.com]
Sent: Tuesday, August 26, 2008 1:22 PM
To: technology-networking
Subject: Re: RE: RE: RE: RE: RE: io-pkt Crash
> Hi Santosh:
> The OTHER possibility is that by compiling everything with a
> different optimization level, you've changed the behaviour such that
> things no longer crash. Can you reproduce the crash using the new
> binaries without changing anything else?
>
> Robert.
Hi Robert,
I think you might be right. I ran the same binaries without signal
handler overnight and it still hasnt crashed. What is the default
optimization level of the non-debug version? Maybe I can simple set the
CCOPTS flag with -O0 but without the -g option and then run the whole
thing again. I have seen strange behaviors with optimization all the
time. So wouldnt be surprised. What do you suggest?
Thanks,
Santosh
> -----Original Message-----
> From: Santosh Kumar [mailto:community-noreply@qnx.com]
> Sent: Monday, August 25, 2008 6:46 PM
> To: technology-networking
> Subject: Re: RE: RE: RE: RE: io-pkt Crash
>
> Hi,
>
> I was unable to reproduce the crash, There was one difference between
> the current setup and the previous one.
>
> In order to perform remote debug, I started io-net with
prefix=/home/alt
> and then started io-pkt. Loaded all modules as usual excepting one
> Ethernet interface (using devn-pq2fads-ppc8260.so) and lsm-qnet
module.
> In the origninal setup that crashed, I was bring up io-pkt with the
BSP
> and no io-net at all.
>
> What can I now do to attach/start the debugger and try to reproduce
the
> same problem? Do you guys have native ppc gdb?
>
> -Santosh
>
>
>
> _______________________________________________
> Technology
> http://community.qnx.com/sf/go/post12409
_______________________________________________
Technology
http://community.qnx.com/sf/go/post12446
|
|
|
|