Project Home
Project Home
Wiki
Wiki
Discussion Forums
Discussions
Project Information
Project Info
Forum Topic - io-pkt "losing" IRQs: (6 Items)
   
io-pkt "losing" IRQs  
We have an open ticket in the support system regarding network crashes that we are experiencing at a customer site. 

We have just upgraded to 6.41 in the hope that this would help, but it has failed to solve the problem (and has actually
 made it worse on one system). 

This is a standard x86 Quad Core PC system running current 6.4.1 with the e1000 driver and three NICs (two on IRQ7, one 
on IRQ11).

One thing that seems very curious is that after the crash, io-pkt has "lost" the IRQs that it had previously hooked. For
 example:

On a working system ("pidin irq" extract):

114707   1 sbin/io-pkt-v4-hc  
 114707   2 sbin/io-pkt-v4-hc  
		12                0    0 T-- @0x80b7334:0x812a060 
		13              0x7    0 T-- @0xb82225bb:0x81539c0 
		14              0xb    0 T-- @0xb82225bb:0x81c1180 
 114707   3 sbin/io-pkt-v4-hc  
 114707   4 sbin/io-pkt-v4-hc  
 114707   5 sbin/io-pkt-v4-hc  
 114707   6 sbin/io-pkt-v4-hc  
 114707   7 sbin/io-pkt-v4-hc  
 114707   8 sbin/io-pkt-v4-hc  

On the same system, after it has failed:

 114707   1 sbin/io-pkt-v4-hc  
 114707   2 sbin/io-pkt-v4-hc  
               11                0    0 T-- @0x80b7334:0x812a060 
 114707   3 sbin/io-pkt-v4-hc  
 114707   4 sbin/io-pkt-v4-hc  
 114707   5 sbin/io-pkt-v4-hc  
 114707   6 sbin/io-pkt-v4-hc  
 114707   7 sbin/io-pkt-v4-hc  
 114707   8 sbin/io-pkt-v4-hc  
 114707   9 sbin/io-pkt-v4-hc  

This seems very weird to me. How can io-pkt have un-hooked the IRQs? What does this mean? By the way, there is nothing 
in sloginfo.

I don't mean to circumvent the official support ticket (or imply that we're not getting a good response through that 
mechanism. We are.) But this seems very weird and I wanted to share it with a wider audience it case anyone else might 
have a useful comment/suggestion. And we are desperate since this is killing us in the field.

Any ideas/comments certainly welcomed.

Thanks,
Rob Rutherford
Ruzz TV
Re: io-pkt "losing" IRQs  
It only looks like the ethernet driver unregister the interrupt. what is pidin -p io-pkt-v4-hc mem
Who crashes?

or run the driver with verbose=5 may see more in sloginfo.

yao
> 
> We have an open ticket in the support system regarding network crashes that we
>  are experiencing at a customer site. 
> 
> We have just upgraded to 6.41 in the hope that this would help, but it has 
> failed to solve the problem (and has actually made it worse on one system). 
> 
> This is a standard x86 Quad Core PC system running current 6.4.1 with the 
> e1000 driver and three NICs (two on IRQ7, one on IRQ11).
> 
> One thing that seems very curious is that after the crash, io-pkt has "lost" 
> the IRQs that it had previously hooked. For example:
> 
> On a working system ("pidin irq" extract):
> 
> 114707   1 sbin/io-pkt-v4-hc  
>  114707   2 sbin/io-pkt-v4-hc  
> 		12                0    0 T-- @0x80b7334:0x812a060 
> 		13              0x7    0 T-- @0xb82225bb:0x81539c0 
> 		14              0xb    0 T-- @0xb82225bb:0x81c1180 
>  114707   3 sbin/io-pkt-v4-hc  
>  114707   4 sbin/io-pkt-v4-hc  
>  114707   5 sbin/io-pkt-v4-hc  
>  114707   6 sbin/io-pkt-v4-hc  
>  114707   7 sbin/io-pkt-v4-hc  
>  114707   8 sbin/io-pkt-v4-hc  
> 
> On the same system, after it has failed:
> 
>  114707   1 sbin/io-pkt-v4-hc  
>  114707   2 sbin/io-pkt-v4-hc  
>                11                0    0 T-- @0x80b7334:0x812a060 
>  114707   3 sbin/io-pkt-v4-hc  
>  114707   4 sbin/io-pkt-v4-hc  
>  114707   5 sbin/io-pkt-v4-hc  
>  114707   6 sbin/io-pkt-v4-hc  
>  114707   7 sbin/io-pkt-v4-hc  
>  114707   8 sbin/io-pkt-v4-hc  
>  114707   9 sbin/io-pkt-v4-hc  
> 
> This seems very weird to me. How can io-pkt have un-hooked the IRQs? What does
>  this mean? By the way, there is nothing in sloginfo.
> 
> I don't mean to circumvent the official support ticket (or imply that we're 
> not getting a good response through that mechanism. We are.) But this seems 
> very weird and I wanted to share it with a wider audience it case anyone else 
> might have a useful comment/suggestion. And we are desperate since this is 
> killing us in the field.
> 
> Any ideas/comments certainly welcomed.
> 
> Thanks,
> Rob Rutherford
> Ruzz TV


Re: io-pkt "losing" IRQs  
Please can you add the following lines to your build file, before the
diskboot line:

slogger -s128k
pci-bios -vvv
waitfor /dev/pci

Rebuild your boot image and reboot the machine. After the crash, please post
the output from 'pidin mem' as well as the sloginfo output.

Thanks, Hugh.


On 10-07-14 10:17 AM, "Robert Rutherford" <community-noreply@qnx.com> wrote:

> 
> We have an open ticket in the support system regarding network crashes that we
> are experiencing at a customer site.
> 
> We have just upgraded to 6.41 in the hope that this would help, but it has
> failed to solve the problem (and has actually made it worse on one system).
> 
> This is a standard x86 Quad Core PC system running current 6.4.1 with the
> e1000 driver and three NICs (two on IRQ7, one on IRQ11).
> 
> One thing that seems very curious is that after the crash, io-pkt has "lost"
> the IRQs that it had previously hooked. For example:
> 
> On a working system ("pidin irq" extract):
> 
> 114707   1 sbin/io-pkt-v4-hc
>  114707   2 sbin/io-pkt-v4-hc
> 12                0    0 T-- @0x80b7334:0x812a060
> 13              0x7    0 T-- @0xb82225bb:0x81539c0
> 14              0xb    0 T-- @0xb82225bb:0x81c1180
>  114707   3 sbin/io-pkt-v4-hc
>  114707   4 sbin/io-pkt-v4-hc
>  114707   5 sbin/io-pkt-v4-hc
>  114707   6 sbin/io-pkt-v4-hc
>  114707   7 sbin/io-pkt-v4-hc
>  114707   8 sbin/io-pkt-v4-hc
> 
> On the same system, after it has failed:
> 
>  114707   1 sbin/io-pkt-v4-hc
>  114707   2 sbin/io-pkt-v4-hc
>                11                0    0 T-- @0x80b7334:0x812a060
>  114707   3 sbin/io-pkt-v4-hc
>  114707   4 sbin/io-pkt-v4-hc
>  114707   5 sbin/io-pkt-v4-hc
>  114707   6 sbin/io-pkt-v4-hc
>  114707   7 sbin/io-pkt-v4-hc
>  114707   8 sbin/io-pkt-v4-hc
>  114707   9 sbin/io-pkt-v4-hc
> 
> This seems very weird to me. How can io-pkt have un-hooked the IRQs? What does
> this mean? By the way, there is nothing in sloginfo.
> 
> I don't mean to circumvent the official support ticket (or imply that we're
> not getting a good response through that mechanism. We are.) But this seems
> very weird and I wanted to share it with a wider audience it case anyone else
> might have a useful comment/suggestion. And we are desperate since this is
> killing us in the field.
> 
> Any ideas/comments certainly welcomed.
> 
> Thanks,
> Rob Rutherford
> Ruzz TV
> 
> 
> 
> _______________________________________________
> 
> Technology
> http://community.qnx.com/sf/go/post59414
> 

-- 
Hugh Brown                      (613) 591-0931 ext. 2209 (voice)
QNX Software Systems Ltd.        (613) 591-3579           (fax)
175 Terence Matthews Cres.       email:  hsbrown@qnx.com
Kanata, Ontario, Canada.
K2M 1W8
 

Re: io-pkt "losing" IRQs  
> Please can you add the following lines to your build file, before the
> diskboot line:
> 
> slogger -s128k
> pci-bios -vvv
> waitfor /dev/pci
> 
> Rebuild your boot image and reboot the machine. After the crash, please post
> the output from 'pidin mem' as well as the sloginfo output.

We haven't had a chance yet to capture this info with the requested changes to the boot image.

But I have attached a set of data that was captured from the previous crash yesterday, this includes "pidin mem" and "
pci -v". I will post the same set of data again as soon as I can following the next crash (only this time with the 
modified boot image).

Thanks.
Attachment: Text felixdb1_logs.tgz 13 KB
Re: io-pkt "losing" IRQs  
Look at sloginfo, it seems qnet complains "No interfaces" any more, and pidin irq no e1000 interrupts any more. Probably
 e1000 unregister the interrupts under some conditions. 
I guess that is a clue to debug.


> > Please can you add the following lines to your build file, before the
> > diskboot line:
> > 
> > slogger -s128k
> > pci-bios -vvv
> > waitfor /dev/pci
> > 
> > Rebuild your boot image and reboot the machine. After the crash, please post
> 
> > the output from 'pidin mem' as well as the sloginfo output.
> 
> We haven't had a chance yet to capture this info with the requested changes to
>  the boot image.
> 
> But I have attached a set of data that was captured from the previous crash 
> yesterday, this includes "pidin mem" and "pci -v". I will post the same set of
>  data again as soon as I can following the next crash (only this time with the
>  modified boot image).
> 
> Thanks.


Re: io-pkt "losing" IRQs  
The only things that come to mind are:

- io-pkt crashed and got restarted
- There's more than one instance running.
- the interface(s) got unloaded somehow.

Regards,

-seanb

On Thu, Jul 15, 2010 at 09:26:42AM -0400, Yao Zhao wrote:
> Look at sloginfo, it seems qnet complains "No interfaces" any more, and pidin irq no e1000 interrupts any more. 
Probably e1000 unregister the interrupts under some conditions. 
> I guess that is a clue to debug.
> 
> 
> > > Please can you add the following lines to your build file, before the
> > > diskboot line:
> > > 
> > > slogger -s128k
> > > pci-bios -vvv
> > > waitfor /dev/pci
> > > 
> > > Rebuild your boot image and reboot the machine. After the crash, please post
> > 
> > > the output from 'pidin mem' as well as the sloginfo output.
> > 
> > We haven't had a chance yet to capture this info with the requested changes to
> >  the boot image.
> > 
> > But I have attached a set of data that was captured from the previous crash 
> > yesterday, this includes "pidin mem" and "pci -v". I will post the same set of
> >  data again as soon as I can following the next crash (only this time with the
> >  modified boot image).
> > 
> > Thanks.
> 
> 
> 
> 
> 
> 
> _______________________________________________
> 
> Technology
> http://community.qnx.com/sf/go/post59514
>