Dave Bott(deleted)
|
devnp-e1000 RX overruns on *some* ports
|
Dave Bott(deleted)
08/20/2009 1:09 PM
post36378
|
devnp-e1000 RX overruns on *some* ports
Customer is using QNX 6.4.1 SMP 8x way on a 2x Xeon (each 4-way + hyperthreading (disabled)) board:
Intel S5520UR board
SATA SSD, 4GB RAM, DVD, Intel dual NIC (pro1000PT).
The attached hw-info file (get_hw_info is great!) file shows the PCI setup
There are 4 GigE ports on the board - 2 from an 82575EB chip, 2 from an 82571EB
The 82571EB NICs seem to work fine - can send/receive very well
The 82575 NICs see link status and see receive packets, but after ~200 packets, the received OK count stops and th SQE
error count starts incrementing. The source shows that SQE represents the RX dropped count.
The slog shows an Unexpected irq cause 0x40, which is RX overrun.
The pci config *looks* OK in terms of allocations. Interrupt assignments seem OK too.
Running *just* the 82575 ports did not make a difference.
Any suggestions / known issue ?
Thanks
Dave
|
|
|
Hugh Brown
|
RE: devnp-e1000 RX overruns on *some* ports
|
Hugh Brown
08/20/2009 1:19 PM
post36379
|
RE: devnp-e1000 RX overruns on *some* ports
Why is the driver being started with irq=11? Two of the devices are on
IRQ 10 and the other 2 are on 11. The driver should be started without
any IRQ on the command line.
Hugh.
-----Original Message-----
From: Dave Bott [mailto:community-noreply@qnx.com]
Sent: Thursday, August 20, 2009 1:10 PM
To: drivers-networking
Subject: devnp-e1000 RX overruns on *some* ports
Customer is using QNX 6.4.1 SMP 8x way on a 2x Xeon (each 4-way +
hyperthreading (disabled)) board:
Intel S5520UR board
SATA SSD, 4GB RAM, DVD, Intel dual NIC (pro1000PT).
The attached hw-info file (get_hw_info is great!) file shows the PCI
setup
There are 4 GigE ports on the board - 2 from an 82575EB chip, 2 from an
82571EB
The 82571EB NICs seem to work fine - can send/receive very well
The 82575 NICs see link status and see receive packets, but after ~200
packets, the received OK count stops and th SQE error count starts
incrementing. The source shows that SQE represents the RX dropped count.
The slog shows an Unexpected irq cause 0x40, which is RX overrun.
The pci config *looks* OK in terms of allocations. Interrupt assignments
seem OK too.
Running *just* the 82575 ports did not make a difference.
Any suggestions / known issue ?
Thanks
Dave
_______________________________________________
Networking Drivers
http://community.qnx.com/sf/go/post36378
|
|
|
Dave Bott(deleted)
|
RE: devnp-e1000 RX overruns on *some* ports
|
Dave Bott(deleted)
08/20/2009 1:26 PM
post36380
|
RE: devnp-e1000 RX overruns on *some* ports
Hi Hugh,
That was done just as a test - I guess we captured the info when that test was running. The issue occurred without
specifying any IRQs, and then the IRQs did indeed come up as expected. Sorry for the confusion there.
Dave
________________________________
From: Hugh Brown [mailto:community-noreply@qnx.com]
Sent: Thu 20/08/2009 10:19 AM
To: drivers-networking
Subject: RE: devnp-e1000 RX overruns on *some* ports
Why is the driver being started with irq=11? Two of the devices are on
IRQ 10 and the other 2 are on 11. The driver should be started without
any IRQ on the command line.
Hugh.
-----Original Message-----
From: Dave Bott [mailto:community-noreply@qnx.com]
Sent: Thursday, August 20, 2009 1:10 PM
To: drivers-networking
Subject: devnp-e1000 RX overruns on *some* ports
Customer is using QNX 6.4.1 SMP 8x way on a 2x Xeon (each 4-way +
hyperthreading (disabled)) board:
Intel S5520UR board
SATA SSD, 4GB RAM, DVD, Intel dual NIC (pro1000PT).
The attached hw-info file (get_hw_info is great!) file shows the PCI
setup
There are 4 GigE ports on the board - 2 from an 82575EB chip, 2 from an
82571EB
The 82571EB NICs seem to work fine - can send/receive very well
The 82575 NICs see link status and see receive packets, but after ~200
packets, the received OK count stops and th SQE error count starts
incrementing. The source shows that SQE represents the RX dropped count.
The slog shows an Unexpected irq cause 0x40, which is RX overrun.
The pci config *looks* OK in terms of allocations. Interrupt assignments
seem OK too.
Running *just* the 82575 ports did not make a difference.
Any suggestions / known issue ?
Thanks
Dave
_______________________________________________
Networking Drivers
http://community.qnx.com/sf/go/post36378
_______________________________________________
Networking Drivers
http://community.qnx.com/sf/go/post36379
|
|
|
Hugh Brown
|
RE: devnp-e1000 RX overruns on *some* ports
|
Hugh Brown
08/20/2009 3:46 PM
post36383
|
RE: devnp-e1000 RX overruns on *some* ports
If you slay io-usb does it help? I see that some of the USB interfaces
are also using IRQ 11.
The devnp-e1000 driver has standard transmit and receive routines for
all the chipsets, so I'm not exactly sure what could be causing the
problem. Can you run tests on a bare-bones system, (ie. No graphics or
any other drivers running besides Ethernet) and see if the problem still
occurs?
Hugh.
-----Original Message-----
From: Dave Bott [mailto:community-noreply@qnx.com]
Sent: Thursday, August 20, 2009 1:27 PM
To: drivers-networking
Subject: RE: devnp-e1000 RX overruns on *some* ports
Hi Hugh,
That was done just as a test - I guess we captured the info when that
test was running. The issue occurred without specifying any IRQs, and
then the IRQs did indeed come up as expected. Sorry for the confusion
there.
Dave
________________________________
From: Hugh Brown [mailto:community-noreply@qnx.com]
Sent: Thu 20/08/2009 10:19 AM
To: drivers-networking
Subject: RE: devnp-e1000 RX overruns on *some* ports
Why is the driver being started with irq=11? Two of the devices are on
IRQ 10 and the other 2 are on 11. The driver should be started without
any IRQ on the command line.
Hugh.
-----Original Message-----
From: Dave Bott [mailto:community-noreply@qnx.com]
Sent: Thursday, August 20, 2009 1:10 PM
To: drivers-networking
Subject: devnp-e1000 RX overruns on *some* ports
Customer is using QNX 6.4.1 SMP 8x way on a 2x Xeon (each 4-way +
hyperthreading (disabled)) board:
Intel S5520UR board
SATA SSD, 4GB RAM, DVD, Intel dual NIC (pro1000PT).
The attached hw-info file (get_hw_info is great!) file shows the PCI
setup
There are 4 GigE ports on the board - 2 from an 82575EB chip, 2 from an
82571EB
The 82571EB NICs seem to work fine - can send/receive very well
The 82575 NICs see link status and see receive packets, but after ~200
packets, the received OK count stops and th SQE error count starts
incrementing. The source shows that SQE represents the RX dropped count.
The slog shows an Unexpected irq cause 0x40, which is RX overrun.
The pci config *looks* OK in terms of allocations. Interrupt assignments
seem OK too.
Running *just* the 82575 ports did not make a difference.
Any suggestions / known issue ?
Thanks
Dave
_______________________________________________
Networking Drivers
http://community.qnx.com/sf/go/post36378
_______________________________________________
Networking Drivers
http://community.qnx.com/sf/go/post36379
_______________________________________________
Networking Drivers
http://community.qnx.com/sf/go/post36380
|
|
|
Dave Bott(deleted)
|
Re: devnp-e1000 RX overruns on *some* ports
|
Dave Bott(deleted)
08/20/2009 5:06 PM
post36390
|
Re: devnp-e1000 RX overruns on *some* ports
Hi High,
That's a very good point. I did not mention that this board has io-hid
running ready - seems to be a PS-2 issue.
Perhaps the issues are related - I'll have them try to resolve the PS-2
issue and see if the NICs start running... or rearrange the IRQs at
least, if the BIOS supports that.
Just killing the USB might not help, since that's the keyboard... ;-)
Thanks
Dave
Hugh Brown wrote:
>
> If you slay io-usb does it help? I see that some of the USB interfaces
> are also using IRQ 11.
>
> The devnp-e1000 driver has standard transmit and receive routines for
> all the chipsets, so I'm not exactly sure what could be causing the
> problem. Can you run tests on a bare-bones system, (ie. No graphics or
> any other drivers running besides Ethernet) and see if the problem still
> occurs?
>
> Hugh.
>
> -----Original Message-----
> From: Dave Bott [mailto:community-noreply@qnx.com]
> Sent: Thursday, August 20, 2009 1:27 PM
> To: drivers-networking
> Subject: RE: devnp-e1000 RX overruns on *some* ports
>
> Hi Hugh,
>
> That was done just as a test - I guess we captured the info when that
> test was running. The issue occurred without specifying any IRQs, and
> then the IRQs did indeed come up as expected. Sorry for the confusion
> there.
>
> Dave
>
> ________________________________
>
> From: Hugh Brown [mailto:community-noreply@qnx.com]
> Sent: Thu 20/08/2009 10:19 AM
> To: drivers-networking
> Subject: RE: devnp-e1000 RX overruns on *some* ports
>
>
>
> Why is the driver being started with irq=11? Two of the devices are on
> IRQ 10 and the other 2 are on 11. The driver should be started without
> any IRQ on the command line.
>
> Hugh.
>
> -----Original Message-----
> From: Dave Bott [mailto:community-noreply@qnx.com]
> Sent: Thursday, August 20, 2009 1:10 PM
> To: drivers-networking
> Subject: devnp-e1000 RX overruns on *some* ports
>
> Customer is using QNX 6.4.1 SMP 8x way on a 2x Xeon (each 4-way +
> hyperthreading (disabled)) board:
> Intel S5520UR board
> SATA SSD, 4GB RAM, DVD, Intel dual NIC (pro1000PT).
>
> The attached hw-info file (get_hw_info is great!) file shows the PCI
> setup
>
> There are 4 GigE ports on the board - 2 from an 82575EB chip, 2 from an
> 82571EB
>
> The 82571EB NICs seem to work fine - can send/receive very well
>
> The 82575 NICs see link status and see receive packets, but after ~200
> packets, the received OK count stops and th SQE error count starts
> incrementing. The source shows that SQE represents the RX dropped count.
>
> The slog shows an Unexpected irq cause 0x40, which is RX overrun.
>
> The pci config *looks* OK in terms of allocations. Interrupt assignments
> seem OK too.
>
> Running *just* the 82575 ports did not make a difference.
>
> Any suggestions / known issue ?
>
> Thanks
>
> Dave
>
>
>
> _______________________________________________
>
> Networking Drivers
> http://community.qnx.com/sf/go/post36378
>
>
>
> _______________________________________________
>
> Networking Drivers
> http://community.qnx.com/sf/go/post36379
>
>
>
>
>
>
>
> _______________________________________________
>
> Networking Drivers
> http://community.qnx.com/sf/go/post36380
>
>
>
> _______________________________________________
>
> Networking Drivers
> http://community.qnx.com/sf/go/post36383
>
--
Dave Bott (dbott@qnx.com) Field Applications Engineer
QNX Software Systems, Inc. Cell:408 391-3535
San Jose CA
Join Foundry27 <http://community.qnx.com> - the new QNX developer forum.
|
|
|
Dave Bott(deleted)
|
Re: devnp-e1000 RX overruns on *some* ports
|
Dave Bott(deleted)
09/01/2009 5:31 PM
post37082
|
Re: devnp-e1000 RX overruns on *some* ports
The customer *finally* tried the suggestions:
Telnet into the board via one of the 2 working ports.
Slay io-hid, io-usb
Able to assign an address to one of the suspect ports using dhcp (was
able to do this before).
Tries to use it, still sees SQE errors == dropped RX packes
So, it seems, sadly, that the shared interrupt is not the problem.
Any more suggestions for things to try ? These guys are not happy...
Thanks !
Dave
Hugh Brown wrote:
>
> If you slay io-usb does it help? I see that some of the USB interfaces
> are also using IRQ 11.
>
> The devnp-e1000 driver has standard transmit and receive routines for
> all the chipsets, so I'm not exactly sure what could be causing the
> problem. Can you run tests on a bare-bones system, (ie. No graphics or
> any other drivers running besides Ethernet) and see if the problem still
> occurs?
>
> Hugh.
>
> -----Original Message-----
> From: Dave Bott [mailto:community-noreply@qnx.com]
> Sent: Thursday, August 20, 2009 1:27 PM
> To: drivers-networking
> Subject: RE: devnp-e1000 RX overruns on *some* ports
>
> Hi Hugh,
>
> That was done just as a test - I guess we captured the info when that
> test was running. The issue occurred without specifying any IRQs, and
> then the IRQs did indeed come up as expected. Sorry for the confusion
> there.
>
> Dave
>
> ________________________________
>
> From: Hugh Brown [mailto:community-noreply@qnx.com]
> Sent: Thu 20/08/2009 10:19 AM
> To: drivers-networking
> Subject: RE: devnp-e1000 RX overruns on *some* ports
>
>
>
> Why is the driver being started with irq=11? Two of the devices are on
> IRQ 10 and the other 2 are on 11. The driver should be started without
> any IRQ on the command line.
>
> Hugh.
>
> -----Original Message-----
> From: Dave Bott [mailto:community-noreply@qnx.com]
> Sent: Thursday, August 20, 2009 1:10 PM
> To: drivers-networking
> Subject: devnp-e1000 RX overruns on *some* ports
>
> Customer is using QNX 6.4.1 SMP 8x way on a 2x Xeon (each 4-way +
> hyperthreading (disabled)) board:
> Intel S5520UR board
> SATA SSD, 4GB RAM, DVD, Intel dual NIC (pro1000PT).
>
> The attached hw-info file (get_hw_info is great!) file shows the PCI
> setup
>
> There are 4 GigE ports on the board - 2 from an 82575EB chip, 2 from an
> 82571EB
>
> The 82571EB NICs seem to work fine - can send/receive very well
>
> The 82575 NICs see link status and see receive packets, but after ~200
> packets, the received OK count stops and th SQE error count starts
> incrementing. The source shows that SQE represents the RX dropped count.
>
> The slog shows an Unexpected irq cause 0x40, which is RX overrun.
>
> The pci config *looks* OK in terms of allocations. Interrupt assignments
> seem OK too.
>
> Running *just* the 82575 ports did not make a difference.
>
> Any suggestions / known issue ?
>
> Thanks
>
> Dave
>
>
>
> _______________________________________________
>
> Networking Drivers
> http://community.qnx.com/sf/go/post36378
>
>
>
> _______________________________________________
>
> Networking Drivers
> http://community.qnx.com/sf/go/post36379
>
>
>
>
>
>
>
> _______________________________________________
>
> Networking Drivers
> http://community.qnx.com/sf/go/post36380
>
>
>
> _______________________________________________
>
> Networking Drivers
> http://community.qnx.com/sf/go/post36383
>
--
Dave Bott (dbott@qnx.com) Field Applications Engineer
QNX Software Systems, Inc. Cell:408 391-3535
San Jose CA
Join Foundry27 <http://community.qnx.com> - the new QNX developer forum....
|
|
|
Hugh Brown
|
RE: devnp-e1000 RX overruns on *some* ports
|
Hugh Brown
09/02/2009 7:48 AM
post37098
|
RE: devnp-e1000 RX overruns on *some* ports
This driver has been tested on all the hardware that we have, so to be
able to investigate the customer's problem, we would need the hardware.
Hugh.
-----Original Message-----
From: Dave Bott [mailto:community-noreply@qnx.com]
Sent: Tuesday, September 01, 2009 5:31 PM
To: drivers-networking
Subject: Re: devnp-e1000 RX overruns on *some* ports
The customer *finally* tried the suggestions:
Telnet into the board via one of the 2 working ports.
Slay io-hid, io-usb
Able to assign an address to one of the suspect ports using dhcp (was
able to do this before).
Tries to use it, still sees SQE errors == dropped RX packes
So, it seems, sadly, that the shared interrupt is not the problem.
Any more suggestions for things to try ? These guys are not happy...
Thanks !
Dave
Hugh Brown wrote:
>
> If you slay io-usb does it help? I see that some of the USB interfaces
> are also using IRQ 11.
>
> The devnp-e1000 driver has standard transmit and receive routines for
> all the chipsets, so I'm not exactly sure what could be causing the
> problem. Can you run tests on a bare-bones system, (ie. No graphics or
> any other drivers running besides Ethernet) and see if the problem
still
> occurs?
>
> Hugh.
>
> -----Original Message-----
> From: Dave Bott [mailto:community-noreply@qnx.com]
> Sent: Thursday, August 20, 2009 1:27 PM
> To: drivers-networking
> Subject: RE: devnp-e1000 RX overruns on *some* ports
>
> Hi Hugh,
>
> That was done just as a test - I guess we captured the info when that
> test was running. The issue occurred without specifying any IRQs, and
> then the IRQs did indeed come up as expected. Sorry for the confusion
> there.
>
> Dave
>
> ________________________________
>
> From: Hugh Brown [mailto:community-noreply@qnx.com]
> Sent: Thu 20/08/2009 10:19 AM
> To: drivers-networking
> Subject: RE: devnp-e1000 RX overruns on *some* ports
>
>
>
> Why is the driver being started with irq=11? Two of the devices are on
> IRQ 10 and the other 2 are on 11. The driver should be started without
> any IRQ on the command line.
>
> Hugh.
>
> -----Original Message-----
> From: Dave Bott [mailto:community-noreply@qnx.com]
> Sent: Thursday, August 20, 2009 1:10 PM
> To: drivers-networking
> Subject: devnp-e1000 RX overruns on *some* ports
>
> Customer is using QNX 6.4.1 SMP 8x way on a 2x Xeon (each 4-way +
> hyperthreading (disabled)) board:
> Intel S5520UR board
> SATA SSD, 4GB RAM, DVD, Intel dual NIC (pro1000PT).
>
> The attached hw-info file (get_hw_info is great!) file shows the PCI
> setup
>
> There are 4 GigE ports on the board - 2 from an 82575EB chip, 2 from
an
> 82571EB
>
> The 82571EB NICs seem to work fine - can send/receive very well
>
> The 82575 NICs see link status and see receive packets, but after ~200
> packets, the received OK count stops and th SQE error count starts
> incrementing. The source shows that SQE represents the RX dropped
count.
>
> The slog shows an Unexpected irq cause 0x40, which is RX overrun.
>
> The pci config *looks* OK in terms of allocations. Interrupt
assignments
> seem OK too.
>
> Running *just* the 82575 ports did not make a difference.
>
> Any suggestions / known issue ?
>
> Thanks
>
> Dave
>
>
>
> _______________________________________________
>
> Networking Drivers
> http://community.qnx.com/sf/go/post36378
>
>
>
> _______________________________________________
>
> Networking Drivers
> http://community.qnx.com/sf/go/post36379
>
>
>
>
>
>
>
> _______________________________________________
>
> Networking Drivers
>...
View Full Message
|
|
|
Andrew Boyd(deleted)
|
RE: devnp-e1000 RX overruns on *some* ports
|
Andrew Boyd(deleted)
09/03/2009 9:22 AM
post37247
|
RE: devnp-e1000 RX overruns on *some* ports
> still sees SQE errors == dropped RX packes
Horrible, shameful hack alert, Will Robinson! (arms flailing)
You really aren't seeing SQE errors - I re-used that
statistic so I could differentiate between the two
sources of lost packets:
1) running out of rx descriptors, and
2) rx fifo overrun
In case #1 above, the nic was able to successfully rx
the packet, but because it had no buffers in CPU memory,
down the drain it went.
In case #2 above, the nic was NOT able to successfully
rx the packets because it's internal buffer (fifo)
ran out of space (overrun).
You are seeing door #2 - the SQE overload is the data
from the MPC register.
The normal cause of rx fifo overrun is excessive PCI
bus latency - some other master is hogging it - but
it is also worth looking at the link-level flow control
to make sure it is configured correctly - start with
"ifconfig -v", it prints it out.
--
aboyd
|
|
|
Dave Bott(deleted)
|
Re: devnp-e1000 RX overruns on *some* ports
|
Dave Bott(deleted)
09/03/2009 11:52 AM
post37281
|
Re: devnp-e1000 RX overruns on *some* ports
Thanks for the clarification !
This is a honking server board, so the customer does not want to send it
to us.
I still think Hugh may be right that the PS-2 process might have been
the issue, but they are adamant that they killed it and the problem
persisted.
Thanks
Dave
Andrew Boyd wrote:
>
> > still sees SQE errors == dropped RX packes
>
> Horrible, shameful hack alert, Will Robinson! (arms flailing)
>
> You really aren't seeing SQE errors - I re-used that
> statistic so I could differentiate between the two
> sources of lost packets:
>
> 1) running out of rx descriptors, and
> 2) rx fifo overrun
>
> In case #1 above, the nic was able to successfully rx
> the packet, but because it had no buffers in CPU memory,
> down the drain it went.
>
> In case #2 above, the nic was NOT able to successfully
> rx the packets because it's internal buffer (fifo)
> ran out of space (overrun).
>
> You are seeing door #2 - the SQE overload is the data
> from the MPC register.
>
> The normal cause of rx fifo overrun is excessive PCI
> bus latency - some other master is hogging it - but
> it is also worth looking at the link-level flow control
> to make sure it is configured correctly - start with
> "ifconfig -v", it prints it out.
>
> --
> aboyd
>
>
>
>
> _______________________________________________
>
> Networking Drivers
> http://community.qnx.com/sf/go/post37247
>
--
Dave Bott (dbott@qnx.com) Field Applications Engineer
QNX Software Systems, Inc. Cell:408 391-3535
San Jose CA
Join Foundry27 <http://community.qnx.com> - the new QNX developer forum.
|
|
|
|