Project Home
Project Home
Wiki
Wiki
Discussion Forums
Discussions
Project Information
Project Info
Forum Topic - i82544 dhcp and other problems: (6 Items)
   
i82544 dhcp and other problems  
Hi all,

We have a small fanless computer with two builtin network adapters and one on a pci card. All 3 cards are intel pro 1000
 (PT and PL).

When using the driver on the mile stone build released a few weeks ago the driver detects all three cards but when using
 ifconfig it seems as if they fail to read out the mac address because it is ff:ff:ff:ff:ff:ff. We then downloaded an 
updated version from this forum and now it detected all mac addresses and everything seemed to work as normal.

On each built in network adapter we have Gigabit ethernet cameras connected and the third adapter on the extra card is 
used for some slow communication with the controlling operators panel etc.

The problem is that some of network packages from one of the camera are corrupted. We switched the two cables to the 
cameras and now we get corrupted data from the other camera so our best guess now is that there is problem with that 
specific network adapter and not with the camera or the cables. We have this problem on at least two different (but 
identical) computers.

When using the extra network card instead of the not so good build in port the communication seems to work as it should 
but we haven't tested ot thouroughly. The problem now is that the bad network card can not communicate with the 
operators panel on our 100 MB network. dhcp.client does not work and if I set the IP address to something I still can 
not ping the router on 192.168.1.1. When looking at the led on this network card it flashes when pinging but very slowly
, like once every 10 seconds.

The data from the cameras are udp jumbo packets of a size around 4500 bytes. When there is corrupted data it usually is 
in complete thirds of the packets, for example 0 to around 3000 is corrupted, 3000-4500 is good or 0-1500 is corrupted 
and the rest is good.


Does anyone have any suggestion how we should continue?

Best regards
Bjorn Sundman
Optonova, Sweden


RE: i82544 dhcp and other problems  
Hi Bjorn:
   Do you have more information on the particular card that's failing? I'm
guessing it's probably the PL card that fails... We might have one here that
we can try out.  

The other possibility to try is specifying a MAC address on the command line
with the milestone driver (use the "mac=00072a010203" (e.g.) option on the
command line).  That might work.  Certainly using a driver that results in a
MAC address of all f's won't allow DHCP to work.

When I think about it, the OTHER possibility is that there's some sort of
internal buffering issue...
Are you starting the stack with the pagesize / mclbytes option set?

From the drivers Wiki:

"# io-pkt-v6-hc -d i82544 -p tcpip pagesize=8192,mclbytes=8192

If we pass the "pagesize" and "mclbytes" cmd line options to the stack, we
tell it to allocate contiguous 8K buffers (which may end up being two
adjacent 4K pages, which works fine) for each 8k cluster to use for packet
buffers. This will reduce packet processing overhead, which will improve
throughput and reduce CPU utilization."


If these options fix things, then there's likely a driver bug in there
somewhere...

	Robert.

-----Original Message-----
From: Björn Sundman [mailto:bjorn_sundman@yahoo.se] 
Sent: Wednesday, May 28, 2008 2:58 PM
To: drivers-networking
Subject: i82544 dhcp and other problems

Hi all,

We have a small fanless computer with two builtin network adapters and one
on a pci card. All 3 cards are intel pro 1000 (PT and PL).

When using the driver on the mile stone build released a few weeks ago the
driver detects all three cards but when using ifconfig it seems as if they
fail to read out the mac address because it is ff:ff:ff:ff:ff:ff. We then
downloaded an updated version from this forum and now it detected all mac
addresses and everything seemed to work as normal.

On each built in network adapter we have Gigabit ethernet cameras connected
and the third adapter on the extra card is used for some slow communication
with the controlling operators panel etc.

The problem is that some of network packages from one of the camera are
corrupted. We switched the two cables to the cameras and now we get
corrupted data from the other camera so our best guess now is that there is
problem with that specific network adapter and not with the camera or the
cables. We have this problem on at least two different (but identical)
computers.

When using the extra network card instead of the not so good build in port
the communication seems to work as it should but we haven't tested ot
thouroughly. The problem now is that the bad network card can not
communicate with the operators panel on our 100 MB network. dhcp.client does
not work and if I set the IP address to something I still can not ping the
router on 192.168.1.1. When looking at the led on this network card it
flashes when pinging but very slowly, like once every 10 seconds.

The data from the cameras are udp jumbo packets of a size around 4500 bytes.
When there is corrupted data it usually is in complete thirds of the
packets, for example 0 to around 3000 is corrupted, 3000-4500 is good or
0-1500 is corrupted and the rest is good.


Does anyone have any suggestion how we should continue?

Best regards
Bjorn Sundman
Optonova, Sweden




_______________________________________________
Networking Drivers
http://community.qnx.com/sf/go/post8538
Re: RE: i82544 dhcp and other problems  
Hi Robert, thansk for your quick reply.

When we use the driver that we downloaded from this forum we get the mac address from all three cards. The mac addresses
 was only ff:ff:ff.. etc when we used the milestone build. The DHCP works fine when used with the extra card or the good
 one of the built in adapters. Not the other built in adapter though even if the MAC address is correct for this card. 
It might be the PL card, I cannot check it right now since it's installed in the factory. What is the difference between
 PL and PT?

Another thing that maybe means something is the failing port gets IRQ 15 and the other two gets IRQ 5 (according to 
'pci' command). I have no understanding of the IRQ numbers so maybe this is meaningless.

The only options we pass the driver is an option that I think was something like "irq_threshold=9000". I will try 
pagesize=8192,mclbytes=8192 and see if it makes a difference.

Bjorn
RE: RE: i82544 dhcp and other problems  
Hi Bjorn:

	Not sure of the difference between the PL and the PT, but I've got a
PT on my desktop and haven't tried a PL. Most of the time, there's very
little that has to be changed inside of the driver to support the different
card variants other than the EEPROM reading.  The milestone build driver
doesn't read the EEPROM properly for all cards.  Unfortunately, the driver
that we have internally has updated EEPROM code that doesn't clearly give us
rights to post in source form (which I need to get sorted out).

With regards to the interrupt, given that interrupts are shared, it is
possible that something else is holding onto the interrupt for longer than
it should resulting in different behaviour for the cards.  I wouldn't expect
packet corruption to be the result of this, but it's definitely another data
point for me.  If you do a "pidin irq" you can see if another process is
sharing the interrupt with the network card.  For high performance, it's
definitely in your best interests to try and adjust the BIOS to prevent
interrupt sharing from happening (although this often isn't possible with
the BIOS settings these days). 

So what we need to know for certain is:  Is it always the same card type
that causes the corruption problem (and what card type is it)?  Does it
always get the same interrupt?  Is the interrupt for the card shared?  

For DHCP not working, what does ifconfig report?  Does it negotiate to the
correct speed?  It sounds like, if it "mostly" works with the Gig network
and not 100BT, auto-negotiation might not be working.

You can also post the output from sloginfo to see if there are any error
messages showing up.


	Robert.

-----Original Message-----
From: Björn Sundman [mailto:bjorn_sundman@yahoo.se] 
Sent: Wednesday, May 28, 2008 5:11 PM
To: drivers-networking
Subject: Re: RE: i82544 dhcp and other problems

Hi Robert, thansk for your quick reply.

When we use the driver that we downloaded from this forum we get the mac
address from all three cards. The mac addresses was only ff:ff:ff.. etc when
we used the milestone build. The DHCP works fine when used with the extra
card or the good one of the built in adapters. Not the other built in
adapter though even if the MAC address is correct for this card. It might be
the PL card, I cannot check it right now since it's installed in the
factory. What is the difference between PL and PT?

Another thing that maybe means something is the failing port gets IRQ 15 and
the other two gets IRQ 5 (according to 'pci' command). I have no
understanding of the IRQ numbers so maybe this is meaningless.

The only options we pass the driver is an option that I think was something
like "irq_threshold=9000". I will try pagesize=8192,mclbytes=8192 and see if
it makes a difference.

Bjorn


_______________________________________________
Networking Drivers
http://community.qnx.com/sf/go/post8549
RE: i82544 dhcp and other problems  
> some of network packages from one of the camera are corrupted. 

> there is problem with that specific network adapter
 
The ethernet CRC is supposed to catch these kinds of
errors - are there any crc's in the nicinfo output? - but
sometimes, what happens is that the packet is transferred
correctly out of the transmitting box, across the network,
and into the receiving box, but it is corrupted *inside*
the receiving box after reception of the correct packet
data.
 
This is annoying, and you might wonder how it is possible.

The reason why this can happen is rx fifo overflow.

See, there is a buffer on the PCI card, which can store
a certain amount of data.  But as the network speed increases
(this isn't a problem with 10mbit, can possibly  be a problem 
with 100mbit and certainly can be a problem with gige) the
PCI bus latency becomes more and more imporant.

If the timing of the PCI bus activity is such that some
other bus mastering device hogs the bus during a big, fast
burst of packets, well, the rx fifo buffer on the PCI
nic can overflow.

Often nics allow you to configure various parameters of
the fifo - carving for tx/rx, hi/lo water marks - but
the most important think that makes this design work
is PHY flow control aka backpressure.

See, hardware guys aren't morons, though sometimes they
do some things that make us software guys wonder  :-)

The hardware guys designed the system so that when the
rx fifo starts to fill up, and hit a "high water mark"
the nic is supposed to transmit a "pause" frame (of 
so many slot times) which flow controls the transmitter
until the rx fifo is drained to the "low water mark"
again.

Now, you say that some nics work and some nics don't.  

If I had to guess, I would say that there is some
subtle difference in the PHY or register definitions
(which the i82544 is NOTORIOUS for) which has resulted
in our driver not correctly configuring or enabling
the rx fifo watermarks or pause frame transmission.

Last time I checked there were SIXTY (60 - count 'em)
variants of the i82544 nic chipsets.  If we have one
of your troublesome ones here, generally we can test
it and try to reproduce and fix the problem.  Without
one, the best we can do is pore over the data sheets,
looking for differences in register definitions.

--
aboyd 
RE: i82544 dhcp and other problems  
As another side note on this one, you'll also find the NetBSD ported
devnp-wm driver in the milestone build as well.  If you give that one a try,
we can figure out if there's a driver problem or a hardware problem
happening...

	Robert.

-----Original Message-----
From: Andrew Boyd [mailto:aboyd@qnx.com] 
Sent: Thursday, May 29, 2008 10:08 AM
To: drivers-networking
Subject: RE: i82544 dhcp and other problems


> some of network packages from one of the camera are corrupted. 

> there is problem with that specific network adapter
 
The ethernet CRC is supposed to catch these kinds of
errors - are there any crc's in the nicinfo output? - but
sometimes, what happens is that the packet is transferred
correctly out of the transmitting box, across the network,
and into the receiving box, but it is corrupted *inside*
the receiving box after reception of the correct packet
data.
 
This is annoying, and you might wonder how it is possible.

The reason why this can happen is rx fifo overflow.

See, there is a buffer on the PCI card, which can store
a certain amount of data.  But as the network speed increases
(this isn't a problem with 10mbit, can possibly  be a problem 
with 100mbit and certainly can be a problem with gige) the
PCI bus latency becomes more and more imporant.

If the timing of the PCI bus activity is such that some
other bus mastering device hogs the bus during a big, fast
burst of packets, well, the rx fifo buffer on the PCI
nic can overflow.

Often nics allow you to configure various parameters of
the fifo - carving for tx/rx, hi/lo water marks - but
the most important think that makes this design work
is PHY flow control aka backpressure.

See, hardware guys aren't morons, though sometimes they
do some things that make us software guys wonder  :-)

The hardware guys designed the system so that when the
rx fifo starts to fill up, and hit a "high water mark"
the nic is supposed to transmit a "pause" frame (of 
so many slot times) which flow controls the transmitter
until the rx fifo is drained to the "low water mark"
again.

Now, you say that some nics work and some nics don't.  

If I had to guess, I would say that there is some
subtle difference in the PHY or register definitions
(which the i82544 is NOTORIOUS for) which has resulted
in our driver not correctly configuring or enabling
the rx fifo watermarks or pause frame transmission.

Last time I checked there were SIXTY (60 - count 'em)
variants of the i82544 nic chipsets.  If we have one
of your troublesome ones here, generally we can test
it and try to reproduce and fix the problem.  Without
one, the best we can do is pore over the data sheets,
looking for differences in register definitions.

--
aboyd 


_______________________________________________
Networking Drivers
http://community.qnx.com/sf/go/post8567