Mario Charest
10/21/2009 11:07 AM
post40436
|
Is there a more recent version of devnp-e1000 then the one that came with 6.4.1.
I'm having problem again with pulse sent over the network taking over 5 seconds to get there ( thank god for the system
profiler )
On the machine that is receiving the pulse, /proc/qnetstat is full of
l4_rx_seq_insert(): out-of-order rx: seq ... flags 0...
I'm not sure if this cause by the sender or the receiver.
All machines are started like this
io-pkt-v4 -de1000 transmit=2048,receive=2048,priority=100 -pqnet no_slog=1,auto_add=10
I was hopping for a version that supports more then 2048?
|
|
|
Mario Charest
10/21/2009 11:15 AM
post40439
|
> -----Original Message-----
> From: Andrew Boyd [mailto:community-noreply@qnx.com]
> Sent: Wednesday, October 21, 2009 11:10 AM
> To: drivers-networking
> Subject: RE: latest e1000
>
>
> Yup, qnet says you're losing packets.
>
> IIRC Hugh checked in a new driver with 4096 rx descr
>
Binary easily available somewhere?
> PS Do your netstats on the rx node indicate
> any rx descr overruns or rx fifo overruns?
I assume you mean nicinfo. No error reported, all zero`s. However a second NIC with no cable connected show 1 error on
transmit allocation.
>
> --
> aboyd
>
>
>
>
> _______________________________________________
>
> Networking Drivers
> http://community.qnx.com/sf/go/post40437
>
|
|
|
Andrew Boyd(deleted)
10/21/2009 11:17 AM
post40440
|
>> any rx descr overruns or rx fifo overruns?
>
> No error reported, all zero`s
Not sure 4096 rx descr instead of 2048 is going
to solve your problem, if the rx side is reporting
no errors.
Anyways, it can't hurt to try Hugh's new binary.
But I'd look at the transmit side, too - any nicinfo
non-zero counts there?.
--
aboyd
|
|
|
Mario Charest
11/05/2009 2:10 PM
post41439
|
> >> any rx descr overruns or rx fifo overruns?
> >
> > No error reported, all zero`s
>
> Not sure 4096 rx descr instead of 2048 is going
> to solve your problem, if the rx side is reporting
> no errors.
>
> Anyways, it can't hurt to try Hugh's new binary.
>
> But I'd look at the transmit side, too - any nicinfo
> non-zero counts there?.
Nada, zero all over.
I created the following setup. 3 PC A,B,C. A test program is doing a send of 1M and the receive program replies with 1M
of data. Machine A is the sender, machine B the receiver. Throughput is 105MBytes a sec !!! Wires are hot !!! No
error anywhere, nothing in /proc/qnetstats.
Then on machine B I run cp /net/C/dev/hd0 /dev/null, then things get ugly.
Machine C is ok, machine B shows huge number of out-of-order and ome dup rx. Machine A has slow mode.
What is strange is if I stop the cp, machine B keeps getting out-of-order error ( maybe related to slow mode).
Even stranger from test to test I get different type of error. Once I got: What I can't make any sense of is why is the
hole always at offset 14600? I mean if packets get lost or screw up why always the same offset.
At this time I'm using tx and rx descriptor of 3000... The machines have 8 cores and run nothing else then the
communication test program. And I repreat nicinfo shows no error either in rx or tx on all 3 machines.
Tomorrow I will setup a machine to capture network traffic to get to the bottom of this. It's killing us.
00145123 l4_rx_seq_last(): hole w off 14600 len 1460 for seq 1138167 conn 3 nd 101, will tx NACK
00145123 l4_rx_seq_insert(): out-of-order rx: seq 1138167 flags 0 toff 16060 poff 14600 prev B62E870 nd 101
00145123 l4_rx_seq_insert(): dup rx: seq 1138167 flags 2 offset 16060 nd 101
00145123 l4_rx_seq_last(): hole w off 14600 len 1460 for seq 1138168 conn 3 nd 101, will tx NACK
00145123 l4_rx_seq_insert(): out-of-order rx: seq 1138168 flags 0 toff 16060 poff 14600 prev B443250 nd 101
00145123 l4_rx_seq_insert(): dup rx: seq 1138168 flags 2 offset 16060 nd 101
00145123 l4_rx_seq_last(): hole w off 14600 len 1460 for seq 1138169 conn 3 nd 101, will tx NACK
00145123 l4_rx_seq_insert(): out-of-order rx: seq 1138169 flags 0 toff 16060 poff 14600 prev 80D94B8 nd 101
00145123 l4_rx_seq_insert(): dup rx: seq 1138169 flags 2 offset 16060 nd 101
00145123 l4_rx_seq_last(): hole w off 14600 len 1460 for seq 1138170 conn 3 nd 101, will tx NACK
00145123 l4_rx_seq_insert(): out-of-order rx: seq 1138170 flags 0 toff 16060 poff 14600 prev B62DB10 nd 101
00145123 l4_rx_seq_insert(): dup rx: seq 1138170 flags 2 offset 16060 nd 101
00145123 l4_rx_seq_last(): hole w off 14600 len 1460 for seq 1138171 conn 3 nd 101, will tx NACK
00145123 l4_rx_seq_insert(): out-of-order rx: seq 1138171 flags 0 toff 16060 poff 14600 prev B441BF0 nd 101
00145123 l4_rx_seq_insert(): dup rx: seq 1138171 flags 2 offset 16060 nd 101
00145123 l4_rx_seq_last(): hole w off 14600 len 1460 for seq 1138172 conn 3 nd 101, will tx NACK
00145123 l4_rx_seq_insert(): out-of-order rx: seq 1138172 flags 0 toff 16060 poff 14600 prev 80D9AA0 nd 101
00145123 l4_rx_seq_insert(): dup rx: seq 1138172 flags 2 offset 16060 nd 101
00145123 l4_rx_seq_last(): hole w off 14600 len 1460 for seq 1138173 conn 3 nd 101, will tx NACK
00145123 l4_rx_seq_insert(): out-of-order rx: seq 1138173 flags 0 toff 16060 poff 14600 prev B441B10 nd 101
00145123 l4_rx_seq_insert(): dup rx: seq 1138173 flags 2 offset 16060 nd 101
00145123 l4_rx_seq_last(): hole w off 14600 len 1460 for seq 1138174 conn 3 nd 101, will tx NACK
00145123 l4_rx_seq_insert(): out-of-order rx: seq 1138174 flags 0 toff 16060 poff 14600 prev B442C28 nd 101
00145123 l4_rx_seq_insert(): dup rx: seq 1138174 flags 2 offset 16060 nd 101
>
> --
> aboyd
|
|
|
Andrew Boyd(deleted)
11/05/2009 2:23 PM
post41440
|
You are losing packets somewhere. If all the
nicinfo counts (tx and rx) are zero, I might suggest
directly-connecting the machines to get rid of
the switch inbetween them to get rid of that
possibility.
P.S. You don't need a crossover cable with MDIX
--
aboyd
|
|
|
Mario Charest
11/05/2009 4:05 PM
post41454
|
> -----Original Message-----
> From: Andrew Boyd [mailto:community-noreply@qnx.com]
> Sent: Thursday, November 05, 2009 2:23 PM
> To: drivers-networking
> Subject: RE: RE: latest e1000
>
>
> You are losing packets somewhere. If all the
> nicinfo counts (tx and rx) are zero, I might suggest
> directly-connecting the machines to get rid of
> the switch inbetween them to get rid of that
> possibility.
I could but it's a high end Cisco switch. I very much doubt that thing is losing packet at such an high rate. It's also
not a hardware problem with that switch because it shows at different site. Plus as far as I know the switch doesn't
know about QNX protocol, why would it always loose the same packet offset in the sequence.
And why "dup rx", that means the same packet (same offset and sequence number) was received twice right? Can't wrap
my brain around that one.
>
> P.S. You don't need a crossover cable with MDIX
It seems I need 3 computers to create the problem. If I run the test program between A-B it ok, if I run between B-C
it's ok, same for A-C. Each time, throughput is 105Mbytes/sec (50-TX, 50-RX), but if another pair of computer does
something as low as 10Mbytes/sec, which is just 10% more, havoc ensues.
At any rate I will sniff the network tomorrow. I'm hopping I'll be able to see the incoming packets received by one PC,
that way I will be able to tell what is really going on.
>
> --
> aboyd
>
>
>
>
> _______________________________________________
>
> Networking Drivers
> http://community.qnx.com/sf/go/post41440
>
|
|
|
Andrew Boyd(deleted)
11/05/2009 4:12 PM
post41455
|
dup rx likely means that an ACK packet got
lost.
Scenario: A transmits pkt #1 to B, and B
fires back ACK. ACK is lost. A times out,
re-transmits, B sees a dup rx. Happens all
the time.
Be sure that link-level flow control is enabled
all the way around - all PC's and on the switch.
This should be set via the auto-negotiation.
Are you sure you aren't trying to jam more than
one gig's worth of traffic into a gig pipe? Like
a large lady trying to squeeze into a small dress,
that's just not going to work very well.
Checking the counters on the switch would be a
high priority.
--
aboyd
|
|
|
Mario Charest
11/06/2009 9:00 AM
post41490
|
> -----Original Message-----
> From: Andrew Boyd [mailto:community-noreply@qnx.com]
> Sent: Thursday, November 05, 2009 4:12 PM
> To: drivers-networking
> Subject: RE: RE: latest e1000
>
> dup rx likely means that an ACK packet got
> lost.
>
> Scenario: A transmits pkt #1 to B, and B
> fires back ACK. ACK is lost. A times out,
> re-transmits, B sees a dup rx. Happens all
> the time.
>
> Be sure that link-level flow control is enabled
> all the way around - all PC's and on the switch.
> This should be set via the auto-negotiation.
>
Input-flow control got turn on and no difference.
To try to get an idea of what is going on, I set one port to 1G and the other to 100Mbits in the switch configuration.
Now what I'm seeing is a rather steady rate of transfer between two computers at 11Mbytes/sec but every 3 seconds it
drops below 1Mbytes (that's the part that worries me), but this only if the data being send goes from the 1G -> 100M
machines.
It's a SendMsg of 1Meg of data. Curiously a SendMsg is either at 11Mbytes or very slow (>1M). I have not seen anything
in between.
On the 1G machine /proc/qnetstats contains lots of
00135519 l4_tx_timeout(): rxd nack: nd 58 sc 13 dc 3 ss 54113 nh 1
00135519 l4_tx_max_pkt_set(): nd 58 slow mode: passed 436 pkts, window 109 pkts
But when the slowdown occur I see
00135519 l4_tx_timeout(): rxd nack: nd 58 sc 13 dc 3 ss 54115 nh 1
00135519 l4_tx_max_pkt_set(): nd 58 slow mode: passed 436 pkts, window 109 pkts
00135519 l4_tx_timeout(): rxd nack: nd 58 sc 13 dc 3 ss 54115 nh 1
00135519 l4_tx_max_pkt_set(): nd 58 slow mode: passed 545 pkts, window 113 pkts
00135519 l4_tx_timeout(): rxd nack: nd 58 sc 13 dc 3 ss 54115 nh 1
*00135520 l4_tx_timeout(): timeout: nd 58 sc 13 dc 3 ss 54115 tk 326483 ct 326485
*00135520 l4_tx_max_pkt_set(): nd 58 slow mode: passed -1 pkts, window 112 pkts
00135520 l4_tx_max_pkt_set(): nd 58 slow mode: passed 655 pkts, window 112 pkts
00135520 l4_tx_timeout(): rxd nack: nd 58 sc 13 dc 3 ss 54115 nh 1
> Are you sure you aren't trying to jam more than
> one gig's worth of traffic into a gig pipe? Like
> a large lady trying to squeeze into a small dress,
> that's just not going to work very well.
>
> Checking the counters on the switch would be a
> high priority.
>
> --
> aboyd
>
>
>
>
> _______________________________________________
>
> Networking Drivers
> http://community.qnx.com/sf/go/post41455
>
|
|
|
Mario Charest
11/18/2009 10:38 AM
post42166
|
Any news?
>
> > -----Original Message-----
> > From: Andrew Boyd [mailto:community-noreply@qnx.com]
> > Sent: Thursday, November 05, 2009 4:12 PM
> > To: drivers-networking
> > Subject: RE: RE: latest e1000
> >
> > dup rx likely means that an ACK packet got
> > lost.
> >
> > Scenario: A transmits pkt #1 to B, and B
> > fires back ACK. ACK is lost. A times out,
> > re-transmits, B sees a dup rx. Happens all
> > the time.
> >
> > Be sure that link-level flow control is enabled
> > all the way around - all PC's and on the switch.
> > This should be set via the auto-negotiation.
> >
>
> Input-flow control got turn on and no difference.
>
> To try to get an idea of what is going on, I set one port to 1G and the other
> to 100Mbits in the switch configuration.
>
> Now what I'm seeing is a rather steady rate of transfer between two computers
> at 11Mbytes/sec but every 3 seconds it drops below 1Mbytes (that's the part
> that worries me), but this only if the data being send goes from the 1G ->
> 100M machines.
>
> It's a SendMsg of 1Meg of data. Curiously a SendMsg is either at 11Mbytes or
> very slow (>1M). I have not seen anything in between.
>
> On the 1G machine /proc/qnetstats contains lots of
>
> 00135519 l4_tx_timeout(): rxd nack: nd 58 sc 13 dc 3 ss 54113 nh 1
> 00135519 l4_tx_max_pkt_set(): nd 58 slow mode: passed 436 pkts, window 109
> pkts
>
> But when the slowdown occur I see
>
> 00135519 l4_tx_timeout(): rxd nack: nd 58 sc 13 dc 3 ss 54115 nh 1
> 00135519 l4_tx_max_pkt_set(): nd 58 slow mode: passed 436 pkts, window 109
> pkts
> 00135519 l4_tx_timeout(): rxd nack: nd 58 sc 13 dc 3 ss 54115 nh 1
> 00135519 l4_tx_max_pkt_set(): nd 58 slow mode: passed 545 pkts, window 113
> pkts
> 00135519 l4_tx_timeout(): rxd nack: nd 58 sc 13 dc 3 ss 54115 nh 1
> *00135520 l4_tx_timeout(): timeout: nd 58 sc 13 dc 3 ss 54115 tk 326483 ct
> 326485
> *00135520 l4_tx_max_pkt_set(): nd 58 slow mode: passed -1 pkts, window 112
> pkts
> 00135520 l4_tx_max_pkt_set(): nd 58 slow mode: passed 655 pkts, window 112
> pkts
> 00135520 l4_tx_timeout(): rxd nack: nd 58 sc 13 dc 3 ss 54115 nh 1
>
>
> > Are you sure you aren't trying to jam more than
> > one gig's worth of traffic into a gig pipe? Like
> > a large lady trying to squeeze into a small dress,
> > that's just not going to work very well.
> >
> > Checking the counters on the switch would be a
> > high priority.
> >
> > --
> > aboyd
> >
> >
> >
> >
> > _______________________________________________
> >
> > Networking Drivers
> > http://community.qnx.com/sf/go/post41455
> >
|
|
|
Hugh Brown
11/18/2009 10:51 AM
post42174
|
Mario,
Please create a support ticket on this and it will be looked at in due
course.
Thanks, Hugh.
-----Original Message-----
From: Mario Charest [mailto:community-noreply@qnx.com]
Sent: November 18, 2009 10:39 AM
To: drivers-networking
Subject: Re: RE: RE: latest e1000
Any news?
>
> > -----Original Message-----
> > From: Andrew Boyd [mailto:community-noreply@qnx.com]
> > Sent: Thursday, November 05, 2009 4:12 PM
> > To: drivers-networking
> > Subject: RE: RE: latest e1000
> >
> > dup rx likely means that an ACK packet got
> > lost.
> >
> > Scenario: A transmits pkt #1 to B, and B
> > fires back ACK. ACK is lost. A times out,
> > re-transmits, B sees a dup rx. Happens all
> > the time.
> >
> > Be sure that link-level flow control is enabled
> > all the way around - all PC's and on the switch.
> > This should be set via the auto-negotiation.
> >
>
> Input-flow control got turn on and no difference.
>
> To try to get an idea of what is going on, I set one port to 1G and
the other
> to 100Mbits in the switch configuration.
>
> Now what I'm seeing is a rather steady rate of transfer between two
computers
> at 11Mbytes/sec but every 3 seconds it drops below 1Mbytes (that's the
part
> that worries me), but this only if the data being send goes from the
1G ->
> 100M machines.
>
> It's a SendMsg of 1Meg of data. Curiously a SendMsg is either at
11Mbytes or
> very slow (>1M). I have not seen anything in between.
>
> On the 1G machine /proc/qnetstats contains lots of
>
> 00135519 l4_tx_timeout(): rxd nack: nd 58 sc 13 dc 3 ss 54113 nh 1
> 00135519 l4_tx_max_pkt_set(): nd 58 slow mode: passed 436 pkts, window
109
> pkts
>
> But when the slowdown occur I see
>
> 00135519 l4_tx_timeout(): rxd nack: nd 58 sc 13 dc 3 ss 54115 nh 1
> 00135519 l4_tx_max_pkt_set(): nd 58 slow mode: passed 436 pkts, window
109
> pkts
> 00135519 l4_tx_timeout(): rxd nack: nd 58 sc 13 dc 3 ss 54115 nh 1
> 00135519 l4_tx_max_pkt_set(): nd 58 slow mode: passed 545 pkts, window
113
> pkts
> 00135519 l4_tx_timeout(): rxd nack: nd 58 sc 13 dc 3 ss 54115 nh 1
> *00135520 l4_tx_timeout(): timeout: nd 58 sc 13 dc 3 ss 54115 tk
326483 ct
> 326485
> *00135520 l4_tx_max_pkt_set(): nd 58 slow mode: passed -1 pkts, window
112
> pkts
> 00135520 l4_tx_max_pkt_set(): nd 58 slow mode: passed 655 pkts, window
112
> pkts
> 00135520 l4_tx_timeout(): rxd nack: nd 58 sc 13 dc 3 ss 54115 nh 1
>
>
> > Are you sure you aren't trying to jam more than
> > one gig's worth of traffic into a gig pipe? Like
> > a large lady trying to squeeze into a small dress,
> > that's just not going to work very well.
> >
> > Checking the counters on the switch would be a
> > high priority.
> >
> > --
> > aboyd
> >
> >
> >
> >
> > _______________________________________________
> >
> > Networking Drivers
> > http://community.qnx.com/sf/go/post41455
> >
_______________________________________________
Networking Drivers
http://community.qnx.com/sf/go/post42166
|
|
|
Mario Charest
11/18/2009 10:56 AM
post42175
|
There is a ticket open, TicketID90402. It was suggested to me by tech support to poke the thread ;-) I believe he
assumed you or Andrew were already involved. When I opened the ticket about a week ago, I made reference to this
thread.
> -----Original Message-----
> From: Hugh Brown [mailto:community-noreply@qnx.com]
> Sent: Wednesday, November 18, 2009 10:52 AM
> To: drivers-networking
> Subject: RE: RE: RE: latest e1000
>
> Mario,
>
> Please create a support ticket on this and it will be looked at in due
> course.
>
> Thanks, Hugh.
>
> -----Original Message-----
> From: Mario Charest [mailto:community-noreply@qnx.com]
> Sent: November 18, 2009 10:39 AM
> To: drivers-networking
> Subject: Re: RE: RE: latest e1000
>
>
>
> Any news?
>
> >
> > > -----Original Message-----
> > > From: Andrew Boyd [mailto:community-noreply@qnx.com]
> > > Sent: Thursday, November 05, 2009 4:12 PM
> > > To: drivers-networking
> > > Subject: RE: RE: latest e1000
> > >
> > > dup rx likely means that an ACK packet got
> > > lost.
> > >
> > > Scenario: A transmits pkt #1 to B, and B
> > > fires back ACK. ACK is lost. A times out,
> > > re-transmits, B sees a dup rx. Happens all
> > > the time.
> > >
> > > Be sure that link-level flow control is enabled
> > > all the way around - all PC's and on the switch.
> > > This should be set via the auto-negotiation.
> > >
> >
> > Input-flow control got turn on and no difference.
> >
> > To try to get an idea of what is going on, I set one port to 1G and
> the other
> > to 100Mbits in the switch configuration.
> >
> > Now what I'm seeing is a rather steady rate of transfer between two
> computers
> > at 11Mbytes/sec but every 3 seconds it drops below 1Mbytes (that's
> the
> part
> > that worries me), but this only if the data being send goes from the
> 1G ->
> > 100M machines.
> >
> > It's a SendMsg of 1Meg of data. Curiously a SendMsg is either at
> 11Mbytes or
> > very slow (>1M). I have not seen anything in between.
> >
> > On the 1G machine /proc/qnetstats contains lots of
> >
> > 00135519 l4_tx_timeout(): rxd nack: nd 58 sc 13 dc 3 ss 54113 nh 1
> > 00135519 l4_tx_max_pkt_set(): nd 58 slow mode: passed 436 pkts,
> window
> 109
> > pkts
> >
> > But when the slowdown occur I see
> >
> > 00135519 l4_tx_timeout(): rxd nack: nd 58 sc 13 dc 3 ss 54115 nh 1
> > 00135519 l4_tx_max_pkt_set(): nd 58 slow mode: passed 436 pkts,
> window
> 109
> > pkts
> > 00135519 l4_tx_timeout(): rxd nack: nd 58 sc 13 dc 3 ss 54115 nh 1
> > 00135519 l4_tx_max_pkt_set(): nd 58 slow mode: passed 545 pkts,
> window
> 113
> > pkts
> > 00135519 l4_tx_timeout(): rxd nack: nd 58 sc 13 dc 3 ss 54115 nh 1
> > *00135520 l4_tx_timeout(): timeout: nd 58 sc 13 dc 3 ss 54115 tk
> 326483 ct
> > 326485
> > *00135520 l4_tx_max_pkt_set(): nd 58 slow mode: passed -1 pkts,
> window
> 112
> > pkts
> > 00135520 l4_tx_max_pkt_set(): nd 58 slow mode: passed 655 pkts,
> window
> 112
> > pkts
> > 00135520 l4_tx_timeout(): rxd nack: nd 58 sc 13 dc 3 ss 54115 nh 1
> >
> >
> > > Are you sure you aren't trying to jam more than
> > > one gig's worth of traffic into a gig pipe? Like
> > > a large lady trying to squeeze into a small dress,
> > > that's just not going to work very well.
> > >
> > > Checking the counters on the switch would be a
> > > high priority.
>...
|
|
|
Hugh Brown
11/18/2009 10:59 AM
post42177
|
Unfortunately we are both swamped with work right now.
-----Original Message-----
From: Mario Charest [mailto:community-noreply@qnx.com]
Sent: November 18, 2009 10:56 AM
To: drivers-networking
Subject: RE: RE: RE: latest e1000
There is a ticket open, TicketID90402. It was suggested to me by tech
support to poke the thread ;-) I believe he assumed you or Andrew were
already involved. When I opened the ticket about a week ago, I made
reference to this thread.
> -----Original Message-----
> From: Hugh Brown [mailto:community-noreply@qnx.com]
> Sent: Wednesday, November 18, 2009 10:52 AM
> To: drivers-networking
> Subject: RE: RE: RE: latest e1000
>
> Mario,
>
> Please create a support ticket on this and it will be looked at in due
> course.
>
> Thanks, Hugh.
>
> -----Original Message-----
> From: Mario Charest [mailto:community-noreply@qnx.com]
> Sent: November 18, 2009 10:39 AM
> To: drivers-networking
> Subject: Re: RE: RE: latest e1000
>
>
>
> Any news?
>
> >
> > > -----Original Message-----
> > > From: Andrew Boyd [mailto:community-noreply@qnx.com]
> > > Sent: Thursday, November 05, 2009 4:12 PM
> > > To: drivers-networking
> > > Subject: RE: RE: latest e1000
> > >
> > > dup rx likely means that an ACK packet got
> > > lost.
> > >
> > > Scenario: A transmits pkt #1 to B, and B
> > > fires back ACK. ACK is lost. A times out,
> > > re-transmits, B sees a dup rx. Happens all
> > > the time.
> > >
> > > Be sure that link-level flow control is enabled
> > > all the way around - all PC's and on the switch.
> > > This should be set via the auto-negotiation.
> > >
> >
> > Input-flow control got turn on and no difference.
> >
> > To try to get an idea of what is going on, I set one port to 1G and
> the other
> > to 100Mbits in the switch configuration.
> >
> > Now what I'm seeing is a rather steady rate of transfer between two
> computers
> > at 11Mbytes/sec but every 3 seconds it drops below 1Mbytes (that's
> the
> part
> > that worries me), but this only if the data being send goes from the
> 1G ->
> > 100M machines.
> >
> > It's a SendMsg of 1Meg of data. Curiously a SendMsg is either at
> 11Mbytes or
> > very slow (>1M). I have not seen anything in between.
> >
> > On the 1G machine /proc/qnetstats contains lots of
> >
> > 00135519 l4_tx_timeout(): rxd nack: nd 58 sc 13 dc 3 ss 54113 nh 1
> > 00135519 l4_tx_max_pkt_set(): nd 58 slow mode: passed 436 pkts,
> window
> 109
> > pkts
> >
> > But when the slowdown occur I see
> >
> > 00135519 l4_tx_timeout(): rxd nack: nd 58 sc 13 dc 3 ss 54115 nh 1
> > 00135519 l4_tx_max_pkt_set(): nd 58 slow mode: passed 436 pkts,
> window
> 109
> > pkts
> > 00135519 l4_tx_timeout(): rxd nack: nd 58 sc 13 dc 3 ss 54115 nh 1
> > 00135519 l4_tx_max_pkt_set(): nd 58 slow mode: passed 545 pkts,
> window
> 113
> > pkts
> > 00135519 l4_tx_timeout(): rxd nack: nd 58 sc 13 dc 3 ss 54115 nh 1
> > *00135520 l4_tx_timeout(): timeout: nd 58 sc 13 dc 3 ss 54115 tk
> 326483 ct
> > 326485
> > *00135520 l4_tx_max_pkt_set(): nd 58 slow mode: passed -1 pkts,
> window
> 112
> > pkts
> > 00135520 l4_tx_max_pkt_set(): nd 58 slow mode: passed 655 pkts,
> window
> 112
> > pkts
> > 00135520 l4_tx_timeout(): rxd nack: nd 58 sc 13 dc 3 ss 54115 nh 1
> >
> >
> > > Are you sure you aren't trying to jam more than
> > > one gig's worth of traffic into a gig pipe? ...
View Full Message
|
|
|
Chris Foran
11/19/2009 1:09 PM
post42258
|
Mario,
Can you please check the stats on the switch? It may bring some additional information to light.
-Chris
|
|
|
Mario Charest
11/19/2009 1:51 PM
post42264
|
The switch (Cisco 2960) show no error what so ever. I had the our cisco guy double check.
During the test port 12 is at 1G and port 20 is at 100Mbits.
[cid:image001.png@01CA6920.0B2C4100]
[cid:image002.png@01CA6920.0B2C4100]
Here the output of our test program. See how it periodically drop from 11Mbytes to .8Mbytes.
dt=0.172495 max=2.433851 avg=0.285511 size 1000000 repsize 1000000 MB/s:11.594557
dt=0.172367 max=2.433851 avg=0.285438 size 1000000 repsize 1000000 MB/s:11.603162
dt=0.172126 max=2.433851 avg=0.285365 size 1000000 repsize 1000000 MB/s:11.619415
dt=2.431994 max=2.433851 avg=0.285292 size 1000000 repsize 1000000 MB/s:0.822371
dt=0.172438 max=2.433851 avg=0.285218 size 1000000 repsize 1000000 MB/s:11.598352
dt=0.172250 max=2.433851 avg=0.285143 size 1000000 repsize 1000000 MB/s:11.611001
dt=0.172565 max=2.433851 avg=0.285069 size 1000000 repsize 1000000 MB/s:11.589857
dt=0.171953 max=2.433851 avg=0.284995 size 1000000 repsize 1000000 MB/s:11.631065
dt=0.171935 max=2.433851 avg=0.284920 size 1000000 repsize 1000000 MB/s:11.632280
dt=0.172139 max=2.433851 avg=0.284846 size 1000000 repsize 1000000 MB/s:11.618490
dt=0.172000 max=2.433851 avg=0.284772 size 1000000 repsize 1000000 MB/s:11.627908
dt=0.171998 max=2.433851 avg=0.284698 size 1000000 repsize 1000000 MB/s:11.628027
dt=0.171937 max=2.433851 avg=0.284624 size 1000000 repsize 1000000 MB/s:11.632138
dt=0.172483 max=2.433851 avg=0.284551 size 1000000 repsize 1000000 MB/s:11.595367
dt=0.172416 max=2.433851 avg=0.284477 size 1000000 repsize 1000000 MB/s:11.599870
dt=0.171904 max=2.433851 avg=0.284404 size 1000000 repsize 1000000 MB/s:11.634398
dt=0.172030 max=2.433851 avg=0.284330 size 1000000 repsize 1000000 MB/s:11.625900
dt=0.172048 max=2.433851 avg=0.284257 size 1000000 repsize 1000000 MB/s:11.624649
dt=0.172090 max=2.433851 avg=0.284184 size 1000000 repsize 1000000 MB/s:11.621825
dt=0.172129 max=2.433851 avg=0.284110 size 1000000 repsize 1000000 MB/s:11.619210
dt=2.429998 max=2.433851 avg=0.285510 size 1000000 repsize 1000000 MB/s:0.823046
dt=0.172010 max=2.433851 avg=0.285436 size 1000000 repsize 1000000 MB/s:11.627258
dt=0.172233 max=2.433851 avg=0.285362 size 1000000 repsize 1000000 MB/s:11.612152
dt=0.172056 max=2.433851 avg=0.285289 size 1000000 repsize 1000000 MB/s:11.624139
dt=0.172416 max=2.433851 avg=0.285215 size 1000000 repsize 1000000 MB/s:11.599882
dt=0.172054 max=2.433851 avg=0.285142 size 1000000 repsize 1000000 MB/s:11.624279
dt=0.172320 max=2.433851 avg=0.285068 size 1000000 repsize 1000000 MB/s:11.606286
dt=0.172068 max=2.433851 avg=0.284995 size 1000000 repsize 1000000 MB/s:11.623291
dt=0.172515 max=2.433851 avg=0.284922 size 1000000 repsize 1000000 MB/s:11.593164
dt=0.172352 max=2.433851 avg=0.284849 size 1000000 repsize 1000000 MB/s:11.604183
dt=0.172341 max=2.433851 avg=0.284776 size 1000000 repsize 1000000 MB/s:11.604907
dt=0.172569 max=2.433851 avg=0.284703 size 1000000 repsize 1000000 MB/s:11.589583
dt=0.172406 max=2.433851 avg=0.284631 size 1000000 repsize 1000000 MB/s:11.600505
dt=0.172454 max=2.433851 avg=0.284558 size 1000000 repsize 1000000 MB/s:11.597285
dt=0.172146 max=2.433851 avg=0.284486 size 1000000 repsize 1000000 MB/s:11.618019
dt=0.172525 max=2.433851 avg=0.284413 size 1000000 repsize 1000000 MB/s:11.592529
dt=0.172551 max=2.433851 avg=0.284341 size 1000000 repsize 1000000 MB/s:11.590784
dt=1.841849 max=2.433851 avg=0.285346 size 1000000 repsize 1000000 MB/s:1.085865
Would you care for a WireShark capture session. I could arrange for that.
> -----Original Message-----
> From: Chris Foran [mailto:community-noreply@qnx.com]
> Sent: Thursday, November 19, 2009 1:09 PM
> To: drivers-networking
> Subject: Re: RE: RE: latest e1000
>
> Mario,
>
> Can you please check the stats on the switch? It may bring...
View Full Message
|
|
|
Chris Foran
|
Re: RE: RE: RE: latest e1000
|
Chris Foran
12/04/2009 10:46 AM
post43206
|
Re: RE: RE: RE: latest e1000
Hi Mario,
Hugh has requested that I attempt to reproduce the issue here. Can you please attach the source for your test app? I
will set up two machines connected through a gigabit switch. I will force one to 1000 and the other to 100 and run your
test. How does that sound? Am I likely to be able to reproduce this?
-Chris
|
|
|
jikui su
|
Re: RE: RE: RE: latest e1000
|
jikui su
12/08/2009 11:07 AM
post43374
|
Re: RE: RE: RE: latest e1000
Hi Mario,
Hugh has an idea about what might be happening, and has formulated this driver as an experiment. Can you please try it
and see if it yields a better result? Please note that it is of "experimental" status and not to be used in any release
quality system. If it does help the problem, let us know, and we'll go from there.
Thanks,
-Chris
|
|
|
Mario Charest
|
RE: RE: RE: RE: latest e1000
|
Mario Charest
12/08/2009 12:23 PM
post43379
|
RE: RE: RE: RE: latest e1000
> -----Original Message-----
> From: jikui su [mailto:community-noreply@qnx.com]
> Sent: Tuesday, December 08, 2009 11:07 AM
> To: drivers-networking
> Subject: Re: RE: RE: RE: latest e1000
>
> Hi Mario,
>
> Hugh has an idea about what might be happening, and has formulated this
> driver as an experiment. Can you please try it and see if it yields a
> better result? Please note that it is of "experimental" status and not
> to be used in any release quality system. If it does help the problem,
> let us know, and we'll go from there.
>
Two test scenarios, I`m using two programs, one does a SendMsg of 1Meg and the other a ReplyMsg of 1Meg.
If two machines are setup a 1G it seems good, with only the test program running, looks ok, logs are empty and not
showing any entries. Transfer rate is 103M/sec. If from machine B I start a file copy from C, then then most transfer
will drop to 98M/sec. But once in a while it will drop to 3M/sec! Logs are full of "out of order and dup rx".
If one machine is setup a 100M and the other at 1G, transfer goes at 11Mbytes per sec, which is good but will drop to .8
to 3M per secondes every 5 seconds or so.
> Thanks,
>
> -Chris
>
>
>
> _______________________________________________
>
> Networking Drivers
> http://community.qnx.com/sf/go/post43374
|
|
|
Hugh Brown
10/21/2009 11:14 AM
post40438
|
Here's one with a maximum of 4096 descriptors.
Hugh.
-----Original Message-----
From: Mario Charest [mailto:community-noreply@qnx.com]
Sent: Wednesday, October 21, 2009 11:07 AM
To: drivers-networking
Subject: latest e1000
Is there a more recent version of devnp-e1000 then the one that came
with 6.4.1.
I'm having problem again with pulse sent over the network taking over 5
seconds to get there ( thank god for the system profiler )
On the machine that is receiving the pulse, /proc/qnetstat is full of
l4_rx_seq_insert(): out-of-order rx: seq ... flags 0...
I'm not sure if this cause by the sender or the receiver.
All machines are started like this
io-pkt-v4 -de1000 transmit=2048,receive=2048,priority=100 -pqnet
no_slog=1,auto_add=10
I was hopping for a version that supports more then 2048?
_______________________________________________
Networking Drivers
http://community.qnx.com/sf/go/post40436
|
|
|
Mario Charest
10/21/2009 11:32 AM
post40442
|
Awesome thanks !!!
> -----Original Message-----
> From: Hugh Brown [mailto:community-noreply@qnx.com]
> Sent: Wednesday, October 21, 2009 11:15 AM
> To: drivers-networking
> Subject: RE: latest e1000
>
> Here's one with a maximum of 4096 descriptors.
>
> Hugh.
>
> -----Original Message-----
> From: Mario Charest [mailto:community-noreply@qnx.com]
> Sent: Wednesday, October 21, 2009 11:07 AM
> To: drivers-networking
> Subject: latest e1000
>
> Is there a more recent version of devnp-e1000 then the one that came
> with 6.4.1.
>
> I'm having problem again with pulse sent over the network taking over 5
> seconds to get there ( thank god for the system profiler )
>
> On the machine that is receiving the pulse, /proc/qnetstat is full of
>
> l4_rx_seq_insert(): out-of-order rx: seq ... flags 0...
>
>
> I'm not sure if this cause by the sender or the receiver.
>
> All machines are started like this
>
> io-pkt-v4 -de1000 transmit=2048,receive=2048,priority=100 -pqnet
> no_slog=1,auto_add=10
>
> I was hopping for a version that supports more then 2048?
>
>
>
> _______________________________________________
>
> Networking Drivers
> http://community.qnx.com/sf/go/post40436
>
>
>
>
>
> _______________________________________________
>
> Networking Drivers
> http://community.qnx.com/sf/go/post40438
|
|
|
Mario Charest
10/21/2009 1:35 PM
post40449
|
If I specify transmit=4096,receive=4096 it creates havoc in the system, even killing devb-eide. Program like ifconfig
will block forever on io-pkt-v4
With 4095 it`s all good !?!
> -----Original Message-----
> From: Hugh Brown [mailto:community-noreply@qnx.com]
> Sent: Wednesday, October 21, 2009 11:15 AM
> To: drivers-networking
> Subject: RE: latest e1000
>
> Here's one with a maximum of 4096 descriptors.
>
> Hugh.
>
> -----Original Message-----
> From: Mario Charest [mailto:community-noreply@qnx.com]
> Sent: Wednesday, October 21, 2009 11:07 AM
> To: drivers-networking
> Subject: latest e1000
>
> Is there a more recent version of devnp-e1000 then the one that came
> with 6.4.1.
>
> I'm having problem again with pulse sent over the network taking over 5
> seconds to get there ( thank god for the system profiler )
>
> On the machine that is receiving the pulse, /proc/qnetstat is full of
>
> l4_rx_seq_insert(): out-of-order rx: seq ... flags 0...
>
>
> I'm not sure if this cause by the sender or the receiver.
>
> All machines are started like this
>
> io-pkt-v4 -de1000 transmit=2048,receive=2048,priority=100 -pqnet
> no_slog=1,auto_add=10
>
> I was hopping for a version that supports more then 2048?
>
>
>
> _______________________________________________
>
> Networking Drivers
> http://community.qnx.com/sf/go/post40436
>
>
>
>
>
> _______________________________________________
>
> Networking Drivers
> http://community.qnx.com/sf/go/post40438
|
|
|
Hugh Brown
10/21/2009 2:36 PM
post40454
|
I don't know why its causing havoc in your system, but to be on the safe
side, I can always limit it to 4095.
-----Original Message-----
From: Mario Charest [mailto:community-noreply@qnx.com]
Sent: Wednesday, October 21, 2009 1:36 PM
To: drivers-networking
Subject: RE: latest e1000
If I specify transmit=4096,receive=4096 it creates havoc in the system,
even killing devb-eide. Program like ifconfig will block forever on
io-pkt-v4
With 4095 it`s all good !?!
> -----Original Message-----
> From: Hugh Brown [mailto:community-noreply@qnx.com]
> Sent: Wednesday, October 21, 2009 11:15 AM
> To: drivers-networking
> Subject: RE: latest e1000
>
> Here's one with a maximum of 4096 descriptors.
>
> Hugh.
>
> -----Original Message-----
> From: Mario Charest [mailto:community-noreply@qnx.com]
> Sent: Wednesday, October 21, 2009 11:07 AM
> To: drivers-networking
> Subject: latest e1000
>
> Is there a more recent version of devnp-e1000 then the one that came
> with 6.4.1.
>
> I'm having problem again with pulse sent over the network taking over
5
> seconds to get there ( thank god for the system profiler )
>
> On the machine that is receiving the pulse, /proc/qnetstat is full of
>
> l4_rx_seq_insert(): out-of-order rx: seq ... flags 0...
>
>
> I'm not sure if this cause by the sender or the receiver.
>
> All machines are started like this
>
> io-pkt-v4 -de1000 transmit=2048,receive=2048,priority=100 -pqnet
> no_slog=1,auto_add=10
>
> I was hopping for a version that supports more then 2048?
>
>
>
> _______________________________________________
>
> Networking Drivers
> http://community.qnx.com/sf/go/post40436
>
>
>
>
>
> _______________________________________________
>
> Networking Drivers
> http://community.qnx.com/sf/go/post40438
_______________________________________________
Networking Drivers
http://community.qnx.com/sf/go/post40449
|
|
|
Mario Charest
10/21/2009 2:38 PM
post40455
|
That`s not an issue for me now that I know what not do to, just wanted to let you know about it.
> -----Original Message-----
> From: Hugh Brown [mailto:community-noreply@qnx.com]
> Sent: Wednesday, October 21, 2009 2:37 PM
> To: drivers-networking
> Subject: RE: latest e1000
>
> I don't know why its causing havoc in your system, but to be on the
> safe
> side, I can always limit it to 4095.
>
>
> -----Original Message-----
> From: Mario Charest [mailto:community-noreply@qnx.com]
> Sent: Wednesday, October 21, 2009 1:36 PM
> To: drivers-networking
> Subject: RE: latest e1000
>
> If I specify transmit=4096,receive=4096 it creates havoc in the system,
> even killing devb-eide. Program like ifconfig will block forever on
> io-pkt-v4
>
> With 4095 it`s all good !?!
>
> > -----Original Message-----
> > From: Hugh Brown [mailto:community-noreply@qnx.com]
> > Sent: Wednesday, October 21, 2009 11:15 AM
> > To: drivers-networking
> > Subject: RE: latest e1000
> >
> > Here's one with a maximum of 4096 descriptors.
> >
> > Hugh.
> >
> > -----Original Message-----
> > From: Mario Charest [mailto:community-noreply@qnx.com]
> > Sent: Wednesday, October 21, 2009 11:07 AM
> > To: drivers-networking
> > Subject: latest e1000
> >
> > Is there a more recent version of devnp-e1000 then the one that came
> > with 6.4.1.
> >
> > I'm having problem again with pulse sent over the network taking over
> 5
> > seconds to get there ( thank god for the system profiler )
> >
> > On the machine that is receiving the pulse, /proc/qnetstat is full of
> >
> > l4_rx_seq_insert(): out-of-order rx: seq ... flags 0...
> >
> >
> > I'm not sure if this cause by the sender or the receiver.
> >
> > All machines are started like this
> >
> > io-pkt-v4 -de1000 transmit=2048,receive=2048,priority=100 -pqnet
> > no_slog=1,auto_add=10
> >
> > I was hopping for a version that supports more then 2048?
> >
> >
> >
> > _______________________________________________
> >
> > Networking Drivers
> > http://community.qnx.com/sf/go/post40436
> >
> >
> >
> >
> >
> > _______________________________________________
> >
> > Networking Drivers
> > http://community.qnx.com/sf/go/post40438
>
>
>
> _______________________________________________
>
> Networking Drivers
> http://community.qnx.com/sf/go/post40449
>
>
>
>
> _______________________________________________
>
> Networking Drivers
> http://community.qnx.com/sf/go/post40454
>
|
|
|
David Villarosa(deleted)
01/23/2012 2:54 PM
post91080
|
Hugh,
I downloaded your [DATE=2010/04/06-13:29:21-EDT] version of devnp-e1000.so and can successfully communicate with my
client via my hosts' GUI using QNX 6.4. We are also successfully using devn-e1000.so.
How can I get the latest official version of these two drivers? Our 6.4 support has recently expired.
Thanks for your support,
Dave Villarosa
|
|
|
Hugh Brown
01/23/2012 3:00 PM
post91081
|
Dave,
Your best bet is to download the latest x86 BSP from Foundry27 as it
includes the latest e1000 driver as well as the source.
Hugh.
--
Hugh Brown
QNX Software Systems Limited
1001 Farrar Rd.,
Ottawa. ON. K2K 1Y5.
Telephone: 613-591-0931
On 12-01-23 2:54 PM, "David Villarosa" <community-noreply@qnx.com> wrote:
>Hugh,
>
>I downloaded your [DATE=2010/04/06-13:29:21-EDT] version of
>devnp-e1000.so and can successfully communicate with my client via my
>hosts' GUI using QNX 6.4. We are also successfully using devn-e1000.so.
>
>How can I get the latest official version of these two drivers? Our 6.4
>support has recently expired.
>
>Thanks for your support,
>Dave Villarosa
>
>
>
>
>_______________________________________________
>
>Networking Drivers
>http://community.qnx.com/sf/go/post91080
>
|
|
|
|