Norbert Wassmer
12/17/2007 6:19 AM
post3815
|
Hi,
I hope this is the right forum for my question.....
I'm using QNX4 to control many of our testbenches on 25 Node network.
Now I plan to switch over to Neutrino.
To test some things (especially qnet), I have connected 2 Neutrino computers called nto1 and nto2 (using a crossover
connection)
I started qnet (with no options) and I can access nto2 from nto1.
So far everything is ok, but
- when I call sin -n nto2 from nto1 the program crashes with a core dump
- when I call pidin -n nto2 I get the list but sometimes (often) the output stops for about 10 seconds and continues
after that delay.
- I have started a small server prog on nto2. Then I connect a client, which runs on nto1, to the server. The client
sends 1000 requests to the server (message length 4 byte). To see what happens the server prints out the number of the
current request. The output stops after a certain number of request (again for about 10 secs) continues, stops again,
continues....
What is the reason for that delays?
In the attachment you find some info about my configuration.
Thanks
Norbert
|
|
|
Robert Craig
12/17/2007 11:46 AM
post3829
|
Hi Norbert:
While pidin is the preferred mechanism for getting system
information, sin certainly shouldn't cause a crash. It's possible that your
program isn't correctly handling one of the messages that sin sends out and
that's what's causing the crash. That should be pretty easy to debug.
There certainly appears to be some sort of driver problem on nto2. I can
see one nicinfo output (which I'm assuming is from nto1?) but not the other.
Have you started up io-net manually or is this through the enumeration
process done at start up?
The nto2 logs show that packets have been sent to the driver to be
transmitted but are not being transmitted and the packet that was requested
to be sent is still "owned" by the driver (hence the "in progress" log...
This typically means that the packet has been placed in the transmit
descriptor ring but hasn't been transmitted). At this point, qnet sends a
flush request to the driver in order to free up the packet but that doesn't
appear to be having any effect.
If we can have the nicinfo information from nto2 and (if you have them) the
command lines used to start io-net that would help us see if there's a
driver problem.
If you're sending a 1000 requests in short order, another possibility is
that you are running out of buffers.
You can slay io-net and re-start it as follows to provide more buffers to
the driver
sloginfo -c
io-net -di82544 receive=2048,transmit=2048,verbose -pqnet
(you may also have to add the -ptcpip option if you need IP connectivity as
well)
Run through your tests and then do another "sloginfo" on both nodes and see
if there are any driver error messages printed out.
Robert.
-----Original Message-----
From: Norbert Wassmer [mailto:norbert.wassmer@de.bosch.com]
Sent: Monday, December 17, 2007 6:19 AM
To: technology-networking
Subject: qnet delays
Hi,
I hope this is the right forum for my question.....
I'm using QNX4 to control many of our testbenches on 25 Node network.
Now I plan to switch over to Neutrino.
To test some things (especially qnet), I have connected 2 Neutrino computers
called nto1 and nto2 (using a crossover connection)
I started qnet (with no options) and I can access nto2 from nto1.
So far everything is ok, but
- when I call sin -n nto2 from nto1 the program crashes with a core dump
- when I call pidin -n nto2 I get the list but sometimes (often) the
output stops for about 10 seconds and continues
after that delay.
- I have started a small server prog on nto2. Then I connect a client, which
runs on nto1, to the server. The client
sends 1000 requests to the server (message length 4 byte). To see what
happens the server prints out the number of the
current request. The output stops after a certain number of request (again
for about 10 secs) continues, stops again,
continues....
What is the reason for that delays?
In the attachment you find some info about my configuration.
Thanks
Norbert
_______________________________________________
Technology
http://community.qnx.com/sf/go/post3815
|
|
|
Xiaodan Tang(deleted)
12/17/2007 12:00 PM
post3830
|
I would suggest use a hub/switch instead of cross-cable first, see if that helps.
|
|
|
Norbert Wassmer
12/18/2007 7:19 AM
post3861
|
Hello Robert, hello Xiaodan,
thank you for your answers.
First I used a Cisco-Switch instead of the crossover cable
--> no luck, same behavior!
Next I started io-net as you suggested in your reply io-net -di82544 receive=2048,transmit=2048,verbose -pqnet
--> no luck (pidin -n /net/nto2 needs up to 26 seconds to finish)
I have noticed the following:
If I call pidin -n /net/nto1 from nto2 qnetstats on nto1 is filled up with error messages. No error messages
appears in qnetstats on nto2 and vice versa. Only the "remote" host has error messages in his qnetstats file
(is this the expected behavior ?).
In the attachment you can find the outputs...
Thanks,
Norbert
|
|
|
Xiaodan Tang(deleted)
12/18/2007 12:36 PM
post3877
|
The error message basically says QNET give the driver a packet to transmit. The driver doesn't giving it back. (driver
suppose to give it back once it put the data on the wire). After about 10 seconds, QNET give up.
This sounds to me either be a driver problem or a hardware issue.
Any chance to change to a different network card on it? Is nto1 also a 6.3.2 with all latest software?
|
|
|
Norbert Wassmer
12/19/2007 6:02 AM
post3897
|
Hello Xiaodan,
Both computers have exactly the same configuration (Hard- and Software).
I added a second network card into nto1 (8086h, 1229h, 82557/8/9 [Ethernet Pro 100] driver devn-speedo.so)
Now everything works fine (no delays) !!!
I think it doesen't matter in which computer (not1 or nto2) I add the second network card,
the problem exists only when I connect two 82544 cards together. Can I do something to find out
what kind of problem we have (driver or hardware) ?
BTW: how can I define at start up which network card is assigned to en0 and which one is assigned to en1 ?
Thanks,
Norbert
|
|
|
Robert Craig
12/19/2007 5:35 PM
post3939
|
At least we've isolated it down to a hardware or driver issue at this point.
Just to check, if you replace one card, does this mean that communications
in both directions now work correctly (i.e. no qnet errors?). If you put
the new card in the other machine, do you see the same behaviour? It's
possible that one of those network cards is simply broken.
The ordering of the interfaces is (I believe) determined by the order in
which they're found on the PCI bus. You can probably modify the enumerator
configuration files to get the numbering, but it may be just as easy to slay
and re-start io-net with specific driver ordering in the option list and
you'll get the numbers falling out from that.
Robert.
-----Original Message-----
From: Norbert Wassmer [mailto:norbert.wassmer@de.bosch.com]
Sent: Wednesday, December 19, 2007 6:02 AM
To: technology-networking
Subject: Re: qnet delays
Hello Xiaodan,
Both computers have exactly the same configuration (Hard- and Software).
I added a second network card into nto1 (8086h, 1229h, 82557/8/9 [Ethernet
Pro 100] driver devn-speedo.so)
Now everything works fine (no delays) !!!
I think it doesen't matter in which computer (not1 or nto2) I add the second
network card,
the problem exists only when I connect two 82544 cards together. Can I do
something to find out
what kind of problem we have (driver or hardware) ?
BTW: how can I define at start up which network card is assigned to en0 and
which one is assigned to en1 ?
Thanks,
Norbert
_______________________________________________
Technology
http://community.qnx.com/sf/go/post3897
|
|
|
Mario Charest
12/19/2007 6:08 PM
post3940
|
> At least we've isolated it down to a hardware or driver issue at this point.
>
Just want to drop in to mention we have similar problem and are using similar hardware. I'm told Jason Clark ( my
contact on this issue ) is in touch with you on this matter.
- Mario
|
|
|
Norbert Wassmer
12/21/2007 7:36 AM
post3960
|
Hello Robert,
if I put the devn-speedo network card into the second computer I can see the same behavior:
If I call pidin -n /net/nto... on the node with uses devn-speedo.so for qnet, there are no delays
and no error messages. If I call pidin -n /net/nto... on the computer which uses the 82544 card for qnet
there are short delay and a number of error messages in both qnetstats files (see the attachment).
Here is a summary:
1)
nto1 (Ethernet Pro 100, devn-speedo.so) --- nto2 (devn-i82544.so)
calling pidin -n /net/nto2 on nto1: no delays and no error messages in qnetstats files on both computer
calling pidin -n /net/nto1 on nto2: short delays and a number of error messages in qnetstats files on both computer
2)
nto1 (devn-i82544.so) --- nto2 (Ethernet Pro 100, devn-speedo.so)
calling pidin -n /net/nto2 on nto1: short delays and a number of error messages in qnetstats files on both computer
calling pidin -n /net/nto1 on nto2: no delays and no error messages in qnetstats files on both computer
3)
nto1 (3Com 9055, devn-el900.so) --- nto2 (Ethernet Pro 100, devn-speedo.so)
no delays and no error messages at all (everything is ok)
4)
nto1 (devn-i82544.so) --- nto2 (devn-i82544.so) 10 MBit, half duplex
no delays and no error messages at all (everything is ok)
hope this helps....
Norbert
|
|
|
Robert Craig
12/23/2007 1:36 PM
post3980
|
Hmmm.... So it's definitely tied to the Intel Network cards. I believe that those cards are PCI-Express aren't they?
We have seen some strange things with PCI-Express involving the card FIFOs not being properly serviced resulting in
dropped packets, but this usually only happens for Gigabit rates. Out of curiosity, do you have the latest BIOS
installed?
There was a new version of the i82544 driver released recently
(http://www.qnx.com/download/feature.html?programid=17094). Is this the version of the driver that you're running? I
can't see anything specifically added that would fix the behaviour that you're seeing, but it wouldn't hurt to give it a
shot.
Robert.
|
|
|
Norbert Wassmer
12/28/2007 7:23 AM
post3998
|
Hello Robert,
I'm out of office until Jan 07. I will give you a feedback when I'm back in office again.
Norbert
-----Ursprüngliche Nachricht-----
Von: Robert Craig [mailto:rcraig@qnx.com]
Gesendet: Sonntag, 23. Dezember 2007 19:36
An: technology-networking
Betreff: Re: RE: qnet delays
Hmmm.... So it's definitely tied to the Intel Network cards. I believe that those cards are PCI-Express aren't they?
We have seen some strange things with PCI-Express involving the card FIFOs not being properly serviced resulting in
dropped packets, but this usually only happens for Gigabit rates. Out of curiosity, do you have the latest BIOS
installed?
There was a new version of the i82544 driver released recently
(http://www.qnx.com/download/feature.html?programid=17094). Is this the version of the driver that you're running? I
can't see anything specifically added that would fix the behaviour that you're seeing, but it wouldn't hurt to give it a
shot.
Robert.
_______________________________________________
Technology
http://community.qnx.com/sf/go/post3980
|
|
|
Norbert Wassmer
01/09/2008 1:14 AM
post4134
|
Hello Robert,
I have installed the new version of the driver from Nov 07.
The behavior is the same as before. There are long delays when I connect two i82544 cards together.
BIOS? I hope the BIOS is up to date (the computers are about 1 year old).
PCI-Express? Ethernet is onboard
Norbert
|
|
|
Robert Craig
01/09/2008 1:22 PM
post4152
|
Hi Norbert:
From the PCI device ID (0x108c) I can see that this is an Intel
82573E which is indeed a PCI Express card (just hardwired on the
motherboard). Nicinfo isn't showing any packets getting dropped, which is
good, but that "still in progress" error seems to point to the card not
transmitting the data.
I've asked someone else who is far more familiar with the driver to see if
he can figure out what's going on.
It would be very interesting to see if the io-pkt version of the driver has
the same problem. This would, of course, involve downloading and setting up
the new networking stack...
Robert.
-----Original Message-----
From: Norbert Wassmer [mailto:norbert.wassmer@de.bosch.com]
Sent: Wednesday, January 09, 2008 1:15 AM
To: technology-networking
Subject: Re: AW: RE: qnet delays
Hello Robert,
I have installed the new version of the driver from Nov 07.
The behavior is the same as before. There are long delays when I connect two
i82544 cards together.
BIOS? I hope the BIOS is up to date (the computers are about 1 year old).
PCI-Express? Ethernet is onboard
Norbert
_______________________________________________
Technology
http://community.qnx.com/sf/go/post4134
|
|
|
Norbert Wassmer
|
Re: RE: AW: RE: qnet delays
|
Norbert Wassmer
01/11/2008 8:11 AM
post4212
|
Re: RE: AW: RE: qnet delays
Hi Robert,
I tried the io-pkt stack but I have some diffuculties.
Because I have problems to get the source code with svn, I have downloaded the binarys from Project Downloads.
I installed the package in /root/stage and set the path enviroment variables.
# slay io-net
First try the Ethernet Pro (speedo.so) to establish a qnet connection
# io-pkt-v4-hc & --> ok
# mount -T io-pkt devnp-speedo.so --> ok
# setup IP-Addr. with phlip (shelf)
# ping my second computer --> ok
# mount -T io-pkt /root/stage/x86/lib/dll/npm-qnet-l4_lite.so --> failed
mount: Can't mount / (type io-pkt)
mount: Possible reason: No such process
Next try the i82544 card for TCP/IP communication
# io-pkt-v4-hc & --> ok
# mount -T io-pkt devnp-i82544.so --> ok
# setup IP-Addr. with phlip (shelf)
# nicinfo
wm0: INTEL 82544 Gigabit (Copper) Ethernet Controller
Physical Node ID ........................... FFFFFF FFFFFF
Current Physical Node ID ................... FFFFFF FFFFFF
Current Operation Rate ..................... 1000.00 Mb/s full-duplex
Active Interface Type ...................... MII
(Node ID wrong, Operation Rate wrong (it's 100 Mb/s)
The card is connected to a 100 Mb/s network.
TCP/IP doesn't work....
In the attachment you can find the output from sloginfo and nicinfo
What goes wrong ??
I cannot check out the source code because svn failed (Firewall,...???)
Here is what svn reports:
# svn checkout --username norbert.wassmer@de.bosch.com http://community.qnx.com/svn/repos/core_networking
svn: PROPFIND request failed on '/svn/repos/core_networking'
svn: PROPFIND of '/svn/repos/core_networking': Could not resolve hostname `community.qnx.com': No address associated wi
th name (http://community.qnx.com)
Is there another way to get the source code ?
Thanks,
Norbert
|
|
|
|