Project Home
Project Home
Wiki
Wiki
Discussion Forums
Discussions
Project Information
Project Info
Forum Topic - qnet delays: (14 Items)
   
qnet delays  
Hi,

I hope this is the right forum for my question.....

I'm using QNX4 to control many of our testbenches on 25 Node network.
Now I plan to switch over to Neutrino.
To test some things (especially qnet), I have connected 2 Neutrino computers called nto1 and nto2 (using a crossover 
connection)

I started qnet (with no options) and I can access nto2  from  nto1.

So far everything is ok, but

- when I call  sin -n nto2 from nto1 the program crashes with a core dump

- when I call  pidin -n nto2   I get the list but sometimes (often) the output stops for about 10 seconds and continues 

after that delay. 

- I have started a small server prog on nto2. Then I connect a client, which runs on nto1, to the server. The client 
sends 1000 requests to the server (message length 4 byte). To see what happens the server prints out the number of the 
current request.  The output stops after a certain number of request  (again for about 10 secs) continues, stops again, 

continues....

What is the reason for that delays?

In the attachment you find some info about my configuration.

Thanks 

Norbert
Attachment: Text qnet.txt 16.27 KB
RE: qnet delays  
Hi Norbert:
	While pidin is the preferred mechanism for getting system
information, sin certainly shouldn't cause a crash.  It's possible that your
program isn't correctly handling one of the messages that sin sends out and
that's what's causing the crash.  That should be pretty easy to debug.

There certainly appears to be some sort of driver problem on nto2.  I can
see one nicinfo output (which I'm assuming is from nto1?) but not the other.
Have you started up io-net manually or is this through the enumeration
process done at start up?

The nto2 logs show that packets have been sent to the driver to be
transmitted but are not being transmitted and the packet that was requested
to be sent is still "owned" by the driver (hence the "in progress" log...
This typically means that the packet has been placed in the transmit
descriptor ring but hasn't been transmitted).  At this point, qnet sends a
flush request to the driver in order to free up the packet but that doesn't
appear to be having any effect.

If we can have the nicinfo information from nto2 and (if you have them) the
command lines used to start io-net that would help us see if there's a
driver problem.


If you're sending a 1000 requests in short order, another possibility is
that you are running out of buffers. 

You can slay io-net and re-start it as follows to provide more buffers to
the driver

sloginfo -c 
io-net -di82544 receive=2048,transmit=2048,verbose -pqnet

(you may also have to add the -ptcpip option if you need IP connectivity as
well)

Run through your tests and then do another "sloginfo" on both nodes and see
if there are any driver error messages printed out.


	Robert.


-----Original Message-----
From: Norbert Wassmer [mailto:norbert.wassmer@de.bosch.com] 
Sent: Monday, December 17, 2007 6:19 AM
To: technology-networking
Subject: qnet delays

Hi,

I hope this is the right forum for my question.....

I'm using QNX4 to control many of our testbenches on 25 Node network.
Now I plan to switch over to Neutrino.
To test some things (especially qnet), I have connected 2 Neutrino computers
called nto1 and nto2 (using a crossover connection)

I started qnet (with no options) and I can access nto2  from  nto1.

So far everything is ok, but

- when I call  sin -n nto2 from nto1 the program crashes with a core dump

- when I call  pidin -n nto2   I get the list but sometimes (often) the
output stops for about 10 seconds and continues 
after that delay. 

- I have started a small server prog on nto2. Then I connect a client, which
runs on nto1, to the server. The client 
sends 1000 requests to the server (message length 4 byte). To see what
happens the server prints out the number of the 
current request.  The output stops after a certain number of request  (again
for about 10 secs) continues, stops again, 
continues....

What is the reason for that delays?

In the attachment you find some info about my configuration.

Thanks 

Norbert

_______________________________________________
Technology
http://community.qnx.com/sf/go/post3815
Re: qnet delays  
I would suggest use a hub/switch instead of cross-cable first, see if that helps.
Re: qnet delays  
Hello Robert, hello Xiaodan,

thank you for your answers.

First I used a Cisco-Switch instead of the crossover cable
 --> no luck, same behavior!

Next I started io-net as you suggested in your reply   io-net -di82544 receive=2048,transmit=2048,verbose -pqnet
--> no luck (pidin -n /net/nto2  needs up to 26 seconds to finish)

I have noticed the following:
If I call pidin -n /net/nto1  from nto2 qnetstats on nto1 is filled up with error messages. No error messages
appears in qnetstats on nto2 and vice versa. Only the "remote" host has error messages in his qnetstats file 
(is this the expected behavior ?).

In the attachment you can find the outputs...

Thanks,

Norbert
Attachment: Text qnet1.txt 30.94 KB
Re: qnet delays  
The error message basically says QNET give the driver a packet to transmit. The driver doesn't giving it back. (driver 
suppose to give it back once it put the data on the wire). After about 10 seconds, QNET give up.

This sounds to me either be a driver problem or a hardware issue.

Any chance to change to a different network card on it? Is nto1 also a 6.3.2 with all latest software?
Re: qnet delays  
Hello Xiaodan,

Both computers have exactly the same configuration (Hard- and Software).

I added a second network card into nto1 (8086h, 1229h, 82557/8/9 [Ethernet Pro 100] driver devn-speedo.so)

Now everything works fine (no delays) !!! 

I think it doesen't matter in which computer (not1 or nto2) I add the second network card,
the problem exists only when I connect two 82544 cards together. Can I do something to find out
what kind of problem we have (driver or hardware) ?

BTW: how can I define at start up which network card is assigned to en0 and which one is assigned to en1 ?

Thanks,

Norbert
RE: qnet delays  
At least we've isolated it down to a hardware or driver issue at this point.

Just to check, if you replace one card, does this mean that communications
in both directions now work correctly (i.e. no qnet errors?).  If you put
the new card in the other machine, do you see the same behaviour?  It's
possible that one of those network cards is simply broken.

The ordering of the interfaces is (I believe) determined by the order in
which they're found on the PCI bus.  You can probably modify the enumerator
configuration files to get the numbering, but it may be just as easy to slay
and re-start io-net with specific driver ordering in the option list and
you'll get the numbers falling out from that.

	Robert.


-----Original Message-----
From: Norbert Wassmer [mailto:norbert.wassmer@de.bosch.com] 
Sent: Wednesday, December 19, 2007 6:02 AM
To: technology-networking
Subject: Re: qnet delays

Hello Xiaodan,

Both computers have exactly the same configuration (Hard- and Software).

I added a second network card into nto1 (8086h, 1229h, 82557/8/9 [Ethernet
Pro 100] driver devn-speedo.so)

Now everything works fine (no delays) !!! 

I think it doesen't matter in which computer (not1 or nto2) I add the second
network card,
the problem exists only when I connect two 82544 cards together. Can I do
something to find out
what kind of problem we have (driver or hardware) ?

BTW: how can I define at start up which network card is assigned to en0 and
which one is assigned to en1 ?

Thanks,

Norbert


_______________________________________________
Technology
http://community.qnx.com/sf/go/post3897
Re: RE: qnet delays  
> At least we've isolated it down to a hardware or driver issue at this point.
> 

Just want to drop in to mention we have similar problem and are using similar hardware.  I'm told Jason Clark ( my 
contact on this issue ) is in touch with you on this matter.

- Mario
Re: RE: qnet delays  
Hello Robert,

if I put the devn-speedo network card into the second  computer I can see the same behavior:
If I call pidin -n /net/nto...  on the node with uses devn-speedo.so for qnet, there are no delays
and no error messages. If I call pidin -n /net/nto... on the computer which uses the 82544 card for qnet
there are short delay and a number of error messages in both qnetstats files (see the attachment).

Here is a summary:

1)
nto1 (Ethernet Pro 100, devn-speedo.so)  ---  nto2  (devn-i82544.so)
calling pidin -n /net/nto2 on nto1:  no delays and no error messages in qnetstats files on both computer
calling pidin -n /net/nto1 on nto2:  short delays and a number of error messages in qnetstats files on both computer

2)
nto1 (devn-i82544.so) --- nto2 (Ethernet Pro 100, devn-speedo.so) 
calling pidin -n /net/nto2 on nto1:  short delays and a number of error messages in qnetstats files on both computer
calling pidin -n /net/nto1 on nto2:  no delays and no error messages in qnetstats files on both computer

3)
nto1 (3Com 9055, devn-el900.so) --- nto2 (Ethernet Pro 100, devn-speedo.so) 
no delays and no error messages at all (everything is ok)

4)
nto1 (devn-i82544.so) --- nto2 (devn-i82544.so)   10 MBit, half duplex
no delays and no error messages at all (everything is ok)

hope this helps....

Norbert
Attachment: Text qnetstat.txt 9.97 KB
Re: RE: qnet delays  
Hmmm....  So it's definitely tied to the Intel Network cards.  I believe that those cards are PCI-Express aren't they?  
We have seen some strange things with PCI-Express involving the card FIFOs not being properly serviced resulting in 
dropped packets, but this usually only happens for Gigabit rates.  Out of curiosity, do you have the latest BIOS 
installed?

There was a new version of the i82544 driver released recently 
(http://www.qnx.com/download/feature.html?programid=17094).  Is this the version of the driver that you're running?  I 
can't see anything specifically added that would fix the behaviour that you're seeing, but it wouldn't hurt to give it a
 shot.


     Robert.
AW: RE: qnet delays  
Hello Robert,

I'm out of office until Jan 07. I will give you a feedback when I'm back in office again.

Norbert
 

-----Ursprüngliche Nachricht-----
Von: Robert Craig [mailto:rcraig@qnx.com] 
Gesendet: Sonntag, 23. Dezember 2007 19:36
An: technology-networking
Betreff: Re: RE: qnet delays

Hmmm....  So it's definitely tied to the Intel Network cards.  I believe that those cards are PCI-Express aren't they?  
We have seen some strange things with PCI-Express involving the card FIFOs not being properly serviced resulting in 
dropped packets, but this usually only happens for Gigabit rates.  Out of curiosity, do you have the latest BIOS 
installed?

There was a new version of the i82544 driver released recently 
(http://www.qnx.com/download/feature.html?programid=17094).  Is this the version of the driver that you're running?  I 
can't see anything specifically added that would fix the behaviour that you're seeing, but it wouldn't hurt to give it a
 shot.


     Robert.

_______________________________________________
Technology
http://community.qnx.com/sf/go/post3980
Re: AW: RE: qnet delays  
Hello Robert,

I have installed the new version of the driver from Nov 07. 
The behavior is the same as before. There are long delays when I connect two i82544 cards together.

BIOS?  I hope the BIOS is up to date (the computers are about 1 year old). 
PCI-Express?  Ethernet is onboard

Norbert
RE: AW: RE: qnet delays  
Hi Norbert:
	From the PCI device ID (0x108c) I can see that this is an Intel
82573E which is indeed a PCI Express card (just hardwired on the
motherboard).  Nicinfo isn't showing any packets getting dropped, which is
good, but that "still in progress" error seems to point to the card not
transmitting the data.

I've asked someone else who is far more familiar with the driver to see if
he can figure out what's going on.  

It would be very interesting to see if the io-pkt version of the driver has
the same problem.  This would, of course, involve downloading and setting up
the new networking stack...

	Robert.



-----Original Message-----
From: Norbert Wassmer [mailto:norbert.wassmer@de.bosch.com] 
Sent: Wednesday, January 09, 2008 1:15 AM
To: technology-networking
Subject: Re: AW: RE: qnet delays

Hello Robert,

I have installed the new version of the driver from Nov 07. 
The behavior is the same as before. There are long delays when I connect two
i82544 cards together.

BIOS?  I hope the BIOS is up to date (the computers are about 1 year old). 
PCI-Express?  Ethernet is onboard

Norbert

_______________________________________________
Technology
http://community.qnx.com/sf/go/post4134
Re: RE: AW: RE: qnet delays  
Hi Robert,

I tried the io-pkt stack but I have some diffuculties.
Because I have problems to get the source code with svn, I have downloaded the binarys from Project Downloads.
I installed the package in /root/stage and set the path enviroment variables.

# slay io-net

First try the Ethernet Pro (speedo.so) to establish a qnet connection 
# io-pkt-v4-hc &   --> ok
# mount -T io-pkt devnp-speedo.so  --> ok
# setup IP-Addr. with phlip  (shelf)
# ping my second computer --> ok
# mount -T io-pkt /root/stage/x86/lib/dll/npm-qnet-l4_lite.so  --> failed 
   mount: Can't mount / (type io-pkt)
   mount: Possible reason: No such process



Next try the i82544 card for TCP/IP communication

# io-pkt-v4-hc &   --> ok
# mount -T io-pkt devnp-i82544.so  --> ok
# setup IP-Addr. with phlip  (shelf)
# nicinfo

wm0: INTEL 82544 Gigabit (Copper) Ethernet Controller

  Physical Node ID ........................... FFFFFF FFFFFF
  Current Physical Node ID ................... FFFFFF FFFFFF
  Current Operation Rate ..................... 1000.00 Mb/s full-duplex
  Active Interface Type ...................... MII

(Node ID wrong, Operation Rate wrong (it's 100 Mb/s)

The card is connected to a 100 Mb/s network.
TCP/IP doesn't work....

In the attachment you can find the output from sloginfo and nicinfo

What goes wrong ??

I cannot check out the source code because svn failed (Firewall,...???)
Here is what svn reports:
# svn checkout --username norbert.wassmer@de.bosch.com http://community.qnx.com/svn/repos/core_networking
svn: PROPFIND request failed on '/svn/repos/core_networking'
svn: PROPFIND of '/svn/repos/core_networking': Could not resolve hostname `community.qnx.com': No address associated wi
th name (http://community.qnx.com)

Is there another way to get the source code ?


Thanks,

Norbert
Attachment: Text iopkt.log 4.65 KB