Project Home
Project Home
Wiki
Wiki
Discussion Forums
Discussions
Project Information
Project Info
Forum Topic - local UDP Echo test three times slower on QNX than on Linux: (12 Items)
   
local UDP Echo test three times slower on QNX than on Linux  
Hi Robert and all,

a customer recently reported that in a comparison against a Realtime Linux, a small UDP local echo test program seems to
 be almost 3 times slower on QNX 6.4. ThomasH and myself have analysed it, changed the way it measures the duration of 
single echos to use ClockCycles. The client can print a nice histogram after running.

We've taken kernel logs and analyzed them while both udp client and server ran on prio 60. I'm attaching the program 
(and the log in next post) taken on a 500 MHz x86 (there final target is an MPC 5200).

Most of the time is being spent in io-pkt. Of course the message passing brings in some overhead here for almost no data
 in the packets, but is this all? Can the programs (server and client) be optimized or are there tricks to configure io-
pkt to improve this test scenario?


- Malte
Attachment: Compressed file udp.tar 30 KB
Re: local UDP Echo test three times slower on QNX than on Linux  
Here's a screenshot of the log. 
Attachment: Image udp1.png 97.28 KB
Re: local UDP Echo test three times slower on QNX than on Linux  
The kernel trace log. It clearly shows io-pkt being very busy, and also it seems the client wakes up a 2nd time before 
the server processes the first echo, not sure if my interpretation is correct though.
Attachment: Compressed file udp-log.zip 2.97 MB
RE: local UDP Echo test three times slower on QNX than on Linux  
Hi Malte:
	We know that small packet performance is the worst case scenario
given our architecture, but you're right that there appears to be
"something" in the stack consuming CPU rather than it being purely an
artifact of the message passing.  This is going to take some time to
analyze so stay tuned.

	Robert.


-----Original Message-----
From: Malte Mundt [mailto:community-noreply@qnx.com] 
Sent: Monday, January 26, 2009 9:34 AM
To: technology-networking
Subject: Re: local UDP Echo test three times slower on QNX than on Linux

The kernel trace log. It clearly shows io-pkt being very busy, and also
it seems the client wakes up a 2nd time before the server processes the
first echo, not sure if my interpretation is correct though.

_______________________________________________
Technology
http://community.qnx.com/sf/go/post20779
Re: RE: local UDP Echo test three times slower on QNX than on Linux  
Thanks Robert, I'm staying tuned and I also let our customer know about this posting so he can follow it, aswell.


- Malte
RE: local UDP Echo test three times slower on QNX than on Linux  
Quick note on the client waking up a second time... 

You can see from the events that io-pkt is broken up into long and short
processing intervals.  

The shorter interval happens as a result of a recvfrom from the app
(i.e. the app is asking for data but there's none immediately
available).  Io-pkt is squirreling away the information that someone is
waiting for data (and also does a quick check to see if any other
threads have dropped IP packets into the queue to be processed).
 
The long interval is a sendto transaction.  The localhost loopback
results in a transmitted packet being looped back and immediately
received so you' see a combination of a udp_output followed by a
udp_input (which then results in the data being sent to the other app
which is reply blocked in it's recvfrom function).

The long / short intervals ping pong back and forth between the svr and
client as they switch between sendto and recvfrom modes.

I found the placement of the trace events in the server app to be a put
confusing and moved one to before the recvfrom.  It made things much
clearer for me...

	Robert.				

-----Original Message-----
From: Malte Mundt [mailto:community-noreply@qnx.com] 
Sent: Monday, January 26, 2009 9:34 AM
To: technology-networking
Subject: Re: local UDP Echo test three times slower on QNX than on Linux

The kernel trace log. It clearly shows io-pkt being very busy, and also
it seems the client wakes up a 2nd time before the server processes the
first echo, not sure if my interpretation is correct though.

_______________________________________________
Technology
http://community.qnx.com/sf/go/post20779
RE: local UDP Echo test three times slower on QNX than on Linux  
Hi Malte:

I've spent some time looking into this and need a bit of feedback.  I've
got two 2.1 GHz dual core Intel boxes with Intel GigE (PCI Express
cards) in them.  I've got them dual booting Ubuntu 8.04 and Neutrino
6.4.0.

I slightly modified the udpcli.c to time the series command (i.e. print
out the amount of time it takes to do the given number of series instead
of doing a histogram).

On Neutrino, localhost gives me (for 100,000 reps) ~2.2 s as compared to
~1.2 seconds for Linux... This is less than a factor of 2 which is not
bad (but we still have to do the analysis to figure out where the time
is going).

The really interesting thing is what happens when you try the same
application using real hardware instead of over localhost.  With the new
version of devnp-i82544 that we have, if you run the stack on both sides
with

io-pkt-v4 -di82544 irq_thresh=0  

100000 reps takes 8.6 seconds.  The same hardware on linux is giving me
10.6 seconds, so we're 25% faster than Linux in this case.

Is there anyway that you can confirm this? I suspect that this will be
HIGHLY dependent upon the driver...

   Robert.



-----Original Message-----
From: Malte Mundt [mailto:community-noreply@qnx.com] 
Sent: Monday, January 26, 2009 9:34 AM
To: technology-networking
Subject: Re: local UDP Echo test three times slower on QNX than on Linux

The kernel trace log. It clearly shows io-pkt being very busy, and also
it seems the client wakes up a 2nd time before the server processes the
first echo, not sure if my interpretation is correct though.

_______________________________________________
Technology
http://community.qnx.com/sf/go/post20779
Re: RE: local UDP Echo test three times slower on QNX than on Linux  
Hi Robert, 

> Hi Malte:
> 
> I've spent some time looking into this and need a bit of feedback.  I've
> got two 2.1 GHz dual core Intel boxes with Intel GigE (PCI Express
> cards) in them.  I've got them dual booting Ubuntu 8.04 and Neutrino
> 6.4.0.
> 
> I slightly modified the udpcli.c to time the series command (i.e. print
> out the amount of time it takes to do the given number of series instead
> of doing a histogram).
> 
> On Neutrino, localhost gives me (for 100,000 reps) ~2.2 s as compared to
> ~1.2 seconds for Linux... This is less than a factor of 2 which is not
> bad (but we still have to do the analysis to figure out where the time
> is going).
> 
> The really interesting thing is what happens when you try the same
> application using real hardware instead of over localhost.  With the new
> version of devnp-i82544 that we have, if you run the stack on both sides
> with
> 
> io-pkt-v4 -di82544 irq_thresh=0  
> 
> 100000 reps takes 8.6 seconds.  The same hardware on linux is giving me
> 10.6 seconds, so we're 25% faster than Linux in this case.
> 
> Is there anyway that you can confirm this? I suspect that this will be
> HIGHLY dependent upon the driver...

I think you are right, the driver is the most important key.
Some of our QNX embedded targets are very very slow on the network. 
The actual network drivers for ppc405gpr, ppc440epx and mpc5200 are far away from optimized drivers. If I remember right
, the main problem is the tx-path of those drivers. The TX-packets are defragmented/copied  by the cpu before they will 
fetched by the dma-part oft the network chips. Additionally there seems to be annother strange problem  in the mpc5200-
driver, which transfers more bytes/sec as "echo server" than as "chargen server" !?

Well, these drivers are stable and do not make problems in a hard realtime environment, were some important threads live
 above the io-pkt priority.

Perhaps the native ports of these drivers will get faster in the future.

Kind Regards
Michael

> 
>    Robert.
> 
> 
> 
> -----Original Message-----
> From: Malte Mundt [mailto:community-noreply@qnx.com] 
> Sent: Monday, January 26, 2009 9:34 AM
> To: technology-networking
> Subject: Re: local UDP Echo test three times slower on QNX than on Linux
> 
> The kernel trace log. It clearly shows io-pkt being very busy, and also
> it seems the client wakes up a 2nd time before the server processes the
> first echo, not sure if my interpretation is correct though.
> 
> _______________________________________________
> Technology
> http://community.qnx.com/sf/go/post20779


RE: local UDP Echo test three times slower on QNX than on Linux  
Dumb question: are they running an io-net driver, with
the shim?  If so, the extra thread switch during packet
rx might explain some of what you're seeing.  The output
of "ifconfig" and "piding mem" will tell the story.

Most of the time, io-net "shim" drivers work just fine.

However, if you're doing performance testing - especially
when you are also considering cpu consumption, esp on
slower machines - a native driver is a "must".

--
aboyd
RE: local UDP Echo test three times slower on QNX than on Linux  
Maybe I misunderstand the network stack architecture,
but is the shim layer really involved in IP communication
between two local processes?

- Thomas

> -----Original Message-----
> From: Andrew Boyd [mailto:community-noreply@qnx.com] 
> Sent: 26 January 2009 15:42
> To: technology-networking
> Subject: RE: local UDP Echo test three times slower on QNX 
> than on Linux
> 
> 
> Dumb question: are they running an io-net driver, with the 
> shim?  If so, the extra thread switch during packet rx might 
> explain some of what you're seeing.  The output of "ifconfig" 
> and "piding mem" will tell the story.
> 
> Most of the time, io-net "shim" drivers work just fine.
> 
> However, if you're doing performance testing - especially 
> when you are also considering cpu consumption, esp on slower 
> machines - a native driver is a "must".
> 
> --
> aboyd
> 
> 
> _______________________________________________
> Technology
> http://community.qnx.com/sf/go/post20780
> 
> 
RE: local UDP Echo test three times slower on QNX than on Linux  
Sorry, my bad - awfully early yet.  It REALLY was a dumb question  :)

--
aboyd
RE: local UDP Echo test three times slower on QNX than on Linux  
I rather read a 'dumb' question than find myself working 
with a dumb implementation...   ;-)

- Thomas

> -----Original Message-----
> From: Andrew Boyd [mailto:community-noreply@qnx.com] 
> Sent: 26 January 2009 15:46
> To: technology-networking
> Subject: RE: local UDP Echo test three times slower on QNX 
> than on Linux
> 
> 
> Sorry, my bad - awfully early yet.  It REALLY was a dumb question  :)
> 
> --
> aboyd
> 
> 
> _______________________________________________
> Technology
> http://community.qnx.com/sf/go/post20782
> 
>