Mark Dowdy(deleted)
|
Two instances of io-pkt interfering with each other
|
Mark Dowdy(deleted)
12/21/2012 3:19 PM
post98289
|
Two instances of io-pkt interfering with each other
First some background. We're running two instances of io-pkt in our system, one for 'normal' network traffic (i.e. TCP/
IP, UDP/IP, management, Qnet, Photon (that's another story), etc.) and a second for time critical traffic (data
extracted/injected via BPF, no IP address configured, Qnet not started on this interface). These are x86 machines, one
multicore and 2 or 3 single core. On the multicore machine, we're using the devnp-e1000 driver on both interfaces (NIC
device ID 1096h); on the single core machines we're running the devnp-speedo driver on the 'normal' network (NIC device
ID 1209h) and the devnp-e1000 driver on the time critical network (NIC device ID 1076h). Time critical system control
traffic is sent using the second io-pkt instance every ms.
Using an internally developed traffic test program, we observe time critical data flowing from a single core machine to
the multicore machine and then from the multicore machine back to the single core with a best case round trip time of ~
290 µsec. In some cases, however, this round trip time increases to ~450 µsec. You may be asking "what's 160 µsec
between friends?" but we use this time critical channel for motor control data and when we use our full system (multiple
single core machines sending data with our full motor control algorithms and other system functions), the problem
magnifies to the point that we see traffic that doesn't make it back before the next cycle starts 1 ms later.
So, we took to looking at what was going on using the System Profiler (i.e. kernel trace) and observed that traffic on
the time critical network (2nd io-pkt instance) slowed down when there was action on the management network (1st io-pkt
instance). That was unexpected since the reason we started two instances of io-pkt was to avoid exactly that scenario,
regular traffic negatively impacting time critical traffic. The second io-pkt instance is started with higher priority
(see below). The kernel trace only gives us exposure to task activity, not NIC driver execution, so it's hard to say
when network data actually "hits the wire". What could be going on in io-pkt that could be causing this type of adverse
interaction? On the single core machine, it looks like the first io-pkt instance is running and delaying the driver from
putting data on the wire. With two instances of io-pkt using devnp-e1000 on the multicore machine, could the shared
driver object be 'binding' the two io-pkt instances and altering performance?
Thanks.
First io-pkt instance start-up (i.e. 'normal' network on multicore machine)
io-pkt-v4-hc -ptcpip
mount -Tio-pkt -opci=0,vid=0x8086,did=0x1096 /lib/dll/devnp-e1000.so
Second io-pkt instance start-up (i.e. time critical network on multicore machine)
io-pkt-v4-hc -i1 -ptcpip prefix=/alt
mount -Tio-pkt1 -opci=1,vid=0x8086,did=0x1096,priority=32,receive=512,transmit=4096 /lib/dll/devnp-e1000.so
|
|
|