Nick Reilly
|
Re: Concurrency and mutual exclusion in io-pkt network drivers
|
Nick Reilly
03/28/2019 11:02 AM
post119620
|
Re: Concurrency and mutual exclusion in io-pkt network drivers
Hi John,
While the start callback is usually called from stack context, there are 3 scenarios where it is called from an
interrupt event handler thread:
1) Bridging. The Ethernet frame is received by the receiving interface's Rx interrupt event handler. The call to ifp->
if_input() will perform a bridging lookup and call the output interface's start callback all within this same context.
2) Fast Forwarding. Similar to the Bridging scenario but with an IP flow.
3) Tx after Tx descriptors are full. The start callback is usually called by stack context everytime that a packet is
added to the interface Tx queue. The start callback populates the Tx descriptors, but if it runs out of room then it
sets IFF_OACTIVE, enables Tx completion interrupts and returns. When a Tx completion interrupt fires then the Tx
interrupt event handler will run. This should take the ifp->if_snd_ex lock, reap the used descriptors, and continue to
populate the Tx descriptors now that it has more space. It may run out of space again in which case this pattern repeats
. Eventually it will drain the Tx queue and it should clear IFF_OACTIVE and disable the Tx completion interrupt (to
avoid wasting CPU). Now that IFF_OACTIVE is clear then the stack context will call the driver start callback each time
it places a packet to be transmitted on the Tx queue.
Just to clarify one thing about the ifp->if_snd_ex. It is actually the lock for the interface Tx queue, so io-pkt stack
context takes the lock each time it wants to add a packet on to the queue. We can reuse it in the stop callback to
indicate that the start callback isn't running because the start callback is always called with the lock held.
You are correct about nic_delay() when it yields stack context. The stack context can now perform any other stack
context operations including driver callbacks, timer callouts etc. N.B. This includes calling the detach callback, which
invariably leads to a crash after the detach callback has removed the interface, the delay expires and then the stack
context tries to switch back to running code that may no longer be in memory, or if it is then it refers to an interface
that no longer exists and the structures have been freed. The most painful to debug crashes come from when that freed
memory is actually reused for another interface.
nic_mutex() is used to provide mutex-like locking for the stack context operations - an actual mutex cannot be used.
As I mentioned, the low level call that yields stack context is ltsleep() which can be wrapped in tsleep(). In driver
code it is usually the nic_delay() call that yields context or blockop().
A note on delays:
io-pkt has a best timer granularity of 8.5ms, but it can extend out to 50ms if no traffic is running. This means that a
call to nic_delay(1) can be a 50ms delay rather than a 1ms delay. If there are many such calls in succession this can
lead to a driver taking many seconds to setup some hardware. In these scenarios it is best to call blockop(). This will
yield stack context at that point and execute the code in the io-pkt main thread (thread 1). nic_delay() detects the
context it is running under and will switch to using normal sleep calls with the granularity of the system timer
(default 1ms). At the completion of the blockop() then the original stack context routine is placed in a ready state -
although note that if the stack context is currently performing some other operation it may not run immediately.
Regards,
Nick.
|
|
|