Project Home
Project Home
Wiki
Wiki
Discussion Forums
Discussions
Project Information
Project Info
Forum Topic - Concurrency and mutual exclusion in io-pkt network drivers: (10 Items)
   
Concurrency and mutual exclusion in io-pkt network drivers  
Hello,

I am developing a driver for a PCI-based network interface for QNX 7. I have read the Writing Network Drivers for io-pkt
 section in the Core Networking Stack User's Guide and looked at the ravb and mx6x driver source.

It is still not entirely clear to me if and when there may be more than one thread executing the driver code. I don't 
anticipate having to create my own separate driver threads.

My device has MSI-X interrupts so I plan to have multiple ISRs, one for each MSI-X interrupt source. I would therefore 
expect that multiple interrupt event handling threads could run concurrently. If that is possible I will need to protect
 driver resources, including device registers, from concurrent access. 

Can the io-pkt stack context thread run concurrently with the interrupt event handling threads? For example, can the 
stack queue packets for transmission or update the link speed, etc, while receive and transmit interrupt events are 
being processed by io-pkt worker threads?

The drivers source I have looked at has almost no locking. There is a little use of nic_mutex_lock/unlock but only in 
some peripheral cases. Is nic_mutex_lock/unlock the right mechanism to provide access serialisation in the driver code?

BTW, I cannot find mention of nic_mutex_lock/unlock and nic_delay in the documentation. Where are these functions 
described?

If anyone can shed some light on this it would be greatly appreciated.

Thanks,
John
Re: Concurrency and mutual exclusion in io-pkt network drivers  
Hi John,
Yes, the stack context thread and multiple interrupt event handling threads can run concurrently.

nic_mutex_lock() is for locking the multiple coroutines that can run on the stack context thread, it will panic() if 
called from an interrupt event handling thread. Take a look at NW_SIGLOCK() / NW_SIGLOCK_P() for providing locking 
between an interrupt event handling thread and a stack context thread.

Regards,
Nick
Re: Concurrency and mutual exclusion in io-pkt network drivers  
Hi Nick,

Thanks - it's beginning to make more sense now.

I see the stack already uses ifp->if_snd_ex to serialise access to transmit resources. It appears this mutex is held 
when the start and stop callbacks are entered - is that correct? 

Are you saying I can create additional mutex if I need to serialise access to other driver resources in stack context 
and  interrupt processing threads? Should the new mutex be initialised using NW_EX_INIT?

What are the rules using nic_mutex_lock()? I mean how you do recognise situations where it might be required?

Regards,
John
Re: Concurrency and mutual exclusion in io-pkt network drivers  
The ifp->if_snd_ex is only held by the stack when the driver start callback is called, the stop callback needs to lock 
it first to ensure that the driver start isn't currently running.

A new mutex would be initialised with NW_EX_INIT() and destroyed with NW_EX_DESTROY().

nic_mutex_lock() is used to synchronise across multiple io-pkt coroutines running on the stack context thread. This is 
necessary any time that the stack context thread can yield - at the lowest level it is a call to ltsleep(), but at the 
driver level this is often wrapped up in a nic_delay() call.

Remember that io-pkt stack context is single threaded hence you do need to yield rather than sitting in a blocking call.
 If you don't yield then you can impact other stack context operations such as traffic on other interfaces.
Re: Concurrency and mutual exclusion in io-pkt network drivers  
Hi Nick,

> The ifp->if_snd_ex is only held by the stack when the driver start callback is
>  called, the stop callback needs to lock it first to ensure that the driver 
> start isn't currently running.

How is it possible for the stack to call the stop callback if the start callback is still running? Doesn't that imply 
another thread of execution? I thought the driver callbacks (and callouts) were invoked in the stack context.

> nic_mutex_lock() is used to synchronise across multiple io-pkt coroutines 
> running on the stack context thread. This is necessary any time that the stack
>  context thread can yield - at the lowest level it is a call to ltsleep(), but
>  at the driver level this is often wrapped up in a nic_delay() call.

Does this mean that if a driver callback uses nic_delay() the stack context can invoke another driver callback (or timer
 callout)? The nic_mutex is therefore used to protect driver critical resources in this situation - is that correct?

Apart from nic_delay() what other calls result in a callback yielding to another stack coroutine that could result in 
the stack entering another driver callback?

Regards,
John

Re: Concurrency and mutual exclusion in io-pkt network drivers  
Hi John,

While the start callback is usually called from stack context, there are 3 scenarios where it is called from an 
interrupt event handler thread:

1) Bridging. The Ethernet frame is received by the receiving interface's Rx interrupt event handler. The call to ifp->
if_input() will perform a bridging lookup and call the output interface's start callback all within this same context.

2) Fast Forwarding. Similar to the Bridging scenario but with an IP flow.

3) Tx after Tx descriptors are full. The start callback is usually called by stack context everytime that a packet is 
added to the interface Tx queue. The start callback populates the Tx descriptors, but if it runs out of room then it 
sets IFF_OACTIVE, enables Tx completion interrupts and returns. When a Tx completion interrupt fires then the Tx 
interrupt event handler will run. This should take the ifp->if_snd_ex lock, reap the used descriptors, and continue to 
populate the Tx descriptors now that it has more space. It may run out of space again in which case this pattern repeats
. Eventually it will drain the Tx queue and it should clear IFF_OACTIVE and disable the Tx completion interrupt (to 
avoid wasting CPU). Now that IFF_OACTIVE is clear then the stack context will call the driver start callback each time 
it places a packet to be transmitted on the Tx queue.

Just to clarify one thing about the ifp->if_snd_ex. It is actually the lock for the interface Tx queue, so io-pkt stack 
context takes the lock each time it wants to add a packet on to the queue. We can reuse it in the stop callback to 
indicate that the start callback isn't running because the start callback is always called with the lock held.


You are correct about nic_delay() when it yields stack context. The stack context can now perform any other stack 
context operations including driver callbacks, timer callouts etc. N.B. This includes calling the detach callback, which
 invariably leads to a crash after the detach callback has removed the interface, the delay expires and then the stack 
context tries to switch back to running code that may no longer be in memory, or if it is then it refers to an interface
 that no longer exists and the structures have been freed. The most painful to debug crashes come from when that freed 
memory is actually reused for another interface.

nic_mutex() is used to provide mutex-like locking for the stack context operations - an actual mutex cannot be used.

As I mentioned, the low level call that yields stack context is ltsleep() which can be wrapped in tsleep(). In driver 
code it is usually the nic_delay() call that yields context or blockop().

A note on delays:
io-pkt has a best timer granularity of 8.5ms, but it can extend out to 50ms if no traffic is running. This means that a 
call to nic_delay(1) can be a 50ms delay rather than a 1ms delay. If there are many such calls in succession this can 
lead to a driver taking many seconds to setup some hardware. In these scenarios it is best to call blockop(). This will 
yield stack context at that point and execute the code in the io-pkt main thread (thread 1). nic_delay() detects the 
context it is running under and will switch to using normal sleep calls with the granularity of the system timer 
(default 1ms). At the completion of the blockop() then the original stack context routine is placed in a ready state - 
although note that if the stack context is currently performing some other operation it may not run immediately.

Regards,
Nick.
Re: Concurrency and mutual exclusion in io-pkt network drivers  
Hi Nick,

Thanks again for the clear and detailed information.

I take the point re nic_delay(). However, some drivers I've seen use nic_delay when waiting for register bits to change,
 e.g. following a reset, or waiting for the MII bus to become free at the start of the MII read/write routines.

Is that still preferable to doing a busy-wait using nanosleep_ns()?

Regards,
John
Re: Concurrency and mutual exclusion in io-pkt network drivers  
Hi John,
In general it depends on how long the wait is. The rule of thumb that I use is < 1ms then you can busy wait, otherwise 
use nic_delay(). When you are busy waiting you are blocking io-pkt stack context so you are potentially affecting 
traffic on other interfaces / Unix Domain Sockets etc.
Regards,
Nick
Re: Concurrency and mutual exclusion in io-pkt network drivers  
Hi Nick,

> 
> nic_mutex() is used to provide mutex-like locking for the stack context 
> operations - an actual mutex cannot be used.
> 
> As I mentioned, the low level call that yields stack context is ltsleep() 
> which can be wrapped in tsleep(). In driver code it is usually the nic_delay()
>  call that yields context or blockop().

Does this mean that if a stack callback does not yield it will run to completion? In other words, there is no preemption
  or time slicing in the scheduling of co-routines?
Re: Concurrency and mutual exclusion in io-pkt network drivers  
Hi John,
Yes, it's a run to completion model with respect to other co-routines unless you yield. No preemption or time-slicing 
with respect to other co-routines. This is why you only need to consider locking when yielding and why you need to 
consider yielding when blocking.

Of course this is all still running on a POSIX thread which may be preempted or time-sliced with threads in other 
processes, so you aren't guaranteed to be on the processor all the time!

Regards,
Nick.