Project Home
Project Home
Wiki
Wiki
Discussion Forums
Discussions
Project Information
Project Info
Forum Topic - io-pkt thread context nightmare: (8 Items)
   
io-pkt thread context nightmare  
Hi, 

we are just porting out lsm-Modul, we developed based on the lsm.nraw sample years ago in the good old QNX 6.x times.
The Resource-Manager thread for our lsm-interface  is created with nw_pthread_create().
This thread is calling  ifp->if_ioctl() from time to time, to get the interface statistics.

Under QNX 7 this results in an io-.pkt panic, which is fired by nic_mutex_trylock(), called from the e1000 driver.
nic_mutex_trylock() tries to validate  the run context by checking the curlwp against proc0. If they are equal, we 
panic().

As I learnd until now, io-pkt seems to emulate half of the netbsd-kernel to run the BSD TCPIP stack.

My thread, generated by nw_pthread_create()  seems to be the wrong one for calling the  e1000 driver ioctl.
How do I get the correct thread context, to allow it.
pcreat(), kthread_create() , ...?

Do I loose performance, if my thread is part of this emulated NETBSD-scheduler ?

I am a bit confused. Please advice. 

Kind Regards
Michael
Re: io-pkt thread context nightmare  
Yes, the threading model for io-pkt is highly complex and if you get it wrong you will likely get panics and crashes.

Most driver callbacks including the ifp->if_ioctl() expect to run in stack context so you will need to use 
stk_context_callback_2() to get the code to run in the stack context.

Stack context is single threaded, so if something else is using it then your code will have to wait. Equally you should 
ensure that your code doesn't block the stack context and adversely affect other operations.
Re: io-pkt thread context nightmare  
Hello Nick, 

> Most driver callbacks including the ifp->if_ioctl() expect to run in stack 
> context so you will need to use stk_context_callback_2() to get the code to 
> run in the stack context.

i found a sample for the use of this call in the e1000 driver for QNX 7.
I will try...

Many thanks
Michael
Re: io-pkt thread context nightmare  
Hello Nick,

> Most driver callbacks including the ifp->if_ioctl() expect to run in stack 
> context so you will need to use stk_context_callback_2() to get the code to 
> run in the stack context.


"stk_context_callback_2()" seems not to be enough. We still get the "proc0" assertion.
The e1000 driver additional uses  kthread_create1() inside of the stack context callback. Why?

In the meantime we use another approach:
We removed our resmgr thread and connect our interface to the normal io-pkt  resmgr threads by using their dispatch 
handle "skt_ctl.dpp".
This seems to work perfectly, but I wonder, if this approach could have any drawbacks.

Kind Regards
Michael
Re: io-pkt thread context nightmare  
Sorry, I forgot. You would need to call stk_context_callback_2() to get across from your thread to io-pkt stack context,
 but then this runs within proc0 on stack context so you would then need to kthread_create1() to get on to a proper 
coroutine.

Relying on the io-pkt resource manager places everything in the stack context and thus the concerns I mentioned in my 
earlier message with performance, both io-pkt affecting your code and your code affecting io-pkt. Best performance would
 be to run in your own thread with your own resource manager and only do the stk_context_callback_2(), kthread_create1()
  dance for those pieces of code that need to run in io-pkt stack context.

I did mention that the io-pkt threading model is highly complicated ;-)

Regards,
Nick
Re: io-pkt thread context nightmare  
Hi Nick, 

> Relying on the io-pkt resource manager places everything in the stack context 
> and thus the concerns I mentioned in my earlier message with performance, both
>  io-pkt affecting your code and your code affecting io-pkt. Best performance 
> would be to run in your own thread with your own resource manager and only do 
> the stk_context_callback_2(), kthread_create1()  dance for those pieces of 
> code that need to run in io-pkt stack context.

the problem is finding out, which functions have to be called in stack context and which have not.
Unfortunately this can be driver dependent. e.g. Our lsm sometimes reads the statistic counters with a simple direct 
call (ifp->if_ioctl()) to the driver. Here we got the assertion in the devctl part of the QNX7 e1000 driver, using a 
nic_mutex:
 
                                case    DRVCOM_STATS:
                                        dstp = (struct drvcom_stats *)ifdc;

                                        if (ifdc->ifdc_len != sizeof (nic_stats_t)) {
                                                error = EINVAL;
                                                break;
                                                }
                                        nic_mutex_lock (&i82544->drv_mutex);
                                        update_stats (i82544);
                                        nic_mutex_unlock (&i82544->drv_mutex);
                                        memcpy (&dstp->dcom_stats, &i82544->stats, sizeof (i82544->stats));
                                        break;

This e1000 driver seems to expect more than one thread of the BSD-scheduler could run through this ioctl. ?! Is that 
possible?
Is there a rule? e.g. : "You must call the ioctl entry of a driver with the stack-context!" ????
BTW.: Why is the memcpy() not in the critical section. OK, let us hope, src and dest is aligned and the memcpy() does 
not copy by moving bytes. ;) 

What about other calls? Our lsm uses ifq_enqueue(ifp, m) to send a packet. This seems to work without the stack context,
 but how is the rule here?

What about the mbuf handling functions?

Kind Regards
Michael
Re: io-pkt thread context nightmare  
All driver callbacks expect to run in stack context with the exception of receive and start (transmit), so the following
 expect to be in stack context:
Driver entry
Attach
Detach
Init
Stop
Ioctl

The driver receive function runs in an interrupt worker context. While the driver start function is often called from 
stack context, it must also be capable of running from interrupt worker context - in the case of bridging or fast 
forwarding then the Rx on the ingress interface will directly call the Tx on the egress interface without passing in to 
stack context.

You may be lucky and have some driver callbacks not fail when called from a context other than stack context, but this 
is just luck with that particular driver.

Stack context is single threaded but with multiple coroutines in a run to completion model, but the coroutines may 
choose to yield to avoid blocking stack context (which could block traffic on other interfaces that needs stack context)
. The stack context coroutines yield by calling ltsleep(), and nic_delay() is a wrapper for this. The e1000 driver (and 
most drivers) use this when waiting for hardware. The nic_mutex() call is used to serialise operations to the hardware -
 think of it as a standard mutex that works across stack context coroutines, but note that it introduces a yielding 
point as it may not be able to acquire the lock.

To answer your other questions:
The memcpy() isn't going to yield stack context or access the hardware so it doesn't need to be inside the mutex, 
nic_mutex() is taken around the update stats call because that is going to read hardware and we need to ensure that 
other operations are not happening on the hardware at the same time.

ifq_enqueue(ifp,m) is potentially going to call the driver start routine. While this doesn't need to be in stack context
, it does need to be an mbuf handling thread - so an nw_pthread and not just a pthread.

mbufs should only be touched from nw_pthreads and not plain pthreads, but note that an nw_pthread needs to have a 
quiesce function for when io-pkt is updating certain structures. Ensuring that the quiesce can happen in a multi-
threaded scenario takes careful planning to avoid deadlocking.

Please name any threads that you do create - by default they get an io-pkt#0x01 style name which makes people think that
 they are io-pkt threads and not that they belong to an lsm!

Regards,
Nick 
Re: io-pkt thread context nightmare  
Hello Nick, 

thanks for your detailed explanation.
Now things got much clearer to me.

What a high price to make the netbsd stuff usable for a clean structured realtime OS...

Thanks again 
Michael