Project Home
Project Home
Documents
Documents
Wiki
Wiki
Discussion Forums
Discussions
Project Information
Project Info
BroadcastCommunity.qnx.com will be offline from May 31 6:00pm until June 2 12:00AM for upcoming system upgrades. For more information please go to https://community.qnx.com/sf/discussion/do/listPosts/projects.bazaar/discussion.bazaar.topc28418
Forum Topic - open channels: what's a reasonable maximum?: (14 Items)
   
open channels: what's a reasonable maximum?  
I'm dealing with a customer problem.  They are seeing a ConnectAttach call to devb-umass never returning.  Investigating
 has shown that the routine is getting stuck in the search for a channel to re-use.  It loops over the channels vector 
but gets preempted before it has a chance to complete.

There are 10348 entries in the channels vector.  We get 2 or three interrupts every millisecond on this board.  The loop
 over the channels vector gets no further than about 5000 iterations (usually much less) before it gets preempted.

My question: is it a bug for devb-umass to have that many channels open?  Or is it a bug that the kernel doesn't handle 
it correctly?

I think we need to change ker_connect_attach to handle an arbitrary number of channels without preemption troubles...  
Agree?  Disagree?

There may also be a bug in devb-umass, though.  It seems suspicious that it has that many entries in its chancons vector
...

btw, the problem was spotted in 6.3.0...
RE: open channels: what's a reasonable maximum?  
 

> -----Original Message-----
> From: Douglas Bailey [mailto:community-noreply@qnx.com] 
> Sent: November 6, 2008 11:34 AM
> To: ostech-core_os
> Subject: open channels: what's a reasonable maximum?
> 
> 
> I'm dealing with a customer problem.  They are seeing a 
> ConnectAttach call to devb-umass never returning.  
> Investigating has shown that the routine is getting stuck in 
> the search for a channel to re-use.  It loops over the 
> channels vector but gets preempted before it has a chance to complete.
> 
> There are 10348 entries in the channels vector.  We get 2 or 
> three interrupts every millisecond on this board.  The loop 
> over the channels vector gets no further than about 5000 
> iterations (usually much less) before it gets preempted.
> 
> My question: is it a bug for devb-umass to have that many 
> channels open?  Or is it a bug that the kernel doesn't handle 
> it correctly?
> 
> I think we need to change ker_connect_attach to handle an 
> arbitrary number of channels without preemption troubles...  
> Agree?  Disagree?

Either change it, or have a hard limit, and promise that we always work
up to that limit.

> 
> There may also be a bug in devb-umass, though.  It seems 
> suspicious that it has that many entries in its chancons vector...

Also agree.  I don't' see why it needs more than a handful of open
channels.  We're talking channels, and not coids, right?

> 
> btw, the problem was spotted in 6.3.0...
> 
> 
> _______________________________________________
> OSTech
> http://community.qnx.com/sf/go/post16111
> 
> 
Re: open channels: what's a reasonable maximum?  
Not being entirely familiar with the way devb-umass works, I can't say whether 
or not there's a bug in devb-umass ... but it sure does seem odd that the 
kernel wouldn't handle an arbitrary number of channels (well ... *large* 
arbitrary number).

That being said, I would classify it as a kernel bug, although, the kernel bug 
may not be the only one in play here.

I agree on changing ker_connect_attach to deal with preemption ... a good 
approach in this case may be to save the state of the iteration ... so, even 
if the call is preempted, and something happens to the list, it should 
continue to look from that spot until it reaches the end of the list, and 
start again ... this isn't an optimal solution, but it should prevent the 
problem of the higher channels never getting reused, although, it may lead to 
timeout problems if it keeps checking higher channels and low ones are 
getting freed (with a long enough list).

Okay ... I'm done ... now it's time for me to step aside, and let someone 
smarter correct me ;)

On Thursday 06 November 2008 11:34:01 Douglas Bailey wrote:
> I'm dealing with a customer problem.  They are seeing a ConnectAttach call
> to devb-umass never returning.  Investigating has shown that the routine is
> getting stuck in the search for a channel to re-use.  It loops over the
> channels vector but gets preempted before it has a chance to complete.
>
> There are 10348 entries in the channels vector.  We get 2 or three
> interrupts every millisecond on this board.  The loop over the channels
> vector gets no further than about 5000 iterations (usually much less)
> before it gets preempted.
>
> My question: is it a bug for devb-umass to have that many channels open? 
> Or is it a bug that the kernel doesn't handle it correctly?
>
> I think we need to change ker_connect_attach to handle an arbitrary number
> of channels without preemption troubles...  Agree?  Disagree?
>
> There may also be a bug in devb-umass, though.  It seems suspicious that it
> has that many entries in its chancons vector...
>
> btw, the problem was spotted in 6.3.0...
>
>
> _______________________________________________
> OSTech
> http://community.qnx.com/sf/go/post16111

Re: open channels: what's a reasonable maximum?  
On Thu, Nov 06, 2008 at 11:33:58AM -0500, Douglas Bailey wrote:
> My question: is it a bug for devb-umass to have that many channels open?  Or is it a bug that the kernel doesn't 
handle it correctly?

It's probably server connection id's rather than channels. It is a bug
though: we're supposed to be able to handle up to 32K of them.

> I think we need to change ker_connect_attach to handle an arbitrary number of channels without preemption troubles... 
 Agree?  Disagree?

Yup.

> There may also be a bug in devb-umass, though.  It seems suspicious that it has that many entries in its chancons 
vector...

Rather than devb-umass, it could be the clients connecting to it as well
that are forgetting close things. If it is scoid's like I think, we can
tell who's doing the opens and the MME folks can check if they've got a
bug there.

	Brian

-- 
Brian Stecher (bstecher@qnx.com)        QNX Software Systems
phone: +1 (613) 591-0931 (voice)        175 Terence Matthews Cr.
       +1 (613) 591-3579 (fax)          Kanata, Ontario, Canada K2M 1W8
RE: open channels: what's a reasonable maximum?  
How can we easily tell who the clients are?  The first thing we thought
of was "pidin fds", but that isn't supported in 6.3.0... 

-----Original Message-----
From: Brian Stecher [mailto:community-noreply@qnx.com] 
Sent: Thursday, November 06, 2008 11:45 AM
To: ostech-core_os
Subject: Re: open channels: what's a reasonable maximum?

On Thu, Nov 06, 2008 at 11:33:58AM -0500, Douglas Bailey wrote:
> My question: is it a bug for devb-umass to have that many channels
open?  Or is it a bug that the kernel doesn't handle it correctly?

It's probably server connection id's rather than channels. It is a bug
though: we're supposed to be able to handle up to 32K of them.

> I think we need to change ker_connect_attach to handle an arbitrary
number of channels without preemption troubles...  Agree?  Disagree?

Yup.

> There may also be a bug in devb-umass, though.  It seems suspicious
that it has that many entries in its chancons vector...

Rather than devb-umass, it could be the clients connecting to it as well
that are forgetting close things. If it is scoid's like I think, we can
tell who's doing the opens and the MME folks can check if they've got a
bug there.

	Brian


_______________________________________________
OSTech
http://community.qnx.com/sf/go/post16114
Re: open channels: what's a reasonable maximum?  
I think it'll have to be a special version of procnto that kprintf's
things out.

	Brian

On Thu, Nov 06, 2008 at 11:46:48AM -0500, Douglas Bailey wrote:
> 
> How can we easily tell who the clients are?  The first thing we thought
> of was "pidin fds", but that isn't supported in 6.3.0... 
> 
> -----Original Message-----
> From: Brian Stecher [mailto:community-noreply@qnx.com] 
> Sent: Thursday, November 06, 2008 11:45 AM
> To: ostech-core_os
> Subject: Re: open channels: what's a reasonable maximum?
> 
> On Thu, Nov 06, 2008 at 11:33:58AM -0500, Douglas Bailey wrote:
> > My question: is it a bug for devb-umass to have that many channels
> open?  Or is it a bug that the kernel doesn't handle it correctly?
> 
> It's probably server connection id's rather than channels. It is a bug
> though: we're supposed to be able to handle up to 32K of them.
> 
> > I think we need to change ker_connect_attach to handle an arbitrary
> number of channels without preemption troubles...  Agree?  Disagree?
> 
> Yup.
> 
> > There may also be a bug in devb-umass, though.  It seems suspicious
> that it has that many entries in its chancons vector...
> 
> Rather than devb-umass, it could be the clients connecting to it as well
> that are forgetting close things. If it is scoid's like I think, we can
> tell who's doing the opens and the MME folks can check if they've got a
> bug there.
> 
> 	Brian
> 
> 
> _______________________________________________
> OSTech
> http://community.qnx.com/sf/go/post16114
> 
> 
> _______________________________________________
> OSTech
> http://community.qnx.com/sf/go/post16115
> 

-- 
Brian Stecher (bstecher@qnx.com)        QNX Software Systems
phone: +1 (613) 591-0931 (voice)        175 Terence Matthews Cr.
       +1 (613) 591-3579 (fax)          Kanata, Ontario, Canada K2M 1W8
Re: open channels: what's a reasonable maximum?  

Brian Stecher wrote:
> On Thu, Nov 06, 2008 at 11:33:58AM -0500, Douglas Bailey wrote:
>> My question: is it a bug for devb-umass to have that many channels open?  Or is it a bug that the kernel doesn't 
handle it correctly?
> 
> It's probably server connection id's rather than channels. It is a bug
> though: we're supposed to be able to handle up to 32K of them.

This seems to imply that we should have some sort of generic vector_lookup mechanism to handle preemption, no?

>> I think we need to change ker_connect_attach to handle an arbitrary number of channels without preemption troubles...
  Agree?  Disagree?
> 
> Yup.
> 
>> There may also be a bug in devb-umass, though.  It seems suspicious that it has that many entries in its chancons 
vector...
> 
> Rather than devb-umass, it could be the clients connecting to it as well
> that are forgetting close things. If it is scoid's like I think, we can
> tell who's doing the opens and the MME folks can check if they've got a
> bug there.
> 
> 	Brian
> 

-- 
cburgess@qnx.com
Re: open channels: what's a reasonable maximum?  
On Thu, Nov 06, 2008 at 12:42:19PM -0500, Colin Burgess wrote:
> Brian Stecher wrote:
> > On Thu, Nov 06, 2008 at 11:33:58AM -0500, Douglas Bailey wrote:
> >> My question: is it a bug for devb-umass to have that many channels open?  Or is it a bug that the kernel doesn't 
handle it correctly?
> > 
> > It's probably server connection id's rather than channels. It is a bug
> > though: we're supposed to be able to handle up to 32K of them.
> 
> This seems to imply that we should have some sort of generic vector_lookup mechanism to handle preemption, no?

A vector_lookup isn't a problem - it's O(1). It's people scanning the
vector array: vector_search or, in this case, the traversal looking for
a connection to share with are the issue. I don't think that you can 
do a generic mechanism to deal with it.

In this case, there's a fairly trivial fix that can be made at the cost
of not always sharing a connection. Somewhat harder but not impossible,
would be something that picked up the search from where it left off when
the call got preempted.

	Brian

-- 
Brian Stecher (bstecher@qnx.com)        QNX Software Systems
phone: +1 (613) 591-0931 (voice)        175 Terence Matthews Cr.
       +1 (613) 591-3579 (fax)          Kanata, Ontario, Canada K2M 1W8
Re: open channels: what's a reasonable maximum?  
On Thursday 06 November 2008 13:26:12 Brian Stecher wrote:
>  Somewhat harder but not impossible,
> would be something that picked up the search from where it left off when
> the call got preempted.

That's what I said ... only in less words. Thank you Brian, for affirming my 
belief that there's still hope out there for me ;)
Re: open channels: what's a reasonable maximum?  
Sorry, vector_search was, of course what I meant. :-)

How about saving the current index, and a generation, in the thp args.
We would then restart the search there, and a change to the vector would
bump the generation.

Brian Stecher wrote:
> On Thu, Nov 06, 2008 at 12:42:19PM -0500, Colin Burgess wrote:
>> Brian Stecher wrote:
>>> On Thu, Nov 06, 2008 at 11:33:58AM -0500, Douglas Bailey wrote:
>>>> My question: is it a bug for devb-umass to have that many channels open?  Or is it a bug that the kernel doesn't 
handle it correctly?
>>> It's probably server connection id's rather than channels. It is a bug
>>> though: we're supposed to be able to handle up to 32K of them.
>> This seems to imply that we should have some sort of generic vector_lookup mechanism to handle preemption, no?
> 
> A vector_lookup isn't a problem - it's O(1). It's people scanning the
> vector array: vector_search or, in this case, the traversal looking for
> a connection to share with are the issue. I don't think that you can 
> do a generic mechanism to deal with it.
> 
> In this case, there's a fairly trivial fix that can be made at the cost
> of not always sharing a connection. Somewhat harder but not impossible,
> would be something that picked up the search from where it left off when
> the call got preempted.
> 
> 	Brian
> 

-- 
cburgess@qnx.com
Re: open channels: what's a reasonable maximum?  
On Thu, Nov 06, 2008 at 01:38:15PM -0500, Colin Burgess wrote:
> Sorry, vector_search was, of course what I meant. :-)
> 
> How about saving the current index, and a generation, in the thp args.
> We would then restart the search there, and a change to the vector would
> bump the generation.

Two (and 1/2) problems:

As a utility routine, vector_search doesn't know if the calling routine 
is using any of args fields already. That could be dealt by adding an
extra parm to the routine that passes in where it's supposed to store
the info.

We don't have a generic mechanism to tell if this is the first time into
a kernel call or it's being restarted due to a preemption - vector_search
wouldn't be able to tell if the args values are valid or not. This one
is a bit tricky - there are a number of corner cases.

(1/2 problem) - we don't have any place right now to store the generation
number on the vector - we'd have to grow the VECTOR structure.

	Brian


-- 
Brian Stecher (bstecher@qnx.com)        QNX Software Systems
phone: +1 (613) 591-0931 (voice)        175 Terence Matthews Cr.
       +1 (613) 591-3579 (fax)          Kanata, Ontario, Canada K2M 1W8
Re: open channels: what's a reasonable maximum?  
Brian Stecher wrote:
> On Thu, Nov 06, 2008 at 01:38:15PM -0500, Colin Burgess wrote:
>> Sorry, vector_search was, of course what I meant. :-)
>>
>> How about saving the current index, and a generation, in the thp args.
>> We would then restart the search there, and a change to the vector would
>> bump the generation.
> 
> Two (and 1/2) problems:
> 
> As a utility routine, vector_search doesn't know if the calling routine 
> is using any of args fields already. That could be dealt by adding an
> extra parm to the routine that passes in where it's supposed to store
> the info.

I was thinking of a new parm (or a new version of vector_search that takes the parm(s))

> We don't have a generic mechanism to tell if this is the first time into
> a kernel call or it's being restarted due to a preemption - vector_search
> wouldn't be able to tell if the args values are valid or not. This one
> is a bit tricky - there are a number of corner cases.

Yes, I see a few issues there.  It will bear more thinking...

> (1/2 problem) - we don't have any place right now to store the generation
> number on the vector - we'd have to grow the VECTOR structure.

Phooey! :-)

> 	Brian
> 
> 

-- 
cburgess@qnx.com
RE: open channels: what's a reasonable maximum?  
Looking at the trace and at ker_connect.c again...

The vectors being searched in ker_connect_attach() are the active
thread's fdcons and chancons.  So isn't the problem with the client
(MediaPlayer in this case) rather than the server (devb-umass)?

Doug 

-----Original Message-----
From: Brian Stecher [mailto:community-noreply@qnx.com] 
Sent: Thursday, November 06, 2008 11:45 AM
To: ostech-core_os
Subject: Re: open channels: what's a reasonable maximum?

On Thu, Nov 06, 2008 at 11:33:58AM -0500, Douglas Bailey wrote:
> My question: is it a bug for devb-umass to have that many channels
open?  Or is it a bug that the kernel doesn't handle it correctly?

It's probably server connection id's rather than channels. It is a bug
though: we're supposed to be able to handle up to 32K of them.

> I think we need to change ker_connect_attach to handle an arbitrary
number of channels without preemption troubles...  Agree?  Disagree?

Yup.

> There may also be a bug in devb-umass, though.  It seems suspicious
that it has that many entries in its chancons vector...

Rather than devb-umass, it could be the clients connecting to it as well
that are forgetting close things. If it is scoid's like I think, we can
tell who's doing the opens and the MME folks can check if they've got a
bug there.

	Brian

-- 
Brian Stecher (bstecher@qnx.com)        QNX Software Systems
phone: +1 (613) 591-0931 (voice)        175 Terence Matthews Cr.
       +1 (613) 591-3579 (fax)          Kanata, Ontario, Canada K2M 1W8

_______________________________________________
OSTech
http://community.qnx.com/sf/go/post16114
Re: open channels: what's a reasonable maximum?  
Sorry, I thought devb-umass was the active process. In that case, yeah,
the problem might be in MediaPlayer and the MME folks should be queried
if it's usual for it to have that many open files/side channels.

	Brian

On Thu, Nov 06, 2008 at 01:48:33PM -0500, Douglas Bailey wrote:
> 
> Looking at the trace and at ker_connect.c again...
> 
> The vectors being searched in ker_connect_attach() are the active
> thread's fdcons and chancons.  So isn't the problem with the client
> (MediaPlayer in this case) rather than the server (devb-umass)?
> 
> Doug 
> 
> -----Original Message-----
> From: Brian Stecher [mailto:community-noreply@qnx.com] 
> Sent: Thursday, November 06, 2008 11:45 AM
> To: ostech-core_os
> Subject: Re: open channels: what's a reasonable maximum?
> 
> On Thu, Nov 06, 2008 at 11:33:58AM -0500, Douglas Bailey wrote:
> > My question: is it a bug for devb-umass to have that many channels
> open?  Or is it a bug that the kernel doesn't handle it correctly?
> 
> It's probably server connection id's rather than channels. It is a bug
> though: we're supposed to be able to handle up to 32K of them.
> 
> > I think we need to change ker_connect_attach to handle an arbitrary
> number of channels without preemption troubles...  Agree?  Disagree?
> 
> Yup.
> 
> > There may also be a bug in devb-umass, though.  It seems suspicious
> that it has that many entries in its chancons vector...
> 
> Rather than devb-umass, it could be the clients connecting to it as well
> that are forgetting close things. If it is scoid's like I think, we can
> tell who's doing the opens and the MME folks can check if they've got a
> bug there.
> 
> 	Brian
> 
> -- 
> Brian Stecher (bstecher@qnx.com)        QNX Software Systems
> phone: +1 (613) 591-0931 (voice)        175 Terence Matthews Cr.
>        +1 (613) 591-3579 (fax)          Kanata, Ontario, Canada K2M 1W8
> 
> _______________________________________________
> OSTech
> http://community.qnx.com/sf/go/post16114
> 
> 
> _______________________________________________
> OSTech
> http://community.qnx.com/sf/go/post16129
> 

-- 
Brian Stecher (bstecher@qnx.com)        QNX Software Systems
phone: +1 (613) 591-0931 (voice)        175 Terence Matthews Cr.
       +1 (613) 591-3579 (fax)          Kanata, Ontario, Canada K2M 1W8