foundry27 : Post

Forum Topic - resource leak?: (4 Items)

View: as

Update

Expand All | Collapse All

Steve Graves

11/04/2009 7:02 PM

post41383

resource leak?

Hello,

We have created the following test case that might reveal a resource leak. Would appreciate more sets of eyes and
feedback from the community.

Testcase decription:
1. Process A creates a channel via the ChannelCreate() function and waits for incoming data requests.
2. Process B creates an incoming data channel via ChannelCreate() and connects to Process A (ConnectAttach() to procA
channel) and sends a handshake message via this channel and passes the incoming data channel id.
3. Process A picks up this message via MsgRead(), then ConnectAttach() to procB channel and create an additional channel
via ChannelCreate() function and pass this channel id to procB via MsgReplyv.
4. Process B gets reply and calls ConnectAttach() to newly created channel. So basicly we have one request processing
channel and then for every new connection we create read and write channels. This was done to support tcp-like server
connection handling.
5. Then process B sends a data message to process A.
6. Process A pick up this message and sends it back (like ping).
7. Process B gets a message from process A and disconnects from process A via ConnectDettach() and destroys it own
channel via ChannelDestroy().
8. Process A does the same thing as proc B at #7 above, but for its own part of the connection. Both ConnectDettach()
and ChannelDestroy() succed.
9. After that process B goes to step #2 and repeats all the steps again and again (until it gets an error).

This scenario works fine, but we found that every time we create a new channel in process A, it has a new (incremented)
id. So at some point we are faced with the problem that process A cannot create more Channels(ChannelCreate). We suspect
that we have some kind of resource leak here but, as far as we can tell, we are doing everything to cleanup resources.
Maybe you all can shed some light
in this case.

Also, we found that if we eventually close the accepting channel (process A)- all outstanding channels IDs become free
and we can start over, meaning the channel IDs (counters) are reset. But even with this, the system eventually grinds to
a halt after a couple of weeks, which leads to suspicion that there is a resource leak, even if we periodically close
the accepting channel.

The attached test case illustrates this problem.

One more thing, when you start process A and process B in the same terminal you can see resource leak after a while(the
way test.sh in the attachment does this). But when you start them in different terminals you'll see warning messages on
second iteration.

Thanks in advance for any insights.

Attachment:

testcase01.zip 4.75 KB

Peter Weber

11/04/2009 8:07 PM

post41386

Re: resource leak?

You have to close the server connection ID on the server side.
If a client closes the connection to a server, the server receives a pulse which contains the scoid. This pulse is 
synthesized by the kernel to tell you, that the client is gone.
 
See also http://www.qnx.com/developers/docs/6.3.0SP3/neutrino/lib_ref/c/channelcreate.html


_NTO_CHF_COID_DISCONNECT

Pulse code:
    _PULSE_CODE_COIDDEATH 
Pulse value:
    Connection ID (coid) of a connection that was attached to a destroyed channel.

Deliver a pulse to this channel for each connection that belongs to the calling process when the channel that the 
connection is attached to is destroyed. Only one channel per process can have this flag set.

Steve Graves

11/05/2009 6:06 PM

post41468

Re: resource leak?

Thanks for the reply.

We've tried to follow this logic and found the following:
1. Yes, got the message about closing server side connection.
2. got the same coid that server closed just before getting pulse. Maybe because of that ConnectDetach (inside pulse 
handler) failed with errcode: Invalid argument.
So basically it doesnt change anything - because we've already tracked channel ids and coids on server and on client 
side .. without any special notifications. And we still have the same behaviour (resource leak?). Updated source file is
 attached.

Attachment:

main.c 12.28 KB

Peter Weber

11/05/2009 7:42 PM

post41470

Re: resource leak?

sorry, my fault.
your qnxmsg_read has to check against the _PULSE_CODE_DISCONNECT too because you have set the _NTO_CHF_DISCONNECT within
 the channel flags.
The Pulse code for _PULSE_CODE_COIDDEATH provides a coid not a scoid.

Return

The text you entered is not a valid object ID
More Information
Object IDs begin with an object prefix and end with a number. For example, if you enter
artf2345
the application will jump directly to an artifact with the ID artf2345. Some valid object prefixes are:
artf	for an artifact
doc	for a document
page	for a project page
topc	for a discussion topic
wiki	for a wiki page