foundry27 : Post

Forum Topic - Asynchronous Message/Data passing: local and via QNet: (2 Items)

View: as

Marc Roessler

06/24/2008 11:02 AM

post9587

Asynchronous Message/Data passing: local and via QNet

Hello Guys,

I'm currently designing a system that will consist of multiple CPUs.
Within the main CPU, there will be several threads. Those will
send each other messages, and they will also exchange messages
with the threads running on the remote CPUs. Communication will
be asynchronous, because threads must not block when sending a message.
Also this will not be a simple request/reply protocol (a la call/return)
but more like a thread being started and then continuously receiving and
sending commands/data asynchronously.
Ideally, handing a message to a local or remote thread should not differ,
i.e. the API should be very similar or identical.

My first idea was to use Posix Message Queues (mq_*) to hand around
references to objects in memory (not a problem for the local threads
because they run in the same address space). However this does not
work for the remote threads, of course. Even if this problem was solved,
for message queue access via QNet I would need to use "mqueue" instead
of "mq" - and mqueue is said to be several times slower than mq, which
does not sound too promising.

Is there any simple and elegant solution to this? I would love to use
QNet for this due to its simplicity..

Regards, Marc

Peter Weber

07/02/2008 9:34 AM

post9915

Re: Asynchronous Message/Data passing: local and via QNet

Hi Marc,
there are different ways to do this kind of communication and it is often dictated by the available ressources and the
performance requirements. As always, a balance between comfort and speed.
To make a decision, it would be good to get some more parameters. E.g. How do you plan to dispatch the messages( central
vs decentral)?
Is it a 1:1 communication from each thread to each other or do you need a kind of broadcast mechanism (1:many)?
How do you plan to announce the availability of communication partners?
(symbolically? on the fly at runtime? hard coded?

Because QNX is a POSIX certified OS, a ressource manager would be a very robust and easy way to implement services. The
mqueue server for example is implemented as a RM using synchronous MP (Message Passing). The timly decoupling takes
place at a higher level not at the lower transport level. In other words, there is no blocking point in the mqueue
server (except you want to block in case of full queues).

To do a decoupling at application level, you could use a separate thread in each process, a so called message-thread,
which will do the work for handshaking and/or transfers and connection management. You could still use the synchronous
transport mechanism with QNET since the message-thread will be decoupled from the main threads by maintaining a
application local queue/message buffer.
The advantage of this approach is -the number of copies is only one - from sender to receiver. In case of a burst of
messages, you could also use a thread pool to send and receive the messages in parallel.
A typical implementation would be a mutex to protect the local queue together with a condvar to broadcast 'fill events'.
From application API perspective, the mechanism can be completely trasparent (local or remote).

Another way to do non-blocking transfers could be a reply-driven-messaging. This way, the messages are transferred by a
reply operation (which is non-blocking) and the remote application will unblock and consume the message. This approach
implies also an extra thread on the receiver side by doing the initial MsgSend first and then going reply-blocked
waiting for a 'reply with the message'. The sender will maintain slots of reply blocked 'receiver'. Disadvantage here is
the number of thread neccessary to get messages from many other senders. Typically, this will be combined/changed with
the following approach.

It could be a 'notify - fetch' implementation. The sender fires a pulse (non-blocking) to the receiver, telling him,
'here is a message for you - please read em'. This is also implemented in a typical ressource manager for the ionotify()
and select() libc function. (and mqueue :-).

If your application design is not strongly hirarchical (Top->Down), the notification mechanism would be the best
solution to prevent deadlocks.

Please keep in mind, that the connection management in a m:n communication relation increases quickly and it could
become complex.
A RM covers all these complex context management code in a frame work (named 'ressource manager frame work'.

Marc, please provide some numbers for a more detailed discussion.

Return

The text you entered is not a valid object ID
More Information
Object IDs begin with an object prefix and end with a number. For example, if you enter
artf2345
the application will jump directly to an artifact with the ID artf2345. Some valid object prefixes are:
artf	for an artifact
doc	for a document
page	for a project page
topc	for a discussion topic
wiki	for a wiki page