Project Home
Project Home
Documents
Documents
Wiki
Wiki
Discussion Forums
Discussions
Project Information
Project Info
Forum Topic - can msgsendvs fail?: (2 Items)
   
can msgsendvs fail?  
Hi all,

This is my first post here, so don't know if this is the correct place, but since this is a severe problem any place 
should be ok ;)

In our project we have a process which has one thread sending messages to another process which is receving messages and
 pulses from different other processes. once in a while a message does not get send correctly. I added traceevents in 
the code we use and filled in some fixed values in our header to clarify the problem. The system must be under heavy 
load for it to occur, more logging and more tracing seems to increase the change of it to happen. 

below the code and traces made, first is the successfull send (dummy=0x000e1b0e) second is the failed send (dummy=
0x000e1b0f)

@t:0xe55eccca one can see the MSG_SENDV/11 does send a message containing our dummy value correct
in the failed send 
@t:0xe570fc8a the dummy value does not appear and an incorrect message is send.

My question is: 
is it be possible that msgsendvs sometimes sends an incorrect message? wrongly reads the IOV structure, or .... 
I am clueless on what to do next since I could not reproduce the problem with a simple test program(just sending and 
receiving), but only in our system which is under heavy load (60% average cpu usage)

I hope this information is sufficient if there are any questions please ask.

Kind regards,
Erwin Chabot

<versioninfo>
TRACEPRINTER version 1.02
TRACEPARSER LIBRARY version 1.02
 -- HEADER FILE INFORMATION -- 
       TRACE_FILE_NAME:: /root/tracefinal
            TRACE_DATE:: Wed Mar 12 10:38:00 2008
       TRACE_VER_MAJOR:: 1
       TRACE_VER_MINOR:: 01
   TRACE_LITTLE_ENDIAN:: TRUE
        TRACE_ENCODING:: 16 byte events
       TRACE_BOOT_DATE:: Tue Mar 11 17:24:13 2008
  TRACE_CYCLES_PER_SEC:: 2392482800
         TRACE_CPU_NUM:: 1
         TRACE_SYSNAME:: QNX
        TRACE_NODENAME:: I2
     TRACE_SYS_RELEASE:: 6.3.2
     TRACE_SYS_VERSION:: 2006/03/16-14:19:50EST
         TRACE_MACHINE:: x86pc
     TRACE_SYSPAGE_LEN:: 2280
</versioninfo>

<mycode>

//fill the header with some values to show up in the traces
//message is a union of a _pulse type and our header type
message.header.type        = 1234;
message.header.system      = 0x56;
message.header.align       = 0x78;
message.header.dummy       = dummy;
message.header.sizeRequest = sizeRequestData;

// trace dummy, requestid and processid
TraceEvent(_NTO_TRACE_INSERTSUSEREVENT, 111, pConnection->channelId, dummy );
TraceEvent(_NTO_TRACE_INSERTSUSEREVENT, 111, pConnection->channelId, requestId );
TraceEvent(_NTO_TRACE_INSERTSUSEREVENT, 111, pConnection->channelId, getpid() );

// fill IOV struct and send the message
SETIOV(&iov[0], &message, sizeof(message.header));
SETIOV(&iov[1], pRequestData, sizeRequestData);
int result = MsgSendvs(channelId, iov, 2, pReplyData, sizeReplyData);
</mycode>

<successfull>
t:0xe55eb12e CPU:00 KER_EXIT:TRACE_EVENT/01 ret_val:0x00000000 empty:0x00000000

t:0xe55eb9ea CPU:00 USREVENT:EVENT:111, d0:0x4000000b d1:0x000e1b0e
t:0xe55ebf2a CPU:00 USREVENT:EVENT:111, d0:0x4000000b d1:0x00000005
t:0xe55ec68a CPU:00 USREVENT:EVENT:111, d0:0x4000000b d1:0x0099703f
t:0xe55eccca CPU:00 KER_CALL:MSG_SENDV/11
                      coid:0x4000000b
                    sparts:2
                    rparts:0
                       msg:"" (0x78561234 0x000e1b0e 0x00000008)
t:0xe55ed60a CPU:00 COMM    :SND_MESSAGE   rcvid:0x0000000e pid:10055729
t:0xe55ee2fe CPU:00 THREAD  :THREPLY       pid:10055743 tid:1 priority:15 policy:2
t:0xe55ee6a2 CPU:00 THREAD  :THRUNNING     pid:10055729 tid:2 priority:15 policy:2
t:0xe55eea8a CPU:00 COMM    :REC_MESSAGE   rcvid:0x0000000e pid:10055729
t:0xe55ef306 CPU:00 KER_EXIT:MSG_RECEIVEV/14
                     rcvid:0x0000000e
                      rmsg:"" (0x78561234 0x000e1b0e 0x00000008)
                  info->nd:0
               info->srcnd:0
                ...
View Full Message
Re: can msgsendvs fail?  
On Wed, Mar 12, 2008 at 6:04 AM, E Chabot <erwin.chabot@vanderlande.com>
wrote:

[ Trace bits snipped ]

> My question is:
> is it be possible that msgsendvs sometimes sends an incorrect message?
> wrongly reads the IOV structure, or ....
> I am clueless on what to do next since I could not reproduce the problem
> with a simple test program(just sending and receiving), but only in our
> system which is under heavy load (60% average cpu usage)
>
> I hope this information is sufficient if there are any questions please
> ask.
>
I suspect that you have a bug in your program.  It is highly unlikely that
there is a problem with the MsgSend[v] call
and more likely (ie 99% certain) that something is not correct with what you
are doing in your code ... perhaps by
latent corruption of the data elsewhere.  Since this is only showing up
under heavy load, I would suggest that
the following problems may be present:
* Edge conditions in your storage or queuing schemes that are corrupting
data
* Thread race conditions on processing and handling the data

Without seeing your code, it is hard to speculate any further.  However one
thing is for sure, traces don't
lie ,... so if you are getting bad data in your MsgSend, it is your program
putting it there.

Thomas