Project Home
Project Home
Documents
Documents
Wiki
Wiki
Discussion Forums
Discussions
Project Information
Project Info
Forum Topic - A bug with MsgReadv in 6.3.0 SP2 kernel???: (13 Items)
   
A bug with MsgReadv in 6.3.0 SP2 kernel???  
Hi All,

I have a process locked up in call to Msgreadv() while manual states that this function never blocks.

(gdb) she  pidin -p 12898399
     pid tid name               prio STATE       Blocked
12898399   1 ../bin/drm.bin      10o MUTEX       12898399-04 #1
12898399   2 ../bin/drm.bin      10o CONDVAR     816b89c
12898399   3 ../bin/drm.bin      10o RECEIVE     2
12898399   4 ../bin/drm.bin      10o RECEIVE     30
12898399   5 ../bin/drm.bin      10o RECEIVE     18
12898399   6 ../bin/drm.bin      10o RECEIVE     18
12898399   7 ../bin/drm.bin      10o RECEIVE     22
12898399   8 ../bin/drm.bin      10o RECEIVE     22
12898399   9 ../bin/drm.bin      10o RECEIVE     26
12898399  10 ../bin/drm.bin      10o RECEIVE     26
12898399  11 ../bin/drm.bin      10o RECEIVE     30
12898399  12 ../bin/drm.bin      10o RECEIVE     87
12898399  13 ../bin/drm.bin      10o RECEIVE     87
12898399  14 ../bin/drm.bin      10o RECEIVE     99
12898399  15 ../bin/drm.bin      10o RECEIVE     99
12898399  16 ../bin/drm.bin      10o RECEIVE     112
12898399  17 ../bin/drm.bin      10o RECEIVE     112
12898399  18 ../bin/drm.bin      10o RECEIVE     121
12898399  19 ../bin/drm.bin      10o RECEIVE     121
12898399  20 ../bin/drm.bin      10o RECEIVE     30
12898399  21 ../bin/drm.bin      10o RECEIVE     128
12898399  22 ../bin/drm.bin      10o RECEIVE     38
12898399  23 ../bin/drm.bin      10o RECEIVE     128
12898399  24 ../bin/drm.bin      10o RECEIVE     51
12898399  25 ../bin/drm.bin      10o RECEIVE     136
12898399  26 ../bin/drm.bin      16o RECEIVE     136
12898399  27 ../bin/drm.bin      10o RECEIVE     143
12898399  28 ../bin/drm.bin      10o RECEIVE     30
12898399  29 ../bin/drm.bin      10o RECEIVE     143
12898399  30 ../bin/drm.bin      10o RECEIVE     149
12898399  31 ../bin/drm.bin      10o RECEIVE     18
12898399  32 ../bin/drm.bin      10o RECEIVE     63
12898399  33 ../bin/drm.bin      10o RECEIVE     149
12898399  34 ../bin/drm.bin      10o RECEIVE     154
12898399  35 ../bin/drm.bin      10o RECEIVE     63
12898399  36 ../bin/drm.bin      10o RECEIVE     76
12898399  37 ../bin/drm.bin      10o RECEIVE     154
12898399  38 ../bin/drm.bin      10o RECEIVE     160
12898399  39 ../bin/drm.bin      10o RECEIVE     26
12898399  40 ../bin/drm.bin      10o RECEIVE     76
12898399  41 ../bin/drm.bin      10o RECEIVE     160
12898399  42 ../bin/drm.bin      10o RECEIVE     38
12898399  43 ../bin/drm.bin      10o RECEIVE     165
12898399  44 ../bin/drm.bin      10o RECEIVE     165
12898399  45 ../bin/drm.bin      10o RECEIVE     51
12898399  46 ../bin/drm.bin      10o RECEIVE     170
12898399  47 ../bin/drm.bin      10o RECEIVE     170
12898399  48 ../bin/drm.bin      10o RECEIVE     175
12898399  49 ../bin/drm.bin      10o RECEIVE     175
12898399  50 ../bin/drm.bin      10o RECEIVE     180
12898399  51 ../bin/drm.bin      10o RECEIVE     180
12898399  52 ../bin/drm.bin      10o RECEIVE     30
12898399  53 ../bin/drm.bin      10o RECEIVE     30
12898399  54 ../bin/drm.bin      12o MUTEX       12898399-64 #1
12898399  55 ../bin/drm.bin      12o MUTEX       12898399-64 #1
12898399  56 ../bin/drm.bin      10o RECEIVE     190
12898399  57 ../bin/drm.bin      10o RECEIVE     190
12898399  58 ../bin/drm.bin      10o RECEIVE     195
12898399  59 ../bin/drm.bin      10o RECEIVE     195
12898399  60 ../bin/drm.bin      10o RECEIVE     200
12898399  61 ../bin/drm.bin      10o RECEIVE     200
12898399  62 ../bin/drm.bin      12o MUTEX       12898399-64 #1
12898399  63 ../bin/drm.bin      10o RECEIVE     195
12898399  64 ../bin/drm.bin      16o REPLY       12898399
12898399  65 ../bin/drm.bin      16o MUTEX       12898399-64 #1
12898399  66 ../bin/drm.bin      10o RECEIVE     18
12898399  67 ../bin/drm.bin      12o MUTEX       12898399-64 #1
12898399  68 ../bin/drm.bin      10o RECEIVE     18
12898399  69 ../bin/drm.bin      16o RECEIVE     136
12898399  70...
View Full Message
Re: A bug with MsgReadv in 6.3.0 SP2 kernel???  
I think you are confusing gdb thread id with kernel thread id, perhaps?

Oleh Derevenko wrote:
> Hi All,
> 
> I have a process locked up in call to Msgreadv() while manual states that this function never blocks.
> 
> (gdb) she  pidin -p 12898399
>      pid tid name               prio STATE       Blocked
> 12898399   1 ../bin/drm.bin      10o MUTEX       12898399-04 #1
> 12898399   2 ../bin/drm.bin      10o CONDVAR     816b89c
> 12898399   3 ../bin/drm.bin      10o RECEIVE     2
> 12898399   4 ../bin/drm.bin      10o RECEIVE     30
> 12898399   5 ../bin/drm.bin      10o RECEIVE     18
> 12898399   6 ../bin/drm.bin      10o RECEIVE     18
> 12898399   7 ../bin/drm.bin      10o RECEIVE     22
> 12898399   8 ../bin/drm.bin      10o RECEIVE     22
> 12898399   9 ../bin/drm.bin      10o RECEIVE     26
> 12898399  10 ../bin/drm.bin      10o RECEIVE     26
> 12898399  11 ../bin/drm.bin      10o RECEIVE     30
> 12898399  12 ../bin/drm.bin      10o RECEIVE     87
> 12898399  13 ../bin/drm.bin      10o RECEIVE     87
> 12898399  14 ../bin/drm.bin      10o RECEIVE     99
> 12898399  15 ../bin/drm.bin      10o RECEIVE     99
> 12898399  16 ../bin/drm.bin      10o RECEIVE     112
> 12898399  17 ../bin/drm.bin      10o RECEIVE     112
> 12898399  18 ../bin/drm.bin      10o RECEIVE     121
> 12898399  19 ../bin/drm.bin      10o RECEIVE     121
> 12898399  20 ../bin/drm.bin      10o RECEIVE     30
> 12898399  21 ../bin/drm.bin      10o RECEIVE     128
> 12898399  22 ../bin/drm.bin      10o RECEIVE     38
> 12898399  23 ../bin/drm.bin      10o RECEIVE     128
> 12898399  24 ../bin/drm.bin      10o RECEIVE     51
> 12898399  25 ../bin/drm.bin      10o RECEIVE     136
> 12898399  26 ../bin/drm.bin      16o RECEIVE     136
> 12898399  27 ../bin/drm.bin      10o RECEIVE     143
> 12898399  28 ../bin/drm.bin      10o RECEIVE     30
> 12898399  29 ../bin/drm.bin      10o RECEIVE     143
> 12898399  30 ../bin/drm.bin      10o RECEIVE     149
> 12898399  31 ../bin/drm.bin      10o RECEIVE     18
> 12898399  32 ../bin/drm.bin      10o RECEIVE     63
> 12898399  33 ../bin/drm.bin      10o RECEIVE     149
> 12898399  34 ../bin/drm.bin      10o RECEIVE     154
> 12898399  35 ../bin/drm.bin      10o RECEIVE     63
> 12898399  36 ../bin/drm.bin      10o RECEIVE     76
> 12898399  37 ../bin/drm.bin      10o RECEIVE     154
> 12898399  38 ../bin/drm.bin      10o RECEIVE     160
> 12898399  39 ../bin/drm.bin      10o RECEIVE     26
> 12898399  40 ../bin/drm.bin      10o RECEIVE     76
> 12898399  41 ../bin/drm.bin      10o RECEIVE     160
> 12898399  42 ../bin/drm.bin      10o RECEIVE     38
> 12898399  43 ../bin/drm.bin      10o RECEIVE     165
> 12898399  44 ../bin/drm.bin      10o RECEIVE     165
> 12898399  45 ../bin/drm.bin      10o RECEIVE     51
> 12898399  46 ../bin/drm.bin      10o RECEIVE     170
> 12898399  47 ../bin/drm.bin      10o RECEIVE     170
> 12898399  48 ../bin/drm.bin      10o RECEIVE     175
> 12898399  49 ../bin/drm.bin      10o RECEIVE     175
> 12898399  50 ../bin/drm.bin      10o RECEIVE     180
> 12898399  51 ../bin/drm.bin      10o RECEIVE     180
> 12898399  52 ../bin/drm.bin      10o RECEIVE     30
> 12898399  53 ../bin/drm.bin      10o RECEIVE     30
> 12898399  54 ../bin/drm.bin      12o MUTEX       12898399-64 #1
> 12898399  55 ../bin/drm.bin      12o MUTEX       12898399-64 #1
> 12898399  56 ../bin/drm.bin      10o RECEIVE     190
> 12898399  57 ../bin/drm.bin      10o RECEIVE     190
> 12898399  58 ../bin/drm.bin      10o RECEIVE     195
> 12898399  59 ../bin/drm.bin      10o RECEIVE     195
> 12898399  60 ../bin/drm.bin      10o RECEIVE     200
> 12898399  61 ../bin/drm.bin      10o RECEIVE     200
> 12898399  62 ../bin/drm.bin      12o MUTEX      ...
View Full Message
Re: A bug with MsgReadv in 6.3.0 SP2 kernel???  
No, GDB thread 64 is also kernel thread 64

(gdb) p {CUNIXThreadDescriptor}m_pHostThread
$5 = {m_pThreadID = 0x40, m_lRefCount = 2, m_ai_RunningMutexStorage = {65538, 64, -2147483647, 1348403264}}

Here m_pThreadID = 0x40 is the thread ID returned by pthread_self() when a pool thread starts executing.
RE: A bug with MsgReadv in 6.3.0 SP2 kernel???  
Reply Block "on itself", is usually a sign that a server thread
MsgRead() a client message from remote node.
Is this the case?

-xtang

> -----Original Message-----
> From: Oleh Derevenko [mailto:community-noreply@qnx.com] 
> Sent: Thursday, February 05, 2009 1:53 PM
> To: ostech-core_os
> Subject: Re: A bug with MsgReadv in 6.3.0 SP2 kernel???
> 
> No, GDB thread 64 is also kernel thread 64
> 
> (gdb) p {CUNIXThreadDescriptor}m_pHostThread
> $5 = {m_pThreadID = 0x40, m_lRefCount = 2, 
> m_ai_RunningMutexStorage = {65538, 64, -2147483647, 1348403264}}
> 
> Here m_pThreadID = 0x40 is the thread ID returned by 
> pthread_self() when a pool thread starts executing.
> 
> _______________________________________________
> OSTech
> http://community.qnx.com/sf/go/post21504
> 
> 
Re: A bug with MsgReadv in 6.3.0 SP2 kernel???  
Looking at the code that seems to be the case (ker_msg_readv -> lookup_rcvid -> net_send2)
but why is pidin not reporting the pid@<node> information?

Xiaodan Tang wrote:
> Reply Block "on itself", is usually a sign that a server thread
> MsgRead() a client message from remote node.
> Is this the case?
> 
> -xtang
> 
>> -----Original Message-----
>> From: Oleh Derevenko [mailto:community-noreply@qnx.com] 
>> Sent: Thursday, February 05, 2009 1:53 PM
>> To: ostech-core_os
>> Subject: Re: A bug with MsgReadv in 6.3.0 SP2 kernel???
>>
>> No, GDB thread 64 is also kernel thread 64
>>
>> (gdb) p {CUNIXThreadDescriptor}m_pHostThread
>> $5 = {m_pThreadID = 0x40, m_lRefCount = 2, 
>> m_ai_RunningMutexStorage = {65538, 64, -2147483647, 1348403264}}
>>
>> Here m_pThreadID = 0x40 is the thread ID returned by 
>> pthread_self() when a pool thread starts executing.
>>
>> _______________________________________________
>> OSTech
>> http://community.qnx.com/sf/go/post21504
>>
>>
> 
> _______________________________________________
> OSTech
> http://community.qnx.com/sf/go/post21523
> 

-- 
cburgess@qnx.com
RE: A bug with MsgReadv in 6.3.0 SP2 kernel???  
On client side, if you MsgSend() cross QNET, pidin show you block on
server_pid@server.

On server side, it is qnet (vthread) sending to real server (NETCON);
when the server decided to "MsgRead()", it is blocked on the same
connection, hance the pidin report it block on itself (cop->chn->proc).

-xtang
 

> -----Original Message-----
> From: Colin Burgess [mailto:community-noreply@qnx.com] 
> Sent: Thursday, February 05, 2009 3:56 PM
> To: ostech-core_os
> Subject: Re: A bug with MsgReadv in 6.3.0 SP2 kernel???
> 
> Looking at the code that seems to be the case (ker_msg_readv 
> -> lookup_rcvid -> net_send2) but why is pidin not reporting 
> the pid@<node> information?
> 
> Xiaodan Tang wrote:
> > Reply Block "on itself", is usually a sign that a server thread
> > MsgRead() a client message from remote node.
> > Is this the case?
> > 
> > -xtang
> > 
> >> -----Original Message-----
> >> From: Oleh Derevenko [mailto:community-noreply@qnx.com]
> >> Sent: Thursday, February 05, 2009 1:53 PM
> >> To: ostech-core_os
> >> Subject: Re: A bug with MsgReadv in 6.3.0 SP2 kernel???
> >>
> >> No, GDB thread 64 is also kernel thread 64
> >>
> >> (gdb) p {CUNIXThreadDescriptor}m_pHostThread
> >> $5 = {m_pThreadID = 0x40, m_lRefCount = 2, 
> m_ai_RunningMutexStorage = 
> >> {65538, 64, -2147483647, 1348403264}}
> >>
> >> Here m_pThreadID = 0x40 is the thread ID returned by
> >> pthread_self() when a pool thread starts executing.
> >>
> >> _______________________________________________
> >> OSTech
> >> http://community.qnx.com/sf/go/post21504
> >>
> >>
> > 
> > _______________________________________________
> > OSTech
> > http://community.qnx.com/sf/go/post21523
> > 
> 
> --
> cburgess@qnx.com
> 
> _______________________________________________
> OSTech
> http://community.qnx.com/sf/go/post21536
> 
> 
Re: A bug with MsgReadv in 6.3.0 SP2 kernel???  
(qnet makes my brain squishy)

Can pidin detect this and print something more meaningful?

Xiaodan Tang wrote:
> On client side, if you MsgSend() cross QNET, pidin show you block on
> server_pid@server.
> 
> On server side, it is qnet (vthread) sending to real server (NETCON);
> when the server decided to "MsgRead()", it is blocked on the same
> connection, hance the pidin report it block on itself (cop->chn->proc).
> 
> -xtang
>  
> 
>> -----Original Message-----
>> From: Colin Burgess [mailto:community-noreply@qnx.com] 
>> Sent: Thursday, February 05, 2009 3:56 PM
>> To: ostech-core_os
>> Subject: Re: A bug with MsgReadv in 6.3.0 SP2 kernel???
>>
>> Looking at the code that seems to be the case (ker_msg_readv 
>> -> lookup_rcvid -> net_send2) but why is pidin not reporting 
>> the pid@<node> information?
>>
>> Xiaodan Tang wrote:
>>> Reply Block "on itself", is usually a sign that a server thread
>>> MsgRead() a client message from remote node.
>>> Is this the case?
>>>
>>> -xtang
>>>
>>>> -----Original Message-----
>>>> From: Oleh Derevenko [mailto:community-noreply@qnx.com]
>>>> Sent: Thursday, February 05, 2009 1:53 PM
>>>> To: ostech-core_os
>>>> Subject: Re: A bug with MsgReadv in 6.3.0 SP2 kernel???
>>>>
>>>> No, GDB thread 64 is also kernel thread 64
>>>>
>>>> (gdb) p {CUNIXThreadDescriptor}m_pHostThread
>>>> $5 = {m_pThreadID = 0x40, m_lRefCount = 2, 
>> m_ai_RunningMutexStorage = 
>>>> {65538, 64, -2147483647, 1348403264}}
>>>>
>>>> Here m_pThreadID = 0x40 is the thread ID returned by
>>>> pthread_self() when a pool thread starts executing.
>>>>
>>>> _______________________________________________
>>>> OSTech
>>>> http://community.qnx.com/sf/go/post21504
>>>>
>>>>
>>> _______________________________________________
>>> OSTech
>>> http://community.qnx.com/sf/go/post21523
>>>
>> --
>> cburgess@qnx.com
>>
>> _______________________________________________
>> OSTech
>> http://community.qnx.com/sf/go/post21536
>>
>>
> 
> _______________________________________________
> OSTech
> http://community.qnx.com/sf/go/post21537
> 

-- 
cburgess@qnx.com
Re: RE: A bug with MsgReadv in 6.3.0 SP2 kernel???  
Yes, the request has come from the another node.

(gdb) fr 6
#6  0x080ad8b1 in CResourceManager::ProcessRequest (this=0x83dca28, pdcRequest=0x8425288) at /home/masha/rel6776/Src/
shared/common/rm.cpp:774
774             dispatch_handler(pdcRequest);
(gdb) p pdcRequest[0]
$6 = {resmgr_context = {rcvid = 8650979, info = {nd = 7, srcnd = 155, pid = 14934133, tid = 24, chid = 185, scoid = 
1073742051, coid = 12, msglen = 16,
      srcmsglen = 16, dstmsglen = 2147483647, priority = 12, flags = 256, reserved = 0}, msg = 0x84252e4, dpp = 
0x8347888, id = -1, tid = 0, msg_max_size = 4104,
    status = 0, offset = 0, size = 4, iov = {{iov_base = 0x84252e4, iov_len = 60}}}, message_context = {rcvid = 8650979,
 info = {nd = 7, srcnd = 155,
      pid = 14934133, tid = 24, chid = 185, scoid = 1073742051, coid = 12, msglen = 16, srcmsglen = 16, dstmsglen = 
2147483647, priority = 12, flags = 256,
      reserved = 0}, msg = 0x84252e4, dpp = 0x8347888, id = -1, tid = 0, msg_max_size = 4104, status = 0, offset = 0, 
size = 4, iov = {{iov_base = 0x84252e4,
        iov_len = 60}}}, select_context = {rcvid = 8650979, info = {msginfo = {nd = 7, srcnd = 155, pid = 14934133, tid 
= 24, chid = 185, scoid = 1073742051,
        coid = 12, msglen = 16, srcmsglen = 16, dstmsglen = 2147483647, priority = 12, flags = 256, reserved = 0}, 
siginfo = {si_signo = 7, si_code = 155,
        si_errno = 14934133, __data = {__pad = {24, 185, 1073742051, 12, 16, 16, 2147483647}, __proc = {__pid = 24, 
__pdata = {__kill = {__uid = 185, __value = {
                  sival_int = 1073742051, sival_ptr = 0x400000e3}}, __chld = {__utime = 185, __status = 1073742051, 
__stime = 12}}}, __fault = {__fltno = 24,
            __fltip = 0xb9, __addr = 0x400000e3}}}}, msg = 0x84252e4, dpp = 0x8347888, fd = -1, tid = 0, reserved = 4104
, flags = 0, reserved2 = {0, 4}, iov = {{
        iov_base = 0x84252e4, iov_len = 60}}}, sigwait_context = {signo = 8650979, info = {msginfo = {nd = 7, srcnd = 
155, pid = 14934133, tid = 24, chid = 185,
        scoid = 1073742051, coid = 12, msglen = 16, srcmsglen = 16, dstmsglen = 2147483647, priority = 12, flags = 256, 
reserved = 0}, siginfo = {si_signo = 7,
        si_code = 155, si_errno = 14934133, __data = {__pad = {24, 185, 1073742051, 12, 16, 16, 2147483647}, __proc = 
{__pid = 24, __pdata = {__kill = {
                __uid = 185, __value = {sival_int = 1073742051, sival_ptr = 0x400000e3}}, __chld = {__utime = 185, 
__status = 1073742051, __stime = 12}}},
          __fault = {__fltno = 24, __fltip = 0xb9, __addr = 0x400000e3}}}}, msg = 0x84252e4, dpp = 0x8347888, status = -
1, tid = 0, set = {bits = {4104, 0}},
    reserved2 = {0, 4}, iov = {{iov_base = 0x84252e4, iov_len = 60}}}}
(gdb) x/16b pdcRequest[0].resmgr_context.msg
0x84252e4:      0x01    0x01    0x10    0x00    0x05    0x00    0x00    0x00
0x84252ec:      0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
(gdb)



Here is also the state of process 14934133 at remote node

# pidin -p 14934133
     pid tid name               prio STATE       Blocked
14934133   1 ../bin/axis.bin     10o MUTEX       14934133-06 #1
14934133   2 ../bin/axis.bin     10o CONDVAR     83446b4
14934133   3 ../bin/axis.bin     10o CONDVAR     8326e98
14934133   4 ../bin/axis.bin     10o RECEIVE     2
14934133   5 ../bin/axis.bin     10o RECEIVE     5
14934133   6 ../bin/axis.bin     10o CONDVAR     8366d74
14934133   7 ../bin/axis.bin     10o CONDVAR     837df2c
14934133   8 ../bin/axis.bin     10o CONDVAR     837d0bc
14934133   9 ../bin/axis.bin     14o CONDVAR     837cdcc
14934133  10 ../bin/axis.bin     21o RECEIVE     21
14934133  11 ../bin/axis.bin     10o CONDVAR     837c66c
14934133  12 ../bin/axis.bin     10o CONDVAR     83612c4
14934133  13 ../bin/axis.bin     15o NANOSLEEP
14934133  14 ../bin/axis.bin     17o RECEIVE     25
14934133  15 ../bin/axis.bin     17o RECEIVE     25
14934133  16 ../bin/axis.bin     17o RECEIVE     25
14934133  17 ../bin/axis.bin...
View Full Message
RE: RE: A bug with MsgReadv in 6.3.0 SP2 kernel???  
The message looks like an _IO_READ (0x0101) with nbyte 5;
I am not sure why this would causing a MsgRead(). 

Being said that, blocking a MsgRead() forever is also not the right
thing to
happen...

-xtang


> -----Original Message-----
> From: Oleh Derevenko [mailto:community-noreply@qnx.com] 
> Sent: Thursday, February 05, 2009 4:17 PM
> To: ostech-core_os
> Subject: Re: RE: A bug with MsgReadv in 6.3.0 SP2 kernel???
> 
> Yes, the request has come from the another node.
> 
> (gdb) fr 6
> #6  0x080ad8b1 in CResourceManager::ProcessRequest 
> (this=0x83dca28, pdcRequest=0x8425288) at 
> /home/masha/rel6776/Src/shared/common/rm.cpp:774
> 774             dispatch_handler(pdcRequest);
> (gdb) p pdcRequest[0]
> $6 = {resmgr_context = {rcvid = 8650979, info = {nd = 7, 
> srcnd = 155, pid = 14934133, tid = 24, chid = 185, scoid = 
> 1073742051, coid = 12, msglen = 16,
>       srcmsglen = 16, dstmsglen = 2147483647, priority = 12, 
> flags = 256, reserved = 0}, msg = 0x84252e4, dpp = 0x8347888, 
> id = -1, tid = 0, msg_max_size = 4104,
>     status = 0, offset = 0, size = 4, iov = {{iov_base = 
> 0x84252e4, iov_len = 60}}}, message_context = {rcvid = 
> 8650979, info = {nd = 7, srcnd = 155,
>       pid = 14934133, tid = 24, chid = 185, scoid = 
> 1073742051, coid = 12, msglen = 16, srcmsglen = 16, dstmsglen 
> = 2147483647, priority = 12, flags = 256,
>       reserved = 0}, msg = 0x84252e4, dpp = 0x8347888, id = 
> -1, tid = 0, msg_max_size = 4104, status = 0, offset = 0, 
> size = 4, iov = {{iov_base = 0x84252e4,
>         iov_len = 60}}}, select_context = {rcvid = 8650979, 
> info = {msginfo = {nd = 7, srcnd = 155, pid = 14934133, tid = 
> 24, chid = 185, scoid = 1073742051,
>         coid = 12, msglen = 16, srcmsglen = 16, dstmsglen = 
> 2147483647, priority = 12, flags = 256, reserved = 0}, 
> siginfo = {si_signo = 7, si_code = 155,
>         si_errno = 14934133, __data = {__pad = {24, 185, 
> 1073742051, 12, 16, 16, 2147483647}, __proc = {__pid = 24, 
> __pdata = {__kill = {__uid = 185, __value = {
>                   sival_int = 1073742051, sival_ptr = 
> 0x400000e3}}, __chld = {__utime = 185, __status = 1073742051, 
> __stime = 12}}}, __fault = {__fltno = 24,
>             __fltip = 0xb9, __addr = 0x400000e3}}}}, msg = 
> 0x84252e4, dpp = 0x8347888, fd = -1, tid = 0, reserved = 
> 4104, flags = 0, reserved2 = {0, 4}, iov = {{
>         iov_base = 0x84252e4, iov_len = 60}}}, 
> sigwait_context = {signo = 8650979, info = {msginfo = {nd = 
> 7, srcnd = 155, pid = 14934133, tid = 24, chid = 185,
>         scoid = 1073742051, coid = 12, msglen = 16, srcmsglen 
> = 16, dstmsglen = 2147483647, priority = 12, flags = 256, 
> reserved = 0}, siginfo = {si_signo = 7,
>         si_code = 155, si_errno = 14934133, __data = {__pad = 
> {24, 185, 1073742051, 12, 16, 16, 2147483647}, __proc = 
> {__pid = 24, __pdata = {__kill = {
>                 __uid = 185, __value = {sival_int = 
> 1073742051, sival_ptr = 0x400000e3}}, __chld = {__utime = 
> 185, __status = 1073742051, __stime = 12}}},
>           __fault = {__fltno = 24, __fltip = 0xb9, __addr = 
> 0x400000e3}}}}, msg = 0x84252e4, dpp = 0x8347888, status = 
> -1, tid = 0, set = {bits = {4104, 0}},
>     reserved2 = {0, 4}, iov = {{iov_base = 0x84252e4, iov_len = 60}}}}
> (gdb) x/16b pdcRequest[0].resmgr_context.msg
> 0x84252e4:      0x01    0x01    0x10    0x00    0x05    0x00  
>   0x00    0x00
> 0x84252ec:      0x00    0x00    0x00    0x00    0x00    0x00  
>   0x00    0x00
> (gdb)
> 
> 
> 
> Here is also the state of process 14934133 at remote node
> 
> # pidin -p 14934133
>      pid tid name               prio STATE       Blocked
> 14934133   1 ../bin/axis.bin     10o MUTEX      ...
View Full Message
Re: A bug with MsgReadv in 6.3.0 SP2 kernel???  
The MsgReadv is in _resmgr_unblock_handler.

The thread 24 on the client should still be REPLY blocked on the server at this point... I would think!

Xiaodan Tang wrote:
> The message looks like an _IO_READ (0x0101) with nbyte 5;
> I am not sure why this would causing a MsgRead(). 
> 
> Being said that, blocking a MsgRead() forever is also not the right
> thing to
> happen...
> 
> -xtang
> 
> 
>> -----Original Message-----
>> From: Oleh Derevenko [mailto:community-noreply@qnx.com] 
>> Sent: Thursday, February 05, 2009 4:17 PM
>> To: ostech-core_os
>> Subject: Re: RE: A bug with MsgReadv in 6.3.0 SP2 kernel???
>>
>> Yes, the request has come from the another node.
>>
>> (gdb) fr 6
>> #6  0x080ad8b1 in CResourceManager::ProcessRequest 
>> (this=0x83dca28, pdcRequest=0x8425288) at 
>> /home/masha/rel6776/Src/shared/common/rm.cpp:774
>> 774             dispatch_handler(pdcRequest);
>> (gdb) p pdcRequest[0]
>> $6 = {resmgr_context = {rcvid = 8650979, info = {nd = 7, 
>> srcnd = 155, pid = 14934133, tid = 24, chid = 185, scoid = 
>> 1073742051, coid = 12, msglen = 16,
>>       srcmsglen = 16, dstmsglen = 2147483647, priority = 12, 
>> flags = 256, reserved = 0}, msg = 0x84252e4, dpp = 0x8347888, 
>> id = -1, tid = 0, msg_max_size = 4104,
>>     status = 0, offset = 0, size = 4, iov = {{iov_base = 
>> 0x84252e4, iov_len = 60}}}, message_context = {rcvid = 
>> 8650979, info = {nd = 7, srcnd = 155,
>>       pid = 14934133, tid = 24, chid = 185, scoid = 
>> 1073742051, coid = 12, msglen = 16, srcmsglen = 16, dstmsglen 
>> = 2147483647, priority = 12, flags = 256,
>>       reserved = 0}, msg = 0x84252e4, dpp = 0x8347888, id = 
>> -1, tid = 0, msg_max_size = 4104, status = 0, offset = 0, 
>> size = 4, iov = {{iov_base = 0x84252e4,
>>         iov_len = 60}}}, select_context = {rcvid = 8650979, 
>> info = {msginfo = {nd = 7, srcnd = 155, pid = 14934133, tid = 
>> 24, chid = 185, scoid = 1073742051,
>>         coid = 12, msglen = 16, srcmsglen = 16, dstmsglen = 
>> 2147483647, priority = 12, flags = 256, reserved = 0}, 
>> siginfo = {si_signo = 7, si_code = 155,
>>         si_errno = 14934133, __data = {__pad = {24, 185, 
>> 1073742051, 12, 16, 16, 2147483647}, __proc = {__pid = 24, 
>> __pdata = {__kill = {__uid = 185, __value = {
>>                   sival_int = 1073742051, sival_ptr = 
>> 0x400000e3}}, __chld = {__utime = 185, __status = 1073742051, 
>> __stime = 12}}}, __fault = {__fltno = 24,
>>             __fltip = 0xb9, __addr = 0x400000e3}}}}, msg = 
>> 0x84252e4, dpp = 0x8347888, fd = -1, tid = 0, reserved = 
>> 4104, flags = 0, reserved2 = {0, 4}, iov = {{
>>         iov_base = 0x84252e4, iov_len = 60}}}, 
>> sigwait_context = {signo = 8650979, info = {msginfo = {nd = 
>> 7, srcnd = 155, pid = 14934133, tid = 24, chid = 185,
>>         scoid = 1073742051, coid = 12, msglen = 16, srcmsglen 
>> = 16, dstmsglen = 2147483647, priority = 12, flags = 256, 
>> reserved = 0}, siginfo = {si_signo = 7,
>>         si_code = 155, si_errno = 14934133, __data = {__pad = 
>> {24, 185, 1073742051, 12, 16, 16, 2147483647}, __proc = 
>> {__pid = 24, __pdata = {__kill = {
>>                 __uid = 185, __value = {sival_int = 
>> 1073742051, sival_ptr = 0x400000e3}}, __chld = {__utime = 
>> 185, __status = 1073742051, __stime = 12}}},
>>           __fault = {__fltno = 24, __fltip = 0xb9, __addr = 
>> 0x400000e3}}}}, msg = 0x84252e4, dpp = 0x8347888, status = 
>> -1, tid = 0, set = {bits = {4104, 0}},
>>     reserved2 = {0, 4}, iov = {{iov_base = 0x84252e4, iov_len = 60}}}}
>> (gdb) x/16b...
View Full Message
Re: A bug with MsgReadv in 6.3.0 SP2 kernel???  
I'm not sure if this has anything to do with the situation and I don't know if the event I'm going to tell about took 
place indeed but I have a mechanism of unblocking threads from requests to the server by sending SIGINT signal with 
pthread_kill() to the thread internally in client. I know that this should send unblock pulse to the server and server 
must unblock the request explicitly, but maybe the pulse has been lost somehow or it has not been processed or it has 
been incorrectly processed while the request was entering dispatch or was just waiting on mutex before 
dispatch_handler() for other thread to finish using the server objects.

> The MsgReadv is in _resmgr_unblock_handler.
> 
> The thread 24 on the client should still be REPLY blocked on the server at 
> this point... I would think!
> 
> Xiaodan Tang wrote:
> > The message looks like an _IO_READ (0x0101) with nbyte 5;
> > I am not sure why this would causing a MsgRead(). 
> > 
> > Being said that, blocking a MsgRead() forever is also not the right
> > thing to
> > happen...
> > 
> > -xtang
Re: A bug with MsgReadv in 6.3.0 SP2 kernel???  
And also, client node is 6.3.2 and does not have patch #0284 integrated into the kernel. Sorry, I've forgotten to 
mention it.


> I'm not sure if this has anything to do with the situation and I don't know if
>  the event I'm going to tell about took place indeed but I have a mechanism of
>  unblocking threads from requests to the server by sending SIGINT signal with 
> pthread_kill() to the thread internally in client. I know that this should 
> send unblock pulse to the server and server must unblock the request 
> explicitly, but maybe the pulse has been lost somehow or it has not been 
> processed or it has been incorrectly processed while the request was entering 
> dispatch or was just waiting on mutex before dispatch_handler() for other 
> thread to finish using the server objects.
> 
> > The MsgReadv is in _resmgr_unblock_handler.
> > 
> > The thread 24 on the client should still be REPLY blocked on the server at 
> > this point... I would think!
> > 
> > Xiaodan Tang wrote:
> > > The message looks like an _IO_READ (0x0101) with nbyte 5;
> > > I am not sure why this would causing a MsgRead(). 
> > > 
> > > Being said that, blocking a MsgRead() forever is also not the right
> > > thing to
> > > happen...
> > > 
> > > -xtang


Re: A bug with MsgReadv in 6.3.0 SP2 kernel???  
And I confirm that thread 24 on client was indeed calling read() with buffer of 5 bytes. I've analyzed the program logs 
in memory and found the code the thread was executing. Unfortunately, the log level is not high enough and I can't see 
what was the result of the operation directly. Most likely the read() has been aborted by closing the file handle from 
other thread.

> And also, client node is 6.3.2 and does not have patch #0284 integrated into 
> the kernel. Sorry, I've forgotten to mention it.
> 
> 
> > I'm not sure if this has anything to do with the situation and I don't know 
> if
> >  the event I'm going to tell about took place indeed but I have a mechanism 
> of
> >  unblocking threads from requests to the server by sending SIGINT signal 
> with 
> > pthread_kill() to the thread internally in client. I know that this should 
> > send unblock pulse to the server and server must unblock the request 
> > explicitly, but maybe the pulse has been lost somehow or it has not been 
> > processed or it has been incorrectly processed while the request was 
> entering 
> > dispatch or was just waiting on mutex before dispatch_handler() for other 
> > thread to finish using the server objects.
> > 
> > > The MsgReadv is in _resmgr_unblock_handler.
> > > 
> > > The thread 24 on the client should still be REPLY blocked on the server at
>  
> > > this point... I would think!
> > > 
> > > Xiaodan Tang wrote:
> > > > The message looks like an _IO_READ (0x0101) with nbyte 5;
> > > > I am not sure why this would causing a MsgRead(). 
> > > > 
> > > > Being said that, blocking a MsgRead() forever is also not the right
> > > > thing to
> > > > happen...
> > > > 
> > > > -xtang
> 
>