Kostadin Vardin(deleted)
05/12/2009 1:58 PM
post29286
|
Hi guys,
I have client and server process running on different nodes of QNET.
Server is a Resource Manager, and Client sends devctl commands.
I am using TimerTimeout before each devctl in order to get unblocked.
When running both on a local mashine this scenario works fine, i.e when server stops, client gets kernel timeout and
unblocks.
When I start client and server on different nodes and disconnecting the LAN cable in order to check the timeout, then
the client gets unblocked after around 15s, no matter what I setup as timeout. I tried different values for timeout and
the result is one and the same.
Please find attached my test_case.
Any hints / help?
Kosta
|
|
|
Colin Burgess(deleted)
|
Re: TimerTimeout and QNET
|
Colin Burgess(deleted)
05/12/2009 2:36 PM
post29297
|
Re: TimerTimeout and QNET
When your client gets unblocked by the timeout, an unblock pulse must be sent to the server first. Since
you pulled the plug on the network, the pulse cannot be delivered, and you have to wait for the qnet timeout.
Colin
Kostadin Vardin wrote:
> Hi guys,
> I have client and server process running on different nodes of QNET.
> Server is a Resource Manager, and Client sends devctl commands.
> I am using TimerTimeout before each devctl in order to get unblocked.
> When running both on a local mashine this scenario works fine, i.e when server stops, client gets kernel timeout and
unblocks.
> When I start client and server on different nodes and disconnecting the LAN cable in order to check the timeout, then
the client gets unblocked after around 15s, no matter what I setup as timeout. I tried different values for timeout and
the result is one and the same.
> Please find attached my test_case.
> Any hints / help?
> Kosta
>
>
> _______________________________________________
> OSTech
> http://community.qnx.com/sf/go/post29286
--
cburgess@qnx.com
|
|
|
Kostadin Vardin(deleted)
|
Re: TimerTimeout and QNET
|
Kostadin Vardin(deleted)
05/12/2009 2:51 PM
post29299
|
Re: TimerTimeout and QNET
Hi Colin,
Please find attached the sloginfo, where you will see what QNET is reporting.
At the first line you see tx_xmit_init_conn_pkt(): to nd17 on L4 1.
This is exactly the moment when I start my test case, get connection with the server on another node( looks like 17 )
and start heart beating.
The second line and after is when I pull out the LAN cable.
Immediately qnet starts reporting l4_tx_timeout() and as you see this happens every 400ms, until the line where it
reports
- l4_tx_service(): exceeded 25 retries during....
and then we see some QOS messages. At this time my client gets unblocked.
My question is :
How I can get notified by qnet when it detects timeouts and use this one as unblocking event.
This is very critical, as we deal with surgery and I need to get unblocked in such a case. 400ms is OK as time out.
Probably I can somehow setup number of retries to 1, and then I will get notified earlier.
Please for your help,
Kosta
|
|
|
Colin Burgess(deleted)
|
Re: TimerTimeout and QNET
|
Colin Burgess(deleted)
05/12/2009 2:54 PM
post29300
|
Re: TimerTimeout and QNET
The qnet gurus here may have a better idea, but it sounds like you want to reduce your qnet timeouts. Looking
at the usemsg for lsm-qnet.so the tx_retries option appears to be the one you are hitting.
Andrew?
Colin
Kostadin Vardin wrote:
> Hi Colin,
> Please find attached the sloginfo, where you will see what QNET is reporting.
> At the first line you see tx_xmit_init_conn_pkt(): to nd17 on L4 1.
> This is exactly the moment when I start my test case, get connection with the server on another node( looks like 17 )
and start heart beating.
> The second line and after is when I pull out the LAN cable.
> Immediately qnet starts reporting l4_tx_timeout() and as you see this happens every 400ms, until the line where it
reports
> - l4_tx_service(): exceeded 25 retries during....
> and then we see some QOS messages. At this time my client gets unblocked.
> My question is :
> How I can get notified by qnet when it detects timeouts and use this one as unblocking event.
> This is very critical, as we deal with surgery and I need to get unblocked in such a case. 400ms is OK as time out.
> Probably I can somehow setup number of retries to 1, and then I will get notified earlier.
> Please for your help,
> Kosta
>
>
> _______________________________________________
> OSTech
> http://community.qnx.com/sf/go/post29299
>
--
cburgess@qnx.com
|
|
|
Xiaodan Tang(deleted)
|
RE: TimerTimeout and QNET
|
Xiaodan Tang(deleted)
05/12/2009 3:03 PM
post29301
|
RE: TimerTimeout and QNET
Yeah, pulling cable is the worst case scenario.
2 numbers decided how long it takes QNET to declare a network down
happened.
1) periodic_ticks (by default this is 5, so we have a 200ms tick)
2) tx_retries (how many retries QNET will do before it give up, default
is 25)
So by default, if you have data try to sent across network (if you
don't have data transmitting, that's a different story), it takes about
5 seconds to declare the other node is down, and every connection to
that node will be cleaned up.
To meet you 400ms, you may try to reduce the tx_retries to something
like 1, or you can try to set periodic_ticks to 10 (100ms tick), and
adjust the tx_retries.
-xtang
> -----Original Message-----
> From: Kostadin Vardin [mailto:community-noreply@qnx.com]
> Sent: Tuesday, May 12, 2009 2:52 PM
> To: ostech-core_os
> Subject: Re: TimerTimeout and QNET
>
> Hi Colin,
> Please find attached the sloginfo, where you will see what
> QNET is reporting.
> At the first line you see tx_xmit_init_conn_pkt(): to nd17 on L4 1.
> This is exactly the moment when I start my test case, get
> connection with the server on another node( looks like 17 )
> and start heart beating.
> The second line and after is when I pull out the LAN cable.
> Immediately qnet starts reporting l4_tx_timeout() and as you
> see this happens every 400ms, until the line where it reports
> - l4_tx_service(): exceeded 25 retries during....
> and then we see some QOS messages. At this time my client
> gets unblocked.
> My question is :
> How I can get notified by qnet when it detects timeouts and
> use this one as unblocking event.
> This is very critical, as we deal with surgery and I need to
> get unblocked in such a case. 400ms is OK as time out.
> Probably I can somehow setup number of retries to 1, and then
> I will get notified earlier.
> Please for your help,
> Kosta
>
>
> _______________________________________________
> OSTech
> http://community.qnx.com/sf/go/post29299
>
|
|
|
Kostadin Vardin(deleted)
|
Re: RE: TimerTimeout and QNET
|
Kostadin Vardin(deleted)
05/12/2009 3:14 PM
post29302
|
Re: RE: TimerTimeout and QNET
Thank you so much guys!!!
It looks like you are in the room next to me!
Excellent!
|
|
|
Yiry Estrinov
06/11/2009 12:27 AM
post31455
|
Hi,
I've got a problem with TimerTimeout.
I want to timeout on a read() function call.
My program can't unblock blocking state on QNX 6.4.0 and 6.4.1.
But it ran quite right on QNX 6.3.2.
Here's my code:
...
fd = open ("/dev/ttyq3", O_RDWR); // may be /dev/ser..
if(fd == -1) perror("open");
timeout = 1ULL;
size = 10;
flag = _NTO_TIMEOUT_REPLY;
event.sigev_notify = SIGEV_UNBLOCK ;
TimerTimeout ( CLOCK_REALTIME, flag, &event, &timeout, NULL);
size = read ( fd, buffer, size );
printf("%d\n", size);
if(size == -1) perror("read");
close(fd);
}
Thanks,
Yiry Estrinov.
|
|
|
|