Davide Ancri
|
qnet: kif_server@899 kif_server_msgs Overflow(0) [and process not terminating as expected]
|
Davide Ancri
01/12/2017 5:36 AM
post117324
|
qnet: kif_server@899 kif_server_msgs Overflow(0) [and process not terminating as expected]
hello everyone
I am investigating a problem where a remote process (spawned on a remote qnet node via "on -f") does not terminate when
a SIGTERM is sent to on.
Here's the steps followed during a full run:
1) process procA is launched on node nodeA
2) procA spawns: on -f nodeB scriptB
3) scriptB ends into: exec procB
4) our test/simulation goes on until a good reason for terminating
5) procA sends a SIGTERM to its on child (spawned at point 2), expecting procB to get terminated
6) procA expects SIGCHLD from on, or after a timeout it exits anyway
What sometimes happens is: on remains alive with a pending SIGTERM, and procB does not terminate.
I noticed this strange message into nodeB sloginfo, in correspondence with the termination attempt:
Jan 05 17:09:42 7 15 0 qnet(stats): kif_server@899 kif_server_msgs Overflow(0)
Can this explain why something gone wrong in the signal delivery procA->on->nodeB->procB ?
Is it a non recoverable qnet error?
Is there a way to avoid this kind of error / critical situation?
I'm running a x86 system with qnx 6.5.0. If other infos are needed feel free to ask :)
thanks in advance!
Davide
|
|
|