Jump to ID:
QNX RTOS v4

Project Home

Discussions

Wiki

Project Info
Forum Topic - alive command: (12 Items)
   
 
 
alive command  
We have a customer issue where we are experiencing a locked up QNX node on a 16 NODE qnx NETWORK. We cannot cd to the 
locked up node from another node at all yet if an alive command is issued from another node alive sees the locked up 
node as "UP"

This does not make sense at all and we are wondering about possible causes and how to trouble shoot this.

- Thanks,

    Kevin
Re: alive command  
Hi Kevin,

Please attach output of:

# sin ver > sin_ver.txt
# sin arg > sin_arg.txt
# sin ti > sin_time.txt
# licinfo -a > licinfo.txt

from your 16 NODE and PC from which you run alive. What means "locked up 
node"? What do you do? Please specify your actions.

-- 
Respectfully,
Oleg

> We have a customer issue where we are experiencing a locked up QNX node
> on a 16 NODE qnx NETWORK. We cannot cd to the locked up node from
> another node at all yet if an alive command is issued from another
> node alive sees the locked up node as "UP"
> 
> This does not make sense at all and we are wondering about possible
> causes and how to trouble shoot this.
> 
> - Thanks,
> 
>     Kevin
> 
> 
> 
> 
> _______________________________________________
> 
> General
> http://community.qnx.com/sf/go/post50758
Re: alive command  
Hi Oleg:

Kevin,
Attached is the requested output. Node 3 was the one that locked-up. Node 15 was used for alive and other commands.
"Locked up node": 
	- Screen frozen, no data and timer changes visible, no kbd and mouse actions accepted. 
	- Other nodes show this node as alive (UP). 
Actions from another node (15):
	- "on -f 3 sin" doesn't return to the prompt, there is no output
	- Ethernet cable disconnected from node 3
	- "on -f 3 sin" returns 
	- "alive" shows node as DOWN
	- Reconnect node 3 to the network
	- "alive" shows node  as "UP"
	- "ls //3/tmp"  doesn't return to the prompt, there is no output
	- node 3 is rebooted.
Attachment: Compressed file test3.zip 9.7 KB
Re: alive command  
Hi Kevin,

What is the causing node 3 locking up?

-- 
Respectfully,
Oleg

> Hi Oleg:
> 
> Kevin,
> Attached is the requested output. Node 3 was the one that locked-up.
> Node 15 was used for alive and other commands. "Locked up node":
> 	- Screen frozen, no data and timer changes visible, no kbd and mouse
> actions accepted. - Other nodes show this node as alive (UP).
> Actions from another node (15):
> 	- "on -f 3 sin" doesn't return to the prompt, there is no output
> 	- Ethernet cable disconnected from node 3
> 	- "on -f 3 sin" returns
> 	- "alive" shows node as DOWN
> 	- Reconnect node 3 to the network
> 	- "alive" shows node  as "UP"
> 	- "ls //3/tmp"  doesn't return to the prompt, there is no output
> 	- node 3 is rebooted.
> 
> 
> 
> 
> _______________________________________________
> 
> General
> http://community.qnx.com/sf/go/post51022
Re: alive command  
Hi Oleg:

   At this point we are not sure as to what is causing the lockup. We 
suspect a faulty network connection somewhere but netinfo's do not show 
any errors that are growing.

   - Kevin



Oleg Bolshakov <community-noreply@qnx.com> 
04/02/2010 08:29 AM
Please respond to
post51095@community.qnx.com


To
general-qnx4 <post51095@community.qnx.com>
cc

Subject
Re: alive command






Hi Kevin,

What is the causing node 3 locking up?

-- 
Respectfully,
Oleg

> Hi Oleg:
> 
> Kevin,
> Attached is the requested output. Node 3 was the one that locked-up.
> Node 15 was used for alive and other commands. "Locked up node":
>                - Screen frozen, no data and timer changes visible, no 
kbd and mouse
> actions accepted. - Other nodes show this node as alive (UP).
> Actions from another node (15):
>                - "on -f 3 sin" doesn't return to the prompt, there is no 
output
>                - Ethernet cable disconnected from node 3
>                - "on -f 3 sin" returns
>                - "alive" shows node as DOWN
>                - Reconnect node 3 to the network
>                - "alive" shows node  as "UP"
>                - "ls //3/tmp"  doesn't return to the prompt, there is no 
output
>                - node 3 is rebooted.
> 
> 
> 
> 
> _______________________________________________
> 
> General
> http://community.qnx.com/sf/go/post51022



_______________________________________________

General
http://community.qnx.com/sf/go/post51095


Re: alive command  
Finally we've gotten a system in our lab that is failing that way. A LAN of 5 (HP xw4600) nodes, all using SATA 160GB 
HDs with IDE Emulation. One of the nodes has the Fsys.atapi crashing thus creating the previously described problem. 
Surprisingly, the problem started when the system was reloaded using the QNX Suite 2009 instead of the previous QNX(4.
25G) version. The failing node has a Samsung HD. Other nodes have Seagate HDs. Specs of both HD types are identical. 
Re: alive command  
Hi Zbigniew,

If I understand you correctly, the Fsys.atapi crashes only on a node with 
Samsung HD and then this node isn't available via network. If so, then I 
think that the easiest way to fix your issue is to change Samsung HD to 
another HD.

Also you can test how your system works with the QNX 4 Product Suite CD 
2010 EXPERIMENTAL:

http://community.qnx.com/sf/wiki/do/viewPage/projects.qnx4/wiki/SoftwareUpdates2010

Does the issue occur with QNX 4 2010 CD?

-- 
Respectfully,
Oleg

> Finally we've gotten a system in our lab that is failing that way. A
> LAN of 5 (HP xw4600) nodes, all using SATA 160GB HDs with IDE
> Emulation. One of the nodes has the Fsys.atapi crashing thus creating
> the previously described problem. Surprisingly, the problem started
> when the system was reloaded using the QNX Suite 2009 instead of the
> previous QNX(4.25G) version. The failing node has a Samsung HD. Other
> nodes have Seagate HDs. Specs of both HD types are identical.
> 
> 
> 
> _______________________________________________
> 
> General
> http://community.qnx.com/sf/go/post83279
Re: alive command  
Hi Oleg:

   Thanks we have not tried this with the new QNX 4 Product Suite CD 2010 
but I will try this as well.
   We did notice that the hard drive seems to work with the older QNX4.25G 
which was using Fsys.eide.

   - Kevin
 



Oleg Bolshakov <community-noreply@qnx.com> 
02/22/2011 08:54 AM
Please respond to
post83367@community.qnx.com


To
general-qnx4 <post83367@community.qnx.com>
cc

Subject
Re: alive command






Hi Zbigniew,

If I understand you correctly, the Fsys.atapi crashes only on a node with 
Samsung HD and then this node isn't available via network. If so, then I 
think that the easiest way to fix your issue is to change Samsung HD to 
another HD.

Also you can test how your system works with the QNX 4 Product Suite CD 
2010 EXPERIMENTAL:

http://community.qnx.com/sf/wiki/do/viewPage/projects.qnx4/wiki/SoftwareUpdates2010


Does the issue occur with QNX 4 2010 CD?

-- 
Respectfully,
Oleg

> Finally we've gotten a system in our lab that is failing that way. A
> LAN of 5 (HP xw4600) nodes, all using SATA 160GB HDs with IDE
> Emulation. One of the nodes has the Fsys.atapi crashing thus creating
> the previously described problem. Surprisingly, the problem started
> when the system was reloaded using the QNX Suite 2009 instead of the
> previous QNX(4.25G) version. The failing node has a Samsung HD. Other
> nodes have Seagate HDs. Specs of both HD types are identical.
> 
> 
> 
> _______________________________________________
> 
> General
> http://community.qnx.com/sf/go/post83279



_______________________________________________

General
http://community.qnx.com/sf/go/post83367


Re: alive command  
Hi Kevin,

If the Fsys.atapi doesn't work correctly with your hardware, but the 
Fsys.eide works, then you can use the Fsys.eide instead of the Fsys.atapi. 
The 2009 QNX4 CD contains the Fsys.eide as well.

To resolve the issue with the Samsung HD we need to get your hardware to 
reproduce the issue.

-- 
Respectfully,
Oleg

> Hi Oleg:
> 
>    Thanks we have not tried this with the new QNX 4 Product Suite CD
> 2010 but I will try this as well.
>    We did notice that the hard drive seems to work with the older
> QNX4.25G which was using Fsys.eide.
> 
>    - Kevin
> 
> 
> 
> 
> Oleg Bolshakov <community-noreply@qnx.com>
> 02/22/2011 08:54 AM
> Please respond to
> post83367@community.qnx.com
> 
> 
> To
> general-qnx4 <post83367@community.qnx.com>
> cc
> 
> Subject
> Re: alive command
> 
> 
> 
> 
> 
> 
> Hi Zbigniew,
> 
> If I understand you correctly, the Fsys.atapi crashes only on a node
> with Samsung HD and then this node isn't available via network. If so,
> then I think that the easiest way to fix your issue is to change
> Samsung HD to another HD.
> 
> Also you can test how your system works with the QNX 4 Product Suite CD
> 2010 EXPERIMENTAL:
> 
> http://community.qnx.com/sf/wiki/do/viewPage/projects.qnx4/wiki/Softwar
> eUpdates2010
> 
> 
> Does the issue occur with QNX 4 2010 CD?
> 
> > Finally we've gotten a system in our lab that is failing that way. A
> > LAN of 5 (HP xw4600) nodes, all using SATA 160GB HDs with IDE
> > Emulation. One of the nodes has the Fsys.atapi crashing thus creating
> > the previously described problem. Surprisingly, the problem started
> > when the system was reloaded using the QNX Suite 2009 instead of the
> > previous QNX(4.25G) version. The failing node has a Samsung HD. Other
> > nodes have Seagate HDs. Specs of both HD types are identical.
> > 
> > 
> > 
> > _______________________________________________
> > 
> > General
> > http://community.qnx.com/sf/go/post83279
> 
> _______________________________________________
> 
> General
> http://community.qnx.com/sf/go/post83367
> 
> 
> 
> 
> 
> 
> _______________________________________________
> 
> General
> http://community.qnx.com/sf/go/post83378
Re: alive command  
Hi Oleg:

   Thanks. That is another possible option to consider.
   We are currently testing a BIOS upgrade for the HPXW4600 computer that 
has the SAMSUNG HD and so far we have not seen the crash.
   We some test scripts which we use to cause the problem and so far so 
good. We are continuing to monitor this scenario.
 
   NOTE:

     We have not seen this issue on the HPZ400 which is the latest HP 
computer that we have validated.
     The HPXW4600 is no longer being produced so this issue really only 
pertains to existing HPxw4600's out in the field that have this suspect 
SAMSUNG HD.
      The other HD that works well and was used in the HPxw4600 was a 
SEAGATE drive.

    Just wondered if you have heard of any similar prolems with Fsys.atapi 
?

    - Thanks,
        Kevin 



Oleg Bolshakov <community-noreply@qnx.com> 
02/23/2011 01:42 PM
Please respond to
post83434@community.qnx.com


To
general-qnx4 <post83434@community.qnx.com>
cc

Subject
Re: alive command






Hi Kevin,

If the Fsys.atapi doesn't work correctly with your hardware, but the 
Fsys.eide works, then you can use the Fsys.eide instead of the Fsys.atapi. 

The 2009 QNX4 CD contains the Fsys.eide as well.

To resolve the issue with the Samsung HD we need to get your hardware to 
reproduce the issue.

-- 
Respectfully,
Oleg

> Hi Oleg:
> 
>    Thanks we have not tried this with the new QNX 4 Product Suite CD
> 2010 but I will try this as well.
>    We did notice that the hard drive seems to work with the older
> QNX4.25G which was using Fsys.eide.
> 
>    - Kevin
> 
> 
> 
> 
> Oleg Bolshakov <community-noreply@qnx.com>
> 02/22/2011 08:54 AM
> Please respond to
> post83367@community.qnx.com
> 
> 
> To
> general-qnx4 <post83367@community.qnx.com>
> cc
> 
> Subject
> Re: alive command
> 
> 
> 
> 
> 
> 
> Hi Zbigniew,
> 
> If I understand you correctly, the Fsys.atapi crashes only on a node
> with Samsung HD and then this node isn't available via network. If so,
> then I think that the easiest way to fix your issue is to change
> Samsung HD to another HD.
> 
> Also you can test how your system works with the QNX 4 Product Suite CD
> 2010 EXPERIMENTAL:
> 
> http://community.qnx.com/sf/wiki/do/viewPage/projects.qnx4/wiki/Softwar
> eUpdates2010
> 
> 
> Does the issue occur with QNX 4 2010 CD?
> 
> > Finally we've gotten a system in our lab that is failing that way. A
> > LAN of 5 (HP xw4600) nodes, all using SATA 160GB HDs with IDE
> > Emulation. One of the nodes has the Fsys.atapi crashing thus creating
> > the previously described problem. Surprisingly, the problem started
> > when the system was reloaded using the QNX Suite 2009 instead of the
> > previous QNX(4.25G) version. The failing node has a Samsung HD. Other
> > nodes have Seagate HDs. Specs of both HD types are identical.
> > 
> > 
> > 
> > _______________________________________________
> > 
> > General
> > http://community.qnx.com/sf/go/post83279
> 
> _______________________________________________
> 
> General
> http://community.qnx.com/sf/go/post83367
> 
> 
> 
> 
> 
> 
> _______________________________________________
> 
> General
> http://community.qnx.com/sf/go/post83378



_______________________________________________

General
http://community.qnx.com/sf/go/post83434


Re: alive command  
Hi Kevin,

I haven't heard about similar problems with the Fsys.atapi.

-- 
Respectfully,
Oleg

> Hi Oleg:
> 
>    Thanks. That is another possible option to consider.
>    We are currently testing a BIOS upgrade for the HPXW4600 computer
> that has the SAMSUNG HD and so far we have not seen the crash.
>    We some test scripts which we use to cause the problem and so far so
> good. We are continuing to monitor this scenario.
> 
>    NOTE:
> 
>      We have not seen this issue on the HPZ400 which is the latest HP
> computer that we have validated.
>      The HPXW4600 is no longer being produced so this issue really only
> pertains to existing HPxw4600's out in the field that have this suspect
> SAMSUNG HD.
>       The other HD that works well and was used in the HPxw4600 was a
> SEAGATE drive.
> 
>     Just wondered if you have heard of any similar prolems with
> Fsys.atapi ?
> 
>     - Thanks,
>         Kevin
> 
> 
> 
> Oleg Bolshakov <community-noreply@qnx.com>
> 02/23/2011 01:42 PM
> Please respond to
> post83434@community.qnx.com
> 
> 
> To
> general-qnx4 <post83434@community.qnx.com>
> cc
> 
> Subject
> Re: alive command
> 
> 
> 
> 
> 
> 
> Hi Kevin,
> 
> If the Fsys.atapi doesn't work correctly with your hardware, but the
> Fsys.eide works, then you can use the Fsys.eide instead of the
> Fsys.atapi.
> 
> The 2009 QNX4 CD contains the Fsys.eide as well.
> 
> To resolve the issue with the Samsung HD we need to get your hardware
> to reproduce the issue.
> 
> > Hi Oleg:
> >    Thanks we have not tried this with the new QNX 4 Product Suite CD
> > 
> > 2010 but I will try this as well.
> > 
> >    We did notice that the hard drive seems to work with the older
> > 
> > QNX4.25G which was using Fsys.eide.
> > 
> >    - Kevin
> > 
> > Oleg Bolshakov <community-noreply@qnx.com>
> > 02/22/2011 08:54 AM
> > Please respond to
> > post83367@community.qnx.com
> > 
> > 
> > To
> > general-qnx4 <post83367@community.qnx.com>
> > cc
> > 
> > Subject
> > Re: alive command
> > 
> > 
> > 
> > 
> > 
> > 
> > Hi Zbigniew,
> > 
> > If I understand you correctly, the Fsys.atapi crashes only on a node
> > with Samsung HD and then this node isn't available via network. If
> > so, then I think that the easiest way to fix your issue is to change
> > Samsung HD to another HD.
> > 
> > Also you can test how your system works with the QNX 4 Product Suite
> > CD 2010 EXPERIMENTAL:
> > 
> > http://community.qnx.com/sf/wiki/do/viewPage/projects.qnx4/wiki/Softw
> > ar eUpdates2010
> > 
> > 
> > Does the issue occur with QNX 4 2010 CD?
> > 
> > > Finally we've gotten a system in our lab that is failing that way.
> > > A LAN of 5 (HP xw4600) nodes, all using SATA 160GB HDs with IDE
> > > Emulation. One of the nodes has the Fsys.atapi crashing thus
> > > creating the previously described problem. Surprisingly, the
> > > problem started when the system was reloaded using the QNX Suite
> > > 2009 instead of the previous QNX(4.25G) version. The failing node
> > > has a Samsung HD. Other nodes have Seagate HDs. Specs of both HD
> > > types are identical.
> > > 
> > > 
> > > 
> > > _______________________________________________
> > > 
> > > General
> > > http://community.qnx.com/sf/go/post83279
> > 
> >...
Re: alive command  
Hi Oleg:

   Just wanted to let you know that we have tried the 2010 Experimental CD 
as well and it also did not resolve the issue.
   As mentioned we are doing tests withan upgrade BIOS for the HPxw4600.

   - Kevin





Oleg Bolshakov <community-noreply@qnx.com> 
02/23/2011 01:42 PM
Please respond to
post83434@community.qnx.com


To
general-qnx4 <post83434@community.qnx.com>
cc

Subject
Re: alive command






Hi Kevin,

If the Fsys.atapi doesn't work correctly with your hardware, but the 
Fsys.eide works, then you can use the Fsys.eide instead of the Fsys.atapi. 

The 2009 QNX4 CD contains the Fsys.eide as well.

To resolve the issue with the Samsung HD we need to get your hardware to 
reproduce the issue.

-- 
Respectfully,
Oleg

> Hi Oleg:
> 
>    Thanks we have not tried this with the new QNX 4 Product Suite CD
> 2010 but I will try this as well.
>    We did notice that the hard drive seems to work with the older
> QNX4.25G which was using Fsys.eide.
> 
>    - Kevin
> 
> 
> 
> 
> Oleg Bolshakov <community-noreply@qnx.com>
> 02/22/2011 08:54 AM
> Please respond to
> post83367@community.qnx.com
> 
> 
> To
> general-qnx4 <post83367@community.qnx.com>
> cc
> 
> Subject
> Re: alive command
> 
> 
> 
> 
> 
> 
> Hi Zbigniew,
> 
> If I understand you correctly, the Fsys.atapi crashes only on a node
> with Samsung HD and then this node isn't available via network. If so,
> then I think that the easiest way to fix your issue is to change
> Samsung HD to another HD.
> 
> Also you can test how your system works with the QNX 4 Product Suite CD
> 2010 EXPERIMENTAL:
> 
> http://community.qnx.com/sf/wiki/do/viewPage/projects.qnx4/wiki/Softwar
> eUpdates2010
> 
> 
> Does the issue occur with QNX 4 2010 CD?
> 
> > Finally we've gotten a system in our lab that is failing that way. A
> > LAN of 5 (HP xw4600) nodes, all using SATA 160GB HDs with IDE
> > Emulation. One of the nodes has the Fsys.atapi crashing thus creating
> > the previously described problem. Surprisingly, the problem started
> > when the system was reloaded using the QNX Suite 2009 instead of the
> > previous QNX(4.25G) version. The failing node has a Samsung HD. Other
> > nodes have Seagate HDs. Specs of both HD types are identical.
> > 
> > 
> > 
> > _______________________________________________
> > 
> > General
> > http://community.qnx.com/sf/go/post83279
> 
> _______________________________________________
> 
> General
> http://community.qnx.com/sf/go/post83367
> 
> 
> 
> 
> 
> 
> _______________________________________________
> 
> General
> http://community.qnx.com/sf/go/post83378



_______________________________________________

General
http://community.qnx.com/sf/go/post83434