Project Home
Project Home
Wiki
Wiki
Discussion Forums
Discussions
Project Information
Project Info
Forum Topic - QNX 6.5.0 io-pkt-v4 crash: (15 Items)
   
QNX 6.5.0 io-pkt-v4 crash  
We are using the base 6.5.0 (not SR1) on an x86.  Occasionally, io-pkt-v4 will crash.  I've managed to capture slogger 
output and a .core file of the crash.  The crash happens after one and three days of operation.  We are reluctant to 
upgrade to SR1 unless it will fix the problem.  An upgrade will require a lot of regression testing and revalidation and
 we would like to avoid this.  Here is the details of the crash.  If this is a known issue please let me know how to 
avoid it.

Slogger details:
Apr 23 18:21:39    5    21     0 run fault pid 20491 tid 2 signal 11 code 1 ip 0x8085844 proc/boot/io-pkt-v4

/qpm/bin/io-pkt-v4.core:
 processor=X86 num_cpus=1
  cpu 1 cpu=586 name=Vortex86 SoC 586 F5M2S2 speed=800
   flags=0xc000001f FPU MMU CPUID RDTSC INVLPG WP BSWAP
 cyc/sec=800223700 tod_adj=1461272031260566669 nsec=163668191934739 inc=999847
 boot=2580547226 epoch=1970 intr=0
 rate=838095345 scale=-15 load=1193
   MACHINE="x86pc" HOSTNAME="localhost"
 pid=20491 parent=1 child=0 pgrp=20491 sid=1
 flags=0x003210 umask=0 base_addr=0x8048000 init_stack=0x8047ea0
 ruid=0 euid=0 suid=0  rgid=0 egid=0 sgid=0
 ign=0000000006800000 queue=ff00000000010000 pending=0000000000000000
 fds=5 threads=2 timers=1 chans=19
 canstub=0xb0320680 sigstub=0xb031a30c
 thread 1
  ip=0xb033e262 sp=0x8047dcc stkbase=0x7fc7000 stksize=528384
  state=SIGWAITINFO flags=80000000 last_cpu=1 timeout=00000000
  pri=21 realpri=21 policy=RR
 thread 2 SIGNALLED-SIGSEGV code=1 MAPERR refaddr=0 fltno=11
  ip=0x8085844 sp=0x7fa1528 stkbase=0x7fa6000 stksize=135168
  state=STOPPED flags=84000000 last_cpu=1 timeout=00000000
  pri=14 realpri=14 policy=RR
Re: QNX 6.5.0 io-pkt-v4 crash  
Can you get the backtrace from the corefile?
Re: QNX 6.5.0 io-pkt-v4 crash  
I don't know how to get the backtrace info, but here is the core file.
Attachment: Text io-pkt-v4.core 3.13 MB
Re: QNX 6.5.0 io-pkt-v4 crash  
It's a crash in the devn-vortex.so driver when handling a request to add a multicast address. Here's a sanitised 
backtrace:

#0  segv_handler
#1  0xb031a32d in ?? ()
#2  0xb820e370 in ?? ()
#3  0xb820bbcf in ?? ()
#4  0xb8205297 in shim_filter
#5  0xb8202887 in shim_ioctl
#6  0x08072273 in in_addmulti

I don't think the devn-vortex.so is a QNX driver so you will have to contact the supplier of it for support with that. 
To help with decoding, devn-vortex.so is loaded in at 0xb8209000 so you should be able to objdump it and work out where 
it is crashing.
Re: QNX 6.5.0 io-pkt-v4 crash  
Thank you for this very good information.  We have the source code to the devn-vortex driver and I'll be able to pin it 
down.
Re: QNX 6.5.0 io-pkt-v4 crash  
We had another io-pkt-v4 crash, this time on a port with the devnp-i82544.so driver.  I use gdb backtrace command and 
get the following:

#0  0x08085844 in segv_handler ()
#1  <signal handler called>
#2  0xb034bcfb in _resmgr_detach_id () from C:/QNX650/target/qnx6/x86/lib/libc.so.3
#3  0xb0335a52 in resmgr_detach () from C:/QNX650/target/qnx6/x86/lib/libc.so.3
#4  0x00000000 in ?? ()

I'm not sure what to make of this info, or if I'm using gdb correctly.  Any insight as to the cause of the crash?  Core 
file is attached.
Attachment: Text io-pkt-v4_031314014_5-4.core 3.18 MB
Re: QNX 6.5.0 io-pkt-v4 crash  
I can¹t make any sense from the core file, but why are you using the
devnp-i82544.so driver? This driver has been replaced by the
devnp-e1000.so driver. You can find the e1000 driver here:

devnp-e1000.so 
<http://community.qnx.com/sf/frs/do/downloadFile/projects.bsp/frs.network_d
river_updates.latest_io_pkt_network_drivers_0/frs134039?dl=1>



On 2016-05-05, 2:35 PM, "Robert Murrell" <community-noreply@qnx.com> wrote:

>We had another io-pkt-v4 crash, this time on a port with the
>devnp-i82544.so driver.  I use gdb backtrace command and get the
>following:
>
>#0  0x08085844 in segv_handler ()
>#1  <signal handler called>
>#2  0xb034bcfb in _resmgr_detach_id () from
>C:/QNX650/target/qnx6/x86/lib/libc.so.3
>#3  0xb0335a52 in resmgr_detach () from
>C:/QNX650/target/qnx6/x86/lib/libc.so.3
>#4  0x00000000 in ?? ()
>
>I'm not sure what to make of this info, or if I'm using gdb correctly.
>Any insight as to the cause of the crash?  Core file is attached.
>
>
>
>_______________________________________________
>
>Technology
>http://community.qnx.com/sf/go/post116231
>To cancel your subscription to this discussion, please e-mail
>technology-networking-unsubscribe@community.qnx.com

Re: QNX 6.5.0 io-pkt-v4 crash  
I found the root cause of the crashing.  After a day or so of normal running, dhcp.client will spontaneously request a 
new address.  Most times, the server will reassign the same address.  Sometimes the application layers will not detect a
 problem.  Other times they will detect a socket error but reconnect.  Other times the server will assign a new address 
and the application layer will disconnect and remain so until manually reconfigured.  A very few times, the ethernet 
card driver will generate a segmentation fault and crash io-pkt-v4.

Our DHCP server provides leases for 24 hours.  It periodically send ARP requests to see what addresses are still being 
used and will reserve that address even if the lease has expired.  I am using Wireshark to monitor all activity through 
a hub (not a switch).  I see normal communications right up to the point of the DHCP request.

So, under what conditions will dhcp.client spontaneously request a new address?

P.S.  I want to thank everone for being responsive concerning this older product.
Re: QNX 6.5.0 io-pkt-v4 crash  
dhcp.client will renew the lease at the time specified in the options if the server provides it. If the server doesn't 
provide a renewal time option then it will use half of the lease time as the renewal time.

Doing renewal shouldn't crash your driver.

Regards,
Nick.
Re: QNX 6.5.0 io-pkt-v4 crash  
Thanks, Nick.  I never new of this "feature" of the DHCP protocol.  Our vendor is looking into the devn-vortex driver 
crash, and we can switch from the devnp-i82544 driver to the latest devnp-e1000 driver.  We can also restrict the use of
 our devices to have infinite leases or static IP addresses.

One last question, though.  If dhcp.client gets the same address that it had been using, will it do anything to disrupt 
open TCP connections?  Our devices run a proprietary protocol server.  When an external device connects, it keeps the 
socket open indefinitely exchanging information.  We are seeing this connection broken sometimes when the renewal 
address is the same as the original lease address.
Re: QNX 6.5.0 io-pkt-v4 crash  
I looked a little closer at the Wireshark traces at the lease renewal and have found something odd.  I've attached a 
partial Wireshark trace as a CSV file.  The first line shows 'AdlinkTe_0e:b0:e5' renewing its lease.  This is our device
 running QNX 6.5.0 dhcp.client.  The second line shows the DCHP server ACKing the request.  The last line shows our 
device issuing a DHCP Decline on the renewal.  Shortly after, it issues a DHCP Request.

This is likely the root cause of our problem.  Why did dhcp.client issue DHCP Decline?
Attachment: Text dhcp.client_anomoly.csv 3.42 KB
Re: QNX 6.5.0 io-pkt-v4 crash  
When given an address dhcp.client will send ARP requests to validate that the address is not in use before using it. You
 can see the first ARP at row 4 in the CSV, note that it sends an ARP broadcast. What's interesting is row 18, the Cisco
 sends a directed ARP to the device and the device replies. I suspect this is getting picked up as an ARP reply from 
someone else and causing dhcp.client to believe that someone else has the address already.

1) Is proxy ARP enabled on the Cisco device?

2) Are you up to date on your version of dhcp.client?
Re: QNX 6.5.0 io-pkt-v4 crash  
Nick,

I don't know about the Cisco device.  It is managed by our IT department and locked somewhere in a dark room.

We are using the base release QNX 6.5.0 dhcp.client.  Can you confirm that the latest dhcp.client does not do this?
Re: QNX 6.5.0 io-pkt-v4 crash  
Had a quick look at the code and I see something suspicious, I think if we get a legitimate ARP request while we are 
probing with ARP then we may believe that is another host with that address.

Do you have a support contract in place to raise this issue through?
Re: QNX 6.5.0 io-pkt-v4 crash  
Sorry, no.  It expired 5 years ago.