Benjamin Richner
01/21/2019 7:43 AM
post119425
|
Hi,
I need to get the default gateway (on QNX6.6) and display it live on our display.
With "netstat -rn" I can easily get the default gateway, but I need to do it programmatically in C. A simple approach
would be to just do a string search on the stdout of "netstat -rn" and find the default gateway. However, our realtime
system is really sensitive and easily disturbed by activities like starting and killing processes (this is a whole
different story, bottom line, a QNX kernel engineer told us we cannot start or kill threads at runtime if we ever want
stable timings).
So, is there a way to get the current default gateway, e.g. with ioctl, without launching route or netstat? I found some
solutions online but they are Linux only.
Cheers,
Benjamin
|
|
|
Benjamin Richner
|
Re: Get default gateway in C
|
Benjamin Richner
01/21/2019 7:47 AM
post119426
|
Re: Get default gateway in C
> a QNX kernel engineer told us we cannot start or kill threads at runtime if we
Correction: cannot start or kill processes*
|
|
|
Will Miles
|
Re: Get default gateway in C
|
Will Miles
01/21/2019 4:44 PM
post119428
|
Re: Get default gateway in C
> > a QNX kernel engineer told us we cannot start or kill threads at runtime if
> we
>
> Correction: cannot start or kill processes*
Sorry to derail your question, but I'm curious about what problems you were running in to with process management. Does
this have to do with a IPI_TLB_FLUSH flood on a multicore system?
-Will
|
|
|
Benjamin Richner
|
Re: Get default gateway in C
|
Benjamin Richner
01/22/2019 3:04 AM
post119429
|
Re: Get default gateway in C
Hi Will,
Yes, that's it, exactly. How do you know about the TLB flushes? Is this common knowledge? It took us a lot of digging
and debugging to find out about it.
-Benjamin
|
|
|
Will Miles
|
Re: Get default gateway in C
|
Will Miles
01/22/2019 1:01 PM
post119437
|
Re: Get default gateway in C
For me, it was a long and sordid tale of woe about trying to track down why a hardware driver we wrote for QNX 6.3
failed to meet its targets on a new multicore SBC using QNX 6.6. I spent a lot of time with tracelogger trying to pin
down what was causing the mysterious delays. (In the end, I never did mange to solve it -- it seemed like there's some
kind of intermittent PCIe root complex stall with that board. When it gets stuck, the other core can still access RAM
, but any attempt to perform memory-mapped IO through any PCIe port would also stall.)
Anyhow, we run Samba on our target boards to provide easy file transfers for Windows clients; and Samba loves to create
and destroy processes with a zillion shared objects, so I saw a lot of these TLB flush floods in my traces. I can't
speak for other customers, but I imagine that pretty much anyone with a multicore x86 system who catches a process
destruction in tracelogger would've seen it. I don't know if it affects other architectures, so far we're still an x86
-only shop here. To be fair, I still have an old copy of the QNX 6.4 kernel sources from way back before BB screwed up
all the policies, and that was somewhat helpful for understanding what the IPI was meant to be doing.
To mitigate the floods we (a) set the runmask on all of our real-time components to lock them to the first core; and
then (b) use 'slay -i -C 0' to also lock procnto's non-idle threads to that core as well. By locking procnto's threads
, we ensure that CPU 0 can never be flooded out as it's always the *source* of the TLB flush interrupts; those procnto
threads will just be preempted by a real-time thread when needed. It's hardly ideal -- essentially only the one core
can be treated as real-time, and there's a performance penalty introduced with every procnto call on other cores -- but
at least we get one core that we can still rely on.
For posterity's sake, I'd also like to note here that I recall reading that QNX 7.0 had a major rewrite of the memory
manager, so it's quite possible this issue has been fixed in the current release. I haven't tried it myself --
unfortunately we've still got too much Photon code to migrate to something else before we can look at making that jump.
-Will
|
|
|
Benjamin Richner
|
Re: Get default gateway in C
|
Benjamin Richner
01/23/2019 4:23 AM
post119438
|
Re: Get default gateway in C
Hey Will,
> For me, it was a long and sordid tale of woe about trying to track down why a hardware driver we wrote for QNX 6.3
failed to meet its targets on a new multicore SBC using QNX 6.6. I spent a lot of time with tracelogger trying to pin
down what was causing the mysterious delays.
We have the exact same problem with our EtherCAT driver. We updated from QNX6.5 to QNX6.6. We do not start or stop any
processes anymore and that helped (no more massive TLB flush storms), but the controller is still not as stable as it
used to be.
> and Samba loves to create and destroy processes with a zillion shared objects, so I saw a lot of these TLB flush
floods in my traces.
What a nightmare.
> I don't know if it affects other architectures, so far we're still an x86-only shop here.
We are also x86-only but from what I've heard so far, it seems like it happens on all SMP versions of QNX6.6. They said
they fixed a race condition that was present in the QNX6.5 kernel, and the fix is causing those TLB flushes. So
basically they had to sacrifice responsiveness for correctness. I do not know how QNX7.0 performs in those regards.
> To be fair, I still have an old copy of the QNX 6.4 kernel sources from way back before BB screwed up all the
policies
I wish I did too. I don't have any kernel source. Sometimes I look into openqnx on github, but it's old and doesn't map
that well onto the newer kernels anymore.
>To mitigate the floods we (a) set the runmask on all of our real-time components to lock them to the first core; and
then (b) use 'slay -i -C 0' to also lock procnto's non-idle threads to that core as well. By locking procnto's threads,
we ensure that CPU 0 can never be flooded out as it's always the *source* of the TLB flush interrupts
That's clever. We did (a) because it makes timing much more stable, but we did not know about (b).
> I haven't tried it myself -- unfortunately we've still got too much Photon code to migrate to something else before we
can look at making that jump.
Didn't you already have to get rid of Photon when using QNX6.6? For graphics I can recommend Crank Storyboard. It's
solid cross-platform software with various renderers (hardware OpenGL based, software, etc). It's pretty cheap too. It
is supported both on photon and screen based QNX versions.
Anyway, we cannot go up to QNX7.0 because of old ethernet, wlan and graphics drivers for the old hardware that do not
exist on QNX7.0 and cost a fortune to port (they're mostly closed source to us, otherwise I'd do it myself). The entire
thing is unfortunate because I ported everything else to QNX7.0 and it's generally running well. The biggest change is
that they completely reworked the PCI server - that cost me the most time.
Cheers,
Benjamin
|
|
|
Will Miles
|
Re: Get default gateway in C
|
Will Miles
01/23/2019 2:24 PM
post119445
|
Re: Get default gateway in C
Hi Benjamin,
On 2019-01-23 4:23 a.m., Benjamin Richner wrote:
> To mitigate the floods we (a) set the runmask on all of our real-time components to lock them to the first core; and
then (b) use 'slay -i -C 0' to also lock procnto's non-idle threads to that core as well. By locking procnto's threads,
we ensure that CPU 0 can never be flooded out as it's always the *source* of the TLB flush interrupts
> That's clever. We did (a) because it makes timing much more stable, but we did not know about (b).
For posterity, I've attached the OpenRC init.d script we use that does (b). Note it loops until it confirms that all
threads have the right runmask, because sometimes they come and go during the operation.
>
>> I haven't tried it myself -- unfortunately we've still got too much Photon code to migrate to something else before
we can look at making that jump.
> Didn't you already have to get rid of Photon when using QNX6.6? For graphics I can recommend Crank Storyboard. It's
solid cross-platform software with various renderers (hardware OpenGL based, software, etc). It's pretty cheap too. It
is supported both on photon and screen based QNX versions.
Ironically in this context, QNX has generally been pretty good about arranging that binaries compiled for the previous
release just work on the new one. I just boxed up Photon from 6.5 (including the io-graphics framework and drivers)
into a package we could deploy on our 6.6 platform. It seems to works as well as it ever did in 6.5 (which is to say,
there are still bugs that have to be worked around, but it gets the job done).
Truthfully we're looking at migrating away from QNX for our future human interfaces; currently we're looking at
commodity tablets running commodity operating systems. The goal being that we can focus on writing the UI using FOSS
tools without getting bogged down in graphics driver woes or display signal timing issues anymore.
> Anyway, we cannot go up to QNX7.0 because of old ethernet, wlan and graphics drivers for the old hardware that do not
exist on QNX7.0 and cost a fortune to port (they're mostly closed source to us, otherwise I'd do it myself). The entire
thing is unfortunate because I ported everything else to QNX7.0 and it's generally running well. The biggest change is
that they completely reworked the PCI server - that cost me the most time.
Yeah, I looked over some of the beta release of the new PCI server. I can certainly understand the need for it -- I
had to patch in additional IRQ routing capabilities for our hardware to the old server and I can appreciate the desire
to reorganize. I'm not looking forward to making the jump; right now our current system is a single platform image using
enum-devices to self-configure for whatever hardware it lands on, so we'll either have to build out a new device
detection framework or completely rethink how we deploy.
-Will
|
|
|
Benjamin Richner
|
Re: Get default gateway in C
|
Benjamin Richner
01/24/2019 5:25 AM
post119446
|
Re: Get default gateway in C
Thanks, Will. I'll take a look into that script.
We have significant vendor-lock-in with QNX so we will not be migrating anytime soon, probably. Our systems run fairly
stable now so I am not that unhappy with it at the moment.
|
|
|
Nick Reilly
|
Re: Get default gateway in C
|
Nick Reilly
01/21/2019 11:24 AM
post119427
|
Re: Get default gateway in C
There are 2 different possibilities. Either open a routing socket and send an RTM_GET message or issue a sysctl for:
mib[0] = CTL_NET;
mib[1] = PF_ROUTE;
mib[2] = 0;
mib[3] = 0;
mib[4] = NET_RT_DUMP;
mib[5] = 0;
Probably best to do the RTM_GET because then you can just get the default gateway rather than dumping the entire table
and then having to parse out the default gateway.
|
|
|
Benjamin Richner
|
Re: Get default gateway in C
|
Benjamin Richner
01/22/2019 3:06 AM
post119430
|
Re: Get default gateway in C
Hi Nick,
Thanks for your answer, it's really helpful. I'll try your approaches.
-Benjamin
|
|
|
Benjamin Richner
|
Re: Get default gateway in C
|
Benjamin Richner
01/22/2019 7:41 AM
post119432
|
Re: Get default gateway in C
Just as a small followup: I got it to work. The following GitHub repo was a great help:
https://github.com/k84d/unpv13e/blob/master/libroute/net_rt_dump.c
The API calls can be copied verbatim and it works in QNX. Cool!
|
|
|
|