Project Home
Project Home
Wiki
Wiki
Discussion Forums
Discussions
Project Information
Project Info
Forum Topic - TCP Retransmissions: (2 Items)
   
TCP Retransmissions  
Hey there!


Occasionally, we have some problems regarding TCP communication from an embedded PPC system (running QNX 6.3.0SP3, but 
with the io-pkt-v4-hc driver from QNX 6.4) to an industrial PC (running Windows XP Embedded).


There are TCP retransmissions occurring from time to time - so far nothing unusual - but in some cases, the error seems 
to build up, i.e. the number of re-transmissions increases successively. After some time, there are too many TCP 
retransmissions, so that the ordinary communication (Modbus request -> Modbus response -> Next Modbus request ...) is 
basically broken.

The underlying Ethernet connection is built up in a local area network (without any switches or hubs in between, and 
with static IP addresses assigned), and is used for Modbus/TCP communication only. Both the PPC and the WinXP systems, 
as well as the connection cross-cable have already been replaced, so this is almost certainly no hardware issue. Other 
systems with the same or a similar setup are working fine so far.

With this brief description at hand, I expect no one to come up with a satisfying solution. Hence, for now I have only 
some basic questions (as I am not such a networking expert as others here, who may hopefully read this):
1. If a TCP connection is finally closed (using the ordinary FIN/ACK handshake), shouldn't the still outstanding 
retransmissions (or any other communication to the remote host with this IP address and TCP port) then be immediately 
discarded? I'm asking this, because in our case, the retransmission of packets is still ongoing for a while (presumably 
until a maximum retry count and/or time is reached) after the connection has been closed. In addition, the QNX system 
also keeps sending RST/ACK requests to the remote system.
Again, the answer might be trivial, as I am not a networking expert. But please, even if so, hit me with it. (Otherwise,
 Wireshark traces could be provided if desired for a more detailed analysis.)

2. Is there any known issue towards using io-pkt-v4-hc on a QNX 6.3.0SP3 system (e.g. known bugs in the TCP stack), 
which could cause this error?

3. On the QNX system, we use the Packet Filter pseudo-device (pf) as a "firewall". Here's the relevant configuration for
 pfctl:
    ####################
    ### NORMALIZATION
    scrub 	in 		all
    scrub 	out 	all
    ### INITIAL BLOCK ALL ON ANY ITF
    block in all
    block out all
    ### SERVICE: modbus server
    pass in quick on en0 proto tcp from any to (en0) port 501 keep state (max-src-conn 10, max-src-conn-rate 5/10)
    ####################
Using this configuration the pf should work "connection-based" and therefore only block network communication if there 
are too many connection requests within a certain amount of time. However, in our case, there's only one connection 
which is permanently maintained (unless the connection is manually terminated, in which case the questions 1. and 2. 
apply). Still: Could it be, that pf is somehow responsible for blocking the outgoing TCP response packets in such a way 
that the described error is provoked?


Thanks in advance for any hints regarding this topic.

Regards,
Markus
Re: TCP Retransmissions  
Problem solved:
The pf configuration was wrong. The re-transmissions were caused by the employed source-tracking rules, causing the pf 
to accept incoming connections, but refuse actual data traffic over them as soon as one of the limits (10 connections or
 5 connections per 10 seconds) is exceeded.

We now employ a per-rule "global" maximum connection limit rather than connection based limits:
pass in quick on en0 proto tcp from any to (en0) port = modbus keep state (max 60)

This way, pf does reject any incoming connection if more than 60 connections are already established. But it also allows
 for "unlimited" data traffic via the existing connections, and this is essentially what we wanted.

We didn't find the source for the initial TCP re-transmissions though (which initially cause the client side to close 
the current TCP connection and open a new one), but this is possibly an EMC disturbance from a nearby hardware device.