foundry27 : Post

Forum Topic - TCP Retransmissions: (2 Items)

View: as

Update

Expand All | Collapse All

Markus Kohler(deleted)

09/17/2012 10:24 AM

post95603

TCP Retransmissions

Hey there!

Occasionally, we have some problems regarding TCP communication from an embedded PPC system (running QNX 6.3.0SP3, but
with the io-pkt-v4-hc driver from QNX 6.4) to an industrial PC (running Windows XP Embedded).

There are TCP retransmissions occurring from time to time - so far nothing unusual - but in some cases, the error seems
to build up, i.e. the number of re-transmissions increases successively. After some time, there are too many TCP
retransmissions, so that the ordinary communication (Modbus request -> Modbus response -> Next Modbus request ...) is
basically broken.

The underlying Ethernet connection is built up in a local area network (without any switches or hubs in between, and
with static IP addresses assigned), and is used for Modbus/TCP communication only. Both the PPC and the WinXP systems,
as well as the connection cross-cable have already been replaced, so this is almost certainly no hardware issue. Other
systems with the same or a similar setup are working fine so far.

With this brief description at hand, I expect no one to come up with a satisfying solution. Hence, for now I have only
some basic questions (as I am not such a networking expert as others here, who may hopefully read this):
1. If a TCP connection is finally closed (using the ordinary FIN/ACK handshake), shouldn't the still outstanding
retransmissions (or any other communication to the remote host with this IP address and TCP port) then be immediately
discarded? I'm asking this, because in our case, the retransmission of packets is still ongoing for a while (presumably
until a maximum retry count and/or time is reached) after the connection has been closed. In addition, the QNX system
also keeps sending RST/ACK requests to the remote system.
Again, the answer might be trivial, as I am not a networking expert. But please, even if so, hit me with it. (Otherwise,
Wireshark traces could be provided if desired for a more detailed analysis.)

2. Is there any known issue towards using io-pkt-v4-hc on a QNX 6.3.0SP3 system (e.g. known bugs in the TCP stack),
which could cause this error?

3. On the QNX system, we use the Packet Filter pseudo-device (pf) as a "firewall". Here's the relevant configuration for
pfctl:
####################
### NORMALIZATION
scrub in all
scrub out all
### INITIAL BLOCK ALL ON ANY ITF
block in all
block out all
### SERVICE: modbus server
pass in quick on en0 proto tcp from any to (en0) port 501 keep state (max-src-conn 10, max-src-conn-rate 5/10)
####################
Using this configuration the pf should work "connection-based" and therefore only block network communication if there
are too many connection requests within a certain amount of time. However, in our case, there's only one connection
which is permanently maintained (unless the connection is manually terminated, in which case the questions 1. and 2.
apply). Still: Could it be, that pf is somehow responsible for blocking the outgoing TCP response packets in such a way
that the described error is provoked?

Thanks in advance for any hints regarding this topic.

Regards,
Markus

Markus Kohler(deleted)

05/28/2013 9:06 AM

post101747

Re: TCP Retransmissions

Problem solved:
The pf configuration was wrong. The re-transmissions were caused by the employed source-tracking rules, causing the pf 
to accept incoming connections, but refuse actual data traffic over them as soon as one of the limits (10 connections or
 5 connections per 10 seconds) is exceeded.

We now employ a per-rule "global" maximum connection limit rather than connection based limits:
pass in quick on en0 proto tcp from any to (en0) port = modbus keep state (max 60)

This way, pf does reject any incoming connection if more than 60 connections are already established. But it also allows
 for "unlimited" data traffic via the existing connections, and this is essentially what we wanted.

We didn't find the source for the initial TCP re-transmissions though (which initially cause the client side to close 
the current TCP connection and open a new one), but this is possibly an EMC disturbance from a nearby hardware device.

Return

The text you entered is not a valid object ID
More Information
Object IDs begin with an object prefix and end with a number. For example, if you enter
artf2345
the application will jump directly to an artifact with the ID artf2345. Some valid object prefixes are:
artf	for an artifact
doc	for a document
page	for a project page
topc	for a discussion topic
wiki	for a wiki page