public inbox for netdev@vger.kernel.org 
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@osdl•org>
To: sri@us•ibm.com
Cc: netdev@vger•kernel.org, Steve Hill <steve.hill@dialogic•com>
Subject: Fw: Intermittent SCTP multihoming breakage
Date: Wed, 3 Jan 2007 15:46:34 -0800	[thread overview]
Message-ID: <20070103154634.b40d9cde.akpm@osdl.org> (raw)



Begin forwarded message:

Date: Wed, 3 Jan 2007 11:54:26 +0000
From: Steve Hill <steve.hill@dialogic•com>
To: Linux Kernel Mailing List <linux-kernel@vger•kernel.org>
Subject: Intermittent SCTP multihoming breakage



Apologies if I'm posting to the wrong list - the lksctp lists seem to be a
bit dead these days and a bit of Googling seemed to inidicate that SCTP
developemnt discussions might have moved here.

I'm running under the 2.6.16.1 kernel and have an intermittent problem
with the SCTP stack.  Having reviewed the git logs I can't see any
indication that the problem has been fixed in more recent kernels, but it
is very difficult to test since it is so intermittent.

I am running a multihomed connection between 2 machines, (2 NICs on
each machine, so 2 paths for the connection) and tcpdump shows heartbeat
requests and acks on both paths.  Putting data over the link correctly
sends it over the first path.

If I drop the traffic on one of the NICs then most of the time it
correctly fails over the the second path and I see the data being sent
and acknowledged correctly on the second path.  However, I also
intermittently see two failure conditions:

1. Sometimes, just after failing over to the second path I see an ABORT.
2. More frequently, the association stays up indefinately, with heartbeat
requests and acks on the second path, but no data chunks are sent even
though the transmit queue on the transmitting end appears to be full and
the socket is blocking writes.

I have been adding debugging to the kernel in an attempt to track down the
source of the second failure condition, and I am wondering if anyone else
has seen similar behaviour?

-- 
 - Steve Hill
   Software Engineer
   Dialogic
   Fordingbridge, Hampshire, UK
   +44-1425-651392
   steve.hill@dialogic•com
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger•kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

             reply	other threads:[~2007-01-03 23:46 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-01-03 23:46 Andrew Morton [this message]
2007-01-04  0:59 ` Fw: Intermittent SCTP multihoming breakage Sridhar Samudrala
2007-01-10 11:55   ` Steve Hill
2007-01-10 20:10     ` Sridhar Samudrala
2007-01-11 10:10       ` Steve Hill
2007-01-25 16:32         ` [Lksctp-developers] " Vlad Yasevich
2007-01-25 16:37           ` Vlad Yasevich
2007-01-10 20:49     ` Vlad Yasevich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070103154634.b40d9cde.akpm@osdl.org \
    --to=akpm@osdl$(echo .)org \
    --cc=netdev@vger$(echo .)kernel.org \
    --cc=sri@us$(echo .)ibm.com \
    --cc=steve.hill@dialogic$(echo .)com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox