public inbox for netdev@vger.kernel.org 
 help / color / mirror / Atom feed
From: Evgeniy Polyakov <johnpol@2ka•mipt.ru>
To: David Miller <davem@davemloft•net>
Cc: rdreier@cisco•com, ak@suse•de, tom@opengridcomputing•com,
	netdev@vger•kernel.org, akpm@osdl•org
Subject: Re: RDMA will be reverted
Date: Tue, 25 Jul 2006 09:51:28 +0400	[thread overview]
Message-ID: <20060725055127.GA5103@2ka.mipt.ru> (raw)
In-Reply-To: <20060724.150613.54186472.davem@davemloft.net>

On Mon, Jul 24, 2006 at 03:06:13PM -0700, David Miller (davem@davemloft•net) wrote:
> Don't get too excited about VJ netchannels, more and more roadblocks
> to their practicality are being found every day.
> 
> For example, my idea to allow ESTABLISHED TCP socket demux to be done
> before netfilter is flawed.  Connection tracking and NAT can change
> the packet ID and loop it back to us to hit exactly an ESTABLISHED TCP
> socket, therefore we must always hit netfilter first.

There is no problem with netfilter and process context processing - when
skb is removed from hardware list/array and is being processed by
netfilter in netchannel (or in process context in general), 
there is no problems if changed skb will be rerouted into different 
queue and state.

> All the original costs of route, netfilter, TCP socket lookup all
> reappear as we make VJ netchannels fit all the rules of real practical
> systems, eliminating their gains entirely.  I will also note in
> passing that papers on related ideas, such as the Exokernel stuff, are
> very careful to not address the issue of how practical 1) their demux
> engine is and 2) the negative side effects of userspace TCP
> implementations.  For an example of the latter, if you have some 1GB
> JAVA process you do not want to wake that monster up just to do some
> ACK processing or TCP window updates, yet if you don't you violate
> TCP's rules and risk spurious unnecessary retransmits.

I still plan to continue userspace implementation.

If gigantic-java-monster (tm) is going to read some data - it has been
awakened already, thus it is in the memeory (with linked tcp lib), so
there is zero overhead.

> Furthermore, the VJ netchannel gains can be partially obtained from
> generic stateless facilities that we are going to get anyways.
> Networking chips supporting multiple MSI-X vectors, choosen by hashing
> the flow ID, can move TCP processing to "end nodes" which are cpu
> threads in this case, by having each such MSI-X vector target a
> different cpu thread.

And if that CPU is very busy?
Linux should somehow tell NIC that some CPUs are valid and some are not
right now, but not in a second, so scheduler must be tightly bound with
network internals.

Just my 2 coins.

-- 
	Evgeniy Polyakov

  parent reply	other threads:[~2006-07-25  5:55 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-06-28  7:07 RDMA will be reverted David Miller
2006-06-28  7:41 ` Evgeniy Polyakov
2006-06-28 14:56 ` Tom Tucker
2006-06-28 15:01 ` Steve Wise
2006-06-29 16:54 ` Roland Dreier
2006-06-29 17:32   ` YOSHIFUJI Hideaki / 吉藤英明
2006-06-29 17:35     ` Roland Dreier
2006-06-29 17:40       ` YOSHIFUJI Hideaki / 吉藤英明
2006-06-29 19:46   ` David Miller
2006-06-29 20:11     ` Tom Tucker
2006-06-29 20:16       ` Tom Tucker
2006-06-29 20:19       ` David Miller
2006-06-29 20:47         ` Tom Tucker
2006-06-29 20:53           ` David Miller
2006-06-29 21:28             ` Tom Tucker
2006-06-29 21:25         ` Andi Kleen
2006-06-29 20:42       ` James Morris
2006-06-30 20:51     ` Roland Dreier
2006-06-30 21:16       ` David Miller
2006-06-30 23:01         ` Tom Tucker
2006-07-01 14:26           ` Andi Kleen
2006-07-04 18:34             ` Andy Gay
2006-07-04 20:47               ` Andi Kleen
2006-07-04 22:22                 ` Andy Gay
2006-07-04 23:01                   ` Andi Kleen
2006-07-04 23:48                     ` Andy Gay
2006-07-05  0:04                       ` Andi Kleen
2006-07-04 20:34             ` Roland Dreier
2006-07-24 22:06               ` David Miller
2006-07-24 23:10                 ` Andi Kleen
2006-07-24 23:22                   ` David Miller
2006-07-25  0:02                     ` Andi Kleen
2006-07-25  0:29                       ` Rick Jones
2006-07-25  0:45                         ` David Miller
2006-07-25  0:55                           ` Rick Jones
2006-07-25  1:04                             ` Andi Kleen
2006-07-25  1:21                             ` David Miller
2006-07-25 16:29                               ` Rick Jones
2006-07-25 16:32                                 ` Andi Kleen
2006-07-25  1:03                           ` Rick Jones
2006-07-25  1:42                         ` Andi Kleen
2006-07-25  5:51                 ` Evgeniy Polyakov [this message]
2006-07-25  6:48                   ` David Miller
2006-07-25  6:59                     ` Evgeniy Polyakov
2006-07-25  7:33                       ` David Miller
2006-07-25  7:42                         ` Evgeniy Polyakov
2006-07-05 17:09             ` Tom Tucker
2006-07-05 17:50               ` Steve Wise
2006-07-24 22:25                 ` David Miller
2006-07-24 22:47                   ` Caitlin Bestler
2006-07-24 22:23               ` David Miller
2006-07-24 22:57                 ` Caitlin Bestler
2006-07-01 21:45           ` David Miller
2006-07-04 20:34             ` Roland Dreier
2006-07-05 18:27               ` David Miller
2006-07-05 20:29                 ` Roland Dreier
2006-07-06  3:03                   ` David Miller
2006-07-06  5:25                     ` Tom Tucker
2006-07-06 14:08                       ` Herbert Xu
2006-07-06 17:36                         ` Tom Tucker
2006-07-07  0:03                           ` Herbert Xu
2006-07-07  0:32                             ` Tom Tucker
2006-07-07  6:53                       ` David Miller
2006-07-07  8:11                         ` What is RDMA (was: RDMA will be reverted) Herbert Xu
2006-07-07 18:25                           ` Steve Wise
2006-07-11  8:17                             ` Herbert Xu
2006-07-11 13:27                               ` Steve Wise
2006-07-24 22:29                           ` What is RDMA David Miller
2006-07-24 22:34                             ` Rick Jones
2006-07-24 22:39                               ` David Miller
2006-07-24 22:49                               ` Andi Kleen
2006-07-07 13:29                         ` RDMA will be reverted Tom Tucker
  -- strict thread matches above, loose matches on Subject: below --
2006-07-06 13:26 Caitlin Bestler
2006-07-25 19:59 Tom Tucker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060725055127.GA5103@2ka.mipt.ru \
    --to=johnpol@2ka$(echo .)mipt.ru \
    --cc=ak@suse$(echo .)de \
    --cc=akpm@osdl$(echo .)org \
    --cc=davem@davemloft$(echo .)net \
    --cc=netdev@vger$(echo .)kernel.org \
    --cc=rdreier@cisco$(echo .)com \
    --cc=tom@opengridcomputing$(echo .)com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox