public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
From: "Shawn O. Pearce" <spearce@spearce•org>
To: Trond Myklebust <Trond.Myklebust@netapp•com>
Cc: "J. Bruce Fields" <bfields@fieldses•org>,
	Junio C Hamano <gitster@pobox•com>,
	logank@sent•com, Christian Holtje <docwhat@gmail•com>,
	git@vger•kernel.org, Trond Myklebust <trond@netapp•com>
Subject: Re: pread() over NFS (again) [1.5.5.4]
Date: Sun, 29 Jun 2008 20:32:03 -0400	[thread overview]
Message-ID: <20080630003203.GJ11793@spearce.org> (raw)
In-Reply-To: <1214578229.7437.14.camel@localhost>

Trond Myklebust <Trond.Myklebust@netapp•com> wrote:
> On Thu, 2008-06-26 at 22:57 -0400, J. Bruce Fields wrote:
> > On Thu, Jun 26, 2008 at 04:38:40PM -0700, Junio C Hamano wrote:
> > > logank@sent•com writes:
> > > 
> > > > On Jun 26, 2008, at 1:56 PM, Junio C Hamano wrote:
> > > >
> > > >>> "The file shouldn't be short unless someone truncated it, or there
> > > >>> is a bug in index-pack.  Neither is very likely, but I don't think
> > > >>> we would want to retry pread'ing the same block forever.
> > > >>
> > > >> I don't think we would want to retry even once.  Return value of 0
> > > >> from
> > > >> pread is defined to be an EOF, isn't it?
> > > >
> > > > No, it seems to be a simple error-out in this case. We have 2.4.20
> > > > systems with nfs-utils 0.3.3 and used to frequently get the same error
> > > > while pushing. I made a similar change back in February and haven't
> > > > had a problem since:
> > > >
> > > > diff --git a/index-pack.c b/index-pack.c
> > > > index 5ac91ba..85c8bdb 100644
> > > > --- a/index-pack.c
> > > > +++ b/index-pack.c
> > > > @@ -313,7 +313,14 @@ static void *get_data_from_pack(struct
> > > > object_entry *obj)
> > > > 	src = xmalloc(len);
> > > > 	data = src;
> > > > 	do {
> > > > +		// It appears that if multiple threads read across NFS, the
> > > > +		// second read will fail. I know this is awful, but we wait for
> > > > +		// a little bit and try again.
> > > > 		ssize_t n = pread(pack_fd, data + rdy, len - rdy, from + rdy);
> > > > +		if (n <= 0) {
> > > > +			sleep(1);
> > > > +			n = pread(pack_fd, data + rdy, len - rdy, from + rdy);
> > > > +		}
> > > > 		if (n <= 0)
> > > > 			die("cannot pread pack file: %s", strerror(errno));
> > > > 		rdy += n;
> > > >
> > > > I use a sleep request since it seems less likely that the other thread
> > > > will have an outstanding request after a second of waiting.
> > > 
> > > Gaah.  Don't we have NFS experts in house?  Bruce, perhaps?
> > 
> > Trond, you don't have any idea why a 2.6.9-42.0.8.ELsmp client (2.4.28
> > server) might be returning spurious 0's from pread()?
> > 
> > Seems like everything is happening from that one client--the file isn't
> > being simultaneously accessed from the server or from another client.
> 
> Is the file only being read, or could there be a simultaneous write to
> the same file? I'm surmising this could be an effect resulting from
> simultaneous cache invalidations: prior to Linux 2.6.20 or so, we
> weren't rigorously following the VFS/VM rules for page locking, and so
> page cache invalidation in particular could have some curious
> side-effects.

The file was created and opened O_CREAT|O_EXCL|O_RDWR, by this
process, written linearly using write(2), without any lseeks.
We kept the file descriptor open and starting issuing pread(2)
calls for earlier offsets we had alread written.  One of those
kicks back EOF far too early (and results in this bug report).

Note the only accesses we are using is write(2) and pread(2), and
once we start reading we don't ever go back to writing.  The pread(2)
calls are typically issued in ascending offsets, and we read each
position only once.  This is to try and take advantage of any
read-ahead the kernel may be able to do.  The pread(2) calls are
rarely (if ever) on a block/page boundary.

Nobody else should know about this file. Its written to a temporary
name and no other well behaved processes would attempt to read
the file until it gets closed and renamed to its final destination.
We haven't reached that far in the processing when we get this error,
so there should be only one file descriptor open on the inode, and
its the same one that wrote the data.

-- 
Shawn.

  reply	other threads:[~2008-06-30  0:33 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-06-26 16:40 pread() over NFS (again) [1.5.5.4] Christian Holtje
2008-06-26 20:46 ` Shawn O. Pearce
2008-06-26 20:56   ` Junio C Hamano
2008-06-26 21:05     ` Shawn O. Pearce
2008-06-26 21:36       ` Christian Holtje
2008-06-26 22:04       ` Junio C Hamano
2008-06-26 22:07         ` Shawn O. Pearce
2008-06-26 23:36     ` logank
2008-06-26 23:38       ` Junio C Hamano
2008-06-27  2:57         ` J. Bruce Fields
2008-06-27 14:50           ` Trond Myklebust
2008-06-30  0:32             ` Shawn O. Pearce [this message]
2008-06-30 19:09               ` Nicolas Pitre
2008-06-27  2:54       ` J. Bruce Fields
2008-06-27 13:44         ` Christian Holtje
2008-06-27 13:54       ` Christian Holtje

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080630003203.GJ11793@spearce.org \
    --to=spearce@spearce$(echo .)org \
    --cc=Trond.Myklebust@netapp$(echo .)com \
    --cc=bfields@fieldses$(echo .)org \
    --cc=docwhat@gmail$(echo .)com \
    --cc=git@vger$(echo .)kernel.org \
    --cc=gitster@pobox$(echo .)com \
    --cc=logank@sent$(echo .)com \
    --cc=trond@netapp$(echo .)com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox