public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox•com>
To: Yiannis Marangos <yiannis.marangos@gmail•com>
Cc: git@vger•kernel.org
Subject: Re: [PATCH v2] Verify index file before we opportunistically update it
Date: Wed, 09 Apr 2014 16:05:33 -0700	[thread overview]
Message-ID: <xmqqwqeyxfwi.fsf@gitster.dls.corp.google.com> (raw)
In-Reply-To: <1397082869-24873-1-git-send-email-yiannis.marangos@gmail.com> (Yiannis Marangos's message of "Thu, 10 Apr 2014 01:34:29 +0300")

Yiannis Marangos <yiannis.marangos@gmail•com> writes:

> Before we proceed to "opportunistic update" we must verify that the
> current index file is the same as the one that we read before. There
> is a possible race if we don't do this:

Please somehow make it clear that the race is in general, and use of
"git rebase" in this description is merely an example.

>   1. process A calls git-rebase

... or does anything that uses the index.

>   2. process A applies 1st commit
>
>   3. process B calls git-status
>
>   4. process B reads index
>
>   5. process A applies 2nd commit

... or does anything that updates the index.

>   6. process B takes the lock, then overwrites process A's changes.
>
>   7. process A applies 3rd commit
>
> As an end result the 3rd commit will have a revert of the 2nd commit.

And the missing piece is...?  "When process B takes the lock, it
needs to make sure that the index hasn't changed since it read at
step 4."

> Signed-off-by: Yiannis Marangos <yiannis.marangos@gmail•com>
> ---

This is a place for you to describe how this version is different
from the previous round.  I am guessing that the only change is that
you removed the unnecessary printf() from the top of if-able, but I
didn't compare (nor read either versions carefully).

>  cache.h      |  3 +++
>  read-cache.c | 77 +++++++++++++++++++++++++++++++++++++++++++++++++++---------
>  2 files changed, 69 insertions(+), 11 deletions(-)
>
> diff --git a/cache.h b/cache.h
> index 107ac61..b0eedad 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -279,6 +279,7 @@ struct index_state {
>  		 initialized : 1;
>  	struct hashmap name_hash;
>  	struct hashmap dir_hash;
> +	unsigned char sha1[20];
>  };
>  
>  extern struct index_state the_index;
> @@ -456,6 +457,8 @@ extern int daemonize(void);
>  	} while (0)
>  
>  /* Initialize and use the cache information */
> +extern int verify_index_from(const struct index_state *, const char *path);
> +extern int verify_index(struct index_state *);

I do not think these should be extern; they are implementation
details of the opportunistic update, and update-if-able is the only
thing the outside world needs to know about, I think.

>  extern int read_index(struct index_state *);
>  extern int read_index_preload(struct index_state *, const struct pathspec *pathspec);
>  extern int read_index_from(struct index_state *, const char *path);
> diff --git a/read-cache.c b/read-cache.c
> index ba13353..a49a441 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -14,6 +14,7 @@
>  #include "resolve-undo.h"
>  #include "strbuf.h"
>  #include "varint.h"
> +#include "git-compat-util.h"

An unnecessary change; we include "cache.h" as the very first thing
in this file.

> @@ -1321,6 +1322,59 @@ static int verify_hdr(struct cache_header *hdr, unsigned long size)
>  	return 0;
>  }
>  
> +/* This function verifies if index_state has the correct sha1 of an index file.
> + * Don't die if we have any other failure, just return 0. */

	/*
         * Please write multi-line comment
         * like this.
         */

> +int verify_index_from(const struct index_state *istate, const char *path)
> +{

Also, this does not have to be extern.  At least not yet.

> +	int fd;
> +	struct stat st;
> +	struct cache_header *hdr;
> +	void *mmap_addr;
> +	size_t mmap_size;
> +
> +	if (!istate->initialized)
> +		return 0;
> +
> +	fd = open(path, O_RDONLY);
> +	if (fd < 0)
> +		return 0;
> +
> +	if (fstat(fd, &st))
> +		return 0;
> +
> +	/* file is too big */
> +	if (st.st_size > (size_t)st.st_size)
> +		return 0;
> +
> +	mmap_size = (size_t)st.st_size;
> +	if (mmap_size < sizeof(struct cache_header) + 20)
> +		return 0;
> +
> +	mmap_addr = mmap(NULL, mmap_size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);

Why PROT_WRITE?

> +	close(fd);
> +	if (mmap_addr == MAP_FAILED)
> +		return 0;
> +
> +	hdr = mmap_addr;
> +	if (verify_hdr(hdr, mmap_size) < 0)
> +		goto unmap;
> +
> +	if (hashcmp(istate->sha1, (unsigned char *)hdr + mmap_size - 20))
> +		goto unmap;
> +
> +	munmap(mmap_addr, mmap_size);
> +	return 1;
> +
> +unmap:
> +	munmap(mmap_addr, mmap_size);
> +	return 0;
> +}
> +
> +int verify_index(struct index_state *istate)
> +{
> +	return verify_index_from(istate, get_index_file());
> +}
> +
>  static int read_index_extension(struct index_state *istate,
>  				const char *ext, void *data, unsigned long sz)
>  {
> @@ -1445,7 +1499,7 @@ int read_index_from(struct index_state *istate, const char *path)
>  	struct stat st;
>  	unsigned long src_offset;
>  	struct cache_header *hdr;
> -	void *mmap;
> +	void *mmap_addr;

This introduces noise to the patch to make it harder to review, and
does not look a necessary change.  Am I mistaken?

>  	size_t mmap_size;
>  	struct strbuf previous_name_buf = STRBUF_INIT, *previous_name;
>  
> @@ -1468,15 +1522,16 @@ int read_index_from(struct index_state *istate, const char *path)
>  	if (mmap_size < sizeof(struct cache_header) + 20)
>  		die("index file smaller than expected");
>  
> -	mmap = xmmap(NULL, mmap_size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
> -	if (mmap == MAP_FAILED)
> +	mmap_addr = xmmap(NULL, mmap_size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
> +	if (mmap_addr == MAP_FAILED)
>  		die_errno("unable to map index file");
>  	close(fd);
>  
> -	hdr = mmap;
> +	hdr = mmap_addr;
>  	if (verify_hdr(hdr, mmap_size) < 0)
>  		goto unmap;
>  
> +	hashcpy(istate->sha1, (unsigned char *)hdr + mmap_size - 20);
>  	istate->version = ntohl(hdr->hdr_version);
>  	istate->cache_nr = ntohl(hdr->hdr_entries);
>  	istate->cache_alloc = alloc_nr(istate->cache_nr);
> @@ -1494,7 +1549,7 @@ int read_index_from(struct index_state *istate, const char *path)
>  		struct cache_entry *ce;
>  		unsigned long consumed;
>  
> -		disk_ce = (struct ondisk_cache_entry *)((char *)mmap + src_offset);
> +		disk_ce = (struct ondisk_cache_entry *)((char *)mmap_addr + src_offset);
>  		ce = create_from_disk(disk_ce, &consumed, previous_name);
>  		set_index_entry(istate, i, ce);
>  
> @@ -1512,21 +1567,21 @@ int read_index_from(struct index_state *istate, const char *path)
>  		 * in 4-byte network byte order.
>  		 */
>  		uint32_t extsize;
> -		memcpy(&extsize, (char *)mmap + src_offset + 4, 4);
> +		memcpy(&extsize, (char *)mmap_addr + src_offset + 4, 4);
>  		extsize = ntohl(extsize);
>  		if (read_index_extension(istate,
> -					 (const char *) mmap + src_offset,
> -					 (char *) mmap + src_offset + 8,
> +					 (const char *) mmap_addr + src_offset,
> +					 (char *) mmap_addr + src_offset + 8,
>  					 extsize) < 0)
>  			goto unmap;
>  		src_offset += 8;
>  		src_offset += extsize;
>  	}
> -	munmap(mmap, mmap_size);
> +	munmap(mmap_addr, mmap_size);
>  	return istate->cache_nr;
>  
>  unmap:
> -	munmap(mmap, mmap_size);
> +	munmap(mmap_addr, mmap_size);
>  	die("index file corrupt");
>  }
>  
> @@ -1779,7 +1834,7 @@ static int has_racy_timestamp(struct index_state *istate)
>  void update_index_if_able(struct index_state *istate, struct lock_file *lockfile)
>  {
>  	if ((istate->cache_changed || has_racy_timestamp(istate)) &&
> -	    !write_index(istate, lockfile->fd))
> +	    verify_index(istate) && !write_index(istate, lockfile->fd))
>  		commit_locked_index(lockfile);
>  	else
>  		rollback_lock_file(lockfile);

  reply	other threads:[~2014-04-09 23:05 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-09 22:06 [PATCH] Verify index file before we opportunistically update it Yiannis Marangos
2014-04-09 22:06 ` Yiannis Marangos
2014-04-09 22:34 ` [PATCH v2] " Yiannis Marangos
2014-04-09 23:05   ` Junio C Hamano [this message]
2014-04-09 22:52 ` [PATCH v3] " Yiannis Marangos
2014-04-10  5:22 ` [PATCH v4] " Yiannis Marangos
2014-04-10  5:34 ` [PATCH v5] " Yiannis Marangos
2014-04-10 10:40   ` Duy Nguyen
2014-04-10 11:57     ` Yiannis Marangos
2014-04-10 16:57     ` Junio C Hamano
2014-04-10 13:11 ` [PATCH v6] " Yiannis Marangos
2014-04-10 18:31 ` [PATCH v7 1/2] Add xpread() and xpwrite() Yiannis Marangos
2014-04-10 18:31   ` [PATCH v7 2/2] Verify index file before we opportunistically update it Yiannis Marangos
2014-04-10 19:28     ` Junio C Hamano
2014-04-11  2:57       ` Duy Nguyen
2014-04-11 19:24         ` Junio C Hamano
2014-04-11 20:43           ` Junio C Hamano
2014-04-11 23:30             ` Yiannis Marangos
2014-04-12  0:10             ` Duy Nguyen
2014-04-12  4:19               ` Junio C Hamano
2014-04-12  7:05                 ` Junio C Hamano
2014-04-12 10:13                 ` Duy Nguyen
2014-04-14 18:50                   ` Junio C Hamano
2014-04-11  7:47       ` Torsten Bögershausen
2014-04-11 15:58         ` Yiannis Marangos
2014-04-11 10:36       ` Duy Nguyen
2014-04-10 18:35   ` [PATCH v7 1/2] Add xpread() and xpwrite() Junio C Hamano
2014-04-10 18:44     ` Yiannis Marangos
2014-04-10 18:54 ` [PATCH v8 1/2] Add xpread() Yiannis Marangos
2014-04-10 19:12   ` Johannes Sixt
2014-04-10 19:20     ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqwqeyxfwi.fsf@gitster.dls.corp.google.com \
    --to=gitster@pobox$(echo .)com \
    --cc=git@vger$(echo .)kernel.org \
    --cc=yiannis.marangos@gmail$(echo .)com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox