public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
From: Taylor Blau <me@ttaylorr•com>
To: Derrick Stolee <stolee@gmail•com>
Cc: Junio C Hamano <gitster@pobox•com>, Taylor Blau <me@ttaylorr•com>,
	git@vger•kernel.org, peff@peff•net, szeder.dev@gmail•com,
	dstolee@microsoft•com
Subject: Re: [PATCH 4/4] commit-graph: close descriptors after mmap
Date: Fri, 24 Apr 2020 10:35:53 -0600	[thread overview]
Message-ID: <20200424163553.GA58219@syl.local> (raw)
In-Reply-To: <2232c379-d0ec-0b52-96b4-379438642785@gmail.com>

On Fri, Apr 24, 2020 at 09:17:16AM -0400, Derrick Stolee wrote:
> On 4/23/2020 6:04 PM, Junio C Hamano wrote:
> > Taylor Blau <me@ttaylorr•com> writes:
> >
> >> From: Jeff King <peff@peff•net>
> >>
> >> We don't ever refer to the descriptor after mmap-ing it. And keeping it
> >> open means we can run out of descriptors in degenerate cases (e.g.,
> >> thousands of split chain files). Let's close it as soon as possible.
> >
> > Yikes.
> >
> > Sorry, I should have looked at the use of mmap in this topioc more
> > carefully when we queued the series.  It is an easy mistake to make
> > by anybody new to the API to leave it open while the region is in
> > use.
>
> You are right. I was new when first contributing the commit-graph. It
> was also easier to miss because we only had one commit-graph open at
> the time. Adding in the incremental file format led to multiple file
> descriptors being open.
>
> However, this idea of closing a descriptor after an mmap is new to
> me. So I thought about other situations where I made the same mistake.
> Please see the patch below.

It's new to me, too :). If I had known it beforehand, then I would have
written the fourth patch here myself. But, I didn't, so I am grateful to
Peff for teaching me something new here.

> > With this fix, with or without the other topics still in flight, I
> > do not think no code touches graph_fd.  Should we remove the
> > graph_fd field from the structure as well?
>
> I agree that this should be done.
>
> Thanks,
> -Stolee
>
> -->8--

For what it's worth, this didn't apply quite right with 'git am -3 -c',
since it didn't seem to recognize that this was your scissors line. If I
edit your mail myself by replacing this line with '-- >8 --', then 'git
am' applies it just fine.

> From: Derrick Stolee <dstolee@microsoft•com>
> Date: Fri, 24 Apr 2020 13:11:13 +0000
> Subject: [PATCH] multi-pack-index: close file descriptor after mmap
>
> We recently discovered that the commit-graph was not closing its
> file descriptor after memory-mapping the file contents. After this
> mmap() succeeds, there is no need to keep the file descriptor open.
> In fact, there is signficant reason to close it so we do not run
> out of descriptors.
>
> This is entirely my fault for not knowing that we can have an open
> mmap while also closing the descriptor. Some could blame this on
> being a new contributor when the series was first submitted, but
> even years later this is still new information to me.
>
> That made me realize that I used the same pattern when opening a
> multi-pack-index. Since this file is not (yet) incremental, there
> will be at most one descriptor open for this reason. It is still
> worth fixing, especially if we extend the format to be incremental
> like the commit-graph.
>
> Signed-off-by: Derrick Stolee <dstolee@microsoft•com>
> ---
>  midx.c | 4 +---
>  midx.h | 2 --
>  2 files changed, 1 insertion(+), 5 deletions(-)
>
> diff --git a/midx.c b/midx.c
> index 1527e464a7..60d30e873b 100644
> --- a/midx.c
> +++ b/midx.c
> @@ -72,9 +72,9 @@ struct multi_pack_index *load_multi_pack_index(const char *object_dir, int local
>  	FREE_AND_NULL(midx_name);
>
>  	midx_map = xmmap(NULL, midx_size, PROT_READ, MAP_PRIVATE, fd, 0);
> +	close(fd);

Right, we want to close as soon as we have mmaped the file.

>
>  	FLEX_ALLOC_STR(m, object_dir, object_dir);
> -	m->fd = fd;
>  	m->data = midx_map;
>  	m->data_len = midx_size;
>  	m->local = local;
> @@ -190,8 +190,6 @@ void close_midx(struct multi_pack_index *m)
>  		return;
>
>  	munmap((unsigned char *)m->data, m->data_len);
> -	close(m->fd);
> -	m->fd = -1;

...and not down here. Thanks.

>  	for (i = 0; i < m->num_packs; i++) {
>  		if (m->packs[i])
> diff --git a/midx.h b/midx.h
> index e6fa356b5c..b18cf53bc4 100644
> --- a/midx.h
> +++ b/midx.h
> @@ -12,8 +12,6 @@ struct repository;
>  struct multi_pack_index {
>  	struct multi_pack_index *next;
>
> -	int fd;
> -

:). Even better!

>  	const unsigned char *data;
>  	size_t data_len;
>
> --
> 2.26.2

This looks great to me, and thanks for being proactive about the fix.

  Reviewed-by: Taylor Blau <me@ttaylorr•com>

Thanks,
Taylor

  reply	other threads:[~2020-04-24 16:36 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-23 21:40 [PATCH 0/4] commit-graph: handle file descriptor exhaustion Taylor Blau
2020-04-23 21:41 ` [PATCH 1/4] commit-graph.c: don't use discarded graph_name in error Taylor Blau
2020-04-23 21:41 ` [PATCH 2/4] t/test-lib.sh: make ULIMIT_FILE_DESCRIPTORS available to tests Taylor Blau
2020-04-23 21:41 ` [PATCH 3/4] commit-graph.c: gracefully handle file descriptor exhaustion Taylor Blau
2021-06-24  9:51   ` t5324-split-commit-graph.sh flaky due to assumptions about ulimit behavior Ævar Arnfjörð Bjarmason
2021-06-24 15:52     ` Jeff King
2020-04-23 21:41 ` [PATCH 4/4] commit-graph: close descriptors after mmap Taylor Blau
2020-04-23 22:04   ` Junio C Hamano
2020-04-24  3:56     ` Jeff King
2020-04-24 13:17     ` Derrick Stolee
2020-04-24 16:35       ` Taylor Blau [this message]
2020-04-24 20:02       ` Junio C Hamano
2020-04-27 10:57         ` Derrick Stolee
2020-04-23 21:43 ` [PATCH 0/4] commit-graph: handle file descriptor exhaustion Taylor Blau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200424163553.GA58219@syl.local \
    --to=me@ttaylorr$(echo .)com \
    --cc=dstolee@microsoft$(echo .)com \
    --cc=git@vger$(echo .)kernel.org \
    --cc=gitster@pobox$(echo .)com \
    --cc=peff@peff$(echo .)net \
    --cc=stolee@gmail$(echo .)com \
    --cc=szeder.dev@gmail$(echo .)com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox