From: Taylor Blau <me@ttaylorr•com>
To: Derrick Stolee <stolee@gmail•com>
Cc: Junio C Hamano <gitster@pobox•com>, Taylor Blau <me@ttaylorr•com>,
git@vger•kernel.org, peff@peff•net, szeder.dev@gmail•com,
dstolee@microsoft•com
Subject: Re: [PATCH 4/4] commit-graph: close descriptors after mmap
Date: Fri, 24 Apr 2020 10:35:53 -0600 [thread overview]
Message-ID: <20200424163553.GA58219@syl.local> (raw)
In-Reply-To: <2232c379-d0ec-0b52-96b4-379438642785@gmail.com>
On Fri, Apr 24, 2020 at 09:17:16AM -0400, Derrick Stolee wrote:
> On 4/23/2020 6:04 PM, Junio C Hamano wrote:
> > Taylor Blau <me@ttaylorr•com> writes:
> >
> >> From: Jeff King <peff@peff•net>
> >>
> >> We don't ever refer to the descriptor after mmap-ing it. And keeping it
> >> open means we can run out of descriptors in degenerate cases (e.g.,
> >> thousands of split chain files). Let's close it as soon as possible.
> >
> > Yikes.
> >
> > Sorry, I should have looked at the use of mmap in this topioc more
> > carefully when we queued the series. It is an easy mistake to make
> > by anybody new to the API to leave it open while the region is in
> > use.
>
> You are right. I was new when first contributing the commit-graph. It
> was also easier to miss because we only had one commit-graph open at
> the time. Adding in the incremental file format led to multiple file
> descriptors being open.
>
> However, this idea of closing a descriptor after an mmap is new to
> me. So I thought about other situations where I made the same mistake.
> Please see the patch below.
It's new to me, too :). If I had known it beforehand, then I would have
written the fourth patch here myself. But, I didn't, so I am grateful to
Peff for teaching me something new here.
> > With this fix, with or without the other topics still in flight, I
> > do not think no code touches graph_fd. Should we remove the
> > graph_fd field from the structure as well?
>
> I agree that this should be done.
>
> Thanks,
> -Stolee
>
> -->8--
For what it's worth, this didn't apply quite right with 'git am -3 -c',
since it didn't seem to recognize that this was your scissors line. If I
edit your mail myself by replacing this line with '-- >8 --', then 'git
am' applies it just fine.
> From: Derrick Stolee <dstolee@microsoft•com>
> Date: Fri, 24 Apr 2020 13:11:13 +0000
> Subject: [PATCH] multi-pack-index: close file descriptor after mmap
>
> We recently discovered that the commit-graph was not closing its
> file descriptor after memory-mapping the file contents. After this
> mmap() succeeds, there is no need to keep the file descriptor open.
> In fact, there is signficant reason to close it so we do not run
> out of descriptors.
>
> This is entirely my fault for not knowing that we can have an open
> mmap while also closing the descriptor. Some could blame this on
> being a new contributor when the series was first submitted, but
> even years later this is still new information to me.
>
> That made me realize that I used the same pattern when opening a
> multi-pack-index. Since this file is not (yet) incremental, there
> will be at most one descriptor open for this reason. It is still
> worth fixing, especially if we extend the format to be incremental
> like the commit-graph.
>
> Signed-off-by: Derrick Stolee <dstolee@microsoft•com>
> ---
> midx.c | 4 +---
> midx.h | 2 --
> 2 files changed, 1 insertion(+), 5 deletions(-)
>
> diff --git a/midx.c b/midx.c
> index 1527e464a7..60d30e873b 100644
> --- a/midx.c
> +++ b/midx.c
> @@ -72,9 +72,9 @@ struct multi_pack_index *load_multi_pack_index(const char *object_dir, int local
> FREE_AND_NULL(midx_name);
>
> midx_map = xmmap(NULL, midx_size, PROT_READ, MAP_PRIVATE, fd, 0);
> + close(fd);
Right, we want to close as soon as we have mmaped the file.
>
> FLEX_ALLOC_STR(m, object_dir, object_dir);
> - m->fd = fd;
> m->data = midx_map;
> m->data_len = midx_size;
> m->local = local;
> @@ -190,8 +190,6 @@ void close_midx(struct multi_pack_index *m)
> return;
>
> munmap((unsigned char *)m->data, m->data_len);
> - close(m->fd);
> - m->fd = -1;
...and not down here. Thanks.
> for (i = 0; i < m->num_packs; i++) {
> if (m->packs[i])
> diff --git a/midx.h b/midx.h
> index e6fa356b5c..b18cf53bc4 100644
> --- a/midx.h
> +++ b/midx.h
> @@ -12,8 +12,6 @@ struct repository;
> struct multi_pack_index {
> struct multi_pack_index *next;
>
> - int fd;
> -
:). Even better!
> const unsigned char *data;
> size_t data_len;
>
> --
> 2.26.2
This looks great to me, and thanks for being proactive about the fix.
Reviewed-by: Taylor Blau <me@ttaylorr•com>
Thanks,
Taylor
next prev parent reply other threads:[~2020-04-24 16:36 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-23 21:40 [PATCH 0/4] commit-graph: handle file descriptor exhaustion Taylor Blau
2020-04-23 21:41 ` [PATCH 1/4] commit-graph.c: don't use discarded graph_name in error Taylor Blau
2020-04-23 21:41 ` [PATCH 2/4] t/test-lib.sh: make ULIMIT_FILE_DESCRIPTORS available to tests Taylor Blau
2020-04-23 21:41 ` [PATCH 3/4] commit-graph.c: gracefully handle file descriptor exhaustion Taylor Blau
2021-06-24 9:51 ` t5324-split-commit-graph.sh flaky due to assumptions about ulimit behavior Ævar Arnfjörð Bjarmason
2021-06-24 15:52 ` Jeff King
2020-04-23 21:41 ` [PATCH 4/4] commit-graph: close descriptors after mmap Taylor Blau
2020-04-23 22:04 ` Junio C Hamano
2020-04-24 3:56 ` Jeff King
2020-04-24 13:17 ` Derrick Stolee
2020-04-24 16:35 ` Taylor Blau [this message]
2020-04-24 20:02 ` Junio C Hamano
2020-04-27 10:57 ` Derrick Stolee
2020-04-23 21:43 ` [PATCH 0/4] commit-graph: handle file descriptor exhaustion Taylor Blau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200424163553.GA58219@syl.local \
--to=me@ttaylorr$(echo .)com \
--cc=dstolee@microsoft$(echo .)com \
--cc=git@vger$(echo .)kernel.org \
--cc=gitster@pobox$(echo .)com \
--cc=peff@peff$(echo .)net \
--cc=stolee@gmail$(echo .)com \
--cc=szeder.dev@gmail$(echo .)com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox