From: Patrick Steinhardt <ps@pks•im>
To: Karthik Nayak <karthik.188@gmail•com>
Cc: git@vger•kernel.org, gitster@pobox•com, shejialuo@gmail•com
Subject: Re: [PATCH v3 7/8] reftable: add code to facilitate consistency checks
Date: Wed, 24 Sep 2025 07:54:48 +0200 [thread overview]
Message-ID: <aNOHqEq5qxXrOCX7@pks.im> (raw)
In-Reply-To: <20250918-228-reftable-introduce-consistency-checks-v3-7-271af03eb34d@gmail.com>
On Thu, Sep 18, 2025 at 10:11:48AM +0200, Karthik Nayak wrote:
> diff --git a/reftable/fsck.c b/reftable/fsck.c
> new file mode 100644
> index 0000000000..785e4b43e8
> --- /dev/null
> +++ b/reftable/fsck.c
> @@ -0,0 +1,112 @@
> +#include "basics.h"
> +#include "reftable-fsck.h"
> +#include "stack.h"
> +
> +static bool valid_table_name(const char *name, uint64_t *min_update_index,
> + uint64_t *max_update_index)
> +{
> + const char *ptr = name;
> + char *endptr;
> +
> + /* strtoull doesn't set errno on success */
> + errno = 0;
> +
> + *min_update_index = strtoull(ptr, &endptr, 16);
> + if (errno == EINVAL)
> + return false;
strtoull may also return ERANGE. In general, shouldn't we abort whenever
errno is non-zero here?
> + ptr = endptr;
> +
> + if (strncmp(ptr, "-", 1))
> + return false;
Better:
if (*ptr != '-')
return false;
> + ptr++;
> +
> + *max_update_index = strtoull(ptr, &endptr, 16);
> + if (errno == EINVAL)
> + return false;
> + ptr = endptr;
> +
> + if (*ptr != '-')
> + return false;
> + ptr++;
> +
> + strtoul(ptr, &endptr, 16);
> + if (errno == EINVAL)
> + return false;
> + ptr = endptr;
> +
> + if (strcmp(ptr, ".ref") && strcmp(ptr, ".log"))
> + return false;
Yup, makes sense. We don't do so ourselves, but in theory it is possible
for tables to have a ".log" suffix. If so, they are expected to only
contain reflog records.
> + return true;
> +}
> +
> +static int stack_check_all_files_in_dir(struct reftable_stack *stack,
> + reftable_fsck_report_fn report_fn,
> + void *cb_data)
> +{
> + DIR *dir = opendir(stack->reftable_dir);
I think it would make sense to move this function call close to the
conditional.
> + struct reftable_fsck_info info;
> + struct dirent *d = NULL;
> + uint64_t min, max;
> + int err = 0;
> +
> + if (!dir)
> + return 0;
> +
> + while ((d = readdir(dir))) {
> + if (!strcmp(d->d_name, "tables.list"))
> + continue;
> +
> + if ((d->d_name[0] == '.' &&
> + (d->d_name[1] == '\0' ||
> + (d->d_name[1] == '.' && d->d_name[2] == '\0'))))
> + continue;
> +
> + if (d->d_type == DT_REG) {
> + if (!valid_table_name(d->d_name, &min, &max)) {
> + info.error = REFTABLE_FSCK_ERROR_TABLE_NAME;
> + info.msg = "file with invalid table name";
> + info.path = d->d_name;
> +
> + err |= report_fn(&info, cb_data);
> + }
One problem with this is that this is racy with concurrent writers. We
don't recognize the "tables.list.lock" file, and neither do we recognize
"0x*-0x*.{ref,log}.temp.XXXXXX"-style files.
Would it be a better approach be to instead go through table names as
loaded by the stack? The reftable code already knows to prune unknown
files anyway, so I don't think we should scan for any other files.
> + } else {
> + info.error = REFTABLE_FSCK_ERROR_INVALID_FILE_TYPE;
> + info.msg = "file with unexpected type";
> + info.path = d->d_name;
> +
> + err |= report_fn(&info, cb_data);
> + }
> + }
> +
> + closedir(dir);
> + return err;
> +}
> +
> +static int stack_checks(struct reftable_stack *stack,
> + reftable_fsck_report_fn report_fn,
> + void *cb_data)
> +{
> + struct reftable_buf msg = REFTABLE_BUF_INIT;
> + char **names = NULL;
This variable is unused.
> + int err = 0;
> +
> + if (stack == NULL)
> + goto out;
Why should someone ever pass a `NULL` stack?
> + err |= stack_check_all_files_in_dir(stack, report_fn, cb_data);
> +
> +out:
> + free_names(names);
> + reftable_buf_release(&msg);
> + return err;
> +}
> +
> +int reftable_fsck_check(struct reftable_stack *stack,
> + reftable_fsck_report_fn report_fn,
> + reftable_fsck_verbose_fn verbose_fn,
> + void *cb_data)
> +{
> + verbose_fn("Checking reftable: stack checks", cb_data);
> + return stack_checks(stack, report_fn, cb_data);
Nit: having this extra function call to `stack_checks()` feels a bit
weird as it could just as well be inlined. Is this preparing for a
future change?
> +}
> diff --git a/reftable/reftable-fsck.h b/reftable/reftable-fsck.h
> new file mode 100644
> index 0000000000..5e13ac9f02
> --- /dev/null
> +++ b/reftable/reftable-fsck.h
> @@ -0,0 +1,42 @@
> +#ifndef REFTABLE_FSCK_H
> +#define REFTABLE_FSCK_H
> +
> +#include "reftable-stack.h"
> +
> +enum reftable_fsck_error {
> + /* Non regular file in the reftable directory */
> + REFTABLE_FSCK_ERROR_INVALID_FILE_TYPE = 0,
> + /* Invalid table name */
> + REFTABLE_FSCK_ERROR_TABLE_NAME,
> + /* Used for bounds checking, must be last */
> + REFTABLE_FSCK_MAX_VALUE
Let's add a trailing comma here.
Patrick
next prev parent reply other threads:[~2025-09-24 5:54 UTC|newest]
Thread overview: 96+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-19 12:20 [PATCH 0/5] refs/reftable: add fsck checks Karthik Nayak
2025-08-19 12:21 ` [PATCH 1/5] fsck: order 'fsck_msg_type' alphabetically Karthik Nayak
2025-08-19 12:21 ` [PATCH 2/5] refs/reftable: add fsck check for checking the table name Karthik Nayak
2025-08-26 16:21 ` shejialuo
2025-09-01 13:33 ` Karthik Nayak
2025-09-03 13:39 ` shejialuo
2025-08-19 12:21 ` [PATCH 3/5] refs/reftable: add fsck check for number of tables Karthik Nayak
2025-08-26 16:33 ` shejialuo
2025-09-01 13:40 ` Karthik Nayak
2025-08-26 16:44 ` shejialuo
2025-09-01 13:52 ` Karthik Nayak
2025-08-19 12:21 ` [PATCH 4/5] refs/reftable: add fsck check for trailing newline Karthik Nayak
2025-08-19 12:21 ` [PATCH 5/5] refs/reftable: add fsck check for incorrect update index Karthik Nayak
2025-08-26 16:39 ` [PATCH 0/5] refs/reftable: add fsck checks shejialuo
2025-09-01 13:52 ` Karthik Nayak
2025-09-02 7:05 ` [PATCH v2 " Karthik Nayak
2025-09-02 7:05 ` [PATCH v2 1/5] fsck: order 'fsck_msg_type' alphabetically Karthik Nayak
2025-09-02 22:25 ` Junio C Hamano
2025-09-08 13:00 ` Karthik Nayak
2025-09-02 7:05 ` [PATCH v2 2/5] refs/reftable: add fsck check for checking the table name Karthik Nayak
2025-09-03 8:07 ` Patrick Steinhardt
2025-09-03 16:51 ` shejialuo
2025-09-09 13:49 ` Karthik Nayak
2025-09-09 8:42 ` Karthik Nayak
2025-09-02 7:05 ` [PATCH v2 3/5] refs/reftable: add fsck check for number of tables Karthik Nayak
2025-09-03 8:07 ` Patrick Steinhardt
2025-09-15 9:27 ` Karthik Nayak
2025-09-02 7:05 ` [PATCH v2 4/5] refs/reftable: add fsck check for trailing newline Karthik Nayak
2025-09-02 22:38 ` Junio C Hamano
2025-09-03 8:07 ` Patrick Steinhardt
2025-09-02 7:05 ` [PATCH v2 5/5] refs/reftable: add fsck check for incorrect update index Karthik Nayak
2025-09-02 22:42 ` Junio C Hamano
2025-09-18 8:11 ` Karthik Nayak
2025-09-18 8:11 ` [PATCH v3 0/8] refs/reftable: add consistency checks Karthik Nayak
2025-09-18 8:11 ` [PATCH v3 1/8] refs: remove unused headers Karthik Nayak
2025-09-18 8:11 ` [PATCH v3 2/8] refs: move consistency check msg to generic layer Karthik Nayak
2025-09-18 8:11 ` [PATCH v3 3/8] reftable: check for trailing newline in 'tables.list' Karthik Nayak
2025-09-18 15:36 ` Junio C Hamano
2025-09-23 15:42 ` Karthik Nayak
2025-09-24 5:54 ` Patrick Steinhardt
2025-09-24 10:02 ` Karthik Nayak
2025-09-24 7:24 ` Kristoffer Haugsbakk
2025-09-24 11:06 ` Karthik Nayak
2025-09-18 8:11 ` [PATCH v3 4/8] reftable: ensure tables in a stack use sequential update indices Karthik Nayak
2025-09-24 5:54 ` Patrick Steinhardt
2025-09-24 11:20 ` Karthik Nayak
2025-09-24 18:04 ` Junio C Hamano
2025-09-24 20:13 ` Karthik Nayak
2025-09-25 6:12 ` Patrick Steinhardt
2025-09-25 16:22 ` Junio C Hamano
2025-09-18 8:11 ` [PATCH v3 5/8] Documentation/fsck-msgids: remove duplicate msg id Karthik Nayak
2025-09-18 8:11 ` [PATCH v3 6/8] fsck: order 'fsck_msg_type' alphabetically Karthik Nayak
2025-09-18 8:11 ` [PATCH v3 7/8] reftable: add code to facilitate consistency checks Karthik Nayak
2025-09-24 5:54 ` Patrick Steinhardt [this message]
2025-09-24 18:40 ` Karthik Nayak
2025-09-25 6:14 ` Patrick Steinhardt
2025-09-18 8:11 ` [PATCH v3 8/8] refs/reftable: add fsck check for checking the table name Karthik Nayak
2025-09-24 5:54 ` Patrick Steinhardt
2025-09-24 18:44 ` Karthik Nayak
2025-09-26 7:25 ` [PATCH v4 0/7] refs/reftable: add consistency checks Karthik Nayak
2025-09-26 7:25 ` [PATCH v4 1/7] refs: remove unused headers Karthik Nayak
2025-09-26 7:25 ` [PATCH v4 2/7] refs: move consistency check msg to generic layer Karthik Nayak
2025-09-26 7:25 ` [PATCH v4 3/7] reftable: check for trailing newline in 'tables.list' Karthik Nayak
2025-10-02 11:44 ` Patrick Steinhardt
2025-10-06 12:02 ` Karthik Nayak
2025-09-26 7:25 ` [PATCH v4 4/7] Documentation/fsck-msgids: remove duplicate msg id Karthik Nayak
2025-09-26 7:25 ` [PATCH v4 5/7] fsck: order 'fsck_msg_type' alphabetically Karthik Nayak
2025-09-26 7:25 ` [PATCH v4 6/7] reftable: add code to facilitate consistency checks Karthik Nayak
2025-10-02 11:44 ` Patrick Steinhardt
2025-09-26 7:25 ` [PATCH v4 7/7] refs/reftable: add fsck check for checking the table name Karthik Nayak
2025-10-02 11:44 ` Patrick Steinhardt
2025-10-06 12:05 ` Karthik Nayak
2025-09-26 21:08 ` [PATCH v4 0/7] refs/reftable: add consistency checks Junio C Hamano
2025-10-06 14:22 ` [PATCH v5 " Karthik Nayak
2025-10-06 14:22 ` [PATCH v5 1/7] refs: remove unused headers Karthik Nayak
2025-10-06 14:23 ` [PATCH v5 2/7] refs: move consistency check msg to generic layer Karthik Nayak
2025-10-06 14:23 ` [PATCH v5 3/7] reftable: check for trailing newline in 'tables.list' Karthik Nayak
2025-10-06 14:23 ` [PATCH v5 4/7] Documentation/fsck-msgids: remove duplicate msg id Karthik Nayak
2025-10-06 14:23 ` [PATCH v5 5/7] fsck: order 'fsck_msg_type' alphabetically Karthik Nayak
2025-10-06 14:23 ` [PATCH v5 6/7] reftable: add code to facilitate consistency checks Karthik Nayak
2025-10-06 14:23 ` [PATCH v5 7/7] refs/reftable: add fsck check for checking the table name Karthik Nayak
2025-10-07 2:32 ` Jeff King
2025-10-07 8:45 ` Karthik Nayak
2025-10-06 22:08 ` [PATCH v5 0/7] refs/reftable: add consistency checks Junio C Hamano
2025-10-07 8:47 ` Karthik Nayak
2025-10-07 15:11 ` Junio C Hamano
2025-10-07 12:11 ` [PATCH v6 " Karthik Nayak
2025-10-07 12:11 ` [PATCH v6 1/7] refs: remove unused headers Karthik Nayak
2025-10-07 12:11 ` [PATCH v6 2/7] refs: move consistency check msg to generic layer Karthik Nayak
2025-10-07 12:11 ` [PATCH v6 3/7] reftable: check for trailing newline in 'tables.list' Karthik Nayak
2025-10-07 12:11 ` [PATCH v6 4/7] Documentation/fsck-msgids: remove duplicate msg id Karthik Nayak
2025-10-07 12:11 ` [PATCH v6 5/7] fsck: order 'fsck_msg_type' alphabetically Karthik Nayak
2025-10-07 12:11 ` [PATCH v6 6/7] reftable: add code to facilitate consistency checks Karthik Nayak
2025-10-07 12:11 ` [PATCH v6 7/7] refs/reftable: add fsck check for checking the table name Karthik Nayak
2025-10-07 13:26 ` [PATCH v6 0/7] refs/reftable: add consistency checks Patrick Steinhardt
2025-10-07 16:25 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aNOHqEq5qxXrOCX7@pks.im \
--to=ps@pks$(echo .)im \
--cc=git@vger$(echo .)kernel.org \
--cc=gitster@pobox$(echo .)com \
--cc=karthik.188@gmail$(echo .)com \
--cc=shejialuo@gmail$(echo .)com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox