From: Junio C Hamano <gitster@pobox•com>
To: Patrick Steinhardt <ps@pks•im>
Cc: git@vger•kernel.org, Johannes Schindelin <Johannes.Schindelin@gmx•de>
Subject: Re: [PATCH 1/4] filter-branch: stop depending on Perl
Date: Tue, 15 Apr 2025 08:50:34 -0700 [thread overview]
Message-ID: <xmqq1ptto245.fsf@gitster.g> (raw)
In-Reply-To: <20250415-b4-pks-drop-perl-v1-1-c6addf175858@pks.im> (Patrick Steinhardt's message of "Tue, 15 Apr 2025 11:57:08 +0200")
Patrick Steinhardt <ps@pks•im> writes:
> While git-filter-branch(1) is written as a shell script, the
> `--state-branch` feature depends on Perl to save and extract the object
> ID mappings. This can lead to subtle breakage though:
>
> - We execute `perl` directly without respecting the `PERL_PATH`
> configured by the distribution. As such, it may happen that we use
> the wrong version of Perl.
>
> - We install the script unchanged even if Perl isn't available at all
> on the system, so using `--state-branch` would lead to failure
> altogether in that case.
>
> Fix this by dropping Perl and instead implementing the feature with
> shell scripting exclusively.
Excellent.
I just realized that the first bullet point above is a good argument
against "#!/usr/bin/env perl" as well. If we were to go the
"#!/usr/bin/env perl" direction, we should stop pretending with
$PERL_PATH that we'd consistently use the same version of Perl
chosen at build-install time, and we just trust users' $PATH, for
consistency.
> Signed-off-by: Patrick Steinhardt <ps@pks•im>
> ---
> git-filter-branch.sh | 37 +++++++++++++++++++------------------
> 1 file changed, 19 insertions(+), 18 deletions(-)
>
> diff --git a/git-filter-branch.sh b/git-filter-branch.sh
> index 3a51d4507c7..24fa317aaaa 100755
> --- a/git-filter-branch.sh
> +++ b/git-filter-branch.sh
> @@ -295,15 +295,18 @@ then
> if test -n "$state_commit"
> then
> echo "Populating map from $state_branch ($state_commit)" 1>&2
> - perl -e'open(MAP, "-|", "git show $ARGV[0]:filter.map") or die;
> - while (<MAP>) {
> - m/(.*):(.*)/ or die;
> - open F, ">../map/$1" or die;
> - print F "$2" or die;
> - close(F) or die;
> - }
> - close(MAP) or die;' "$state_commit" \
> - || die "Unable to load state from $state_branch:filter.map"
> +
> + git show "$state_commit:filter.map" >"$tempdir"/filter-map ||
> + die "Unable to load state from $state_branch:filter.map"
> + while read line
> + do
> + case "$line" in
> + *:*)
> + echo "${line%:*}" >../map/"${line#*:}";;
> + *)
> + die "Unable to load state from $state_branch:filter.map";;
> + esac
> + done <"$tempdir"/filter-map
Quite straight-forward. Do problematic bytes like backslashes
appear in the blob that Perl m/(.*):(.*)/ would have passed intact
without problems but may cause problems to the reimplementation? I
doubt it.
Not in a scope of this change, but use of "git show" in place of
"git cat-file blob" makes my skin somewhat tingle.
> else
> echo "Branch $state_branch does not exist. Will create" 1>&2
> fi
> @@ -633,15 +636,13 @@ if test -n "$state_branch"
> then
> echo "Saving rewrite state to $state_branch" 1>&2
> state_blob=$(
> - perl -e'opendir D, "../map" or die;
> - open H, "|-", "git hash-object -w --stdin" or die;
> - foreach (sort readdir(D)) {
> - next if m/^\.\.?$/;
> - open F, "<../map/$_" or die;
> - chomp($f = <F>);
> - print H "$_:$f\n" or die;
> - }
> - close(H) or die;' || die "Unable to save state")
> + for file in ../map/*
OK. * won't match . or .. so we no longer need to filter. Like the
original, if we had any random cruft file, we may copy such garbage
to the output, but that is not a new problem at all.
Nicely done.
> + do
> + from_commit=$(basename "$file")
> + to_commit=$(cat "$file")
> + echo "$from_commit:$to_commit"
> + done | git hash-object -w --stdin || die "Unable to save state"
> + )
> state_tree=$(printf '100644 blob %s\tfilter.map\n' "$state_blob" | git mktree)
> if test -n "$state_commit"
> then
next prev parent reply other threads:[~2025-04-15 15:50 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-15 9:57 [PATCH 0/4] Drop Perl dependency in a couple of subsystems Patrick Steinhardt
2025-04-15 9:57 ` [PATCH 1/4] filter-branch: stop depending on Perl Patrick Steinhardt
2025-04-15 15:50 ` Junio C Hamano [this message]
2025-04-15 9:57 ` [PATCH 2/4] request-pull: " Patrick Steinhardt
2025-04-15 16:16 ` Junio C Hamano
2025-04-16 15:07 ` Patrick Steinhardt
2025-04-15 9:57 ` [PATCH 3/4] Documentation: stop depending on Perl to massage user manual Patrick Steinhardt
2025-04-15 9:57 ` [PATCH 4/4] Documentation: stop depending on Perl to generate command list Patrick Steinhardt
2025-04-15 16:32 ` Junio C Hamano
2025-04-16 12:16 ` [PATCH v2 0/4] Drop Perl dependency in a couple of subsystems Patrick Steinhardt
2025-04-16 12:16 ` [PATCH v2 1/4] filter-branch: stop depending on Perl Patrick Steinhardt
2025-04-16 12:16 ` [PATCH v2 2/4] request-pull: " Patrick Steinhardt
2025-04-16 12:16 ` [PATCH v2 3/4] Documentation: stop depending on Perl to massage user manual Patrick Steinhardt
2025-04-16 12:16 ` [PATCH v2 4/4] Documentation: stop depending on Perl to generate command list Patrick Steinhardt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqq1ptto245.fsf@gitster.g \
--to=gitster@pobox$(echo .)com \
--cc=Johannes.Schindelin@gmx$(echo .)de \
--cc=git@vger$(echo .)kernel.org \
--cc=ps@pks$(echo .)im \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox