public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
From: Shreyansh Paliwal <shreyanshpaliwalcmsmn@gmail•com>
To: git@vger•kernel.org
Cc: ben.knoble@gmail•com, gitster@pobox•com, philipoakley@iee•email,
	Shreyansh Paliwal <shreyanshpaliwalcmsmn@gmail•com>
Subject: [PATCH v3] send-email: validate charset name in 8bit encoding prompt
Date: Thu, 26 Feb 2026 21:46:34 +0530	[thread overview]
Message-ID: <20260226165559.187261-1-shreyanshpaliwalcmsmn@gmail.com> (raw)
In-Reply-To: <20260224143624.23678-1-shreyanshpaliwalcmsmn@gmail.com>

When a non-ASCII character is detected in the body or subject of the email
the user is prompted with,

        Which 8bit encoding should I declare [UTF-8]? foo

After this the input string is validated by the regex, based on the fact
that the charset string will be minimum 4 characters [1]. If the string is
more than 4 letters the email is sent, if not then a second prompt to
confirm is asked to the user,

        Are you sure you want to use <foo> [y/N]? y

This relies on a length based regex heuristic check to validate the user
input, and can allow clearly invalid charset names to pass if the input is
greater than 4 characters.

Add a semantic validation of the charset name using the
Encode::find_encoding() module of perl. If the encoding is not recognized,
warn the user and ask for confirmation before proceeding. After this
validation the lenght based validation becomes redundant and also breaks
flow, so change the regex of valid input to any non blank string.

Introduce a dedicated helper for confirmation handling that can be reused
both by ask() and the custom 8bit prompt flow. Make the encoding warning
logic specific to the 8bit prompt, this reduces the load on ask(), and
improves maintainability.

Additionally, the wording of the first prompt can confuse the user if not
read properly or under any default assumptions for a yes/no prompt. Change
the wording to make it explicitly clear to the user that the prompt needs a
string input, UTF-8 being the default.

The intended flow is,

        Declare which 8bit encoding to use [default: UTF-8]? foobar
        warning: 'foobar' does not appear to be a valid charset name.
        Are you sure you want to use <foobar> [y/N]?

[1]- https://github.com/git/git/commit/852a15d748034eec87adbee73a72689c8936fb8b

Signed-off-by: Shreyansh Paliwal <shreyanshpaliwalcmsmn@gmail•com>
---
Changes in v2:
 - Added a helper function confirm_ask() to handle yes/no confirmation prompts.
 - Added the validation and warning logic in the 8bit prompt instead of ask().

 git-send-email.perl   | 36 +++++++++++++++++++++++++++++-------
 t/t9001-send-email.sh |  2 +-
 2 files changed, 30 insertions(+), 8 deletions(-)

diff --git a/git-send-email.perl b/git-send-email.perl
index cd4b316ddc..3230b80701 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -23,6 +23,7 @@
 use Git::LoadCPAN::Error qw(:try);
 use Git;
 use Git::I18N;
+use Encode qw(find_encoding);

 Getopt::Long::Configure qw/ pass_through /;

@@ -984,6 +985,18 @@ sub get_patch_subject {
 	}
 }

+sub confirm_ask {
+	my ($resp) = @_;
+	my $term = term();
+	return 0
+		unless defined $term->IN and defined fileno($term->IN) and
+		       defined $term->OUT and defined fileno($term->OUT);
+	my $yesno = $term->readline(
+		# TRANSLATORS: please keep [y/N] as is.
+		sprintf(__("Are you sure you want to use <%s> [y/N]? "), $resp));
+	return defined $yesno && $yesno =~ /y/i;
+}
+
 sub ask {
 	my ($prompt, %arg) = @_;
 	my $valid_re = $arg{valid_re};
@@ -1008,10 +1021,7 @@ sub ask {
 			return $resp;
 		}
 		if ($confirm_only) {
-			my $yesno = $term->readline(
-				# TRANSLATORS: please keep [y/N] as is.
-				sprintf(__("Are you sure you want to use <%s> [y/N]? "), $resp));
-			if (defined $yesno && $yesno =~ /y/i) {
+			if (confirm_ask($resp)) {
 				return $resp;
 			}
 		}
@@ -1044,9 +1054,21 @@ sub file_declares_8bit_cte {
 	foreach my $f (sort keys %broken_encoding) {
 		print "    $f\n";
 	}
-	$auto_8bit_encoding = ask(__("Which 8bit encoding should I declare [UTF-8]? "),
-				  valid_re => qr/.{4}/, confirm_only => 1,
-				  default => "UTF-8");
+	while(1) {
+		my $encoding = ask(__("Declare which 8bit encoding to use [default: UTF-8]? "),
+		valid_re => qr/^\S+$/,
+		default  => "UTF-8");
+		next unless defined $encoding;
+		if (find_encoding($encoding)) {
+			$auto_8bit_encoding = $encoding;
+			last;
+		}
+		printf STDERR __("warning: '%s' does not appear to be a valid charset name.\n"), $encoding;
+		if (confirm_ask($encoding)) {
+			$auto_8bit_encoding = $encoding;
+			last;
+		}
+	}
 }

 if (!$force) {
diff --git a/t/t9001-send-email.sh b/t/t9001-send-email.sh
index e56e0c8d77..24f6c76aee 100755
--- a/t/t9001-send-email.sh
+++ b/t/t9001-send-email.sh
@@ -1691,7 +1691,7 @@ test_expect_success $PREREQ 'asks about and fixes 8bit encodings' '
 			email-using-8bit >stdout &&
 	grep "do not declare a Content-Transfer-Encoding" stdout &&
 	grep email-using-8bit stdout &&
-	grep "Which 8bit encoding" stdout &&
+	grep "Declare which 8bit encoding to use" stdout &&
 	grep -E "Content|MIME" msgtxt1 >actual &&
 	test_cmp content-type-decl actual
 '

Range-diff against v2:
1:  954c1dae9f ! 1:  748bb03a00 send-email: validate charset name in 8bit encoding prompt
    @@ Commit message
         When a non-ASCII character is detected in the body or subject of the email
         the user is prompted with,

    -      Which 8bit encoding should I declare [UTF-8]? foo
    +            Which 8bit encoding should I declare [UTF-8]? foo

         After this the input string is validated by the regex, based on the fact
         that the charset string will be minimum 4 characters [1]. If the string is
         more than 4 letters the email is sent, if not then a second prompt to
         confirm is asked to the user,

    -      Are you sure you want to use <foo> [y/N]? y
    +            Are you sure you want to use <foo> [y/N]? y

         This relies on a length based regex heuristic check to validate the user
         input, and can allow clearly invalid charset names to pass if the input is
    @@ Commit message
         validation the lenght based validation becomes redundant and also breaks
         flow, so change the regex of valid input to any non blank string.

    +    Introduce a dedicated helper for confirmation handling that can be reused
    +    both by ask() and the custom 8bit prompt flow. This makes the encoding
    +    warning logic specific to the 8bit prompt, reduces the load on ask(), and
    +    improves maintainability.
    +
         Additionally, the wording of the first prompt can confuse the user if not
         read properly or under any default assumptions for a yes/no prompt. Change
         the wording to make it explicitly clear to the user that the prompt needs a
         string input, UTF-8 being the default.

    +    The intended flow is,
    +
    +            Declare which 8bit encoding to use [default: UTF-8]? foobar
    +            warning: 'foobar' does not appear to be a valid charset name.
    +            Are you sure you want to use <foobar> [y/N]?
    +
    +    [1]- https://github.com/git/git/commit/852a15d748034eec87adbee73a72689c8936fb8b
    +
         Signed-off-by: Shreyansh Paliwal <shreyanshpaliwalcmsmn@gmail•com>

      ## git-send-email.perl ##
    @@ git-send-email.perl
      Getopt::Long::Configure qw/ pass_through /;

     @@ git-send-email.perl: sub get_patch_subject {
    + 	}
    + }
    +
    ++sub confirm_ask {
    ++	my ($resp) = @_;
    ++	my $term = term();
    ++	return 0
    ++		unless defined $term->IN and defined fileno($term->IN) and
    ++		       defined $term->OUT and defined fileno($term->OUT);
    ++	my $yesno = $term->readline(
    ++		# TRANSLATORS: please keep [y/N] as is.
    ++		sprintf(__("Are you sure you want to use <%s> [y/N]? "), $resp));
    ++	return defined $yesno && $yesno =~ /y/i;
    ++}
    ++
      sub ask {
      	my ($prompt, %arg) = @_;
      	my $valid_re = $arg{valid_re};
    -+	my $warn_invalid = $arg{warn_invalid};
    - 	my $default = $arg{default};
    - 	my $confirm_only = $arg{confirm_only};
    - 	my $resp;
     @@ git-send-email.perl: sub ask {
    - 			return $default;
    - 		}
    - 		if (!defined $valid_re or $resp =~ /$valid_re/) {
    --			return $resp;
    -+			if ($warn_invalid) {
    -+				if (find_encoding($resp)) {
    -+					return $resp;
    -+				} else {
    -+					printf STDERR __("warning: '%s' does not appear to be a valid charset name.\n"), $resp;
    -+				}
    -+			} else {
    -+				return $resp;
    -+			}
    + 			return $resp;
      		}
      		if ($confirm_only) {
    - 			my $yesno = $term->readline(
    +-			my $yesno = $term->readline(
    +-				# TRANSLATORS: please keep [y/N] as is.
    +-				sprintf(__("Are you sure you want to use <%s> [y/N]? "), $resp));
    +-			if (defined $yesno && $yesno =~ /y/i) {
    ++			if (confirm_ask($resp)) {
    + 				return $resp;
    + 			}
    + 		}
     @@ git-send-email.perl: sub file_declares_8bit_cte {
      	foreach my $f (sort keys %broken_encoding) {
      		print "    $f\n";
      	}
     -	$auto_8bit_encoding = ask(__("Which 8bit encoding should I declare [UTF-8]? "),
     -				  valid_re => qr/.{4}/, confirm_only => 1,
    -+	$auto_8bit_encoding = ask(__("Declare which 8bit encoding to use [default: UTF-8]? "),
    -+				  valid_re => qr/^\S+$/, confirm_only => 1,
    -+				  warn_invalid => 1,
    - 				  default => "UTF-8");
    +-				  default => "UTF-8");
    ++	while(1) {
    ++		my $encoding = ask(__("Declare which 8bit encoding to use [default: UTF-8]? "),
    ++		valid_re => qr/^\S+$/,
    ++		default  => "UTF-8");
    ++		next unless defined $encoding;
    ++		if (find_encoding($encoding)) {
    ++			$auto_8bit_encoding = $encoding;
    ++			last;
    ++		}
    ++		printf STDERR __("warning: '%s' does not appear to be a valid charset name.\n"), $encoding;
    ++		if (confirm_ask($encoding)) {
    ++			$auto_8bit_encoding = $encoding;
    ++			last;
    ++		}
    ++	}
      }

    + if (!$force) {

      ## t/t9001-send-email.sh ##
     @@ t/t9001-send-email.sh: test_expect_success $PREREQ 'asks about and fixes 8bit encodings' '
--
2.53.0.154.g7c02d39fc2.dirty

  parent reply	other threads:[~2026-02-26 16:56 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-20 14:50 [RFC] send-email: UTF-8 encoding in subject line Shreyansh Paliwal
2026-02-21  2:28 ` Ben Knoble
2026-02-21 13:38   ` Shreyansh Paliwal
2026-02-21 17:30     ` Junio C Hamano
2026-02-22 14:03       ` Shreyansh Paliwal
2026-02-22 14:53         ` Philip Oakley
2026-02-22 15:00         ` D. Ben Knoble
2026-02-22 15:52           ` Shreyansh Paliwal
2026-02-23 21:38             ` Ben Knoble
2026-02-24  7:55               ` [GSOC] Discuss: Refactoring in order to reduce global state Shreyansh Paliwal
2026-02-22 14:53       ` [RFC] send-email: UTF-8 encoding in subject line D. Ben Knoble
2026-02-24 14:33 ` [PATCH] send-email: validate charset name in 8bit encoding prompt Shreyansh Paliwal
2026-02-24 21:11   ` Junio C Hamano
2026-02-24 21:37   ` [PATCH v2] " Shreyansh Paliwal
2026-02-24 22:06     ` Junio C Hamano
2026-02-24 22:20       ` Shreyansh Paliwal
2026-02-25 16:37     ` D. Ben Knoble
2026-02-26 17:32       ` Shreyansh Paliwal
2026-02-26 16:16   ` Shreyansh Paliwal [this message]
2026-02-26 18:45     ` [PATCH v3] " Junio C Hamano
2026-02-26 19:06       ` Junio C Hamano
2026-02-28  8:41         ` Shreyansh Paliwal
2026-02-28  8:36       ` Shreyansh Paliwal
2026-02-28 11:20   ` [PATCH v4] " Shreyansh Paliwal
2026-02-28 21:16     ` D. Ben Knoble
2026-03-02 16:10     ` Junio C Hamano
2026-03-03 19:06       ` Shreyansh Paliwal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260226165559.187261-1-shreyanshpaliwalcmsmn@gmail.com \
    --to=shreyanshpaliwalcmsmn@gmail$(echo .)com \
    --cc=ben.knoble@gmail$(echo .)com \
    --cc=git@vger$(echo .)kernel.org \
    --cc=gitster@pobox$(echo .)com \
    --cc=philipoakley@iee$(echo .)email \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox