public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox•com>
To: Jeff King <peff@peff•net>
Cc: Git Mailing List <git@vger•kernel.org>,
	Dan Carpenter <dan.carpenter@oracle•com>,
	Mark Einon <mark.einon@gmail•com>,
	Greg KH <gregkh@linuxfoundation•org>
Subject: Re: [RFC/PATCH] mailinfo: do not treat ">From" lines as in-body headers
Date: Tue, 16 Sep 2014 11:41:08 -0700	[thread overview]
Message-ID: <xmqq38br2yt7.fsf@gitster.dls.corp.google.com> (raw)
In-Reply-To: <xmqqy4tj30ny.fsf@gitster.dls.corp.google.com> (Junio C. Hamano's message of "Tue, 16 Sep 2014 11:01:05 -0700")

Junio C Hamano <gitster@pobox•com> writes:

>> I think you forgot to "git add" mbox.h. That being said, if we did go
>> this route, I do not see any reason to share the code at all. This can
>> be purely a mailinfo.c thing.
>
> OK.  A reroll coming today when I find time.

-- >8 --
From: Jeff King <peff@peff•net>
Date: Sat, 13 Sep 2014 21:30:38 -0400
Subject: [PATCH] mailinfo: make ">From" in-body header check more robust

Since commit 81c5cf7 (mailinfo: skip bogus UNIX From line inside
body, 2006-05-21), we have treated lines like ">From" in the body as
headers. This makes "git am" work for people who erroneously paste
the whole output from format-patch:

  From 12345abcd...fedcba543210 Mon Sep 17 00:00:00 2001
  From: them
  Subject: [PATCH] whatever

into their email body (assuming that an mbox writer then quotes
"From" as ">From", as otherwise we would actually mailsplit on the
in-body line).

However, this has false positives if somebody actually has a commit
body that starts with "From "; in this case we erroneously remove
the line entirely from the commit message. We can make this check
more robust by making sure the line actually looks like a real mbox
"From" line.

Inspect the line that begins with ">From " a more carefully to only
skip lines that match the expected pattern (note that the datestamp
part of the format-patch output is designed to be kept constant to
help those who write magic(5) entries).

Signed-off-by: Jeff King <peff@peff•net>
Signed-off-by: Junio C Hamano <gitster@pobox•com>
---
 builtin/mailinfo.c         | 17 ++++++++++++++++-
 t/t5100-mailinfo.sh        | 18 ++++++++++++++++++
 t/t5100/embed-from.expect  |  5 +++++
 t/t5100/embed-from.in      | 13 +++++++++++++
 t/t5100/quoted-from.expect |  3 +++
 t/t5100/quoted-from.in     | 10 ++++++++++
 6 files changed, 65 insertions(+), 1 deletion(-)
 create mode 100644 t/t5100/embed-from.expect
 create mode 100644 t/t5100/embed-from.in
 create mode 100644 t/t5100/quoted-from.expect
 create mode 100644 t/t5100/quoted-from.in

diff --git a/builtin/mailinfo.c b/builtin/mailinfo.c
index cf11c8d..2632fb0 100644
--- a/builtin/mailinfo.c
+++ b/builtin/mailinfo.c
@@ -288,6 +288,21 @@ static inline int cmp_header(const struct strbuf *line, const char *hdr)
 			line->buf[len] == ':' && isspace(line->buf[len + 1]);
 }
 
+#define SAMPLE "From e6807f3efca28b30decfecb1732a56c7db1137ee Mon Sep 17 00:00:00 2001\n"
+static int is_format_patch_separator(const char *line, int len)
+{
+	const char *cp;
+
+	if (len != strlen(SAMPLE))
+		return 0;
+	if (!skip_prefix(line, "From ", &cp))
+		return 0;
+	if (strspn(cp, "0123456789abcdef") != 40)
+		return 0;
+	cp += 40;
+	return !memcmp(SAMPLE + (cp - line), cp, strlen(SAMPLE) - (cp - line));
+}
+
 static int check_header(const struct strbuf *line,
 				struct strbuf *hdr_data[], int overwrite)
 {
@@ -329,7 +344,7 @@ static int check_header(const struct strbuf *line,
 
 	/* for inbody stuff */
 	if (starts_with(line->buf, ">From") && isspace(line->buf[5])) {
-		ret = 1; /* Should this return 0? */
+		ret = is_format_patch_separator(line->buf + 1, line->len - 1);
 		goto check_header_out;
 	}
 	if (starts_with(line->buf, "[PATCH]") && isspace(line->buf[7])) {
diff --git a/t/t5100-mailinfo.sh b/t/t5100-mailinfo.sh
index 3e64a7a..9e1ad1c 100755
--- a/t/t5100-mailinfo.sh
+++ b/t/t5100-mailinfo.sh
@@ -89,4 +89,22 @@ test_expect_success 'mailinfo on from header without name works' '
 
 '
 
+test_expect_success 'mailinfo finds headers after embedded From line' '
+	mkdir embed-from &&
+	git mailsplit -oembed-from "$TEST_DIRECTORY"/t5100/embed-from.in &&
+	test_cmp "$TEST_DIRECTORY"/t5100/embed-from.in embed-from/0001 &&
+	git mailinfo embed-from/msg embed-from/patch \
+	  <embed-from/0001 >embed-from/out &&
+	test_cmp "$TEST_DIRECTORY"/t5100/embed-from.expect embed-from/out
+'
+
+test_expect_success 'mailinfo on message with quoted >From' '
+	mkdir quoted-from &&
+	git mailsplit -oquoted-from "$TEST_DIRECTORY"/t5100/quoted-from.in &&
+	test_cmp "$TEST_DIRECTORY"/t5100/quoted-from.in quoted-from/0001 &&
+	git mailinfo quoted-from/msg quoted-from/patch \
+	  <quoted-from/0001 >quoted-from/out &&
+	test_cmp "$TEST_DIRECTORY"/t5100/quoted-from.expect quoted-from/msg
+'
+
 test_done
diff --git a/t/t5100/embed-from.expect b/t/t5100/embed-from.expect
new file mode 100644
index 0000000..06a3a38
--- /dev/null
+++ b/t/t5100/embed-from.expect
@@ -0,0 +1,5 @@
+Author: Commit Author
+Email: commit@example•com
+Subject: patch subject
+Date: Sat, 13 Sep 2014 21:13:23 -0400 
+
diff --git a/t/t5100/embed-from.in b/t/t5100/embed-from.in
new file mode 100644
index 0000000..5f3f84e
--- /dev/null
+++ b/t/t5100/embed-from.in
@@ -0,0 +1,13 @@
+From 1234567890123456789012345678901234567890 Mon Sep 17 00:00:00 2001
+From: Email Author <email@example•com>
+Date: Sun, 25 May 2008 00:38:18 -0700
+Subject: [PATCH] email subject
+
+>From 1234567890123456789012345678901234567890 Mon Sep 17 00:00:00 2001
+From: Commit Author <commit@example•com>
+Date: Sat, 13 Sep 2014 21:13:23 -0400
+Subject: patch subject
+
+patch body
+---
+patch
diff --git a/t/t5100/quoted-from.expect b/t/t5100/quoted-from.expect
new file mode 100644
index 0000000..8c9d48c
--- /dev/null
+++ b/t/t5100/quoted-from.expect
@@ -0,0 +1,3 @@
+>From the depths of history, we are stuck with the
+flaky mbox format.
+
diff --git a/t/t5100/quoted-from.in b/t/t5100/quoted-from.in
new file mode 100644
index 0000000..847e1c4
--- /dev/null
+++ b/t/t5100/quoted-from.in
@@ -0,0 +1,10 @@
+From 1234567890123456789012345678901234567890 Mon Sep 17 00:00:00 2001
+From: Author Name <somebody@example•com>
+Date: Sun, 25 May 2008 00:38:18 -0700
+Subject: [PATCH] testing quoted >From
+
+>From the depths of history, we are stuck with the
+flaky mbox format.
+
+---
+patch
-- 
2.1.0-420-g23b5121

  reply	other threads:[~2014-09-16 18:41 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1410472786-14552-1-git-send-email-mark.einon@gmail.com>
     [not found] ` <1410472786-14552-5-git-send-email-mark.einon@gmail.com>
2014-09-13  9:37   ` [PATCH 4/8] staging: et131x: Remove ununsed statistics Dan Carpenter
2014-09-13 15:45     ` Greg KH
2014-09-13 19:41       ` Dan Carpenter
2014-09-13 20:36       ` Jeff King
2014-09-13 20:47         ` Mark Einon
2014-09-13 20:57           ` Dan Carpenter
2014-09-13 21:06             ` Mark Einon
2014-09-13 21:09             ` Dan Carpenter
2014-09-13 21:25               ` [RFC/PATCH] mailinfo: do not treat ">From" lines as in-body headers Jeff King
2014-09-13 22:57                 ` brian m. carlson
2014-09-14  0:47                   ` Jeff King
2014-09-14  0:55                     ` Junio C Hamano
2014-09-14  1:01                       ` Jeff King
2014-09-14  1:30                         ` Jeff King
2014-09-15 18:56                           ` Junio C Hamano
2014-09-15 20:15                             ` Junio C Hamano
2014-09-16  0:19                               ` Jeff King
2014-09-16 18:01                                 ` Junio C Hamano
2014-09-16 18:41                                   ` Junio C Hamano [this message]
2014-09-16 20:29                                     ` Jeff King
2014-09-16  0:12                             ` Jeff King
2014-09-15 17:55                         ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqq38br2yt7.fsf@gitster.dls.corp.google.com \
    --to=gitster@pobox$(echo .)com \
    --cc=dan.carpenter@oracle$(echo .)com \
    --cc=git@vger$(echo .)kernel.org \
    --cc=gregkh@linuxfoundation$(echo .)org \
    --cc=mark.einon@gmail$(echo .)com \
    --cc=peff@peff$(echo .)net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox