* imap-send badly handles commit bodies beginning with "From <"
@ 2011-10-28 18:00 Andrew Eikum
2011-10-28 20:32 ` Jeff King
0 siblings, 1 reply; 8+ messages in thread
From: Andrew Eikum @ 2011-10-28 18:00 UTC (permalink / raw)
To: git
Ran into this today. I had a commit message that looked like:
---
Do something
>From <http://url>:
Words
---
I put it through imap-send to email it to my project, and ended up
with this output:
sending 1 messages
200% (2/1) done
On the server side, it was split into two mails on either side of that
commit message's From line with neither mail actually containing the
From line. To fix it, I just changed it to "Copied from <url>:" :-P
Ain't mbox grand?
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: imap-send badly handles commit bodies beginning with "From <" 2011-10-28 18:00 imap-send badly handles commit bodies beginning with "From <" Andrew Eikum @ 2011-10-28 20:32 ` Jeff King 2011-10-28 21:21 ` Andrew Eikum 2011-10-30 9:01 ` Magnus Bäck 0 siblings, 2 replies; 8+ messages in thread From: Jeff King @ 2011-10-28 20:32 UTC (permalink / raw) To: Andrew Eikum; +Cc: git On Fri, Oct 28, 2011 at 01:00:44PM -0500, Andrew Eikum wrote: > On the server side, it was split into two mails on either side of that > commit message's From line with neither mail actually containing the > From line. To fix it, I just changed it to "Copied from <url>:" :-P > > Ain't mbox grand? Mbox does have this problem, but I think in this case it is a particularly crappy implementation of mbox in imap-send. Look at imap-send.c:split_msg; it just looks for "From ". It should at least check for something that looks like a timestamp, like git-mailsplit does. Maybe mailsplit's is_from_line should be factored out so that it can be reused in imap-send. Want to work on a patch? -Peff ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: imap-send badly handles commit bodies beginning with "From <" 2011-10-28 20:32 ` Jeff King @ 2011-10-28 21:21 ` Andrew Eikum 2011-10-28 21:37 ` Jeff King 2011-10-30 9:01 ` Magnus Bäck 1 sibling, 1 reply; 8+ messages in thread From: Andrew Eikum @ 2011-10-28 21:21 UTC (permalink / raw) To: Jeff King; +Cc: Andrew Eikum, git On Fri, Oct 28, 2011 at 01:32:57PM -0700, Jeff King wrote: > Mbox does have this problem, but I think in this case it is a > particularly crappy implementation of mbox in imap-send. Look at > imap-send.c:split_msg; it just looks for "From ". > > It should at least check for something that looks like a timestamp, like > git-mailsplit does. Maybe mailsplit's is_from_line should be factored > out so that it can be reused in imap-send. Since we have a program called "mailsplit," wouldn't it make more sense to have imap-send use its implementation to split mail instead of sharing just the From line detection? > Want to work on a patch? I was hoping it'd be a quick matter of pulling mailsplit's implementation out of builtin and into the top level, but I see it's got some global variables that are tangled enough that I actually have to understand the code before I can pull it apart :) If no one beats me to it, I'll work on this next week. It's late on Friday and I'm moving house this weekend. Quick question, since I'm not intimately familiar with Git's code: I was thinking of creating a new compilation unit at the top level, mailutils.{c,h}, and referencing it from both imap-send.c and builtin/splitmail.c. Does that seem like the right approach? Is there an existing compilation unit I should be placing splitmail's guts into instead? Andrew ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: imap-send badly handles commit bodies beginning with "From <" 2011-10-28 21:21 ` Andrew Eikum @ 2011-10-28 21:37 ` Jeff King 0 siblings, 0 replies; 8+ messages in thread From: Jeff King @ 2011-10-28 21:37 UTC (permalink / raw) To: Andrew Eikum; +Cc: git On Fri, Oct 28, 2011 at 04:21:22PM -0500, Andrew Eikum wrote: > Since we have a program called "mailsplit," wouldn't it make more > sense to have imap-send use its implementation to split mail instead > of sharing just the From line detection? Potentially, yeah. I was thinking of just pulling over the from line detection (which is the real black magic bit), but it looks like imap-send's mbox handling could use some general attention (maybe it would be possible to not read the entire mbox into memory, for example). > I was hoping it'd be a quick matter of pulling mailsplit's > implementation out of builtin and into the top level, but I see it's > got some global variables that are tangled enough that I actually have > to understand the code before I can pull it apart :) > > If no one beats me to it, I'll work on this next week. It's late on > Friday and I'm moving house this weekend. No rush. Let us know if you have questions. > Quick question, since I'm not intimately familiar with Git's code: I > was thinking of creating a new compilation unit at the top level, > mailutils.{c,h}, and referencing it from both imap-send.c and > builtin/splitmail.c. Does that seem like the right approach? Is there > an existing compilation unit I should be placing splitmail's guts into > instead? Yes, I think a new file makes sense here. Make sure to update LIB_H and LIB_OBJS in the Makefile. -Peff ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: imap-send badly handles commit bodies beginning with "From <" 2011-10-28 20:32 ` Jeff King 2011-10-28 21:21 ` Andrew Eikum @ 2011-10-30 9:01 ` Magnus Bäck 2011-11-01 15:38 ` Jeff King 1 sibling, 1 reply; 8+ messages in thread From: Magnus Bäck @ 2011-10-30 9:01 UTC (permalink / raw) To: git; +Cc: Andrew Eikum, Jeff King On Friday, October 28, 2011 at 22:32 CEST, Jeff King <peff@peff•net> wrote: > On Fri, Oct 28, 2011 at 01:00:44PM -0500, Andrew Eikum wrote: > > > On the server side, it was split into two mails on either side > > of that commit message's From line with neither mail actually > > containing the From line. To fix it, I just changed it to "Copied > > from <url>:" :-P > > > > Ain't mbox grand? > > Mbox does have this problem, but I think in this case it is a > particularly crappy implementation of mbox in imap-send. Look at > imap-send.c:split_msg; it just looks for "From ". While there seems to be about a million different implementations of mbox creation and parsing, the relevant RFC[0] points to [1] as an authoritative source. The latter claims that lines matching "^From " denote a message boundary and that lines within a message that match the same pattern should be quoted with ">". That would suggest that the problem isn't imap-send.c but whatever code produces the mbox file in the first place. Of course, if that software isn't part of Git I guess we'll have to deal with the situation anyway. And whatever the RFCs say, we still need to be as compatible is possible with whatever software is out there. > It should at least check for something that looks like a timestamp, > like git-mailsplit does. Maybe mailsplit's is_from_line should be > factored out so that it can be reused in imap-send. I guess that's a reasonable "liberal in what you accept" mitigation. (As a sidenote, I'm getting the ">From" quoting in my maildir message files where no such quoting is expected, so "From" lines are shown as ">From" in my MUA. I don't know if it's Procmail screwing things up or what's going on.) [0] http://tools.ietf.org/html/rfc4155 [1] http://qmail.org./man/man5/mbox.html -- Magnus Bäck Opinions are my own and do not necessarily SW Configuration Manager represent the ones of my employer, etc. Sony Ericsson ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: imap-send badly handles commit bodies beginning with "From <" 2011-10-30 9:01 ` Magnus Bäck @ 2011-11-01 15:38 ` Jeff King 2011-11-01 16:06 ` Michael Haggerty 0 siblings, 1 reply; 8+ messages in thread From: Jeff King @ 2011-11-01 15:38 UTC (permalink / raw) To: Magnus Bäck; +Cc: git, Andrew Eikum On Sun, Oct 30, 2011 at 10:01:11AM +0100, Magnus Bäck wrote: > > Mbox does have this problem, but I think in this case it is a > > particularly crappy implementation of mbox in imap-send. Look at > > imap-send.c:split_msg; it just looks for "From ". > > While there seems to be about a million different implementations of > mbox creation and parsing, the relevant RFC[0] points to [1] as an > authoritative source. The latter claims that lines matching "^From " > denote a message boundary and that lines within a message that match > the same pattern should be quoted with ">". That would suggest that > the problem isn't imap-send.c but whatever code produces the mbox > file in the first place. Of course, if that software isn't part of > Git I guess we'll have to deal with the situation anyway. And whatever > the RFCs say, we still need to be as compatible is possible with > whatever software is out there. Right. If you properly quote and unquote "From " lines, then mbox can be unambiguous. But many pieces of software don't quote them (including git, I think, but I didn't check), so it's prudent when reading to look for something that actually appears to be a "From" line. If somebody wants to tackle >From quoting of commit messages in git-format-patch, they can certainly do so. In practice, it doesn't tend to come up (because sane readers expect there to be a date at the end of the line), so nobody has put forth the effort. -Peff ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: imap-send badly handles commit bodies beginning with "From <" 2011-11-01 15:38 ` Jeff King @ 2011-11-01 16:06 ` Michael Haggerty 2011-11-01 16:14 ` Jeff King 0 siblings, 1 reply; 8+ messages in thread From: Michael Haggerty @ 2011-11-01 16:06 UTC (permalink / raw) To: Jeff King; +Cc: Magnus Bäck, git, Andrew Eikum On 11/01/2011 04:38 PM, Jeff King wrote: > Right. If you properly quote and unquote "From " lines, then mbox can be > unambiguous. That is not quite true. The RFC says only that lines matching "^From " should be quoted, not lines matching "^>From " (or, generally, "^>*From "). So the quoting is lossy; it is *not* possible to tell whether a line starting with ">From " should be unquoted (it could have been ">From " in the original). Michael -- Michael Haggerty mhagger@alum•mit.edu http://softwareswirl.blogspot.com/ ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: imap-send badly handles commit bodies beginning with "From <" 2011-11-01 16:06 ` Michael Haggerty @ 2011-11-01 16:14 ` Jeff King 0 siblings, 0 replies; 8+ messages in thread From: Jeff King @ 2011-11-01 16:14 UTC (permalink / raw) To: Michael Haggerty; +Cc: Magnus Bäck, git, Andrew Eikum On Tue, Nov 01, 2011 at 05:06:48PM +0100, Michael Haggerty wrote: > On 11/01/2011 04:38 PM, Jeff King wrote: > > Right. If you properly quote and unquote "From " lines, then mbox can be > > unambiguous. > > That is not quite true. The RFC says only that lines matching "^From " > should be quoted, not lines matching "^>From " (or, generally, "^>*From > "). So the quoting is lossy; it is *not* possible to tell whether a > line starting with ">From " should be unquoted (it could have been > ">From " in the original). That was what I meant by "properly". Note that the second link Magnus mentioned (and which is referred to in the RFC in the paragraph immediately following the discussion of "from" quoting) discusses this explicitly. The real issue with mbox is not that it can't be done well, but that you have no clue which variant the writing end used. In practice, it works OK because it's simple and those corner cases just don't come up much (at least for a reasonably defensive reader). -Peff ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2011-11-01 16:14 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-10-28 18:00 imap-send badly handles commit bodies beginning with "From <" Andrew Eikum 2011-10-28 20:32 ` Jeff King 2011-10-28 21:21 ` Andrew Eikum 2011-10-28 21:37 ` Jeff King 2011-10-30 9:01 ` Magnus Bäck 2011-11-01 15:38 ` Jeff King 2011-11-01 16:06 ` Michael Haggerty 2011-11-01 16:14 ` Jeff King
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox