public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
From: "Torsten Bögershausen" <tboegi@web•de>
To: "Jakub Narębski" <jnareb@gmail•com>, git@vger•kernel.org
Subject: Re: [BUG?] iconv used as textconv, and spurious ^M on added lines on Windows
Date: Fri, 31 Mar 2017 14:38:05 +0200	[thread overview]
Message-ID: <264c72d0-9558-fa0d-e5ee-eaca894538be@web.de> (raw)
In-Reply-To: <feaeade7-aeb5-fa67-ab29-9106aeadb2a6@gmail.com>

On 30.03.17 21:35, Jakub Narębski wrote:
> Hello,
> 
> Recently I had to work on a project which uses legacy 8-bit encoding
> (namely cp1250 encoding) instead of utf-8 for text files (LaTeX
> documents).  My terminal, that is Git Bash from Git for Windows is set
> up for utf-8.
> 
> I wanted for "git diff" and friends to return something sane on said
> utf-8 terminal, instead of mojibake.  There is 'encoding'
> gitattribute... but it works only for GUI ('git gui', that is).
> 
> Therefore I have (ab)used textconv facility to convert from cp1250 of
> file encoding to utf-8 encoding of console.
> 
> I have set the following in .gitattributes file:
> 
>   ## LaTeX documents in cp1250 encoding
>   *.tex text diff=mylatex
> 
> The 'mylatex' driver is defined as:
> 
>   [diff "mylatex"]
>         xfuncname = "^(\\\\((sub)*section|chapter|part)\\*{0,1}\\{.*)$"
>         wordRegex = "\\\\[a-zA-Z]+|[{}]|\\\\.|[^\\{}[:space:]]+"
>         textconv  = \"C:/Program Files/Git/usr/bin/iconv.exe\" -f cp1250 -t utf-8
>         cachetextconv = true
> 
> And everything would be all right... if not the fact that Git appends
> spurious ^M to added lines in the `git diff` output.  Files use CRLF
> end-of-line convention (the native MS Windows one).
> 
>   $ git diff test.tex
>   diff --git a/test.tex b/test.tex
>   index 029646e..250ab16 100644
>   --- a/test.tex
>   +++ b/test.tex
>   @@ -1,4 +1,4 @@
>   -\documentclass{article}
>   +\documentclass{mwart}^M
>   
>    \usepackage[cp1250]{inputenc}
>    \usepackage{polski}
> 
> What gives?  Why there is this ^M tacked on the end of added lines,
> while it is not present in deleted lines, nor in content lines?
> 
> Puzzled.
> 
> P.S. Git has `i18n.commitEncoding` and `i18n.logOutputEncoding`; pity
> that it doesn't supports in core `encoding` attribute together with
> having `i18n.outputEncoding`.
> --
> Jakub Narębski
> 
> 
Is there a chance to give us a receipt how to reproduce it?
A complete test script or ?
(I don't want to speculate, if the invocation of iconv is the problem,
 where stdout is not in "binary mode", or however this is called under Windows)





  parent reply	other threads:[~2017-03-31 12:38 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-30 19:35 [BUG?] iconv used as textconv, and spurious ^M on added lines on Windows Jakub Narębski
2017-03-30 20:00 ` Jeff King
2017-03-31 13:24   ` Jakub Narębski
2017-04-01  6:08     ` Jeff King
2017-04-01 18:31       ` Jakub Narębski
2017-04-02  7:45         ` Jeff King
2017-04-02 11:40           ` Jakub Narębski
2017-03-31 12:38 ` Torsten Bögershausen [this message]
2017-03-31 19:44   ` Jakub Narębski
2017-04-02  4:34     ` Torsten Bögershausen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=264c72d0-9558-fa0d-e5ee-eaca894538be@web.de \
    --to=tboegi@web$(echo .)de \
    --cc=git@vger$(echo .)kernel.org \
    --cc=jnareb@gmail$(echo .)com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox