public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox•com>
To: Karsten Blees <karsten.blees@gmail•com>
Cc: Git List <git@vger•kernel.org>
Subject: Re: [PATCH] Documentation/i18n.txt: clarify character encoding support
Date: Wed, 17 Jun 2015 13:45:42 -0700	[thread overview]
Message-ID: <xmqqr3pa5aix.fsf@gitster.dls.corp.google.com> (raw)
In-Reply-To: <557EA421.5050706@gmail.com> (Karsten Blees's message of "Mon, 15 Jun 2015 12:08:33 +0200")

Karsten Blees <karsten.blees@gmail•com> writes:

>> I do not think the removal of the text makes much sense here unless
>> you add the equivalent to the new text below.
>> 
>>>   - The contents of the blob objects are uninterpreted sequences
>>>     of bytes.  There is no encoding translation at the core
>>>     level.
>>>  
>>> - - The commit log messages are uninterpreted sequences of non-NUL
>>> -   bytes.
>>> + - Pathnames are encoded in UTF-8 normalization form C. This
>> 
>> That is true only on some systems like OSX (with HFS+) and Windows,
>> no?  BSDs in general and Linux do not do any such mangling IIRC.
>
> Modern Unices don't need any such mangling because UTF-8 NFC should
> be the default system encoding. I'm not sure for BSDs, but it has
> been the default on all major Linux distros for more than 10 years.

So?  All major distros do not have to worry (and do not even need to
know).  As I said,...

>> I
>> am OK with mangling described as a notable oddball to warn users,
>> though; i.e. not as a norm as your new text suggests but as an
>> exception.

... I am OK to describe "pathnames are mangled into UTF-8 NFC on
certain filesystems" as a warning.  I am OK if we encourage the use
of UTF-8, especially if a project wants to be forward looking
(i.e. it may currently be a monoculture but may become cross
platform in the future).  I just do not want to see us saying "you
*must* encode your path in UTF-8 NFC".

> ISO-8859-x file names may be fine if you won't ever need to:
> - use git-web, JGit, gitk, git-gui...
> - exchange repos with "normal" (UTF-8) Unices, Mac and Windows systems
> - publish your work on a git hosting service (and expect file and
>   ref names to show up correctly in the web interface)
> - store the repo on Unicode-based file systems (JFS, Joliet, UDF,
>   exFat, NTFS, HFS, CIFS...)

Yes, that is exatly what I said, isn't it?  "Use whatever works for
your project, we do not dictate."

> These restrictions are not that obvious when you start a new git
> project,...

Or any project for that matter, not limited to "git project", no?
Perhaps that is a moot point by now, as everything in the workd
seems to be a "git project" these days.

  reply	other threads:[~2015-06-17 20:45 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-13 20:24 [PATCH] Documentation/i18n.txt: clarify character encoding support Karsten Blees
2015-06-15  0:12 ` Junio C Hamano
2015-06-15 10:08   ` Karsten Blees
2015-06-17 20:45     ` Junio C Hamano [this message]
2015-07-01 19:10       ` [PATCH v2] " Karsten Blees
2015-07-02  5:25         ` Torsten Bögershausen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqr3pa5aix.fsf@gitster.dls.corp.google.com \
    --to=gitster@pobox$(echo .)com \
    --cc=git@vger$(echo .)kernel.org \
    --cc=karsten.blees@gmail$(echo .)com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox