public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox•com>
To: Karsten Blees <karsten.blees@gmail•com>
Cc: Git List <git@vger•kernel.org>
Subject: Re: [PATCH] Documentation/i18n.txt: clarify character encoding support
Date: Sun, 14 Jun 2015 17:12:10 -0700	[thread overview]
Message-ID: <xmqqmw01ltid.fsf@gitster.dls.corp.google.com> (raw)
In-Reply-To: <557C9161.6020703@gmail.com> (Karsten Blees's message of "Sat, 13 Jun 2015 22:24:01 +0200")

Karsten Blees <karsten.blees@gmail•com> writes:

> diff --git a/Documentation/i18n.txt b/Documentation/i18n.txt
> index e9a1d5d..e5f6233 100644
> --- a/Documentation/i18n.txt
> +++ b/Documentation/i18n.txt
> @@ -1,18 +1,28 @@
> -At the core level, Git is character encoding agnostic.
> -
> - - The pathnames recorded in the index and in the tree objects
> -   are treated as uninterpreted sequences of non-NUL bytes.
> -   What readdir(2) returns are what are recorded and compared
> -   with the data Git keeps track of, which in turn are expected
> -   to be what lstat(2) and creat(2) accepts.  There is no such
> -   thing as pathname encoding translation.
> +Git is to some extent character encoding agnostic.

I do not think the removal of the text makes much sense here unless
you add the equivalent to the new text below.

>   - The contents of the blob objects are uninterpreted sequences
>     of bytes.  There is no encoding translation at the core
>     level.
>  
> - - The commit log messages are uninterpreted sequences of non-NUL
> -   bytes.
> + - Pathnames are encoded in UTF-8 normalization form C. This

That is true only on some systems like OSX (with HFS+) and Windows,
no?  BSDs in general and Linux do not do any such mangling IIRC.  I
am OK with mangling described as a notable oddball to warn users,
though; i.e. not as a norm as your new text suggests but as an
exception.

> +   platforms. If file system APIs don't use UTF-8 (which may be
> +   file system specific), it is recommended to stick to pure
> +   ASCII file names.

Hmph, who endorsed such a recommendation?  It is recommended to
stick to whatever naming scheme that would not cause troubles to
project participants.  If your participants all want to (and can)
use ISO-8859-1, we do not discourage them from doing so.

  reply	other threads:[~2015-06-15  0:12 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-13 20:24 [PATCH] Documentation/i18n.txt: clarify character encoding support Karsten Blees
2015-06-15  0:12 ` Junio C Hamano [this message]
2015-06-15 10:08   ` Karsten Blees
2015-06-17 20:45     ` Junio C Hamano
2015-07-01 19:10       ` [PATCH v2] " Karsten Blees
2015-07-02  5:25         ` Torsten Bögershausen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqmw01ltid.fsf@gitster.dls.corp.google.com \
    --to=gitster@pobox$(echo .)com \
    --cc=git@vger$(echo .)kernel.org \
    --cc=karsten.blees@gmail$(echo .)com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox