From: "Shawn O. Pearce" <spearce@spearce•org>
To: EGit developer discussion <egit-dev@eclipse•org>
Cc: Marc Strapetz <marc.strapetz@syntevo•com>, git@vger•kernel.org
Subject: Re: [egit-dev] Re: jgit problems for file paths with non-ASCII characters
Date: Wed, 25 Nov 2009 16:54:23 -0800 [thread overview]
Message-ID: <20091126005423.GM11919@spearce.org> (raw)
In-Reply-To: <200911252211.55137.robin.rosenberg@dewire.com>
Robin Rosenberg <robin.rosenberg@dewire•com> wrote:
> onsdag 25 november 2009 14:47:25 skrev Marc Strapetz:
> > I have noticed that jgit converts file paths to UTF-8 when querying the
> > repository.
...
> > Is this a bug or a misconfiguration of my repository? I'm using jgit
> > (commit e16af839e8a0cc01c52d3648d2d28e4cb915f80f) on Windows.
>
> A bug.
>
> The problem here is that we need to allow multiple encodings since there
> is no reliable encoding specified anywhere.
This is a design fault of both Linux and git. git gets a byte
sequence from readdir and stores that as-is into the repository.
We have no way of knowing what that encoding is. So now everyone
touching a Git repository is screwed.
> The approach I advocate is
> the one we use for handling encoding in general. I.e. if it looks like UTF-8,
> treat it like that else fallback. This is expensive however
We should try to work harder with the git-core folks to get character
set encoding for file names worked out. We might be able to use a
configuration setting in the repository to tell us what the proper
encoding should be, and if not set, assume UTF-8.
> and then we have
> all the other issues with case insensitive name and the funny property that
> unicode has when it allows characters to be encoding using multiple sequences
> of code points as empoloyed by Apple.
But as you said, this still doesn't make the Apple normal form
any easier. Though if we know we are on such a strange filesystem
we might be able to assume the paths in the repository are equally
damaged. Or not.
--
Shawn.
next prev parent reply other threads:[~2009-11-26 0:54 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-11-25 13:47 jgit problems for file paths with non-ASCII characters Marc Strapetz
2009-11-25 21:11 ` Robin Rosenberg
2009-11-26 0:54 ` Shawn O. Pearce [this message]
2009-11-26 13:09 ` [egit-dev] " Thomas Singer
2009-11-26 14:47 ` Johannes Schindelin
2009-11-26 15:31 ` Thomas Singer
2009-11-26 19:57 ` Shawn O. Pearce
2009-11-26 16:44 ` Robin Rosenberg
2009-11-26 14:25 ` Marc Strapetz
2009-11-26 20:03 ` Shawn O. Pearce
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20091126005423.GM11919@spearce.org \
--to=spearce@spearce$(echo .)org \
--cc=egit-dev@eclipse$(echo .)org \
--cc=git@vger$(echo .)kernel.org \
--cc=marc.strapetz@syntevo$(echo .)com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox