From: Phillip Wood <phillip.wood123@gmail•com>
To: Ezekiel Newren <ezekielnewren@gmail•com>, phillip.wood@dunelm•org.uk
Cc: Ezekiel Newren via GitGitGadget <gitgitgadget@gmail•com>,
git@vger•kernel.org,
Kristoffer Haugsbakk <kristofferhaugsbakk@fastmail•com>,
Patrick Steinhardt <ps@pks•im>,
Chris Torek <chris.torek@gmail•com>
Subject: Re: [PATCH v2 01/10] doc: define unambiguous type mappings across C and Rust
Date: Sun, 9 Nov 2025 14:14:11 +0000 [thread overview]
Message-ID: <fa95b29a-077c-4df5-9c59-34e0c1447e70@gmail.com> (raw)
In-Reply-To: <CAH=ZcbA25eyMhQpvK7eh=ydZkg5RdzbdRFEdj-22T+d1VuTazA@mail.gmail.com>
On 06/11/2025 22:52, Ezekiel Newren wrote:
> On Thu, Nov 6, 2025 at 2:55 AM Phillip Wood <phillip.wood123@gmail•com> wrote:
>> On 29/10/2025 22:19, Ezekiel Newren via GitGitGadget wrote:
>>> From: Ezekiel Newren <ezekielnewren@gmail•com>
>>>
>>> Document other nuances with crossing the FFI boundary. Other language
>>> mappings may be added in the future.
>>
>> Thanks for adding this, I've left a few comments below. Overall I
>> thought it was very well written.
>
> Thanks.
>
> I felt it was necessary since C vs Rust types keep coming up over and
> over again. I'm flexible with the wording of this document. I was just
> trying to convey a firm and clear stance on what is and isn't proper
> in Git.
That will definitely be useful as we add more rust code. In the future
we may want to add a summary of which types to use to
Documentation/CodingGuidelines but that doesn't need to be done in this
series.
>> I tried building an html version of
>> this but even after adding it to the list of TECH_DOCS in
>> Documentation/Makefile with
>>
>> diff --git a/Documentation/Makefile b/Documentation/Makefile
>> index 47208269a2e..2699f0b24af 100644
>> --- a/Documentation/Makefile
>> +++ b/Documentation/Makefile
>> @@ -143,6 +143,7 @@ TECH_DOCS += technical/shallow
>> TECH_DOCS += technical/sparse-checkout
>> TECH_DOCS += technical/sparse-index
>> TECH_DOCS += technical/trivial-merge
>> +TECH_DOCS += technical/unambiguous-types
>> TECH_DOCS += technical/unit-tests
>> SP_ARTICLES += $(TECH_DOCS)
>> SP_ARTICLES += technical/api-index
>>
>> it fails with
>>
>> $ make -C Documentation/ technical/unambiguous-types.html
>> Merge branch
>> 'ps/object-source-loose' into seen
>> make: Entering directory '/home/phil/src/git/Documentation'
>> GEN asciidoc.conf
>> * new asciidoc flags
>> ASCIIDOC technical/unambiguous-types.html
>> asciidoc: ERROR: unambiguous-types.adoc: line 139: undefined filter
>> attribute in command: source-highlight --gen-version -f xhtml -s
>> {language} {src_numbered?--line-number=' '} {src_tab?--tab={src_tab}}
>> {args=}
>> asciidoc: ERROR: unambiguous-types.adoc: line 162: undefined filter
>> attribute in command: source-highlight --gen-version -f xhtml -s
>> {language} {src_numbered?--line-number=' '} {src_tab?--tab={src_tab}}
>> {args=}
>> asciidoc: ERROR: unambiguous-types.adoc: line 177: undefined filter
>> attribute in command: source-highlight --gen-version -f xhtml -s
>> {language} {src_numbered?--line-number=' '} {src_tab?--tab={src_tab}}
>> {args=}
>> asciidoc: ERROR: unambiguous-types.adoc: line 187: undefined filter
>> attribute in command: source-highlight --gen-version -f xhtml -s
>> {language} {src_numbered?--line-number=' '} {src_tab?--tab={src_tab}}
>> {args=}
>> asciidoc: ERROR: unambiguous-types.adoc: line 199: undefined filter
>> attribute in command: source-highlight --gen-version -f xhtml -s
>> {language} {src_numbered?--line-number=' '} {src_tab?--tab={src_tab}}
>> {args=}
>> asciidoc: ERROR: unambiguous-types.adoc: line 213: undefined filter
>> attribute in command: source-highlight --gen-version -f xhtml -s
>> {language} {src_numbered?--line-number=' '} {src_tab?--tab={src_tab}}
>> {args=}
>> asciidoc: ERROR: unambiguous-types.adoc: line 224: undefined filter
>> attribute in command: source-highlight --gen-version -f xhtml -s
>> {language} {src_numbered?--line-number=' '} {src_tab?--tab={src_tab}}
>> {args=}
>> make: *** [Makefile:396: technical/unambiguous-types.html] Error 1
>> make: *** Deleting file 'technical/unambiguous-types.html'
>> make: Leaving directory '/home/phil/src/git/Documentation'
>
> I've never created documentation for Git before, so this helps. I'll
> incorporate your suggestions.
We should also add this file to Documentation/technical/meson.build. It
seems those errors above are due to some incompatibility between
asciidoc and asciidoctor as I just tried running
make -C Documentation/ USE_ASCIIDOCTOR=1
technical/unambiguous-types.html
and it worked just fine. I'm afraid I don't know enough asciidoc to make
any helpful suggestions on how to fix it.
>>> +== Character types
>>> +
>>> +This is where C and Rust don't have a clean one-to-one mapping. A C `char` is
>>> +an 8-bit type that is signless (neither signed nor unsigned)
>>
>> I found this a bit confusing. Isn't the signedness of "char"
>> implementation defined rather than it being "signless"
>>
>>> which causes
>>> +problems with e.g. `make DEVELOPER=1`.
>>
>> I'm not sure what this is referring to - maybe -Wsign-compare?
>
> When I build Git with `make DEVELOPER=1` and I compare uint8_t with
> char it complains about a difference in signedness. When I compare
> int8_t with char it also complains about a difference in signedness.
> So it is implementation defined, but it's also neither signed nor
> unsigned according to DEVELOPER=1 since it complains either way.
Oh, I see - this is saying mixing "char" and "uint8_t" causes problems.
I agree, perhaps we could expand this slightly to mention comparison
with uint8_t to make it clearer.
>>> Rust's `char` type is an unsigned 32-bit
>>> +integer that is used to describe Unicode code points. Even though a C `char`
>>> +is the same width as `u8`, `char` should be converted to u8 where it is
>>> +describing bytes in memory.
>>
>> I'm dreading the point where we start sharing "struct strbuf" with rust
>> and have to change the "buf" member from "char*" to "uint8_t*". While it
>> is not used in the xdiff code it is ubiquitous everywhere else and there
>> are lots of places where be pass the "buf" member to functions expecting
>> a "char*".
>>
>> git grep -E '(\.|->)buf\W'
>>
>> has over 4000 matches
>
> This is why I started in Xdiff since its code is mostly isolated.
Good plan!
> I
> think that we might have to bite the bullet and deal with the ugly
> mapping of char on the C side and u8 on the Rust side when dealing
> with strbuf. Maybe as we translate more of C into Rust someone will
> have a better suggestion. I think my ivec type would be better since
> strbuf is almost a special case of my ivec type, but dealing with
> strbuf is outside the scope of this patch series.
Yes, hopefully it will become clearer what the least painful route
forward is as we get more experience with rust <=> C iterop.
>>> +While you could specify `char` in the C code and `u8` in Rust code, it's not as
>>> +clear what the appropriate type is, but it would work across the FFI boundary.
>>> +However the bigger problem comes from code generation tools like cbindgen and
>>> +bindgen. When cbindgen see u8 in Rust it will generate uint8_t on the C side
>>> +which will cause differ in signedness warnings/errors. Similarly if bindgen
>>> +see `char` on the C side it will generate `std::ffi::c_char` which has its own
>>> +problems.
>>
>> Yeah, we definitely don't want to be using "std::ffi::c_char" in our
>> rust implementations. I do wonder if we might want to use it (or CStr)
>> judiciously in function parameters and immediately convert it to u8 in
>> the function body where the function is called from C though.
>
> That's basically the design pattern I've been using.
>
> In many of my translations from C to Rust I create a Rust stub
> function that takes pointer types and wraps them into safe types which
> then get handed off to a safe Rust function. I think that in the cases
> where CString/CStr is required the Rust stub function would create a
> &[u8] slice for the safe function to operate on.
That sounds like a good pattern - we get a nice interface for the C code
and the rust implementation uses the idiomatic rust types.
Thanks
Phillip
>>> +=== Notes
>>> +^1^ This is only true if stdbool.h (or equivalent) is used. +
>>> +^2^ C does not enforce IEEE-754 compatibility, but Rust expects it. If the
>>> +platform/arch for C does not follow IEEE-754 then this equivalence does not
>>> +hold. Also, it's assumed that `float` is 32 bits and `double` is 64, but
>>> +there may be a strange platform/arch where even this isn't true. +
>>> +^3^ C also defines uintptr_t, but this should not be used in Git. +
>>> +^4^ C also defines ssize_t and intptr_t, but these should not be used in Git. +
>>
>> [u]intptr_t and ssize_t are used in git already. As Junio has pointed
>> out there are sane uses for these types but we don't want to use them in
>> structs or function parameters where the struct or function is shared
>> with rust.
>
> You're right, I should update the phrasing. Something like: "These
> types shouldn't be used if their explicit purpose is for FFI. Whether
> as a field in a struct or part of a function signature." I'll update
> the wording.
>
>>> +
>>> +== Problems with std::ffi::c_* types in Rust
>>> +TL;DR: They're not guaranteed to match C types for all possible C
>>> +compilers/platforms/architectures.
>>
>> Is this official policy of the rust project?
>
> No, this is a personal inference based on logical deduction. The c_*
> definitions have changed over time with new Rust version releases, and
> Git targets more platforms/architectures than what Rust officially
> supports. While it's not guaranteed that it won't work everywhere.
> It's also not guaranteed to work everywhere either. On top of that
> we're targeting 1.63.0 who's c_* definitions are different in 1.89.0
> which I show an example of with c_long_definition. Can anyone say with
> certainty that Rust got these mappings right or wrong for all possible
> C compilers/architectures/platforms? If so (which I highly doubt)
> could someone provide a link?
>
next prev parent reply other threads:[~2025-11-09 14:14 UTC|newest]
Thread overview: 118+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-15 21:18 [PATCH 0/9] Xdiff cleanup part2 Ezekiel Newren via GitGitGadget
2025-10-15 21:18 ` [PATCH 1/9] xdiff: use ssize_t for dstart/dend, make them last in xdfile_t Ezekiel Newren via GitGitGadget
2025-10-21 11:32 ` Phillip Wood
2025-10-21 17:18 ` Junio C Hamano
2025-10-22 21:07 ` Ezekiel Newren
2025-10-22 21:38 ` Junio C Hamano
2025-10-22 21:51 ` Ezekiel Newren
2025-10-15 21:18 ` [PATCH 2/9] xdiff: make xrecord_t.ptr a uint8_t instead of char Ezekiel Newren via GitGitGadget
2025-10-16 21:51 ` Kristoffer Haugsbakk
2025-10-21 8:33 ` Patrick Steinhardt
2025-10-22 21:12 ` Ezekiel Newren
2025-10-21 13:13 ` Phillip Wood
2025-10-21 18:15 ` Junio C Hamano
2025-10-22 13:27 ` Phillip Wood
2025-10-22 20:55 ` Ezekiel Newren
2025-10-15 21:18 ` [PATCH 3/9] xdiff: use size_t for xrecord_t.size Ezekiel Newren via GitGitGadget
2025-10-15 21:18 ` [PATCH 4/9] xdiff: use unambiguous types in xdl_hash_record() Ezekiel Newren via GitGitGadget
2025-10-21 8:33 ` Patrick Steinhardt
2025-10-22 21:20 ` Ezekiel Newren
2025-10-23 5:49 ` Patrick Steinhardt
2025-10-15 21:18 ` [PATCH 5/9] xdiff: split xrecord_t.ha into line_hash and minimal_perfect_hash Ezekiel Newren via GitGitGadget
2025-10-20 23:29 ` Ezekiel Newren
2025-10-21 5:10 ` Junio C Hamano
2025-10-21 8:33 ` Patrick Steinhardt
2025-10-21 10:03 ` Phillip Wood
2025-10-21 11:16 ` Chris Torek
2025-10-22 21:31 ` Ezekiel Newren
2025-10-15 21:18 ` [PATCH 6/9] xdiff: make xdfile_t.nrec a size_t instead of long Ezekiel Newren via GitGitGadget
2025-10-15 21:18 ` [PATCH 7/9] xdiff: make xdfile_t.nreff " Ezekiel Newren via GitGitGadget
2025-10-15 21:18 ` [PATCH 8/9] xdiff: change rindex from long to size_t in xdfile_t Ezekiel Newren via GitGitGadget
2025-10-21 8:34 ` Patrick Steinhardt
2025-10-22 22:14 ` Ezekiel Newren
2025-10-23 5:49 ` Patrick Steinhardt
2025-10-15 21:18 ` [PATCH 9/9] xdiff: rename rindex -> reference_index Ezekiel Newren via GitGitGadget
2025-10-15 21:28 ` [PATCH 0/9] Xdiff cleanup part2 Junio C Hamano
2025-10-21 13:28 ` Phillip Wood
2025-10-21 13:41 ` Junio C Hamano
2025-10-29 22:19 ` [PATCH v2 00/10] " Ezekiel Newren via GitGitGadget
2025-10-29 22:19 ` [PATCH v2 01/10] doc: define unambiguous type mappings across C and Rust Ezekiel Newren via GitGitGadget
2025-11-06 9:55 ` Phillip Wood
2025-11-06 22:52 ` Ezekiel Newren
2025-11-09 14:14 ` Phillip Wood [this message]
2025-10-29 22:19 ` [PATCH v2 02/10] xdiff: use ssize_t for dstart/dend, make them last in xdfile_t Ezekiel Newren via GitGitGadget
2025-11-06 9:55 ` Phillip Wood
2025-11-06 22:56 ` Ezekiel Newren
2025-10-29 22:19 ` [PATCH v2 03/10] xdiff: make xrecord_t.ptr a uint8_t instead of char Ezekiel Newren via GitGitGadget
2025-11-06 10:49 ` Phillip Wood
2025-11-06 23:13 ` Ezekiel Newren
2025-11-06 10:55 ` Phillip Wood
2025-11-06 23:14 ` Ezekiel Newren
2025-10-29 22:19 ` [PATCH v2 04/10] xdiff: use size_t for xrecord_t.size Ezekiel Newren via GitGitGadget
2025-10-29 22:19 ` [PATCH v2 05/10] xdiff: use unambiguous types in xdl_hash_record() Ezekiel Newren via GitGitGadget
2025-10-29 22:19 ` [PATCH v2 06/10] xdiff: split xrecord_t.ha into line_hash and minimal_perfect_hash Ezekiel Newren via GitGitGadget
2025-11-06 11:00 ` Phillip Wood
2025-11-06 23:20 ` Ezekiel Newren
2025-10-29 22:19 ` [PATCH v2 07/10] xdiff: make xdfile_t.nrec a size_t instead of long Ezekiel Newren via GitGitGadget
2025-10-29 22:19 ` [PATCH v2 08/10] xdiff: make xdfile_t.nreff " Ezekiel Newren via GitGitGadget
2025-10-29 22:19 ` [PATCH v2 09/10] xdiff: change rindex from long to size_t in xdfile_t Ezekiel Newren via GitGitGadget
2025-10-29 22:19 ` [PATCH v2 10/10] xdiff: rename rindex -> reference_index Ezekiel Newren via GitGitGadget
2025-10-30 14:26 ` [PATCH v2 00/10] Xdiff cleanup part2 Junio C Hamano
2025-11-11 19:42 ` [PATCH v3 " Ezekiel Newren via GitGitGadget
2025-11-11 19:42 ` [PATCH v3 01/10] doc: define unambiguous type mappings across C and Rust Ezekiel Newren via GitGitGadget
2025-11-11 20:52 ` Junio C Hamano
2025-11-11 21:05 ` Junio C Hamano
2025-11-11 19:42 ` [PATCH v3 02/10] xdiff: use ptrdiff_t for dstart/dend Ezekiel Newren via GitGitGadget
2025-11-11 22:23 ` Junio C Hamano
2025-11-11 19:42 ` [PATCH v3 03/10] xdiff: make xrecord_t.ptr a uint8_t instead of char Ezekiel Newren via GitGitGadget
2025-11-11 22:53 ` Junio C Hamano
2025-11-11 19:42 ` [PATCH v3 04/10] xdiff: use size_t for xrecord_t.size Ezekiel Newren via GitGitGadget
2025-11-11 23:08 ` Junio C Hamano
2025-11-14 6:02 ` Ezekiel Newren
2025-11-14 16:31 ` Junio C Hamano
2025-11-11 19:42 ` [PATCH v3 05/10] xdiff: use unambiguous types in xdl_hash_record() Ezekiel Newren via GitGitGadget
2025-11-11 19:42 ` [PATCH v3 06/10] xdiff: split xrecord_t.ha into line_hash and minimal_perfect_hash Ezekiel Newren via GitGitGadget
2025-11-11 23:21 ` Junio C Hamano
2025-11-14 5:41 ` Ezekiel Newren
2025-11-14 20:06 ` Junio C Hamano
2025-11-11 19:42 ` [PATCH v3 07/10] xdiff: make xdfile_t.nrec a size_t instead of long Ezekiel Newren via GitGitGadget
2025-11-11 19:42 ` [PATCH v3 08/10] xdiff: make xdfile_t.nreff " Ezekiel Newren via GitGitGadget
2025-11-11 19:42 ` [PATCH v3 09/10] xdiff: change rindex from long to size_t in xdfile_t Ezekiel Newren via GitGitGadget
2025-11-11 19:42 ` [PATCH v3 10/10] xdiff: rename rindex -> reference_index Ezekiel Newren via GitGitGadget
2025-11-11 23:40 ` [PATCH v3 00/10] Xdiff cleanup part2 Junio C Hamano
2025-11-14 5:52 ` Ezekiel Newren
2025-11-14 22:36 ` [PATCH v4 " Ezekiel Newren via GitGitGadget
2025-11-14 22:36 ` [PATCH v4 01/10] doc: define unambiguous type mappings across C and Rust Ezekiel Newren via GitGitGadget
2025-11-15 3:06 ` Ramsay Jones
2025-11-15 3:41 ` Ben Knoble
2025-11-15 14:55 ` Ramsay Jones
2025-11-15 16:42 ` Junio C Hamano
2025-11-15 16:59 ` D. Ben Knoble
2025-11-15 20:03 ` Junio C Hamano
2025-11-17 1:20 ` Junio C Hamano
2025-11-17 2:08 ` Ramsay Jones
2025-11-14 22:36 ` [PATCH v4 02/10] xdiff: use ptrdiff_t for dstart/dend Ezekiel Newren via GitGitGadget
2025-11-14 22:36 ` [PATCH v4 03/10] xdiff: make xrecord_t.ptr a uint8_t instead of char Ezekiel Newren via GitGitGadget
2025-11-15 8:26 ` Junio C Hamano
2025-11-18 20:55 ` Ezekiel Newren
2025-11-14 22:36 ` [PATCH v4 04/10] xdiff: use size_t for xrecord_t.size Ezekiel Newren via GitGitGadget
2025-11-14 22:36 ` [PATCH v4 05/10] xdiff: use unambiguous types in xdl_hash_record() Ezekiel Newren via GitGitGadget
2025-11-14 22:36 ` [PATCH v4 06/10] xdiff: split xrecord_t.ha into line_hash and minimal_perfect_hash Ezekiel Newren via GitGitGadget
2025-11-14 22:36 ` [PATCH v4 07/10] xdiff: make xdfile_t.nrec a size_t instead of long Ezekiel Newren via GitGitGadget
2025-11-14 22:36 ` [PATCH v4 08/10] xdiff: make xdfile_t.nreff " Ezekiel Newren via GitGitGadget
2025-11-14 22:36 ` [PATCH v4 09/10] xdiff: change rindex from long to size_t in xdfile_t Ezekiel Newren via GitGitGadget
2025-11-14 22:36 ` [PATCH v4 10/10] xdiff: rename rindex -> reference_index Ezekiel Newren via GitGitGadget
2025-11-18 22:34 ` [PATCH v5 00/10] Xdiff cleanup part2 Ezekiel Newren via GitGitGadget
2025-11-18 22:34 ` [PATCH v5 01/10] doc: define unambiguous type mappings across C and Rust Ezekiel Newren via GitGitGadget
2025-11-18 23:46 ` Ramsay Jones
2025-11-19 4:14 ` Junio C Hamano
2025-11-18 22:34 ` [PATCH v5 02/10] xdiff: use ptrdiff_t for dstart/dend Ezekiel Newren via GitGitGadget
2025-11-18 22:34 ` [PATCH v5 03/10] xdiff: make xrecord_t.ptr a uint8_t instead of char Ezekiel Newren via GitGitGadget
2025-11-18 22:34 ` [PATCH v5 04/10] xdiff: use size_t for xrecord_t.size Ezekiel Newren via GitGitGadget
2025-11-18 22:34 ` [PATCH v5 05/10] xdiff: use unambiguous types in xdl_hash_record() Ezekiel Newren via GitGitGadget
2025-11-18 22:34 ` [PATCH v5 06/10] xdiff: split xrecord_t.ha into line_hash and minimal_perfect_hash Ezekiel Newren via GitGitGadget
2025-11-18 22:34 ` [PATCH v5 07/10] xdiff: make xdfile_t.nrec a size_t instead of long Ezekiel Newren via GitGitGadget
2025-11-18 22:34 ` [PATCH v5 08/10] xdiff: make xdfile_t.nreff " Ezekiel Newren via GitGitGadget
2025-11-18 22:34 ` [PATCH v5 09/10] xdiff: change rindex from long to size_t in xdfile_t Ezekiel Newren via GitGitGadget
2025-11-18 22:34 ` [PATCH v5 10/10] xdiff: rename rindex -> reference_index Ezekiel Newren via GitGitGadget
2025-11-18 23:11 ` [PATCH v5 00/10] Xdiff cleanup part2 Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=fa95b29a-077c-4df5-9c59-34e0c1447e70@gmail.com \
--to=phillip.wood123@gmail$(echo .)com \
--cc=chris.torek@gmail$(echo .)com \
--cc=ezekielnewren@gmail$(echo .)com \
--cc=git@vger$(echo .)kernel.org \
--cc=gitgitgadget@gmail$(echo .)com \
--cc=kristofferhaugsbakk@fastmail$(echo .)com \
--cc=phillip.wood@dunelm$(echo .)org.uk \
--cc=ps@pks$(echo .)im \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox