On 2025-10-07 at 12:36:31, Patrick Steinhardt wrote: > +/// # Safety > +/// > +/// The provided buffer must be large enough to store the encoded varint. Callers may either provide > +/// a `[u8; 16]` here, which is guaranteed to satisfy all encodable numbers. Or they can call this > +/// function with a `NULL` pointer first to figure out array size. > #[no_mangle] > pub unsafe extern "C" fn encode_varint(value: u64, buf: *mut u8) -> u8 { > let mut varint: [u8; 16] = [0; 16]; I'm planning to do something a little different with this code by refactoring it out into a Rust function, so at that point it will no longer be possible to provide a buffer smaller than 16 bytes. Note that all callers of this function pass a 16-byte buffer, so that should be safe. That doesn't mean that you can't send this patch (and I think your patch is good), just that we shouldn't tell people we can use a buffer smaller than 16 bytes, since that will at some point no longer be true. Here's the current version of the patch I'm planning on sending for reference. I can rebase onto your series once Junio picks it up. -- >% -- From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001 From: "brian m. carlson" Date: Wed, 8 Oct 2025 00:27:56 +0000 Subject: [PATCH] varint: write a safe Rust version of encode_varint Our original version of encode_varint in Rust used pointers much like the C version did. However, if we end up using this function elsewhere in Rust, it would be better to have a safe version that we can use. In addition, writing our unsafe C-compatible version in terms of a safe Rust version makes it obvious what our requirements are. For instance, we do not need buf to actually point anywhere and can accept a null pointer if we just want the length, and we can clearly indicate that we require 16 bytes worth of memory to encode data by creating an appropriate slice. All of our existing callers always pass a 16-byte buffer, so we can safely assume that. We can then improve our Rust version by performing normal bounds checking to make sure that we don't exceed the buffer size and use the standard usize return for lengths, converting as necessary in the C-compatible caller. Move the C-compatible code to a mod c to keep things tidy and allow us to have a different Rust version. Signed-off-by: brian m. carlson --- src/varint.rs | 34 ++++++++++++++++++++++++++++------ 1 file changed, 28 insertions(+), 6 deletions(-) diff --git a/src/varint.rs b/src/varint.rs index 6e610bdd8e..83990afe7a 100644 --- a/src/varint.rs +++ b/src/varint.rs @@ -22,8 +22,7 @@ pub unsafe extern "C" fn decode_varint(bufp: *mut *const u8) -> u64 { val } -#[no_mangle] -pub unsafe extern "C" fn encode_varint(value: u64, buf: *mut u8) -> u8 { +pub fn encode_varint(value: u64, buf: Option<&mut [u8]>) -> usize { let mut varint: [u8; 16] = [0; 16]; let mut pos = varint.len() - 1; @@ -37,16 +36,19 @@ pub unsafe extern "C" fn encode_varint(value: u64, buf: *mut u8) -> u8 { value >>= 7; } - if !buf.is_null() { - std::ptr::copy_nonoverlapping(varint.as_ptr().add(pos), buf, varint.len() - pos); + let len = varint.len() - pos; + + if let Some(buf) = buf { + buf[0..len].copy_from_slice(&varint[pos..pos + len]); } - (varint.len() - pos) as u8 + len } #[cfg(test)] mod tests { - use super::*; + use super::c::encode_varint; + use super::decode_varint; #[test] fn test_decode_varint() { @@ -90,3 +92,23 @@ mod tests { } } } + +mod c { + /// Encode `value` into `buf` as a variable-length integer unless `buf` is null. + /// + /// Returns the number of bytes written, or, if `buf` is null, the number of bytes that would be + /// used to encode the integer. + /// + /// # Safety + /// + /// `buf` must either be null or point to at least 16 bytes of memory. + #[no_mangle] + pub unsafe extern "C" fn encode_varint(value: u64, buf: *mut u8) -> u8 { + let buffer = if buf.is_null() { + None + } else { + Some(std::slice::from_raw_parts_mut(buf, 16)) + }; + super::encode_varint(value, buffer) as u8 + } +} -- >% -- -- brian m. carlson (they/them) Toronto, Ontario, CA