From: Kyle Moffett <mrlinuxman@mac•com>
To: "Karl Hasselström" <kha@treskal•com>
Cc: Petr Baudis <pasky@suse•cz>, Erik Mouw <mouw@nl•linux.org>,
git@vger•kernel.org
Subject: Re: Git string manipulation functions wrong?
Date: Tue, 22 May 2007 23:22:35 -0400 [thread overview]
Message-ID: <20070522232235.1cecb880.mrlinuxman@mac.com> (raw)
In-Reply-To: <20070521145925.GA6474@diana.vm.bytemark.co.uk>
On Mon, 21 May 2007 16:59:25 +0200 Karl Hasselström <kha@treskal•com> wrote:
> On 2007-05-21 16:36:16 +0200, Petr Baudis wrote:
> > It's the opposite for me - we don't properly set the NUL byte for
> > smoe of our strncpy() calls, but I don't really see his problem with
> > snprintf(), we seem to handle its return value correctly everywhere
> > (except diff.c, but there the buffer sizes should be designed in
> > such a way that an overflow should be impossible).
>
> I think this kind of detailed case-by-case analysis defeats Timo's
> point, though: that the C library functions make it too easy to write
> bugs. If it's necessary to do non-trivial bounds checking etc. at
> every call site, it doesn't really matter if we currently do get them
> all right; at some point, we _are_ going to miss one. Instead of using
> our collective C-fu to get difficult calls right, we should be using
> it to construct string routines that have low enough overhead that
> it's lost in the noise, and are dead simple to use (and, of course,
> that can be cleanly bypassed in the 1% of cases where it's necessary).
That would be mostly true, except for the fact that without snprintf()
returning how many bytes _would_ have been written, it's much harder to
reliably allocate buffers for the result on the first pass. For
example, this is a trivial implementation of an function which returns
a freshly-allocated formatted string:
char *data;
unsigned long len;
len = snprintf(NULL, 0, some_fmt, arg1, arg2, arg3);
if (!len)
return NULL;
data = malloc(len+1);
if (!data)
return NULL;
data[len] = '\0';
snprintf(data, len, some_fmt, arg1, arg2, arg3);
return data;
You can't do that without a loop if it returns how many bytes were
actually written (although some braindead platforms do that already).
Here's a function which handles both use-cases in an optimal way:
char *data = NULL;
unsigned long datalen = 0, len;
do {
len = snprintf(data, datalen, some_fmt, arg1, arg2, arg3);
if (!datalen) {
datalen = len ? len : 16;
data = malloc(datalen);
if (!data)
return NULL;
} else if (len >= datalen) {
void *newmem;
datalen = (len > datalen)?(len + 1):(datalen +16);
newmem = realloc(data, datalen);
if (!newmem) {
free(data);
return NULL
}
}
} while (len >= datalen);
data[len] = '\0';
return data;
Hopefully, on a nice modern platform, the first iteration will have len
equal to the ideal actual required length and so it will hit the first
case and carefully allocate exactly enough bytes, then on the second
loop through it will fill in exactly the required bytes and return
success. On one of the abovementioned dain-bramaged systems, this will
loop until snprintf doesn't use all the space in the buffer,
incrementing by some fixed value each time (in this implementation,
16). It should be obvious that correctly-implemented systems will be
significantly more performant than ones without the useful "feature" of
POSIX-compliance. :-D
Cheers,
Kyle Moffett
prev parent reply other threads:[~2007-05-23 3:38 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-05-21 13:11 Git string manipulation functions wrong? Erik Mouw
2007-05-21 14:36 ` Petr Baudis
2007-05-21 14:59 ` Karl Hasselström
2007-05-23 3:22 ` Kyle Moffett [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070522232235.1cecb880.mrlinuxman@mac.com \
--to=mrlinuxman@mac$(echo .)com \
--cc=git@vger$(echo .)kernel.org \
--cc=kha@treskal$(echo .)com \
--cc=mouw@nl$(echo .)linux.org \
--cc=pasky@suse$(echo .)cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox