public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
From: "Torsten Bögershausen" <tboegi@web•de>
To: Jeff King <peff@peff•net>, Junio C Hamano <gitster@pobox•com>
Cc: "Johannes Sixt" <j6t@kdbg•org>,
	"Johannes Schindelin" <Johannes.Schindelin@gmx•de>,
	git@vger•kernel.org, "René Scharfe" <l.s.r@web•de>
Subject: Re: [PATCH v4 2/5] t5000: test tar files that overflow ustar headers
Date: Fri, 15 Jul 2016 15:37:32 +0200	[thread overview]
Message-ID: <5b99a4bb-9b8e-e8c6-e214-e041209cb6e6@web.de> (raw)
In-Reply-To: <20160714223843.GA22196@sigill.intra.peff.net>



On 07/15/2016 12:38 AM, Jeff King wrote:
> On Thu, Jul 14, 2016 at 03:30:58PM -0700, Junio C Hamano wrote:
>
>>> If we move to time_t everywhere, I think we'll need an extra
>>> TIME_T_IS_64BIT, but we can cross that bridge when we come to it.
>>>
>>> Likewise I think we'll need SIZE_T_IS_64BIT eventually (for real 32-bit
>>> systems; LLP64 systems like Windows will then be able to run the test).
>>
>> I guess I wrote essentially the same thing before refreshing my
>> Inbox.
>>
>> I am a bit fuzzy between off_t and size_t; the former is for the
>> size of things you see on the filesystem, while the latter is for
>> you to give malloc(3).  I would have thought that off_t is the type
>> we would want at the end of the raw object header, denoting the size
>> of a blob object when deflated, which could be larger than the size
>> of a region of memory we can get from malloc(3), in which case we
>> would use the streaming interface.
>
> Yeah, your understanding is right (s/deflated/inflated/). I agree that
> off_t is probably a better size for blobs. Traditionally git assumed any
> object could fit in memory. The streaming interface helps that somewhat,
> but I think there are cases where we still must load a blob (e.g., if it
> is stored as a delta). In theory that never happens because of
> core.bigfilethreshold, but you may get a packfile from somebody with a
> higher threshold than you.
>
> I wouldn't be surprised if there are other cases that aren't smart
> enough to use the streaming interface yet, but the solution there is to
> make them smarter. :)
>
> So off_t is probably better. We do need to be careful, though, when
> allocating objects. E.g., this:
>
>   off_t size;
>   struct git_istream *stream;
>   void *buf;
>
>   stream = open_istream(sha1, &type, &size, NULL);
>   buf = xmalloc(size);
>   while (1) {
> 	/* read stream into buf */
>   }
>
> is a security hole when size_t is less than off_t (it gets truncated in
> the call to xmalloc, which allocates too few bytes). This is a toy
> example, obviously, but it's something to watch out for.
>
> -Peff
That code is "illegal", it should be
  buf = xmalloc(xsize_t(size));

And the transition from off_t into size_t
should always got via xsize_t():

static inline size_t xsize_t(off_t len)
{
	if (len > (size_t) len)
		die("Cannot handle files this big");
	return (size_t)len;
}

There are some more things to be done, on the long run:
- convert "unsigned long" into either off_t of size_t in e.g. convert.c
- Use the streaming interface to analyze if blobs are binary
   (That is already on my list, the old "stream and early out"
   from the olc 10/10, gmane/$293010 or so can be reused)

  reply	other threads:[~2016-07-15 13:38 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-30  9:06 [PATCH v4 0/5] friendlier handling of overflows in archive-tar Jeff King
2016-06-30  9:07 ` [PATCH v4 1/5] t9300: factor out portable "head -c" replacement Jeff King
2016-07-01  4:45   ` Eric Sunshine
2016-07-01 17:23   ` Junio C Hamano
2016-07-01 18:01     ` Jeff King
2016-06-30  9:08 ` [PATCH v4 2/5] t5000: test tar files that overflow ustar headers Jeff King
2016-07-14 15:47   ` Johannes Schindelin
2016-07-14 16:45     ` Johannes Sixt
2016-07-14 17:08       ` Junio C Hamano
2016-07-14 20:52         ` Johannes Sixt
2016-07-14 21:32           ` Jeff King
2016-07-14 22:30             ` Junio C Hamano
2016-07-14 22:38               ` Jeff King
2016-07-15 13:37                 ` Torsten Bögershausen [this message]
2016-07-15 13:46                   ` Jeff King
2016-07-14 22:26           ` Junio C Hamano
2016-07-14 18:24       ` Jeff King
2016-07-14 18:21     ` Jeff King
2016-07-14 20:00       ` Junio C Hamano
2016-07-14 20:03         ` Junio C Hamano
2016-07-14 20:14           ` Jeff King
2016-07-14 20:09         ` Junio C Hamano
2016-07-14 20:10         ` Jeff King
2016-07-14 20:22           ` Junio C Hamano
2016-07-14 20:27             ` Jeff King
2016-07-14 20:34               ` Junio C Hamano
2016-07-14 20:43                 ` [PATCH v2 0/2] ulong may only be 32-bit wide Junio C Hamano
2016-07-14 20:43                   ` [PATCH v2 1/2] t0006: skip "far in the future" test when unsigned long is not long enough Junio C Hamano
2016-07-14 20:43                   ` [PATCH v2 2/2] archive-tar: huge offset and future timestamps would not work on 32-bit Junio C Hamano
2016-07-14 22:20                     ` Jeff King
2016-07-14 22:36                       ` Junio C Hamano
2016-07-16  6:28                         ` Duy Nguyen
2016-07-15 15:10                 ` [PATCH v4 2/5] t5000: test tar files that overflow ustar headers Johannes Schindelin
2016-07-15 16:49                   ` Junio C Hamano
2016-06-30  9:09 ` [PATCH v4 3/5] archive-tar: write extended headers for file sizes >= 8GB Jeff King
2016-07-14 16:48   ` Johannes Sixt
2016-07-14 17:11     ` Junio C Hamano
2016-07-14 18:16       ` Jeff King
2016-07-15  2:59     ` Torsten Bögershausen
2016-06-30  9:09 ` [PATCH v4 4/5] archive-tar: write extended headers for far-future mtime Jeff King
2016-06-30  9:09 ` [PATCH v4 5/5] archive-tar: drop return value Jeff King
2016-06-30  9:14 ` [PATCH v4 6/5] t5000: use test_match_signal Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5b99a4bb-9b8e-e8c6-e214-e041209cb6e6@web.de \
    --to=tboegi@web$(echo .)de \
    --cc=Johannes.Schindelin@gmx$(echo .)de \
    --cc=git@vger$(echo .)kernel.org \
    --cc=gitster@pobox$(echo .)com \
    --cc=j6t@kdbg$(echo .)org \
    --cc=l.s.r@web$(echo .)de \
    --cc=peff@peff$(echo .)net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox