public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
From: "Joachim Schmitz" <jojo@schmitz-digital•de>
To: "'Junio C Hamano'" <gitster@pobox•com>
Cc: "'Johannes Sixt'" <j.sixt@viscovery•net>,
	"'Jan Engelhardt'" <jengelh@inai•de>, <git@vger•kernel.org>
Subject: RE: git on HP NonStop
Date: Wed, 22 Aug 2012 18:38:09 +0200	[thread overview]
Message-ID: <002b01cd8084$8459abc0$8d0d0340$@schmitz-digital.de> (raw)
In-Reply-To: <7vy5l9f186.fsf@alter.siamese.dyndns.org>



> -----Original Message-----
> From: Junio C Hamano [mailto:gitster@pobox•com]
> Sent: Tuesday, August 21, 2012 4:06 AM
> To: Joachim Schmitz
> Cc: 'Johannes Sixt'; 'Jan Engelhardt'; git@vger•kernel.org
> Subject: Re: git on HP NonStop
> 
> "Joachim Schmitz" <jojo@schmitz-digital•de> writes:
> 
> > OK, so let's have a look at code, current git, builtin/cat-file.c,
> > line 196:
> >         void *contents = contents;
> >
> > This variable is set later in an if branch (if (print_contents ==
> > BATCH), but not in the else branch. It is later used always under the
> > same condition as the one under which it is set.
> > Apparently is is malloc_d storage (there a "free(content);"), so
> > there's no harm al all in initializing it with NULL, even if it only
> > appeases a stupid compiler.
> 
> It actually is harmful.  See below.

Harmful to initialize with NULL or to use that undefined behavoir?

I checked what our compiler does here: after having warned about "vlues us
used before it is set: it actually dies seem to have initializes the value
to 0 resp. NULL.
So here there's no harm done in avoiding undefined behavior and set it to 0
resp NULL in the first place.

> > The next one, builtin/fast-export.c, line 486:
> >                 struct commit *commit = commit; it is set in a switch
> > statement, but not in every case, as far as I can see.
> > Hmm, maybe it is, and I just get lost in the code And it is used
> > directly after the switch, hopefully set to something reasonable.
> > Why take the risk and not set it to NULL?
> 
> Ditto.
> 
> > Next one, builtin/rev-list.c, line 390:
> >                 int reaches = reaches, all = all;
> >
> >                 revs.commits = find_bisection(revs.commits, &reaches,
&all,
> >                                               bisect_find_all);
> >
> > Seem pretty pointless to initialize them, provided find_bisection
> > doesn't read them. Does it?
> 
> That is why they are not initializations but marks to the compiler to
signal "you
> may be stupid enough to think they are used before initialized or
assigned, but
> that is not the case".  Initializing them would be pointless.
> 
> > Next one, fast-import.c, line 2268:
> >         struct object_entry *oe = oe;
> >
> > os gets set in en if and an else branch, but not in then intermediate
> > else if branch!
> 
> Look again.  If the recent code is too complex for you to understand, go
back to
> 10e8d68 (Correct compiler warnings in fast-import., 2007-02-06) and read
the
> function.
> 
> The control flow of the early part of that function dictates that either
oe is
> assigned *or* inline_data is set to 1.  When inline_data is false, oe is
always
> set.
> 
> The compiler was too stupid to read that, and that is why the
> (confusing) idiom to mark it for the stupid compiler was used.
> 
> There are a few reasons why I do not think this self-assignment idiom or
> initializing the variable to an innocuous-looking random value is a
particularly
> good thing to do when you see compiler warnings.
> 
> If the compiler suspects the variable might be unused, you should always
look
> at it and follow the flow yourself.  Once you know it is a false alarm,
you can
> use the idiom to squelch the warning, and it at the same serves as a note
to
> others that you verified the flow and made sure it is a false warning.
> 
> When the next person wants to touch the code, if the person knows the use
of
> the idiom, it only serves as a warning to be extra careful not to
introduce a new
> codepath that reads the variable without setting, as the compiler no
longer
> helps him.  If the person who touches the code is as clueless as the
compiler
> and cannot follow the codepath to see the variable is never used
uninitialized,
> the result will be a lot worse.
> 
> That is the reason why I do not think the idiom to squelch the compiler is
such a
> good thing.  Careless people touch the code, so "oe = oe" initialization
carefully
> placed in the original version does not necessarily stay as a useful
> documentation.
> 
> But if you use "oe = NULL" as a way to squelch the warning in the first
place, it
> is no better than "oe = oe".  In a sense, it is even worse, as it just
looks like any
> other initialization and gives a false impression that the remainder of
the code
> is written in such a way that it tolerates oe being NULL in any codepath,
or
> there is some assignment before that before the code reaches places where
oe
> cannot be NULL.  That is different from what "oe = oe" initializaion
documents--
> -in the codepath protected by "if (inline_data)", it isn't just "oe can
safely be
> NULL there"; instead it is "oe can safely be *any* value there, because we
don't
> use it".  Of course, if you explicitly initialized oe to NULL, even if you
introduce
> a codepath where oe cannot be NULL later, you won't get a warning from the
> compiler, so it is no better than "oe = oe".
> 
> And that is the reason why I do not think initialization to an
innocuous-looking
> random value (e.g. NULL) is a good answer, either.
> 
> When both are not good, replacing "oe = oe" with "oe = NULL" didn't make
> much sense, especially when the former _could_ be used better by more
> careful people.  And that is the resistance J6t remembers.
> 
> But when recent compilers started to warn "oe = oe" that itself is
undefined,
> the equation changed.  The idiom ceased to be a way to squelch the
incorrect
> compiler warning (which was the primary point of its use---the
documentation
> value is secondary, as what we document is "we are squelching a false
alarm",
> but we no longer are squelching anything).
> 
> See 4a5de8d (vcs-svn: avoid self-assignment in dummy initialization of
pre_off,
> 2012-06-01), and 58ebd98 (vcs-svn/svndiff.c: squelch false "unused"
warning
> from gcc, 2012-01-27) that it updated, for an example of how we are
migrating
> away from "oe = oe" idiom.
> 
> In any case, I already said initializing to 0 is OK in my previous
message, I think,
> so if you are seeing warning from the compiler on uninitialized variables
in
> your new code, you know you should
> 
>  (1) first follow the codepath to make sure it is a false alarm (if
>      it isn't, fix the code);
> 
>  (2) then see if you can simplify your code so that it is crystal
>      clear even to the stupid compiler that you are not using an
>      uninitialized variable; and
> 
>  (3) if you can't, then initialize it to some known bad value,
>      avoiding the "oe = oe" idiom.
> 
> Is there anything more to discuss on this topic?

Which of the ones I told you about should get fixed?

Bye, Jojo

  reply	other threads:[~2012-08-22 16:38 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <001101cd79f2$f21b3bd0$d651b370$@schmitz-digital.de>
     [not found] ` <7vr4r98rfd.fsf@alter.siamese.dyndns.org>
2012-08-14 15:52   ` git on HP NonStop Joachim Schmitz
2012-08-19 16:25     ` Jan Engelhardt
2012-08-20 10:36       ` Joachim Schmitz
2012-08-20 10:56         ` Johannes Sixt
2012-08-20 11:27           ` Joachim Schmitz
2012-08-20 16:29           ` Junio C Hamano
2012-08-20 20:51             ` Joachim Schmitz
2012-08-21  2:06               ` Junio C Hamano
2012-08-22 16:38                 ` Joachim Schmitz [this message]
2012-08-23  8:23                   ` Andreas Ericsson
2012-08-23  9:23                     ` Joachim Schmitz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='002b01cd8084$8459abc0$8d0d0340$@schmitz-digital.de' \
    --to=jojo@schmitz-digital$(echo .)de \
    --cc=git@vger$(echo .)kernel.org \
    --cc=gitster@pobox$(echo .)com \
    --cc=j.sixt@viscovery$(echo .)net \
    --cc=jengelh@inai$(echo .)de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox