From: Jon Loeliger <jdl@jdl•com>
To: David Gibson <david@gibson•dropbear.id.au>
Cc: linuxppc-dev@ozlabs•org
Subject: Re: [PATCH 4/4] DTC: Begin the path to sane literals and expressions.
Date: Fri, 26 Oct 2007 08:07:49 -0500 [thread overview]
Message-ID: <E1IlOuj-0000lw-7E@jdl.com> (raw)
In-Reply-To: Your message of "Fri, 26 Oct 2007 11:28:32 +1000." <20071026012832.GC457@localhost.localdomain>
So, like, the other day David Gibson mumbled:
>
> Ah... I think I see the source of our misunderstanding. Sorry if I
> was unclear. I'm not saying that the version token would be
> invisible to the parser, just that it would be recognized by the lexer
> first.
Ah! Right. OK, I see what you are saying now.
> The nice thing about having a token, is that if necessary we can
> completely change the grammar for each version, without having to have
> tangled rules that have to generate yyerror()s in some circumstances
> depending on the version variable. The alternate grammars can be
> encoded directly into the yacc rules:
> startsymbol : version0_file
> | V1_TOKEN version1_file
> | V2_TOKEN version2_file
> ;
Hmmm... Now that I see that your symbol is still in the grammar,
I can see this part as well. OK. I'll buy it.
> > > I'm also inclined to leave the syntax for bytestrings as it is, in
> >
> > Why? Why not be allowed to form up a series of expressions
> > that make up a byte string? Am I missing something obvious here?
>
> Because part of the point of bytestrings is to provide representation
> for binary data. For a MAC address, say
> [0x00 0x0a 0xe4 0x2c 0x23 0x1f]
> is way bulkier than
> [000ae42c231f]
No, I think you misuderstand what I was after. I'm not after the
the latter [000ae4...]. In that case, there would be multiple
expressions, each no bigger than 8 bits wide:
[ expr expr expr expr expr expr ]
[ 0x00 10 0x4 0x20+12 '0'+3 0x20 - 1 ]
or whatever seemed appropriate. It would not be one giant value.
> And in bytestring context, I suspect having every expression result be
> truncated to bytesize will be way more of a gotcha than in cell
> context.
Which is why we run a semantic checking as well and warn on
values not fitting in container sizes.
> I suspect we can get the expression flexibility we want here by
> providing the right operators to act *on* bytestrings, rather than
> within bytestrings.
That too. No problem. I suspect some may be functional, though.
Haven't thought about that a bunch yet. I just want to get
basis stuff in first.
> Hrm. I think just exprval or intval would be better. Actually
> probably intval, since last we spoke I though we were planning on
> having expressions of string and bytestring types as well.
Except I think we want more generalized than that.
> Incidentally, there's another problem here: we haven't solved the
> problem about having to allow property names with initial digits.
I know.
> That's a particular problem here, because although we can make
> literals scanned in preference to propnames of the same length, in
> this case
> 0x1234..0xabcd
> Will be scanned as one huge propname.
I know. White space is mandatory right now.
> This might work for you at the moment, if you've still got all the
> lexer states, but I was really hoping we could ditch most of them with
> the new literals.
Which is really why they are all still there. Longer term,
I want to _quit_ supporting "version 0" and remove the cruft...
> But you haven't actually addressed my concern about this. Actually
> it's worse that I said then, because
> <0x10000000 -999>
> is ambiguous. Is it a single subtraction expression, or one literal
> cell followed by an expression cell with a unary '-'?
Gah.
Paren'ed expressions may be the thing to do.
How do you feel about comma separation?
Anyone else care to chime in?
> > > > +unsigned int dts_version = 0;
> Yeah, I figured this out after. Youch, an even tighter and harder to
> follow coupling between lexer and parser execution order. I can think
> of at least two better ways to do this.
I'm listening... :-)
> 1) handle d# b# etc. at the lexer lexel, with a regex like
> (d#{WS}*[0-9]+). Strictly speaking that changes the language, but I
> don't think anyone's been insane enough to do something like "d#
> /*stupid comment*/ 999". That would remove the whole ugly
> opt_cell_base tangle from the grammar.
That seems like it could work...
> 2) Have the lexer just pass up literals as strings, and let the parser
> do the conversion to integer, based on the grammatical context. I
> think this is preferable because it has other advantages: we can do
> the distinction between 64-bit values for memreserve and 32-bit values
> for cell at the grammatical level. It can also be used to handle the
> propname/literal ambiguity without lexer states (I had a patch a while
> back which removed the MEMRESERVE and CELLDATA lex states using this
> technique).
I'm not so keen on that approach, I don't think.
> > The same call to set_dts_version() as any other case.
>
> Erm... which same call to set_dts_version()? Surely not the one in
> the parser..
I'm clearly not understanding your point, I'm afraid. There are
static default values here:
/*
* DTS sourcefile version.
*/
unsigned int dts_version = 0;
unsigned int expr_default_base = 10;
And there is a call to set_dts_version() made when any DTS file
is parsed, which happens before any -O option is even handled.
What am I missing?
jdl
next prev parent reply other threads:[~2007-10-26 13:07 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-10-19 17:43 [PATCH 4/4] DTC: Begin the path to sane literals and expressions Jon Loeliger
2007-10-20 8:47 ` David Gibson
2007-10-21 5:32 ` Segher Boessenkool
2007-10-22 0:37 ` David Gibson
2007-10-25 18:24 ` Jon Loeliger
2007-10-26 1:28 ` David Gibson
2007-10-26 13:07 ` Jon Loeliger [this message]
2007-10-26 14:03 ` David Gibson
2007-10-21 5:30 ` Segher Boessenkool
2007-10-22 0:51 ` David Gibson
2007-10-22 12:36 ` Jon Loeliger
2007-10-23 0:33 ` David Gibson
2007-10-23 0:57 ` Segher Boessenkool
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=E1IlOuj-0000lw-7E@jdl.com \
--to=jdl@jdl$(echo .)com \
--cc=david@gibson$(echo .)dropbear.id.au \
--cc=linuxppc-dev@ozlabs$(echo .)org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox