Now that the code contains explicit markers for starting and ending an
initialization, and having the guarantee that an assertion fails
whenever some code accesses the state of the "current initialization"
even though there is no ongoing initialization gives me much more
confidence in the correctness of the code. The calls to
begin_initialization and end_initialization always appear in pairs,
enclosing the minimal amount of code necessary for initialization.
In a nutshell, global modifiable state is error-prone and hard to
understand.
A nice side effect is that the grammar no longer needs a special rule
for the outermost initializer since the functions for the debug logging
are now called explicitly.
The code that misuses the initialization state just because it needs to
temporarily store a sym_t somewhere is now clearly marked as such. A
GCC statement expression can appear anywhere and is therefore
independent of the initialization. Most probably the code can simply
refer to the local variable in the grammar rule itself, or this variable
needs to be encoded in the grammar %union. For sure there is a better
way to handle this.
There is no longer a need that the function 'declare' initializes the
initialization state, it was just the wrong place to do this.
This indirection will be needed to handle nested initializations, which
are a new feature of C99. These are currently not handled correctly,
see msg_171.c.
No functional change.
While here, reword the message, avoiding operators and parentheses.
Since 2021-01-02, providing the precise type name is as easy as the
broad type classification (just replace tspec_name with type_name), and
it's definitely more useful to the human readers.
When determining the reachability of a statement, the idea was that
whenever 'reached' was set to false, 'rchflg' (the abbreviation for "do
not warn about unreachable statements") would be reset as well.
In some (trivial) cases, this was done, but many more interesting cases
simply forgot to set this second variable. To prevent this in the
future, encapsulate this in a simple helper function.
Now even if a statement is reachable, 'rchflg' gets reset. This does
not hurt since as long as the current statement is reachable, the value
of 'rchflg' does not matter.
No functional change. There would be quite a big functional change
though if check_statement_reachable were to reset 'rchflg' instead of
'reached', as the comment already suggests. In that case, with the
current code, many legitimate warnings about unreachable statements
would be skipped, especially those involving 'if' statements, since
these didn't reset 'rchflg' properly before.
Previously, only loop statements were considered for reachability. This
ignored the possibility of an early return in an if statement, or
unreachable branches.
This makes it easy to click on the location in the IDE instead of having
to manually parse the location and navigate to it.
No functional change outside debug mode.
Several tokens can only ever map to a single operator and thus do not
need to encode the operator. Indeed, they already encoded it as NOOP,
and it was not used by any grammar rule.
No functional change.
It's enough to have modtab, which describes the properties of the
various operators. There is no need to have a second table imods that
holds the same content. Rather make modtab constant as well.
The only possible functional change is that the names of the internal
operators 'no-op', '++', '--', 'real', 'imag' and 'case' may appear in
diagnostics, where previously lint invoked undefined behavior by passing
a null pointer for a '%s' conversion specifier.
The abbreviations in the table of operator properties had been wrong
since ops.def 1.10 from 2021-01-12, when strict bool mode was added. In
an earlier working draft, I had named that column 'takes_others' instead
of 'requires_bool', that's where the 'o' came from.
The names of the macro arguments had been wrong since op.h 1.11 from
2021-01-09, when the order of the columns changed and the macros were
not adjusted accordingly. Since all the properties of the operator
table are uniform, this didn't result in any bugs, it was just confusing
for human readers.
Clang-tidy suggests to enclose the macro arguments in oper.c in
parentheses but that is not possible since the arguments are either
empty or 1, and the syntactical ambiguity of the '+ 0' being either a
unary or a binary operator is needed here.
No change to the resulting binary.
Since today, lint's strict bool mode requires initializers to have the
correct type. The flags in kwtab are of type bool and were initialized
with an int, for brevity. Keep the brevity and do the conversion from
int to bool in a macro.
By defining several macros for the different kinds of keywords, reduce
the clutter of having 2 additional zeroes per line. The macros also
remove the redundancy and thereby the possible inconsistency of filling
the wrong fields since these depend on the token type.
No functional change. The only change to the binary is due to the
changed line numbers in the calls to lint_assert.
In lint's strict bool mode, initialization must be of the correct type.
This affects the bool fields in ttab_t, which are initialized with int.
To keep the code brief, preserve these ints and let a macro do the
actual work of converting them to bool.
No change to the generated binary.