C99 is too far away from traditional C to make this warning useful.
There are 3 different situations in which this warning is generated:
For '1 << (unsigned char)1', the result type is 'unsigned int' in
traditional C. The result type is unsigned because at least 1 of the
operators is unsigned, and it is 'unsigned int' because the usual
arithmetic promotions are applied.
For '1 >> (long)1', as well as for '1 << (long)1', the result type is
'long' in traditional C since the usual arithmetic promotions are
applied.
Omitting this warning in C99 mode reduces the amount of lint warnings in
a typical NetBSD release build by approximately 6800 of 107000 total.
The previous message 'shift greater than size of object' was too short
to give reasonable hints, especially when the expressions involve
typedefs or macros.
The previous "table" was an insult to any reader. It was unsorted,
listed the functions shuffled, and was not even formatted consistently.
No functional change.
The argument to most of the functions from <ctype.h> "shall either be
representable as an 'unsigned char' or shall equal the value of the
macro EOF".
When confronted with the infamous warning 'array subscript has type
char', there are enough programmers who don't know the background of
that warning and thus fix it in a wrong way. Neither GCC nor Clang
explain its warning to target these programmers.
Both GCC and Clang warn about 'array subscript has type char', but they
ignore the other requirements of the <ctype.h> functions, even though
these are in the C standard library.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94182https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95177https://stackoverflow.com/a/60696378
were merged...
http://www.nerv.org/netbsd/?q=id:20200620T075016Z.3584036ccf31f69ee76ea4a02e9be30ff081df21
> Fix false positive for mvscanw tests on big endian machines.
>
> When conversion specifier is not a derivative form of "%s", retrieve
> input as 32bit integer, and then convert to string literal. Then we
> can avoid interpretation from ASCII code to integer, which is
> apparently byte-order depended.
This allows to process lib/libc/gen/sysctl.c 1.38 from 2021-03-30, as
well as its precedessor 1.37, which had a workaround just for lint.
While unusual, C99 allows these.
This check is not strictly necessary since any C99 compiler must
diagnose them as well, it is rather meant for demonstrating how to do
the check in lint, and for symmetry with the 'unknown member' error
message. These provide insight into how the data structures in init.c
are meant to be accessed.
The previous implementation had a wrong model of how initialization
happens in C99, its assertions failed in all kind of edge cases and it
was not possible to fix the remaining bugs one at a time without running
into even more obscure assertion failures.
The debug logging was detailed but did not help to clarify the
situation. After about 20 failed attempts at fixing the small details I
decided to start all over and rewrite the initialization code from
scratch. I left the low-level parts of handling designators, the code
that is independent of brace_level and the high-level parts of how the
parser calls into this module. Everything else is completely new.
The concept of a brace level stays since that is how C99 describes
initialization. The previous code could not handle multi-level
designations (see d_init_pop_member.c). There are no more assertion
failures in the initialization code.
Some TODO comments have been left in the tests to keep the line numbers
the same in this commit. These will be cleaned up in a follow-up
commit.
The new implementation does not handle initialization with "missing"
braces. This is an edge case that both GCC and Clang warn about, so it
is not widely used. If necessary, it may be added later.
The new implementation does not use any global variables in the vast
majority of the functions, to make all dependencies and possible
modifications obvious.
I had not expected to trigger another assertion, I just wanted to make
sure my latest ongoing refactoring will not break this case. Apparently
there is no need to worry about that.
In my not yet published rewrite of lint's init.c, I forgot to copy the
array type. Guard against this bug, which would have been hard to find.
Given that in C, the declaration 'int a[], b[]' creates two different
type objects anyway, it's not easy to come up with a test case that
actually triggers this possible bug. I'm not sure whether this test
indeed catches this bug. If not, I'll add another test.
The 'cnt = level->bl_type->t_tspec == STRUCT ? 2 : 1;' in
initialization_push_struct_or_union is obviously wrong since not every
struct has exactly 1 remaining member after the first member that has an
initializer with designation.
This bug started its life in init.c 1.12 from 2002-10-21, a little over
18 years ago.
This removes 7 wrong warnings when running lint in -t mode.
Surprisingly, this added a warning that had not been there before in
msg_189.c. This is because check_variable_usage skips the checks when
an error occurred before. All diagnostics that happened were warnings,
but the -w option treats them as errors, see vwarning.
The following code is valid:
int valid = {{{ 3 }}};
C90 3.5.7 and C99 6.7.8 both say that the "initializer for a scalar
shall be a single expression, optionally enclosed in braces". They
don't put any upper bound on the amount of braces, not even in the
"Translation limits" section.
Using only parts of the test name files in t_integration.sh made it
unnecessarily difficult to find a test based on its filename. The tests
for the individual messages already have a different prefix.
No functional change.
Before: error: expected undefined [99]
After: error: 'expected' undefined [99]
Seen in external/mpl/bind, which for Clang defines in stdatomic.h:
> #define atomic_exchange_explicit(obj, desired, order) \
> __c11_atomic_exchange_explicit(obj, expected, order)
Note the mismatch between 'desired' and 'expected'.
From the previous commit, there was an off-by-one error left, which was
due to the interaction between designation_add_subscript and
extend_if_array_of_unknown_size.
The other crucial point was to call initstack_pop_nobrace before
accessing the "current initialization stack element". Without this
call, in msg_168.c the "current element" would point to the initializer
level for 'const char *' instead of the one for 'array of const char *'.
One more step towards supporting C99.
Initialization is still buggy but better than before. The remaining bug
is that only the first designator determines the array size, and after
that, the array is no longer considered of unknown size. This
contradicts C99. More improvements to come.
This has been a long-standing limitation of lint. Now it is almost
ready for C99, see the list of "major changes" in the foreword of C99.
One known remaining bug in the area of initialization is designators
with several levels, such as '.member[2].member.member'. Oh, and
designators for arrays are only supported in the parser but not in the
type checker. There's still some work to do.
Missing braces after 'if', since init.c 1.68 from 2021-02-20.
GCC 10 doesn't complain about this even with -Wmisleading-indentation
since at least one of the involved lines is a macro invocation (in this
case both lines). GCC 11 will warn about this.
Clang warns about this, but the regular Clang build currently fails for
other reasons, so this problem didn't show up there either.
The errors in line 74 and 75 of the test are wrong. Everything is fine
there. The bug lies in init_array_using_string, try to see if you can
spot it, neither GCC 9.3.0 nor Clang 8.0.1 could.
Now that the code contains explicit markers for starting and ending an
initialization, and having the guarantee that an assertion fails
whenever some code accesses the state of the "current initialization"
even though there is no ongoing initialization gives me much more
confidence in the correctness of the code. The calls to
begin_initialization and end_initialization always appear in pairs,
enclosing the minimal amount of code necessary for initialization.
In a nutshell, global modifiable state is error-prone and hard to
understand.
A nice side effect is that the grammar no longer needs a special rule
for the outermost initializer since the functions for the debug logging
are now called explicitly.
The code that misuses the initialization state just because it needs to
temporarily store a sym_t somewhere is now clearly marked as such. A
GCC statement expression can appear anywhere and is therefore
independent of the initialization. Most probably the code can simply
refer to the local variable in the grammar rule itself, or this variable
needs to be encoded in the grammar %union. For sure there is a better
way to handle this.
There is no longer a need that the function 'declare' initializes the
initialization state, it was just the wrong place to do this.
When a pointer to a compound literal is used as an initializer, lint
reports a wrong type mismatch. The details of what happens are now
documented, which allows this problem to be fixed properly.
While here, reword the message, avoiding operators and parentheses.
Since 2021-01-02, providing the precise type name is as easy as the
broad type classification (just replace tspec_name with type_name), and
it's definitely more useful to the human readers.
Previously, only loop statements were considered for reachability. This
ignored the possibility of an early return in an if statement, or
unreachable branches.
C99 6.7.8p11 says for initialization that "the same type constraints and
conversions as for simple assignments apply", so actually apply them.
(I had just forgotten this "operator" when I first implemented strict
bool mode.)
The new code may not be the most beautiful, but it fixes all bugs that
occurred while testing message 327. The grammar rules are taken from
C99 6.8.2, so it's no surprise they work well.
The '%prec T_COMMA' is necessary to avoid lots of parse errors in the
lint1 unit tests. Curiously, further down in the grammar, for compound
literals, the '%prec T_COMMA' is not necessary, even though the context
looks very similar.
No functional change.
In d_c99_init.c, the initialization of array_with_designator failed.
The designator '.member' from that initialization was not cleaned up
before starting the next initialization.
The manual page says that the default maximum length of a comment line
is 78. The test 'comments.0' wrongly assumed that this 78 would refer
to the maximum _column_ allowed, which is off by one.
Fix the wording in the test 'comments.0' and remove the (now satisfied)
expectation comments in the test 'token-comment.0'.
Several other tests just happened to hit that limit, fix these as well.
Previously, the '/*' in the string literal had been interpreted as the
beginning of a comment, which was wrong. Because of that, the variable
declaration in the following line was still interpreted as part of the
comment. The comment even continued until the end of the file.
Due to indent's forgiving nature, it neither complained nor even
mentioned that anything had gone wrong. The decision of rather
producing wrong output than failing early is a dangerous one.
At least, there should have been an error message that at the end of the
file, the parser was still in a a comment, expecting the closing '*/'.
In process_preprocessing, the variable 'quote' is not used, which makes
the code suspicious of not handling the combination of string literals
and comments properly.
The basic idea of indent is to split the input into tokens and then
reassemble them, reformatting them on the way. These tokens determine
how the output is formatted, therefore add tests for each of the
terminal tokens and nonterminal parser symbols, to cover more common
cases, and edge cases as well.
This check has been too quick and broke the lint build. Among others,
lib/libpuffs has -w included in LINTFLAGS, which means that the build
can fail even for new warnings, not only for errors.
libpuffs compares a uint16_t with constants from an unnamed enum type.
Since the enum type is completely unnamed (neither a tag nor a typedef),
there is no way to define a struct member having this type. This was a
scenario that I just didn't consider when I added the check to lint.
For now, disable the new check completely. The previously existing lint
checks stay enabled, including the one that warns about mismatched
anonymous enum types in the '==' operator, which is very similar to the
now disabled check.
Since unnamed enum types cannot be used in type casts, there is no
sensible way that this type mismatch could be resolved, without changing
the definition of the enum type itself, but that may be in a
non-modifiable header.
Therefore, comparisons with enum constants of unnamed types cannot be
sensibly warned about.
When indent runs in filter mode, it may output messages to stderr.
Allow tests with non-empty expected stderr.
In the ATF output, the filename 'output_file.parsed' was not helpful for
casual readers of diff output since they expect the filenames to be
meaningful. Embed the name of the test case in that filename.
Fix quoting of the shell variables.
Remove the repetition of the regular expression to clean up the test
files.
before: array of unsigned int[4]
now: array[4] of unsigned int
Listing the array dimension first keeps it in contact with the keyword
'array'. This reduces confusion, especially for nested arrays.
Discovering one bug in 17 command line options is acceptable. This
compensates the bad first impression I got in the previous batch of
tests, which consisted of 11 tests and found 8 bugs.
Given that indent "has even more switches than ls(1)", there are far too
few tests. To make it easier to add meaningful tests for each of the
options, add the templates for the tests right now, ready to be filled
in.
This is something that neither GCC 10 nor Clang 8 do, even though it
seems useful. Lint didn't do it up to now, but that was probably an
oversight since it is easy to miss the implicit '==' operator in the
switch statement.
Neither lint nor GCC 10 nor Clang 8 have a warning for an enum type
mismatch in a switch statement.
GCC 10 issues a warning but completely misses the point of the
mismatched enum types. It only warns because in this test, EVENING has
the numeric value 3, which is out of bounds for enum color, where the
valid range is from 0 to 2. It says:
> msg_130.c:45:2: warning:
> case value ‘3’ not in enumerated type ‘enum color’ [-Wswitch]
Clang 8 behaves almost the same, it just doesn't mention the value of
the constant, saying merely 'case value not in enumerated type'.
Lint can warn about narrowing conversions, it just doesn't do so by
default.
The option -a (which is included in the default LINTFLAGS in sys.mk)
only reports narrowing conversions from 'long' or larger. To get
warnings about all possible narrowing conversions, the option -a has to
be given more than once.
PR bin/14531
Each of the tests named msg_*.c repeats the template of the message, to
make the test somewhat self-contained when viewed in isolation.
This creates a redundancy, and keeping track of this manually is next to
impossible. I tried it and failed in 9 cases, even though it has just
been 2 months since I myself created the initial files and I knew all
the time that this redundancy exists.
Be fool-proof for the future by checking this automatically.
These expressions are indeed constant for a specific platform, but on
another platform their value may change. This makes them unsuspicious
and legitimate for portable code.
Seen in rump_syscalls.c, as 'sizeof(int) > sizeof(register_t)'.
Previously, 'typedef enum { E } name' was output as 'name', which
omitted the information that this was an enum type. Now it is output as
'enum typedef name'.
Previously, 'typedef struct { int member; } name' was output as 'struct
<unnamed>', which omitted the typedef name. Now it is output as 'struct
typedef name'.
Message 153 didn't state obviously which of the pointer types was the
one before conversion (or cast) and which was the resulting type.
Message 229 didn't have any type information at all.
This warning occurs more than 7400 times in a regular NetBSD build, and
without giving any type information, leaves the reader clueless about
what the underlying issue might be. Add type information since that is
a no-brainer to implement.