Writing *(p+1) is needlessly confusing, even though it adds a little
symmetry between *p and *(p+1). Still, one of these expressions has
parentheses, the other doesn't, which breaks the symmetry.
Wrap overly long code line.
It's confusing to refer to the digits after the backslash once with
index 0 to 2, and the other time with index 1 to 3.
Before, '\b' was interpreted as a simple 'b', which is confusing for C
programmers. Same for '\a'. There is absolutely no reason to escape
letters, so fail early in these cases.
The '\h' in the test addchstr was obviously a typo that was easy to
detect, if only the compiler had been strict enough from the very
beginning.
The code is wider than 80 characters, same as the code that parses octal
escape sequences a few lines above it. This code will be refactored to
use less indentation in a follow-up commit.
indent cannot handle line-end comments.
The indent test suite requires each test file to have both a NetBSD and
a FreeBSD RCS ID. If the FreeBSD RCS ID is missing, the test will
silently pass since in that case, an empty file is compared with an
empty file. See the /start/,/end/ operator in t_indent.sh.
It is possible that the type name 'array[unknown_size]' may spill into
the user-visible diagnostics. The current test suite does not cover
such a case. Anyway, saying 'array[unknown_size]' is still better than
saying 'array[0]', which would be misleading.
By listing the expected diagnostics directly at the code that triggers
the diagnostics, it is easier to cross-check whether the diagnostics
make sense.
No functional change to lint itself.
Several C99 tests do not actually test C99 features but instead GCC
features. All these tests should be double-checked again.
In some other tests, split the initializers into more lines, which makes
it easier to read the debug log corresponding to these tests. This will
be necessary for reworking the initializer code to actually conform to
C99.
It's too easy to forget one of them when adding or removing some lines.
This would make it more difficult to locate the lines referenced in the
error messages.
Right now, this variant of the popular macro pattern is flagged as
needing a /*CONSTCOND*/ annotation. As with 'do { ... } while (0)',
there is nothing wrong with this pattern, therefore there should be no
warning.
This is a preparation for allowing 'do { ... } while (false)', in
addition to the commonly used 'do { ... } while (0)', without declaring
the controlling expression /*CONSTCOND*/.
The intention of the getopt check was to analyze only those while loops
whose condition consists of the usual getopt call. For all other while
loops, ck.while_level was intended to stay at 0.
This was not the case in ckgetopt.c 1.2 and has been fixed in ckgetopt.c
1.3. The code did not document the intended invariants though, which it
should have done. This will be done in a follow-up commit.
Add a sh ATF test to demonstrate a bug in the way that \0 characters
are dropped from scripts. This test will eventually be extended to
test other potential sh script input related issues.
When initially committed, this test should fail. It should succeed
when the fix for the PR is committed (soon).
Nb: this tests only the \0 related issues from the PR, the MSAN
detected uninitialised variable (struct field) can only be detected
by MSAN, as it has no visible impact on the operation of the shell
when running on any real (or even emulated) hardware.
(It will, however, also be fixed).
The previous code only errored out if a write failed completely. If it
was partially written, the program continued without writing the rest of
it.
Extract the common code into a few functions that write raw data to the
parent process.
In sysinst, the installation screen is indented with tabs. Sysinst uses
msgc, which brings its own text layout engine. This engine does not use
addbytes but addch. In addch, the x position for each tab was advanced
twice as much as needed. The menu items were thus not indented by 8
spaces but by 16, which caused an ugly line break in the German
translation.
This bug largely went unnoticed because most other applications use
addbytes instead, which worked fine all the time. It had been
introduced somewhere between NetBSD 8.0 and NetBSD 9.0.
The code around this bug used aliased variables for win->curx and
win->cury a lot. Getting this right is difficult and needs a thorough
test suite. Even though libcurses has 201 tests, that is not nearly
enough to cover all the relations between the various functions in
libcurses that call each other, crossing API boundaries from internal
to external, doing character conversions on the way and juggling around
4 different types of characters (char, wchar_t, chtype, cchar_t).
The simplest fix was to remove all this aliasing, while keeping the
API the same. If _cursesi_waddbytes is not considered part of the API,
it would be possible to replace px with win->curx in all places, same
for py and win->cury.
The complicated code with the aliasing may have been meant for
performance reasons, but it's hard to see any advantage if both points
of truth need to be synchronized all the time.
Libcurses can be built in 2 modes: with wide character support or
without (-DDISABLE_WCHAR). The test suite only covers the variant with
wide characters. The single-byte variant has to be tested manually.
Running sysinst with the single-byte libcurses produces the correct
layout.
Having code indented so far to the right that each word gets its own
line is ridiculous. Fix that.
While here, remove the cargo-cult realloc pattern, which is not needed
if the process exits immediately on error.
While here, reduce the indentation of the code by returning early.
No functional change.
The child process needs to be properly controlled by the parent process.
Otherwise it is not possible to get code coverage data from it using
gcov since that requires the child process to exit normally. Previously
the child process had been killed because its parent had exited.
The test mvwin previously expected an endless stream of bytes, by
comparing the actual output with /dev/zero. This didn't make sense as
the curses output does not contain '\0' in any of the test cases.
Compare with /dev/null instead. This is as wrong as before, but the
curses test framework currently ignores this situation, as for many
other test cases. See the numerous "Excess" messages in atf-run.log.
All include commands in the current test suite use relative paths.
Instead of a fixed include path, interpret the included filename
relative to the including file.
The parent process, like the child process, needs only 2 of the 4 pipe
ends.
In verbose mode (now at testlang_parse.y:1151 and :1154), both ends of the
pipe_from_slave were examined. This looked like a typo and has been fixed
to those pipe ends that are relevant to the parent process.
By providing declarative syntax for accessing the arguments, the
unnecessarily detailed boilerplate code is hidden. This allows easy
inspection by tools and humans, to check for typos and other mistakes.
All uses of the previous macro did not treat the argument as a string or
array of chtype, but as a single chtype. It's strange that the previous
code arbitrarily split the access to the argument by first storing it as
a pointer and then dereferencing it.
No functional change.
I had broken this in testlang_parse.y 1.22 from 2021-02-07, when I
extracted the common 'eol' from the statements. Extracting 'eol' had
the effect that the action for the statement was run before the line
number increased.
Now the line numbers in the diagnostics are the same again as before.
For lines that end with a backslash, the reported line number is the one
from the last of these lines, not the first one, also as before. This
feature is not used by any of the current tests though.
It looks as if the original author just didn't know how to declare the
type of non-terminals. The explicit types in the '$' expressions were
all consistent.
No functional change.
The libcurses framework is not strictly typed and thus provides plenty
of ways to shoot yourself in the foot. It's a waste of time debugging
things that a proper programming language can easily prevent.
The function addch expects an argument of type 'chtype'. Passing a
"double-quoted" string does not match this, as 'chtype' is completely
different from a plain 'char'. Instead, functions taking a 'chtype'
must be passed a `backtick-quoted` string.
Previously, there were several concurring styles:
$msg in line %zu of file %s
$msg line %zu of file %s
%s, %zu: $msg
All these are now replaced with "%s:%zu: $msg".
Previously, commas were completely ignored by the grammar. Erroring out
on invalid characters made some of the tests fail since the comma was
not recognized anymore. Add it back, but only for defining arrays. It
would have been possible to leave out the commas or make them optional,
but since the current tests do not make use of that, keep the grammar as
strict as possible.
Fix an unclosed string literal in a test. This had been wrongly
accepted before by the grammar.
Be strict when parsing the tests. Any unknown character is an error.
This avoids an endless loop when running "./director /dev/zero". There
is no point in silently ignoring other invalid characters as well, as
this would only leave potential test writers in an unclear state,
without any benefit.
The function getyx is not a function but a macro. It does not return
int, but void. Since these changes destroy the simplicity of the
example, combining a regular return value and pass-by-reference return
values, I rewrote the whole section and added more examples.
There is no need to write the keywords in upper case or mixed case. The
only case where a keyword did not have the canonical form yet was a
single lowercase 'ok' in the test case 'innstr'.
The test framework doesn't check the files strictly, it only checks
whether the expected output is a prefix of the actual output, or vice
versa. This allows several deviations to pass unnoticed, which is
wrong.
Up to now, the test command "compare /dev/null" was a no-op since the
command was only parsed but not run at all. Now run it.
This makes the test mvwin fail. That test will have to be fixed.
Comparing to /dev/null is certainly possible and may make sense,
comparing to /dev/zero is nonsense since the actual stream can never be
endless. Some tests do that nevertheless, for whatever reason.
In order to have the expected test output closer to the curses commands
that cause it, it may be a good idea to add another command
'compare_str' that would work independently of an external file and at
the same time allow the expected output to be commented and explained.
This is not possible right now since the .chk files are read exactly
as-is.
The grammar rule 'args' has been left as-is since it needs to be split
into 'args' and 'arg' first, to avoid the redundancy.
The braces in "if (create_check_file)" were misleading. It's strange that
GCC didn't reject this.
Previously, each statement ended with 'eol'. This was unnecessarily
verbose since the 'eol' is not really part of the statement, it's part
of a line.
No functional change.
This makes it possible to write small remarks directly in the affected
line, which not only makes for a clean visual appearance but also shows
up prominently in "cvs annotate" or "git blame", showing when such a
remark has been modified.
Several of the braces were misaligned. For the simple keywords, there
is no need to write these braces at all, they only made the code look
more complicated than it really is.
I stumbled upon this because syntax errors in the test cases currently
let the test case succeed instead of fail, which is another ingredient
for unreliable tests, besides the loose output matching.
When adding "\t" via addch, win.curx advances by twice the spaces as
intended. This bug was introduced somewhere between NetBSD 8.0 and 9.0.
Adding "\t" via addstr does not have this bug.
This bug causes the installation menu of sysinst to be have its menu
items indented by 16 characters instead of only 8. This in turn
produces an ugly line break in the German translation.
The test framework for libcurses is not well integrated into ATF.
Whenever the expected output is longer than the actual output, or vice
versa, the test passes nevertheless. This makes it necessary to
constantly look into atf-run.log to see whether the actual output is
indeed equal to the expected output, which is crucial, especially for
telling the difference between addstr and addnstr.
Reusing the .chk files for several tests is not a good idea either. For
example, addstr and waddstr are supposed to produce the same result for
ASCII-only text, so it was tempting to use the same file. But waddstr
seems to have a bug (maybe undefined behavior), at least waddstr returns
ERR in one case where it shouldn't. This means that currently the
expected output (acknowledging the bug) must be different.
The "expected" test output in waddstr.chk looks completely broken, but
that's exactly what the test produces right now.
That message is only supposed to warn about compatibility to traditional
C, in case the function should ever be compiled without its prototype
being in effect. All other type checks are supposed to be in another
function, as documented, but that type check misses to report a few
error-prone type combinations (long to char, long to int).
30 years after the introduction of prototypes with C90, almost all
existing code uses prototypes. The warning has thus lost most of its
usefulness and can rather be confusing since a conversion from 'char' to
'long' is not problematic with prototypes in action, and the probability
of the code being backported to a pre-C90 compiler is diminishingly
small.
The words "due to prototype" now serve as a hint again. The proper fix
could be to suppress this warning in C99 mode since that's far enough
from traditional C.
The lint tests do not focus on the whitespace since that is the most
boring part of code style. Therefore, format the tests to be readable
by following share/misc/style as close as possible.
For those tests that didn't use GCC-style line markers such as "# 2",
the line numbers of the diagnostics stay the same. This is purely
conincidental. Before, the 3 lines came from lint's built-in
definitions (see 'builtins' in main1.c), and line number counting
continued as if nothing had happened, making the first line of the
actual file line 4. These 3 built-in lines are now replaced with 3
lines of file header.
The words "due to prototype" are an anachronism from the 1990s.
Nowadays every function is defined using a prototype, which makes these
words redundant.
For every conversion it is useful to know both the source and the target
type since these are not always obvious from the code.
The only surprise is the warning in d_gcc_extension. The conversion
there is from 'double' to 'long double', which is a lossless conversion.
This may be a bug in lint.
If one of the nested subexpressions is parenthesized, the author
probably knew how these expressions are evaluated. Therefore don't warn
in such a situation.
Maybe the original author once made a typo and tried to initialize
variables but instead compared them, like this:
int a, b, c;
a == b == c;
This would explain the text of the message, which still sounds strange.
At least it doesn't show up as often anymore.
Message 189 would have applied to traditional C and was supposed to
detect assignments between struct and union types. The corresponding
check had never been implemented though.
Traditional C has been superseded for 30 years now, therefore there is no
point in adding this check retroactively.
Since rt is an alias for rn->tn_type->t_tspec, it cannot be PTR and VOID
at the same time. This makes the condition unsatisfiable. Removing
that part of the code didn't show any change in behavior, as expected.
It may even be that fixing this obvious bug doesn't show any change in
behavior since that function is only used in a single place and
check_pointer_comparison performs its own checks before issuing any
warning.
At least the test cases added to msg_124.c all run as expected.
There are 2 remaining expressions:
In line 244, !(kw->kw_deco & deco) is a bitwise and. This is already
allowed for enums, it needs to be allowed for arbitrary integer
expressions as well. This covers the many places where plain integers
are used for bit fields, together with #define. This pattern is not as
typesafe as using enums, still it is common practice.
In line 769, the expression !finite(f) is a legitimate use of a function
that has return type int for traditional reasons. It's the same as for
ferror.
There are several other functions like unlink, open or strcmp that have
return type int as well, but with a different meaning. It is not yet
clear what the best way is to handle these different meanings. Having
to write finite(f) == 0 in strict bool mode doesn't look idiomatic, on
the other hand, !strcmp(s1, s2) is exactly the pattern that strict bool
mode wants to avoid.
In strict mode, allowing 1 as bool constant expression is probably not
needed in practice since most comparisons are != 0 instead of == 0.
Furthermore, in the expression (flags & 0x0002) == true, comparing with
true is misleading since the '==' operator can never evaluate to true in
this case.
The strict bool mode gets complicated because for system headers the
rules need to be relaxed since they cannot be changed easily, often not at all.
Still, if lint validates a program in strict bool mode, that program
must run with equal behavior regarding boolean expressions even on a
pre-C99 platform.
Seen in usr.bin/make/meta.c:1670: FD_ZERO(&readfds). These macros
cannot be fixed since system headers must not include <stdbool.h>.
Therefore INT constants should be accepted as controlling expressions as
well.
At that stage of analysis, the ampersand is no longer ambiguous, it has
already been resolved as the address-of operator, instead of the
bitwise-and operator.
Previously, lint1 allowed integer constants such as 0 and 1 to be used
as bool constants. This was only half-baked since after fixing all
error messages from that strict mode, there may still be integer
literals in the code that should be replaced with true or false. This
would stop a migration from int to bool in the middle, leaving
inconsistent code around.
To find the remaining type inconsistencies, treat integers and bool as
completely incompatible, even for compile time constants.
Currently, strict bool mode still allows integer constant expressions to
be converted implicitly to bool. This is something that other languages
such as Go, Java, C#, Pascal don't allow.
By providing a custom implementation of <stdbool.h> that defines false
and true to custom bool constant identifiers, lint will cover these
cases as well.
To prepare for this, reword the rules and restructure the tests in
d_c99_bool_strict.c.
Before December 2020, it was cumbersome to add type information to a
message since the caller had to explicitly allocate buffers for the type
names. That's probably the reason why this crucial detail had been left
out of the warning.
In strict bool mode, bool is considered incompatible with all other
scalar types, just as in Java, C#, Pascal.
The controlling expressions in if statements, while loops, for loops and
the '?:' operator must be of type bool. The logical operators work on
bool instead of int, the bitwise operators accept both integer and bool.
The arithmetic operators don't accept bool.
Since <stdbool.h> implements bool using C preprocessor macros instead of
predefining the identifiers "true" and "false", the integer constants 0
and 1 may be used in all contexts that require a bool expression.
Except from these, no implicit conversion between bool and scalar types
is allowed.
See usr.bin/tests/xlint/lint1/d_c99_bool_strict.c for more details.
The command line option -T has been chosen because all obvious choices
(-b or -B for bool, -s or -S for strict) are already in use. The -T may
stand for "types are checked strictly".
The default behavior of lint doesn't change. The strict bool check is
purely optional.
An example program for strict bool mode is usr.bin/make, which has been
using explicit comparisons such as p != NULL, ch != '\0' or n > 0 in
most places for a long time now, even before the refactoring in 2020.
It's GXemul that has the bug! Unfortunately, there's no way (currently) to
detect if we're running under GXemul emulation, so disable for all mips
for now. Hopefully, GXemul will get fixed soon.
There is no danger in allowing (flags & FLAG) as a controlling
expression, provided that it is immediately compared to zero, such as in
an if statement or as the operand of a logical operator.
This strict mode is not yet implemented. The plan is to use it for
usr.bin/make, to get rid of the many possible variants of defining the
Boolean type in make.h. These variants did find some bugs, but not
reliably so. Using static analysis seems more promising for this.
In an early stage of developing this test, lint1 crashed in the enum
definition in line 213, where the node for the '?:' had been NULL. This
can happen in other situations as well, such as with syntax errors, but
these should be rare, as lint is usually only run if the compiler has
accepted the source code. Still, there should not be any assertion
failures while running lint1.
Several of the tests only need to add the -p flag. Mentioning the
(current) default flags in each of these tests is redundant. Therefore,
allow them to specify "lint1-extra-flags: -p" instead of the current
"lint1-flags: -g -S -w -p".
It's perfectly valid to directly use a function name as the controlling
expression of an if statement. That function name is converted
implicitly to a pointer to that function, and that is a scalar value
then.
Spotted by christos in lib/libpthread/pthread.c:634.
In func.c 1.39 from 2020-12-31 18:51:28, the check that controlling
expressions are indeed scalar was extended from while and for loops to
if statements as well. It just seemed to have been an oversight.
This revealed a bug in lint, which didn't accept the following valid
code snippet from lib/libpthread/pthread.c:634:
void _malloc_thread_cleanup(void) __weak;
...
if (_malloc_thread_cleanup)
_malloc_thread_cleanup();
Testing a function (instead of a function pointer) for truthiness is
probably rare since most functions are defined unconditionally. For
weak functions it comes in handy though.
Clang-Tidy suggests to prefix the function with '&' to silence its
warning. Doing that revealed a non-obvious behavior in build_ampersand,
which does not add the AMPER node to the expression even though it is
clearly mentioned in the code. That is left for further research.
Once the original bug is fixed, it probably doesn't matter whether the
AMPER is discarded or retained since check_controlling_expression would
add it back. There's probably a reason though to sometimes discard the
AMPER and sometimes retain it.
- add fudge mode which gives a slightly slower hash function, but works
almost always in the first iteration by avoiding degenerate edges
- avoid keeping incidence lists around reducing the memory foot print by
30%
- split edge processing from hashing as in the non-fudge case it is a
reasonable costly part that often gets thrown away
- merge graph2 and graph3 routines now that they are mostly the same
i386 is an ILP32 platform (arch/i386/targparam.h). On these platforms,
int and long have the same size, and even with the -p option for
portability checks, INT_RSIZE in inittyp.c is defined to 4, not 3.
Because of this, in check_integer_conversion, psize(nt) was not greater
than psize(ot), and the warning was not issued.
To make the test behave the same on all platforms, changed the long
variables to long long, since long long is 64-bit on all platforms, and
int is 32-bit.
These symbolic names for INCBEF, INCAFT, DECBEF and DECAFT were
non-standard and thus confusing. All other operators were as expected.
Now that the operator names from ops.def are very similar, there is no
need to keep to almost identical lists around.
No change to the user-visible messages since the only place where these
operator names were used was in 324, and that message was restricted to
PLUS, MINUS, MULT and SHL.
Including the "p" in the symbolic operator names was questionable, for
several reasons:
1. The "p" could be taken to mean an actual variable name, which is
confusing if the function doesn't have such a variable, or even more
so if the line contains an unrelated variable called "p".
2. For the binary operators, having the "p" mentioned on both sides of
the operator (such as in "p + p") wrongly suggested that both
operands of the expression were the same.
3. The name "p" often stands for a pointer. Most of the operators
don't accept pointers, therefore the name was misleading.
For these reasons, the "p" was removed from the symbolic name of all
operators. This makes several pairs of operators indistinguishable:
INCBEF == INCAFT
DECBEF == DECAFT
UPLUS == PLUS
UMINUS == MINUS
STAR == MULT
AMPER == AND
This is not expected to create any confusion since C programmers are
expected to know these double meanings.
The symbolic names for SHLASS and SHRASS were missing the '=' before.
This was added since omitting it was probably an oversight.
This warning is the only one that calls print_tnode, which in turn uses
the redundant operator names in str_op_t.
There is another list of operator names in ops.c, but those names
include more clutter, for example "p + p" instead of a simple "+".
Using those operator names would therefore rather be confusing. These
two lists should be merged, to remove unnecessary redundancy.
It took quite a while to get to the correct interpretation of this small
piece of code and to draw the right conclusions from it. Now the bug is
finally ready to be fixed, as already announced in the test.
The message may be correct, but it is not helpful in any way. There are
just too many function pointers that may differ in a very small detail.
Before tyname.c 1.20 from 2021-01-02, the string representation of type
names was often limited to only 63 characters. Because of this, it made
sense to omit any detail that could need more space than this. Now that
this limitation is gone, it's reasonable to add more detail to the type
information, especially since that information is readily available.
The body of an ATF test must never return 1 but instead report failure
via atf_fail. Otherwise the following error message appears:
Failed: Test case body returned a non-ok exit code, but this is
not allowed
The test program t_integration intentionally bypasses the official ATF
API for performance reasons. But even then, it should stick to the API
as close as possible.
Bug: _Bool is not accepted as a bit-field, but it should be.
Bug: lint aborts in a controlled manner with message "common/tyname.c,
190: tspec_name(0)" when it sees a declaration of a _Complex bit-field.
(Not that a _Complex bit-field would make any sense.)
Since main1.c from 2014-04-18, running lint in -t mode produces strange
warnings in lines 1 to 3 of no file at all.
This is caused by the builtins that are parsed in main(). These
builtins are incompatible with traditional mode because they use long
double, which had not been known at that time.
Having a test for each message ensures that upcoming refactorings don't
break the basic functionality. Adding the tests will also discover
previously unknown bugs in lint.
The tests ensure that every lint message can actually be triggered, and
they demonstrate how to do so. Having a separate file for each test
leaves enough space for documenting historical anecdotes, rationale or
edge cases, keeping them away from the source code.
The interesting details of this commit are in Makefile and
t_integration.sh. All other files are just auto-generated.
When running the tests as part of ATF, they are packed together as a
single test case. Conceptually, it would have been better to have each
test as a separate test case, but ATF quickly becomes very slow as soon
as a test program defines too many test cases, and 50 is already too
many. The time complexity is O(n^2), not O(n) as one would expect.
It's the same problem as in tests/usr.bin/make, which has over 300 test
cases as well.
The variable namemem is supposed to be a circular list, which is
"documented" implicitly in push_member.
The implementation was buggy though. In pop_member, the circular list
was destroyed though. Given the list (capital, major, favorite_color,
green), removing capital made major point to itself in the forward
direction, even though it should not have been modified at all.
In the test, I had been too optimistic to quickly understand the code
around variable initialization. I was wrong though, so I had to adjust
the comments there to reality.