This removes the redundancy of mentioning the function name in the error
message. This redundancy had been correct in all but 2 cases:
build_real_imag and tsize.
Having 2 lines of source code per assertion is too much, especially
since most of this code is redundant anyway. Extract the common code
and the additional negation into a simple function instead.
Even with -DDEBUG and -DYYDEBUG, the debug output is not detailed enough
to clearly see what happens.
Add some custom debug logging to the parser, mainly for demonstration
purposes, and also to find out how to fix the test d_struct_init_nested.
C99 guarantees that the first 31 characters of an identifier with
external linkage are significant. This removes the need to use
abbreviations for common words.
Exporting the variables at the right time and with the correct values is
a subtle issue. The current implementation carefully marks variables as
ready to be exported, then exports them and at the same time tries to
export as few variables as possible, to avoid memory leaks. This test
describes and explains how all this works in detail.
This test also justifies that the call to Var_ReexportVars happens in
the make process itself, not in the child processes, no matter whether
these are created with vfork or (only theoretically) with plain fork.
This has changed in compat.c 1.217, job.c 1.390 and main.c 1.504 from
2020-12-27.
The previous function name suggested that it would set the complete flag
of the type, but it was the exact opposite. To reduce confusion, negate
the meaning of the parameter.
Using mmap is beneficial if the loaded data is read-only, or if it is
accessed in random order. Neither of these applies here. When loading
a file, make reads it strictly from top to bottom, once. During
parsing, the loaded data is modified in-place to insert '\0' and '\n'
for terminating strings and lines. Because of all of this, there is no
benefit in using mmap.
Reading the file using 2 calls to read(2) (one for the data, one for
checking for EOF) loads the data in a single pass, instead of producing
a page fault whenever the parser passes another page boundary.
Use a Buffer for loading the file data, to avoid calling bmake_realloc
directly.
Do not resize the loaded buffer at the end. Each loaded file is
short-lived anyway, and only a few files are loaded at the same time, so
there is no point in optimizing this part for low memory usage.
Controlling the expansion of variable expressions using a global
variable and a VARE flag was inconsistent.
Converting the global variable into a flag had to prerequisites:
1. The unintended duplicate variable assignment had to be fixed, as
done in parse.c 1.520 from 2020-12-27. Without this fix, it would have
been necessary to add more flags to Var_Exists and Var_SetWithFlags, and
this would have become too complex.
2. There had to be a unit test demonstrating that VARE_KEEP_DOLLAR only
applies to the top-level expression and is not passed to the
subexpressions, while VARE_KEEP_UNDEF applies to all subexpressions as
well. This test is in var-op-expand.mk 1.10 from 2020-12-28, at least
for the ':@word@' modifier. In ParseModifierPartSubst, VARE_KEEP_UNDEF
is not passed down either, in the same way.
At that point, the expression can never be varUndefined. At the
beginning of ParseVarnameLong, the expression is initialized to a simple
empty string, and that string is only ever converted to varUndefined at
the very end of Var_Parse.
Back in 1993, the variables in a context were stored in a linked list.
Searching such a list indeed required literally thousands of calls to
strcmp. In make.h 1.22 from 1999-09-15, the linked list was replaced
with a hash table, requiring much fewer string comparisons. Since then,
the rationale doesn't apply anymore.
This allows the -q option to distinguish errors from out-of-date
targets. Granted, it's an edge case but it should be solved
consistently anyway.
The majority of cases in which make exits with exit status 1, even in -q
mode, is when there are parse errors. These have been kept as-is for
now as they affect many of the unit tests.
The technical errors, on the other hand, occur so rarely that it's hard
to write reliable tests for them that fail consistently on all platforms
supported by make.
Extracting the character-level details makes the essence of Var_Subst
visible in the code, which is to iterate over the given text, handling a
few types of tokens.
The many constants were invented because at that time I didn't quite
understand the actual outcomes of Var_Parse that need to be
distinguished. There are only a few:
(1) Errors, whether they are parse errors, or evaluation errors or
undefined variables. The old constants VPR_PARSE_MSG and
VPR_UNDEF_MSG are merged into VPR_ERR.
(2) Undefined expressions in a situation in which they are allowed.
Previously the documentation for VPR_UNDEF_SILENT talked about
undefined expressions in situations where they were not allowed.
That case is fully covered by VPR_ERR instead.
(3) Errors that are silently ignored. These are probably bugs.
(4) Everything went fine, the expression has a defined value.
Right now, Var_Subst always returns VPR_OK, even if there had been parse
errors or evaluation errors. If that is no longer true, the errors will
be reported properly.
Since make uses vfork if available, re-exporting the variables happens
in the address space of the main process anyway, so there is no point in
mentioning anything about "our client process" anywhere.
Previously, running lint mode didn't define MAKE_RCSID at all, which
resulted in a syntax error.
While here, reduced the indentation and nesting of the preprocessor
directives.
When running lint(1) on the code, it defines the preprocessor macro
"lint" to 1, which generated a syntax error in the declaration "Boolean
lint", as that became "Boolean 1".
This macro was supposed to return a boolean expression all the time, it
just hadn't been implemented this way. This resulted in wrong output
for the test sh-flags, in compilation modes -DUSE_UCHAR_BOOLEAN and
-DUSE_CHAR_BOOLEAN, since in ParseCommandFlags, the expression
DEBUG(LOUD) didn't fit into a boolean.
Since make doesn't support variable names containing spaces, this edge
case is not enough reason to stop this feature. Having multiple
variable names as arguments nicely aligns with other directives such as
.for and .export.
Previously, mmapped files didn't always have the final newline added.
Only those that ended at a page boundary did.
This confused ParseRawLine, which assumed (and since parse.c 1.510 from
moments ago also asserted) that every line ends with a newline, which
allows the code to assume that after a backslash, there is at least one
other character in the buffer, thereby preventing an out-of-bounds read.
This bug had been there at least since parse.c 1.170 from 2010-12-25
04:57:07, maybe even earlier, I didn't check.
Now line_end always points to the trailing newline, which allows
ParseGetLine to overwrite that character to end the string.
For the modifiers :gmtime and :localtime, the excess newline had been
added in var.c 1.631 from 2020-10-31 21:40:20.
For the modifiers :range and :ts, the excess newline had been added in
var.c 1.635 from 2020-11-01 14:36:25.
It's easier to have both the expressions and the expected values in a
single file. This also allows for flexible handling of multiple
acceptable outputs, in this case for 32-bit time_t.
Passing a struct as printf argument for the %s conversion doesn't work.
On NetBSD-8.0-x86_64, the output looks normal, but on SunOS-5.9, the
output is garbled, containing bytes 0xFF and 0xFE.
This bug had been introduced in parse.c 1.507 from 2020-12-20 14:52:16.
Thanks to sjg for finding this bug so quickly.
This makes all intermediate strings constant. For this simple
search-and-replace refactoring, all intermediate locations where the
"current value of the expression" was stored had to be of the type
MFStr.
Using FStr instead of MFStr allows to save a few memory allocations,
which will be done in the follow-up commits.
Previously, memory management had been split among several variables.
The general idea was very simple though. The current value of the
expression needs to be kept in memory, and each modifier either keeps
that value or replaces it with its own newly allocated result, or
var_Error or varUndefined.
Using MFStr, it does not matter anymore that var_Error and varUndefined
are statically allocated since these are assigned using MFStr_InitRefer.
The complexity of the implementation is now closer to the actual
complexity. Most probably the code can be simplified even more.
Memory management is still complicated in this area. To clean this up,
the previous value of the expression needs to be converted to an MFStr
first, and later to an FStr.
Do not increment a null pointer.
Do not assign to a variable twice in the same statement. To be fair,
this may be safe because of the sequence point when the function is
called, but anyway, it looks too close to undefined behavior.
Before, make printed an "error message" that did not include the word
error and thus was not easily identified as such. This "error message"
also did not influence the exit status in the default mode but only in
-dL mode. The error message also didn't include any line number
information and was thus rude.
The next commit will error out on unknown modifiers and influence the
exit status. The test modmisc.mk contains both parse time tests and run
time tests. To prevent the latter from being run, the parse error is
moved to varmod-indirect.mk, which only contains parse time tests.
Previously, the parameter out_freeIt was not guaranteed to be
initialized in every case, at least when looking only at EvalUndefined.
This contradicted the variable name.
Replace the two parameters with a single FStr to ensure that these
variables are always initialized together.
In GetVarnamesToUnexport, there is no need to free the local FStr since
the only place where it is assigned an allocated string is at the very
end.
Having separate functions for the two main use cases of a possibly
allocated string makes the calling code simpler. This is a preparatory
commit for making the memory allocation in ApplyModifiers easier to
understand.
Put the simple tests at the top, demonstrating that there are already
some cases in which the misspelled directive is detected. It's not
detected though if the surrounding conditional branch is skipped.