For performance reasons, the implementation of the simple rule "cmdline
overrides global" grew into code that is much more complicated than a
straight-forward implementation. This added complexity made it easy for
bugs to sneak in.
The extra condition had been necessary before FStr made memory
management simpler.
The Coverity annotation got out-of-date when the parameter was converted
to FStr since that type is not allocated on the heap, only its inner
members are.
No functional change.
Breaking the loop once for depth == 0 and once for depth == 1 was
unnecessarily confusing, as was the nested 'if'. Start counting with 0
since there is no reason to start at 1.
Evaluating the common subexpression '*p == endc' is left as an exercise
to the compiler.
No functional change.
In the expression ${:U}, the variable name is empty. Since these
expressions are generated by .for loops, the error messages for them
must not end with a trailing space. Putting the variable name in quotes
helps against that.
Replace "variable specification" with the more modern "variable
expression", reduce the number of parentheses, output more than a single
character for modifiers, make it obvious that in expressions such as
${:Serror}, the "" means a variable name.
Back in 1995, the modifiers were all single-character, and it made sense
to print only the first character. Nowadays, with ':S', ':@var@...@',
'::=' and several others, a little more context is useful to see where
the exact error is. The actual modifier is still guessed, and the guess
may be wrong as soon as backslashes get involved, but it is still better
than before.
For a very long time now, I had thought that it would be impossible to
undefine global variables during the evaluation of variable expressions.
This is something that the memory management in Var_Parse relies upon,
see the comment 'the value of the variable must not change'.
After several unsuccessful attempts at referring to an already freed
previous value of a variable, today I discovered how to unset a global
variable while evaluating an expression, which has the same effect. To
demonstrate that this use-after-free can reliably crash make, it would
need a memory allocator with a debug mode that never re-allocates the
same memory block after it has been used once. This is something that
jemalloc cannot do at the moment. Valgrind would be another idea, but
that has not been ported to NetBSD.
Undefining a global variable while evaluating an expression is made
possible by an implementation detail of the modifier ':@'. That
modifier undefines the loop variable, without restoring its previous
value, see ApplyModifier_Loop.
By the very old conventions of ODE Make, these loop variables are named
'.V.' and thus do not conflict with variables from other naming
conventions. In NetBSD and pkgsrc, these loop variables are typically
called 'var', sometimes '_var' with a leading underscore, which also
doesn't conflict with the typical form 'VAR' of variables in the global
namespace. Therefore, in practice these loop variables don't interfere
with other variables.
One case that can practically arise is when an outer variable has a
modifier ':@word@${VAR.${word}}@' and one of the referenced variables
uses the same variable name in the modifier, see varmod-loop.mk 1.10
line 91 for a detailed explanation.
By using the ${:@VAR@@} modifier in a place that is evaluated with
cmdline scope, it is not only possible to undefine global variables, it
is possible to undefine cmdline variables as well. When evaluated in a
specific make target, the expression ${:@\@@@} can even be used to
undefine the variable '.TARGET', which will probably crash make with an
assertion failure.
The variable name 'arg' was misleading since after a successful
TryParseTime, it would no longer point to the argument of the variable
modifier, but to the _end_ of the argument. To reduce confusion, use p
instead, like everywhere else. This name is less specific, which is
still better than a wrong name.
Addition is easier than subtraction, and the expression 'word + wordLen'
obviously means 'the end of the word', which was not as easy to spot
before.
No functional change.
These variables all belong to a string variable. Connect them using
FStr, which reduces the number of variables to keep track of.
No functional change.
There was only a single case where this parameter was false. Inline
that case. That was the only case that needed the return value, so
remove that as well.
When
1. there is a global variable containing a dollar in its expanded name
(very unlikely since there are lots of undocumented edge cases that
make variable names containing dollar signs fragile), and
2. after that (unlikely since that requires .MAKEFLAGS instead of a
normal command line)
3. there is a command line variable of the same name (again very
unlikely since that variable name would contain a dollar sign as
well in the expanded form),
the global variable would not be undefined as promised by the comments
since its name was expanded once more than intended.
Because of the two 'very unlikely' above, this edge case hopefully does
not affect any practical use cases.
Note that this is not about VAR.${param} (which has a dollar sign in its
unexpanded form), but about the case where param itself would expand to
a dollar sign, such as after param=$$.
This is a preparation to extract the code for exporting a cmdline
variable. That code differs in several details from the other code in
ExportVar.
No functional change.
Make prevents global variables from being or becoming visible when a
command line variable of the same name is already defined.
There is a double safety net here. Even if the call to Var_DeleteExpand
were removed, there would be no noticeable effect, other than one less
line in the debug log.
No functional change.
On NetBSD 8.0 it still worked. Maybe gcov doesn't support .c files as
arguments anymore. Using the .gcda files works and is more reliable
anyway since it covers the inline functions in the headers as well.
When the point that "isn't going to end well" is reached, the stacktrace
is quite long but still reasonable:
main
main_ReadFiles
ReadAllMakefiles
ReadMakefile directive-export.mk
Parse_File
ParseLine # line 3 has: _!= :;:
ParseVarassign
Parse_DoVar
VarAssign_Eval
VarAssign_EvalShell # because of the '!='
Cmd_Exec :;:
Var_ReexportVars # before starting the subprocess
ExportVar EMPTY_SHELL
ExportVarEnv # was only marked for export
Var_Subst ${EMPTY_SHELL} # to get the value to export
VarSubstExpr ${EMPTY_SHELL}
Var_Parse ${EMPTY_SHELL}
Var_Subst ${:sh} # since EMPTY_SHELL= ${:sh}
VarSubstExpr ${:sh}
Var_Parse ${:sh}
ApplyModifiers
ApplySingleModifier :sh
ApplyModifier
ApplyModifier_SunShell :sh
Cmd_Exec "" # empty command
Var_ReexportVars
ExportVar EMPTY_SHELL
ExportVarEnv # skipping this edge case
During the refactorings of the last months, several comments have become
outdated, some are now redundant since the code is as clear as the
comment, and some code benefits from a bit of explanation.
These are easier to read than hex constants.
There was no need to skip bits 2 and 3 (there were no constants for 0x04
and 0x08). Close this gap, to avoid confusing future readers. Keep the
relative order of the flags since that affects the debug output of -dv.
No functional change.
There are only very few places in var.c that contain really duplicate
code anymore.
There is still lots of _almost_ duplicate, for example the code for
parsing variable modifiers. It differs subtly in behavior:
* The modifiers ':M' and ':N' use '$$' to escape a '$' sign, while
almost all other modifiers use '\$' for this purpose.
* The modifiers ':M', ':N', ':S', ':@' and several others parse
balanced parentheses and braces, allowing '(' to '}' to match.
The modifiers ':D' and ':U' only treat the end character special but
not the other 3 of '(){}'.
* When parsing the modifier ':S' but not evaluating it, the code for
nested variable expressions is parsed differently from when it is in
evaluation mode (VARE_WANTRES). This applies to an outer ':S'
modifier and an inner ':D' or ':M' modifier.
Since these inconsistencies affect the behavior in edge cases and some
users of make might depend on it, they cannot be fixed by
behavior-preserving refactorings.
The type describes the definedness of an expression, not a general
status, therefore the new name is more precise.
The constants are renamed as well since their prefix 'VES' does not
match the type name anymore, it was correct 3 days ago when the type was
still named VarExprStatus. The name VES_NONE was misleading since
'none' does not describe its actual effect. That name came from the
time when the status was a bit set, and 'none' simply meant 'none of the
bits are set'.
The names used in debug logging will be renamed in a follow-up commit,
to demonstrate that the changes in this commit indeed have no functional
change, especially not the change from '!=' to '==' in line 4304.
No functional change.
The details of how variable expressions are evaluated is controlled by
several parameters: startc and endc differ for $(VAR) and ${VAR}, the
value of the expression can be interpreted as a single big word, and
when joining several words (such as with ':M' or ':S'), there may be a
custom word separator (defined with ':ts*').
The scope of half of these parameters is the whole variable expression,
the other half of the parameters are reset after each chain of indirect
modifiers. To make this distinction obvious in the code, extract Expr
from ApplyModifiersState. Previously, these details were hidden in how
parameters are passed and restored among ApplyModifiersIndirect and
ApplyModifiers.
The changes in the individual ApplyModifier functions are numerous but
straight-forward. They mostly replace 'st' with 'expr'.
The changes in ApplyModifiers and ApplyModifiersIndirect are more
subtle. The value of the expression is no longer passed around but is
stored in a fixed location, in Expr, which makes it easier to reason
about memory management.
The code in ApplyModifiers after 'cleanup' looks quite different but
preserves the existing behavior. Expr_SetValueRefer is nothing else
than the combination of FStr_Done followed by FStr_InitRefer. Storing
exprStatus back at the end was responsible for passing the definedness
of the expression after applying the indirect modifiers back to the
outer ApplyModifiersState. The same effect is now achieved by having
Expr.status with a wider scope.
No functional change.
In ModifyWords, there is no "passed string" anymore since that function
now directly operates on the expression value.
While here, improve the documentation of ModifyWordsCallback and rename
it to ModifyWordProc, focusing on its purpose instead of where it is
used.
The condition "st->newValue.str != val" in ApplySingleModifier made the
memory management look more complicated than it really was. Freeing an
object based on another object's value is harder to understand than
necessary.
To fix this, the "current value" of the expression is now stored in
ApplyModifiersState, and it gets updated in-place by the ApplyModifier
functions. This reduces the number of parameters for the ApplyModifier
functions.
Accessing the current value of the expression is now more verbose than
before (st->value.str instead of the simple val). To compensate for
this verbosity, ApplyModifiersIndirect is now much easier to understand
since there is no extra "current value" floating around.
There is still room for improvement. In ApplyModifiers, passing an FStr
in and returning another (or possibly the same) makes it difficult to
understand memory management. Adding a separate Expr type that outlives
the ApplyModifiersState will make this easier, in a follow-up commit.