Correct an issue found by Oguz <oguzismailuysal@gmail.com> and reported
in e-mail (on the bug-bash list initially!) with the code changed to deal
with PR bin/48875
With:
sh -c 'echo start at $SECONDS;
(sleep 3 & (sleep 1& wait) );
echo end at $SECONDS'
The shell should say "start at 0\nend at 1\n", but instead (before
this fix, in -9 and HEAD, but not -8) does "start at 0\nend at 3\n"
(Not in -8 as the 48875 changes were never pulled up)>
There was an old problem, fixed years ago, which cause the same symptom,
related to the way the jobs table was cleared (or not) in subshells, and
it seemed like that might have resurfaced.
But not so, the issue here is the sub-shell elimination, which was part
of the 48875 "fix" (not really, it wasn't really a bug, just sub-optimal
and unexpected behaviour).
What the shell actually has been running in this case is:
sh -c 'echo start at $SECONDS;
(sleep 3 & sleep 1& wait );
echo end at $SECONDS'
as the inner subshell was deemed unnecessary - all its parent would
do is wait for its exit status, and then exit with that status - we
may as well simply replace the current sub-shell with the new one,
let it do its thing, and we're done...
But not here, the running "sleep 3" will remain a child of that merged
sub-shell, and the "wait" will thus wait for it, along with the sleep 1
which is all it should be seeing.
For now, fix this by not eliminating a sub-shell if there are existing
unwaited upon children in the current one. It might be possible to
simply disregard the old child for the purposes of wait (and "jobs", etc,
all cmds which look at the jobs table) but the bookkeeping required to
make that work reliably is likely to take some time to get correct...
Along with this fix comes a fix to DEBUG mode shells, which, in situations
like this, could dump core in the debug code if the relevant tracing was
enabled, and add a new trace for when the jobs table is cleared (which was
added predating the discovery of the actual cause of this issue, but seems
worth keeping.) Neither of these changes have any effect on shells
compiled normally.
XXX pullup -9
Correctly handle (ie: ignore completely) \0 chars (nuls) in the
shell command input stream (script, dot file, or stdin).
Previously nul chars were ignored correctly in the line in which
they occurred, but would cause trailing chars of that line to reappear
as the start of the following line. If there was just one \0 skipped,
this would generally result in an extra \n in the sh input, which in
most cases has no effect. With multiple \0's in a single line, more
of the end of that line was duplicated into the following one. This
usually manifested as a weird "command not found" error.
Note that any \0 chars in the sh input make the script non-conforming,
so fixing this is not crucial (no \0's should really ever be seen) but
it was an obvious bug in the code, which was attempting to ignore nul
chars (as do many other shells), so let it be fixed.
XXX pullup -9
This fixes the MSAN detected reference to an unitialised variable
(an unitialised field in a struct) which happens when a command is
not found after a PATH search.
Aside from skipping some known to be going to fail exec*() calls
in some cases, the setting of the relevant field is irrelevant,
so this problem makes no practical difference to the shell, or any
shell script.
XXX (maybe) pullup -9
the "local" built-in command description (pointed out by mrg@ via uwe@ in
private e-mail).
Add a description to the export command of why this quoting is required,
and then refer to it from local and readonly (explained in export as that
one comes first).
Note that some shells parse export/local/readonly (and often more) as
"declarative" commands, and this quoting isn't needed (provided the
command name is literal and not the result of an expansion) making
X=$Y type args not require quoting, as they often don't in a regular
variable assignment (preceding, or not part of, another command).
POSIX is going to allow, but not require, that behaviour. We do not
implement it.
instead of simply assuming that the pid of the first (leftmost) process
in a pipeline is the pgrp - someday we may switch things around and
create pipelines right to left instead, which has several advantages,
but which would invalidate the assumption which was being made here.
abs(pid)) and indicate that -- is (strictly) needed if the first pid arg
(there often is only one) is negative - though this implementation works
without it if a signal to send has been explicitly given, but whereas
'kill 1234" is valid (send SIGTERM to pid 1234) "kill -1234" will generate
a usage error from the attempt to send signal 1234 to nothing, to send
SIGTERM to pgrp 1234 it needs to be "kill -- -1234" (or "kill -s term -1234").
While here do a couple of markup improvements, and allow for the
possibility that users might be running the builtin kill from some
shell other than csh or sh.
That means we cannot use (pid_t)-1 as an error indicator, as that's a
valid pid to use (described as working in kill(1) - yet it wasn't working
in /bin/kill (nor sh's builtin kill, which is essentially the same code).
This is even required to work by POSIX.
So change processnum() (the parser/validator for the pid args) to take
a pointer to a pid_t and return the pid that way, leaving the return value
of the (now int) function to indicate just ok/error. While here, fix
the validation a little ('' is no longer an accepted alias for 0) and in
case of an error from kill(2) have the error message indicate whether the
kill was targeted at a pid of a pgrp.
environment, rather than the nicer layout that is normally used.
Note this applies to /bin/kill only, the builtin kill in sh uses its
"posix" option for the same purpose, the one in csh only ever uses
POSIX format.
Better describe the command search procedure.
Document "trap -P"
Describe what works as a function name.
More accurate description of reserved word recognition.
Be more accurate about when field splittng happens after
expansions (and in particular note that tilde expansions are
not subject to field splitting). Be clear that "$@" is
not field split, it simply produces multiple fields as part
of its expansion (hence IFS is irrelevant to this), but if
used as $@ (unquoted) each field produced is potentially subject
to field splitting. Other minor wording changes.
traps_invalid (that is, when we actually nuke the parent shell's
caught traps in a subshell). This allows more reasonable use of
"trap -p" (and similar) in subshells than existed before (and in
particular, that command can be in a function now - there can also
be several related commands like
traps=$(trap -p INT; trap -p QUIT; trap -p HUP)
A side effect of all of this is that
(eval "$(trap -p)"; ...)
now allows copying caught traps into a subshell environment, if desired.
Also att the ksh93 variant (the one not picked by POSIX as it isn't
generally as useful) of "trap -p" (but call it "trap -P" which extracts
just the trap action for named signals (giving more than one is usually
undesirable). This allows
eval "$(trap -P INT)"
to run the action for SIGINT traps, without needing to attempt to parse
the "trap -p" output.
Also enhance some of the DEBUG mode trace output (nothing visible
in a normal shell build).
A couple of very minor code changes that no-one should ever notice
(eg: one less wait() call in the case that there is nothing pending).
functions being defined (they can still be included if quoted).
If we parsed the way POSIX specifies (leaving the exact input text of
$ and ` expansions unaltered, until required to be expanded) this would
not be needed, as the name of a function being defined does not underbo
parameter, command, or arith expansions, so xxx$3() { : ; } would just
work. But for many reasons we don't do that (and are unlikely to ever,
though maintaing both forms might be an option someday) - which led to
very obscure behaviour (if sh were compiled in DEBUG mode, even an abort())
and certainly nothing useful. So just prohibit these uses for now.
(A portable function name must be a "name" so this makes no difference
at all to posix compat applications/scripts).
A doc update is pending (the updated sh.1 also contains updates in other
areas not yet appropriate to commit).
When allocating for a Char **, it should use sizeof(Char *), not
sizeof(Char **). This doesn't actually affect the results except
on DS9000 though :-)
(part 2, the instance in this file was as far as I can tell
inexplicably missed by CVS on the first go...)
extra && or || or something ... forgotten now) as part a failed attempt
to fix an earlier bug (later fixed a better way) - when the extra
test (never committed) was removed, the now-redundant parentheses got
forgotten...
NFC.
Fix a bug that has existed since the "command" command was added in
2003. "command foo" would cause the definition of a function "foo"
to be lost (not freed, simply discarded) if "foo" is (in addition to
being a function) a filesystem command. The case where "foo" is
a builtin was handled.
For now, when a function exists with the same name as a filesystem
command, the latter can never appear in the command hash table, and
when used (which can only be via "command foo", just "foo" finds
the function) will always result in a full PATH search.
XXX pullup everything (from NetBSD 2.0 onwards). (really -8 and -9)
It is not enough to avoid displaying the contents of the directory,
we need to set FTS_SKIP to avoid descending into any subdirs too.
Otherwise, if a ".foo" directory has a subdirectory "bar", ls will
descend into bar and display its contents. From Todd Miller
`mv -h source target' just issues rename(source, target) without
discriminating on whether target resolves to a directory; this way
you can atomically replace a symlink to a directory.
would lead to a desynchronization of the protocol and further files or
directories to be ignored or corrupted.
Reported by Daniel Goujot, Georges-Axel Jaloyan, Ryan Lahfa, and David Naccache.
The other BSDs all have a note reminding that many shells have their
own internal echo implementations which may vary from this utility, so
add one. (Much of the wording is borrowed from FreeBSD's man page.)
(The other BSDs also have notes about the -n option not really being
portable, and printf[1] being preferable, we might want to add
something about that, too.)
the output will not be further processed (at all) so there is no need
to escape magic chars in the output, and doing so leaves stray CTLESC
chars in the here doc text. Not good. So don't do that...
To save a strlen() of the result, to determine the size of the here doc,
make rmescapes() return the length of the resulting string (this isn't
needed for other uses, so didn't happen previously).
Reported on current-users@ (2020-02-06) by Jun Ebihara
XXX pullup -9
children happens to exit while we are waiting for another child
to exit.
This can happen with code like
sh -c '
sleep 5 &
exec sh -c "sleep 10 & wait !$"
'
when the inner "sh" is waiting for the 10 second sleep to be
done, the 5 second sleep started earlier terminates. It is
a child of our process, as the inner shell is the same process
as the outer one, but not a known child (the inner shell has no
idea what the outer one did before it started).
This was observed in the wild by Martijn Dekker (where the outer
shell was bash but that's irrelevant).
XXX pullup -9
that we can always wait(2) for our children, and an ignored SIGCHLD
prevents that. Recent versions of bash can be convinced (due to a
bug most likely) to invoke us that way. Always return SIGCHLD to
SIG_DFL during init - we already prevent scripts from fiddling it.
All ash derived shells apparently have this problem (observed by
Martijn Dekker, and notified on the bash-bug list). Actual issue
diagnosed by Harald van Dijk (same list).
Mark the variable as volatile as it can be clobbered when a vfork occurs.
Error was reported when build.sh was run with MKLIBCSANITIZER=yes flag.
Reviewed by: kamil@
https://austingroupbugs.net/view.php?id=252
the Austin Group decided to require processing of "--" by the "."
and "exec" commands to solve a problem where some shells did
option processing for those commands (permitted) and others did
not (also permitted) which left no safe way to process a file
with a name beginning with "-".
This has finally made its way into what will be the next version of
the POSIX standard.
Since this shell did no option processing at all for those commands,
we need to update. This is that update.
The sole effect is that a "--" 'option' (to "." or "exec") is ignored.
This means that if you want to use "--" as the arg to one of those
commands, it needs to be given twice ". -- --". Apart from that there
should be no difference at all (though the "--" can now be used in other
situations, where we did not require it before, and still do not).
process with redirects. If we use vfork() and a redirect hangs
(eg: opening a fifo) which the parent was intended to unhang,
then the parent never gets to continue to unhang the child.
eg: mkfifo f; cat <f &; echo foo>f
The parent should not be waiting for a background process, even
just for its exec() to complete. if there are no redirects there
is (should be) nothing left that might be done that will cause any
noticeable delay, so vfork() should be safe in all other cases.
If a builtin command or function is the final command intended to be
executed, and is interrupted by a caught signal, the trap handler for
that signal was not executed - the shell simply exited (an exit trap
handler would still have been run - if there was one the handler
for the signal may have been invoked during the execution of the
exit trap handler, which, if it happened, is incorrect sequencing).
Now, if we're exiting, and there are pending signals, run their handlers
just before running the EXIT trap handler, if any.
There are almost certainly plenty more issues with traps that need
solving. Later,
XXX pullup -9
(-8 is too different in this area, and this problem suitably obscure,
that we won't bother) (the -7 sh is simply obsolete).
Having traps set should not enforce a fork for the next command,
whatever that command happens to be, only for commands which would
normally fork if they weren't the last command expected to be
executed (ie: builtins and functions shouldn't be exexuted in a
sub-shell merely because a trap is set).
As it was (for example)
trap 'whatever' SIGANY; wait $anypid
was guaranteed to fail the wait, as the subshell it was executed
in could not have any children.
XXX pullup -9
GCC_NO_FORMAT_TRUNCATION -Wno-format-truncation (GCC 7/8)
GCC_NO_STRINGOP_TRUNCATION -Wno-stringop-truncation (GCC 8)
GCC_NO_STRINGOP_OVERFLOW -Wno-stringop-overflow (GCC 8)
GCC_NO_CAST_FUNCTION_TYPE -Wno-cast-function-type (GCC 8)
use these to turn off warnings for most GCC-8 complaints. many
of these are false positives, most of the real bugs are already
commited, or are yet to come.
we plan to introduce versions of (some?) of these that use the
"-Wno-error=" form, which still displays the warnings but does
not make it an error, and all of the above will be re-considered
as either being "fix me" (warning still displayed) or "warning
is wrong."
a bracket expression in a pattern (ie: [[:THISNAME:]]). Previously
the code used strspn() to look for invalid chars in the name, and
then memcpy(), now we do the test and copy a character at a time.
This might, or might not, be faster, but it now correctly handles
\ quoted characters in the name (' and " quoting were already
dealt with, \ was too in an earlier version, but when the \ handling
changes were made, this piece of code broke).
Not exactly a vital bug fix (who writes [[:\alpha:]] or similar?)
but it should work correctly regardless of how obscure the usage is.
Problem noted by Harald van Dijk
XXX pullup -9
our implementation was fine, but the restrict marker is problematic
as gcc 8 is now more strict about checking for restrict issues.
this is the only actual consumer of swab(3) in our tree, though,
besides the test for it. oh well.
The new member is caled f_mntfromlabel and it is the dkw_wname
of the corresponding wedge. This is now used by df -W to display
the mountpoint name as NAME=
like the other tty ioctls that we only warn about). Do the main
ioctl (tcgetattr) first, since that provides a better error message
(ENOTTY instead of EINVAL).
The RSS related statistics are now back in the NetBSD kernel.
These values were disabled since day0 until today.
libkvm(3) users will still receive inappropriate values as RSS statistics
are updated upon sysctl(3) call.
Patch submitted by <Krzysztof Lasocki>
(inside a function or dot script) the exit status of that return
statement should become the exit status of the function (or dot
script) - we were ignoring it,
That is
fn() { while return 7; do return 9; done; return 11; }
should exit with status 7. It was exiting 0.
This is apparently another old ash bug that has been fixed
everywhere else in the past.
Issue pointed out by Martijn Dekker, (fairly obvious) fix borrowed
from FreeBSD, due for return sometime next century.
in the past, but managed to re-surface...
The expression "${0+\}}" should expand to "}" not "\}"
Almost all other shells handle it that way (incl FreeBSD & dash).
Issue pointed out by Martijn Dekker.
Add ATF sub-tests for the 4 old var expand operators (${var+word}
${var-word} ${var-word} and ${var?word} - including the forms
with the ':' included) and amongst those tests include test cases
for this issue, so if the bug tries to appear again, we can squash
it quicker. (The newer pattern matching operators are already
well tested as part of testing patterns.)
cleanups to the trap code. No longer silently ignore attempts to
do anything other than set SIGKILL or SIGSTOP to the default ('-")
state. Don't include those in trap or trap -p output (the former
because they cannot be other than in default state, so simply aren't
included, the latter because it is pointless) but do list them
when requested with trap -p SIG.
Interactive mode SIGINT traps are now run ASAP, rather than after
a command has been entered (so the sequence ^C \n is no longer needed
to generate one). Further, when trapped, in interactive mode,
while waiting for a user command, a SIGINT acts (aside from the
trap being run) just like when not trapped, aborts the command being
entered (rather than leaving it, which it did when libedit was in use)
prints a new prompt, and starts again (which is what should happen.)
Traps other than SIGINT (which has always been handled special in
interactive mode) are unaffected by this change, as are SIGINT traps
in non-interactive shells. Or that is the intent anyway.
Fix an in_dotrap ref count bug (was never being decremented... that
was inserted in a place never executed) (relatively harmless) and
add/improve some trap/signal related DEBUG mode tracing.
Update the description of the <& and >& redirection operators
(as indicated would happen in a message appended to the PR a week ago,
which received no opposition - no feedback).
Some rewriting of the section on redirects (including how the word
expansion of the "file" works) to make this simpler & more accurate.
Fix handling of "$@" (that is, double quoted dollar at), when it
appears in a string which will be subject to field splitting.
Eg:
${0+"$@" }
More common usages, like the simple "$@" or ${0+"$@"} end up
being entirely quoted, so no field splitting happens, and the
problem was avoided.
See the PR for more details.
This ends up making a bunch of old hack code (and some that was
relatively new) vanish - for now it is just #if 0'd or commented out.
Cleanups of that stuff will happen later.
That some of the worst $@ hacks are now gone does not mean that processing
of "$@" does not retain a very special place in every hackers heart.
RIP extreme ugliness - long live the merely ordinary ugly.
Added a new bin/sh ATF test case to verify that all this remains fixed.
finding a job that had previously terminated.
Now in that case JOBWANTED is set on all jobs (since any will do)
which then simplifies a later test which no longer needs to special
case "wait -n". Further, we always look to see if any wanted
job has already terminated, even if there are still running jobs
we can wait upon - if anything is already ready, that's where we start
harvesting (and finish, if -n is specified).
Stamp out "greengrocers' apostrophes" in various places (arguably there
are still more present, but style guides vary on that, and my energies
spent corralling wayward punctuation marks could be spent elsewhere).
As a somewhat pedantic clarification, "-s" does not accept backslashes
as delimiters. (While here, also make the macro use of an expression
shared between pax.1 and tar.1 consistent.)
Note the "s" option has an "s" flag that "prevents substitutions from
being performed on symbolic link destinations". Carry over r. 1.25 from
christos@ and part of r. 1.26 from wiz@ from tar.1, since this
functionality is available in pax as well as tar.
integer (previously it was just clamped at the max possible value).
This would have caused
sleep 10000000000000000000
(or anything bigger) to have only actually slept for 9223372036854775807
secs. Someone would have noticed that happen, one day, in some other
universe.
This is now an error, as it was previously if this had been entered as
sleep 1e19
Also detect an attempt to sleep for so long that a time_t will no longer
be able to represent the current time when the sleep is done.
Undo the attempts to work around a broken kernel nanosleep()
implementation (by only ever issuing shortish sleep requests,
and looping). That code was broken (idiot botch of mine) though
you would have had to wait a month to observe it happen. I was going
to just fix it, but sanity prevailed, and the kernel got fixed instead.
That allows this to be much simplified, only looping as needed to
handle dealing with SIGINFO. Switch to using clock_nanosleep()
to implement the delay, as while our nanosleep() uses CLOCK_MONOTONIC
the standards say it should use CLOCK_REALTIME, and if that we
ever changed that, the old way would alter "sleep 5" from
"sleep for 5 seconds" to "sleep until now + 5 secs", which is
subtly different.
Always use %g format to print the original sleep duration in reports of how
much time remains - this works best for both long and short durations.
A couple of other minor (frill) mods to the SIGINFO report message as well.
with an IQ that underflows when one attempts to enter it as an
unnormalised 160 bit long long double...
Whoever would believe that (~0 & anything) was a meaningful thing
to write? And three times in one #define. That could not possibly
have been me, could it?
Simplify, simplify, simplify. NFC.
to allow bash to build fdflags on Solaris 10, here are some mods that
fix that, and some other similar issues in the NetBSD version of fdflags.
The bash implementation of fdflags is based upon the one Christos did for
the NetBSD sh, so the issues are similar ... the NetBSD sh cannot yet
(easily anyway) build on anything except NetBSD, so this change makes
no current difference at all (just adds some compile time tests (#ifdef)
which always work out the way things did before, when built on NetBSD).
However, there is no system on which any modern shell can hope to work
which does not support close on exec, or fcntl(F_SETFD,...) to set it.
The O_CLOEXEC and FD_CLOEXEC definitions might not exist, but close on
exec can still be manipulated. Since the primary rationale for
the fdflags builtin was to be able to manipulate that state bit from
scripts, it would be annoying to lose that one, and keep all the (less
important) others, just because O_CLOEXEC is not defined, so do the
fix (workaround) a different way than was done in the bash patch.
Further, more than fdflags() will fail if O_CLOEXEC is not defined,
so handle that as well.
Also fix another oddity ... (noticed by reading the code) - if
fcntl(F_GETFL,...) returned any bits set that we don't understand,
the code was supposed to simply print their values as a hex constant,
when fdflags is run with -v. However, the getflags() function was
clearing all bits that the code did not know about ... so there is
no way any unknown bit could ever make it out to be printed. Handle
that a different way - instead of clearing unknown bits, clear any
bits that get returned which we understand, but do not want to deal
with (stuff like O_WRONLY, which should not be returned from the
fcntl(), but who knows...) Leave any unknown bits that happen to be
set, set, so that printone() can display them if appropriate.
(This is most likely to happen when running an older shell on a new
kernel where the kernel supports some new flag that the shell has
not been taught to understand).
NFCI that anyone should notice anytime soon.
matches the internal CTL* chars.
The earlier fixes handled CTL* char values in var expansions,
but not in various other places they can occur (positional
parameters, $@ $* -- even potentially $0 and ~ expansions,
as well as byte strings generated from a \u in a $'' string).
These should all be correctly handled now. There is a new
ISCTL() macro to make the test, rather than using the old
BASESYNTAX[c]==CCTL form (which us still a viable alternative)
as the new way allows compiler optimisations, and less mem
references, so it should be smaller and faster.
Also, be sure in all cases to remove any CTLESC (or other)
CTL* chars from all strings before they are made available
for any external use (there was one case missed - which didn't
matter when we weren't bothering to escape the CTL* chars at
all.)
XXX pullup-8 (will need to be via a patch) along with the Feb 4 fixes.
tree, don't display a CTLESC which is there only to protect a CTL*
char (a data char that happens to have the same value). No actual
CTL* chars are printed as data, so no escaping is needed to protect
data which just happens to look the same. Dropping this avoids the
possibility of confusion/ambiguity in what the word actually contains.
NFC for any normal shell build (very little of this file gets compiled there)
anway) on tech-userlevel with no adverse response.
This allows the magic of vars like HOSTNAME SECONDS, ToD (etc) to be
restored should it be lost - perhaps by having a var of the same name
imported from the environment (which needs to remove the magic in case
a set of scripts are using the env to pass data, and the var name chosen
happens to be one of our magic ones).
No change to SMALL shells (or smaller) - none of the magic vars (except
LINENO, which is exempt from all of this) exist in those, hence such a
shell has no need for this command either.
redirect operator is within range of what the code tree node can
hold. Currently this is a no-op change (the new error can never
occur) as the code already checks that N is in range for an int
(and errors if not) and the field in the node in which we store N
is also an int, so we cannot overflow - but fd's do not really need
to be that big (the max a typical kernel supports is < 10000) so
this just adds validation in case it ever happens that we decide we
can save some node size (ie: sh memory) by making that field smaller.
Note this is parse time error detection, and has no bearing upon
the execution time error that will occur if a script attempts to use
an fd that exceeds the process's max fd limit.
NFCI (for now anyway.)
a value. There are none which do that at the minute, so this is a NFCI
change, which is just making the code correct even though nothing
currently triggers any bugs.
%x commands) generate the most useful error message (from errno value)
rather than whichever happened last.
In posix mode, cause the "jobs" command to delete records of completed
jobs it reports on (as posix requires) as is done in interactive shells.
We don't (won't) do this in !posix mode, as the ability to throw in a
"jobs" command in a script to debug what is happening is too useful to
lose -- and any script that is relying on "jobs" instead of "wait" to
cleanup background processes (from the sh jobs table, sh always collects
zombies from the kernel) is absurd and not worth considering (besides
which I've never seen one).
No visible differences expected - there is a remote chance that
some internal lossage may no longer occur in interactive shells
that receive SIGINT (untrapped) at inopportune times, but you would
have had to have been very unlucky to have ever suffered from that.
Suppress shell error messages while expanding $ENV (which also causes
errors while expanding $PS1 $PS2 and $PS4 to be suppressed as well).
This allows any random garbage that happens to be in ENV to not
cause noise when the shell starts (which is effectively all it did).
On a parse error (for any of those vars) we also use "" as the result,
which will be a null prompt, and avoid attempting to open any file for ENV.
This does not in any way change what happens for a correctly parsed command
substitution (either when it is executed when permitted for one of the
prompts, or when it is not (which is always for ENV)) and commands run
from those can still produce error output (but shell errors remain suppressed).
fixes) where a variable containing a CTL char (the only possibility used
to be CTLESC (0x81)) would lose that character if the variable was expanded
when "set -f" (noglob) was in effect.
1.128 made this worse by adding more 0x8z values (a couple more) which would
see the same behaviour, and one of those was noticed by Martijn Dekker.
The reasoning was that when noglob is on, when a var is expanded, there are
no magic chars, so (apparently) no need to escape anything. Hence nothing
was escaped .. including any CTL chars that happened to be present. When
we later rmescapes() the CTL chars that we expect might occur are summarily
removed - even if they weren't really CTL chars, but just data masquerading.
We must *always* escape any CTL char clones that are in the var value,
no matter what other conditions apply, and what we expect to happen next.
While here, fix rmescapes() (and its $(()) clone, rmescapes_nl()) to
be more robust, less likely to forget to delete anything (which was
not the issue here, just the reverse) and in a DEBUG shell, have the
shell abort() if it encounters something in rmescapes() it is not
anticipating, so the code can be made to handle it, or if it should
not happen, we can find out why it did.
XXX pullup -8 (but will need to be via patch, code is quite different).
After all, a system might want to sleep for several
thousand years on a spaceship headed to a distant
solar system...
So, remove the pause() code, deal with limits on the
range (it is just an int) that can be passed to sleep()
by looping, and do a much better job of checking for
out of range input values.
With this change sleep(1) should work for durations
up to something more than 250 billion years. It
fails (at startup, with an error) if the requested
duration is beyond what can be handled.
Here no changes at all related to locales and arg
parsing. Still for another day.
integer" case, so we avoid adjusting the locale of sleep,
and generally be more reliable and simpler.
In addition, deal with weirdness in nanosleep() when the
interval gets long, by avoiding using it. In this version
when the sleep interval < 10000 seconds, we use nanosleep()
as before, for delays longer than that we use sleep() instead,
and ignore any fractional seconds.
We avoid overflow problems here by not bothering to sleep
at all for delays longer than 135 years (approx) and simply
pause() instead. That sleep never terminates in such a
case is unlikely to ever be observed.
This commit makes no decision on the question of whether
the arg should be interpreted in the locale of the user,
or always in the C locale. That is for another day.
src/tests/bin/sh/t_here.sh
The "magicq" magic was all wrong - it cannot be simply a parameter
to readtoken1() as its value needs to alter during that routine
(eg: when magicq is set - processing here doc text, or whatever)
and we encountered ${var%pattern} "magicq" needs to be off for
"pattern" - and it wasn't.
To handle this magicq needs to be included in the token stack struct,
and simply init'd from the arg to readtoken1 (which we rename).
Then it can be manipulated as required.
Once we no longer have that problem, some other issues can be cleaned
up as well (some of this unbelievably fragile code was attempting to
cope with this in various ad-hoc - and mostly broken - ways).
Also, remove the magicq parameter from parsebackq() - it was not
used (at all) and should never be, a command substitution, wherever
it appears, always starts a new parsing context. How that applies
to old style command substitutions is less clear, but until we see
some real examples where we're not doing the right thing (slightly
less likely now than before ... nothing has changed here in the
way command substitutions are parsed, but quoting in general is
slightly better) I don't plan on worrying about it.
There are a couple of other minor cleanups, which make no actual
difference (like adding () around the use of the parameter in the
RETURN macro ... which is generally better, but makes no difference
here as the param is always a simple constant.
All the current ATF tests pass.
Add tracing of lexical analyser operations. This is deliberately
kept out of the normal "all on" set as it makes a *lot* of noise
when enabled (especially in verbose mode) - but when needed, it
helps (evidence for which is coming soon).
As usual, no doc, you need the sources (and of course, a specially
built sh to even be able to enable it.)
Add an error DEBUG trace in exraise() (when the shell has detected
some error or signal, and is aborting what it is doing)
Fix an arith error in DEBUG bit assignments (harmless as we haven't
reached the limit of flags yet), and add some missing (recently added)
debug flags so they are turned on when the user (ie: me) asks for
"everything".
(PS1 etc) which, if the shell were already exiting, and a prompt
were to be expanded (which only really happens if -x is enabled,
and an exit trap is set, so the commands in the trap need PS4
expanded and written, last thing, before the shell exits) the shell
would instead simply exit when it finished expanding PS4 (before
even writing it, or the xtrace output).
There were more conditions required to set up the environment for
this to actually occur (it seems to only happen when the exit trap
is set in a function, called in a command substitution) but that's
unimportant, the code was nonsense.
Problem noticed by Martijn Dekker.
XXX pullup -8
(which means this is the very first execution in a new subshell)
clear the traps completely, unless the command is "trap". We were
allowing any special builtin, which was probably harmless, but not
intended.
Also (though not required) permit "command trap" and "eval trap"
and combinations thereof, because they might be useful, and there is
no particular reason why not. This is all a part of making t=$(trap)
work as POSIX requires, but almost nothing beyond that. The "trap"
command must be alone (modulo eval and command) in the subshell for
the exception to apply, no t=$(trap; echo) or anything like that.
Martijn Dekker asked for "command trap" to work (no idea why though,
it converts "trap" from being a special builtin, to a normal one,
which means an error won't cause the shell to exit ... if there's
an error, the "trap" command won't do anything useful, and as we
permit no more commands (for this special treatment) the shell is
going to exit anyway, this difference is not really significant.
RANDOM initialisation failed
when the shell might print after RANDOM has been reseeded
(which includes at sh startup) the next time RANDOM is accessed.
It indicates that /dev/urandom was not available or did not
provide data - in that case, sh uses a (weak) seed made out of
the pid and time (but otherwise nothing else changes).
one in the "safe" way (it was ensuring the buffer always ended in 2 \0
characters ... one is enough.) This could affect the expansions of
LINENO RANDOM and SECONDS, though only if they have at least 8 digits
(and then, only sometimes). RANDOM thus is safe, as it never produces
a number with more than 5 digits, you'd need a script with 10000000
lines before there might be an issue with LINENO (and even autoconf
generated scripts don't generally get that bit) and a shell would need
to be running for almost 4 months for SECONDS to climb that high.
Nevertheless: XXX pullup -8.
includes typing ^D) make sure LINENO is set to indicate the last
(actually one past last) line in the input file, rather than
whatever it was set to by the last command that was actually
executed (which could be some line in a function defined in
some other file).
No effect on exit via an explicit exit command - that would already
set the line number correctly.
what the current locale's radix character happens to be,
while still allowing locale specific entry of fractional
seconds (ie: if you're in locale where the radix character
is ',' you san use "sleep 2.5" or "sleep 2,5" and they
accomplish the same thing).
This avoids issues with the "sleep 0.05" in rc.subr which
generated usage messages when a locale that does not use
'.' as its radix character was in use.
Reported on netbsd-users by Dima Veselov, with the problem
diagnosed by Martin Husemann
While here, tighten the arg validity checking (3+4 is
no longer permitted as a synonym of 3) and allow 0.0
to mean the same thing as 0 rather than being an error.
Also, make the SIGINFO reports a little nicer (IMO).
The ATF tests for sleep all pass (not that that means a lot).
will necessarily contain. Allow defined nodes to use any
intN_t or unintN_t (as well as plain old int) data types
in fields (along with the others that are permitted).
Note: this script is a part of the build procedure for /bin/sh,
the modified version generates the exact same output files
(for the unaltered input specifications) as the previous one
did, hence no visible change is expected (or even possible).
While there is a tiny chance that some host shell will fail
to be able to run this script while building, the script still
uses nothing even slightly exotic, and is much more conservative
than other scripts used during the build process, so there should
be no issues there either.
that when traps are marked as invalid, we never use them
for anything except output from the trap command.
Fixes issues where sub-shells of shells which use traps
(eg: to trap SIGPIPE) can end up looping forever if the
signal occurs in a sub-shell (where the trap is supposed
to be reset to its default). Reported, and mostly
analyzed by Martijn Dekker.
that is #if 0'd and (still) has never been compiled (most likely
never will be.)
While here, in the same uncompiled code, deal with line number
counting. Whether this is correct depends upon how this code
is used, and as it never is (and never has been since line numbers
first started being counted), this is somewhat speculative, but
it seems likely to be the correct way to handle things.
NFC (this code is still all #if 0).
should be looked up as a potential following alias - if the first
expands to a string that ends with a space (any space, quoted or
not) then the next word is to be treated as an alias candidate.
(POSIX was to specify only unquoted spaces, but is now going to
leave that unspecified, and the "any space" version turns out to
be more useful.
And besides, the quoteflag test didn't work properly, and would
have been very messy to fix ... if in a word (as if we have a
quoted space) it means that the word has been quoted, which meant
that quoted spaces were correctly detected, but it outside a word,
it just means that the previous word was quoted, so it would sometimes
reject alias lookup on the next word in cases where it is unquestioned
it should be done.
so the fake char returned by the latter when an alias ends (which
is there so we can correctly avoid alias recursion) is correctly
ignored where it is not wanted.
aliases, to actually do what it was supposed to do, and not just
come close by accident. (How broken this was, while still seeming
to work perfectly most of the time was truly amazing!)
This corrects the behaviour of an alias defined with a blank char
as the last of its value, to correctly do an alias lookup on the
word that follows the alias.
cope with the changes made in the previous revision, in an
attempt to avoid bit rot.
Untested (uncompiled) - though it should work.
NFC: this change doesn't get compiled, let alone used.
(a normal builtin, though those are not genrally an issue for
this problem, or a special builtin that has been prefixed by "command")
make sure that we discard any pending input that might have been
queued up, but not yet processed.
We had the mechanism to fix this from when expansion of PS1 etc
was added (which has a similar problem to deal with) - all taken
from FreeBSD - but did not bother to use it here until now...
This fixes an error detected by newly added ATF tests of the eval
builtin, where
eval 'syntax error
another command'
would go ahead and evaluate "another command" which should not
happen (note: only when there was a \n between the two).
Standard C says that free should be a no-op for a NULL pointer, so
we don't need an extra function to do this.
While here, add an XXX about a wrong sounding comment
kill and sh were merged so that the shell (for trap -l) and
kill (for kill -l) can use the same routine, and site that function
in the shell, rather than in kill (use the code that is in kill as
the basis for that routine). This allows access to sh internals,
and in particular to the posix option, so the builtin kill can
operate in posix mode where the standard requires just a single
character (space of newline) between successive signal names (and
we prefer nicely aligned columns instead)..
In a SMALL shell, use the ancient sh printsignals routine instead,
it is smaller (and very much dumber).
/bin/kill still uses the routine that is in its source, and is
not posix compliant. A task for some other day...
The SYNOPSIS for "readonly -q" cannot have the -q be
optional ... Also harmonise the output appearance with
that of the export command.
wiz: have at it...
readonly -q VAR...
readonly -p VAR...
export -q [-x] VAR...
export -p [-x] VAR...
all available only in !SMALL shells - and while here, limit
"export -x" to full sized shells as well.
Also, do a better job of arg checking and validating of the
export and readonly commands (which is really just one built-in)
and of issuing error messages when something bogus is detected.
Since these commands are special builtin commands, any error
causes shell exit (for non-interactive shells).
var=foo; readonly var=new
now fails.
If var was already set, an attempt to make it readonly, and assign it
a new value at the same time, failed - the readonly flag was set too soon.
Pointed out by Martijn Dekker (thanks).
Also, while here, add a couple of comments.
Implement parameter and arithmetic expansion of $ENV
before using it as the name of a file from which to
read startup commands for the shell. This continues
to happen for all interactive shells, and non-interactive
shells for which the posix option is not set (-o posix).
On any actual error, or if an attempt is made to use
command substitution, then the value of ENV is used
unchanged as the file name.
The expansion complies with POSIX XCU 2.5.3, though that
only requires parameter expansion - arithmetic expansion
is an extension (but for us, it is much easier to do, than
not to do, and it allows some weird stuff, if you're so
inclined....) Note that there is no ~ expansion (use $HOME).
update to mkinit.sh (to 1.10).
Or more correctly, revert & fix - turns out that there was an off by one
(failure to adjust for other changes -- in a value printed by debug mode
trace output).
NFC.
uses printf for simplicity).
This script runs using the build host's shell, and echo, and so must
deal with all of the absurdity that different versions of echo dumb
upon us.
This is the underlying cause of the linux build failure that gson@ reported.
some versions of liux) - DEBUG mode change: Delete a (relatively new)
trace point (temporarily anyway) which mkinit (a script run using the
host's /bin/sh) apparently cannot handle correctly on (some release of)
linux (it is fine with the NetBSD shell).
I don't know which linux version has a shell with this problem
(or whether it is a mkinit issue that only works by fluke on NetBSD)
Problem reported by gson@
were added to sh ... in other shells, setting such a variable
(for most of them) causes it to lose its special properties,
and act the same as any other variable. I had assumed that
was just implementor laziness... I was wrong.
From now on the NetBSD shell will act like the others, and if vars
like HOSTNAME (and SECONDS, etc) are used as variables in a script
or whatever, they will act just like normal variables (and unless
this happens when they have been made local, or as a variable-assignment
as a prefix to a command, the special properties they would have had
otherwise are lost for the remainder of the life of the (sub-)shell
in which the variables were set).
Importing a value from the environment counts as setting the
value for this purpose (so if HOSTNAME is set in the environment,
the value there will be the value $HOSTNAME expands to).
The two exceptions to this are LINENO and RANDOM. RANDOM
needs to be able to be set to (re-)set its seed. LINENO needs to
be able to be set (at least in the "local" command) to achieve
the desired functionality. It is unlikely that any (sane) script
is going to want to use those two as normal vars however.
While here, fix a minor bug in popping local vars (fn return) that need
to notify the shell of changes in value (like PATH).
Change sh(1) to reflect this alteration. Also add doc of the
(forgotten) magic var EUSER (which has been there since the others
were added), and add a few more vars (which are documented
in other places in sh(1) - like ENV) into the defined or used
variable list (as well as wherever else they appear).
XXX pullup -8
as traps that issue break/continue/return to cause the loop/function
executing when the trap occurred to break/continue/return, and
generating the correct exit code from the shell including when a
signal is caught, but the trap handler for it exits.
All that from FreeBSD.
Also make
T=$(trap)
work as it is supposed to (also trap -p).
For now this is handled by the same technique as $(jobs) - rather
than clearing the traps in subshells, just mark them invalid, and
then whenever they're invalid, clear them before executing anything
other than the special blessed "trap" command. Eventually we will
handle these using non-subshell command substitution instead (not
creating a subshell environ when the commands in a command-sub alter
nothing in the environment).
intended (and as documented) rather than how it has been behaving
(which was not very rational.) Since it is unlikely that anyone
is using this, the change should be mostly invisible.
While here, a couple of other minor cleanups:
. One call of geteuid() is enough in choose_ps1()
. Fix a typo in a comment
. Improve appearance (whitspace changes) in find_var()
to fix the (unusual) idiom "${1+$@}" (the quotes are part of it).
This seems to have broken about 5 or 6 years ago (somewhere
between -6 and -7), I believe.
Note this is not the same as "$@" and also not the same as ${1+"$@"}
(much more common idioms) which both worked.
Also attempt to deal with "" more correctly, especially when it
appears adjacent to "$@" (or one of the similar constructs.)
This stuff is still all as ugly and hackish (and fragile) as is
possible to imagine, but in an effort to allow some of the weirdness
to eventually go away, the parser output has been made more
regular and all quoted (parts of) words always now start with
CTLQUOTEMARK and end with CTLQUOTEEND regardless of where the
quotes appear.
This allows us to tell the difference between """$@" and "$@"
which was impossible before - yet they are required to generate
different output when there are no args (when "$@" simply vanishes).
Needless to say that change had ramifications all over the place.
To simplify any similar change in the future, there are some new
macros that can generally be used to detect the "noise" data when
processing words, rather than open coding that every time (which
meant that there would *always* be one which missed getting
updated...)
Several other bugs (of my making, and older ones) are also fixed.
The aim is that (aside from anything that is detecting the cases
that were broken before - which were all unlikely uses of sh
syntax) these changes should have no external visible impact.
Sure...
to have them, they should work as documented, not cause core
dumps, reference after free, incorrect replacements, failing
to implement alias after alias, ...
The big comment that ended:
This is a good idea ------- ***NOT***
and the hack it was describing are gone.
Note that most of this was from original CVS version 1.1
code (ie: came from the original import, even before 4.4-Lite
was merged. That is, May 1994. And no-one in 24.5 years
noticed (or at least complained about) all the bugs (or at
least, most of them)).
With these changes, aliases ought to work (if you can call it
that) as they are expected to by POSIX. Now if only we could
get POSIX to delete them (or make them optional)...
Changes partly inspired by similar changes made by FreeBSD,
(as was the previous change to alias.c, forgot ack in commit
log for that one, apologies) but done a little differently,
and perhaps with a slightly better outcome.
to the main handler, rather than wherever the parent shell would go.
nb: not needed for vfork(), after vfork() we never go that path - which
is good or we'd be corrupting the parent's handler.
This allows the child to always exit (when it should) rather than being
caught up doing something else (and while it would eventually exit, the
status would be incorrect in some cases).
One test is:
sh -c 'trap "(! :) && echo BUG || echo nobug" EXIT'
from Martijn Dekker
Fix from FreeBSD (missed earlier).
XXX - 2b part of the 48875 pullup to -8
since this code was first imported (May 1994) (ie: before 4.4-Lite)
There is (much) more coming soon (the big ugly comment is going away).
This one has been separated out, as it can easily cause sh
core dumps, so needs:
XXX pullup-8
(the other changes to aliases probably will not get that.)
what it actually does (makearg would have been an alternative).
While here, notice the one remaining place where it should have been
used, but was left open coded, and consume that one.
NFCI.
for the purposes of any redirects it might have -- ie: as posix
requires, make the redirects appear to have been executed in a subshell
environment, so if one fails, aside from a diagnositc msg, all the
running script sees is a command that failed ($? != 0), rather
that having the shell exit which used to happen (the empty command was
being treated as a special builtin).
Continue to treat the empty command as special for the purposes of
var assigns it might contain (those are not executed in a sub-shell
and persist) - an error there (eg: assigning to a readonly var) will
continue to cause the shell (non-interactive shell) to exit.
This makes the NetBSD shell behave like all other (reasonably modern)
shells - fix method (not the implementation, details differ) taken from
FreeBSD who fixed this back in early 2010. Problem pointed out
in (non-list) mail by Martijn Dekker.
first implemented in response to PR bin/4966 (PR Feb 1998, fix Feb 1999).
The file named should not be truncated.
No other shell truncates the file (<> was added to FreeBSD sh in Oct 2000,
and did not include O_TRUNC) and POSIX certainly does not suggest that
should happen (just that the file is to be created if it does not exist.)
Bug pointed out in off-list e-mail by Martijn Dekker
"command" should not be executed. (The issue affects multi-line
eval strings only - ie: commands after the next \n are not skipped).
Bug noted by Martijn Dekker in off-list e-mail.
Fix from FreeBSD:
src/bin/sh/eval.c: Revision 272983 Sun Oct 12 13:12:06 2014 UTC by jilles
to hide meta-characters in the result when the expansion was
in (double) quotes, and so should not be further processed.
Most of this has been OK for a long while, but \ needs hiding
as well, which complicates things, as \ cannot simply be hidden
in the syntax tables as one of the group of random special characters.
This was fixed earlier for simple variable expansions, but
every variety has its own code path ($var uses different code
than $n which is different than $(...), which is different
again from ~ expansions, and also from what $'...' produces).
This could be fixed by moving them all to a common code path,
but that's harder than it seems. The form in which the data
is made available differs, so one common routine would need
a whole bunch of different "get the next char or indicate end"
methods - probably via passing in an accessor function.
That's all a lot of churn, and would probably slow the shell.
Instead, just make macros for doing the standard tests, and
use those instead of open coding (differently) each time.
This way some of the code paths don't end up forgetting to
handle '\' (which is different than all the others).
This removes one optimisation ... when no escaping is needed
(like just $var (unquoted) where magic chars (think '*') in
the value are intended to remain magic), the code avoided doing
two tests for each char ("do we need escapes" and "is this char
one that needs escaping") by choosing two different syntax
tables (choice made outside the loop) - one of which never
returns the magic "needs escaping" result, and the other does
when appropriate, and then just avoiding the "do we need escapes"
test for each character processed. Then when '\' was fixed,
there needed to be another test for it, as it cannot (for other
reasons) be the same as all the others for which "this char
need escaping" is true. So that added a 2nd test for each char...
Not all the code paths were updated. Hence the bugs...
nb: this is all rarely seen in the wild, so it is no big
surprised that no-one ever noticed.
Now the "use two different syntax tables" is gone (the two
returned the same for '\' which is why '\' needed special
processing) - and in order to avoid two tests for each
char (plus the \ test) we duplicate the loops, one of which
tests each char to see if it needs an escape, the 2nd just
copies them. This should be faster in the "no escapes"
code path (though that is not the point) and perhaps also
in the "escapes needed" path (no indirect reference to
the syntax table - though that would probably be in a
register) but makes the code slightly bigger. For /bin/sh
the text segment (on amd64) has grown by 48 bytes. But
it still uses the same number of 512 byte pages (and hence
also any bigger page size). The resulting file size
(/bin/sh) is identical before and after. So is /rescue/sh
(or /rescue/anything-else).
prompts to exit when they're done, rather than forcing them to
turn into interactive shells and start reading input ...
Completes a part of the previous changes (just 10+ weeks late...)
Should fix the prompt expansion issue reported by Caóc on
current-users.
and one in (the included from bin/kill) kill.c and use just
the one in kill.c (which is amended slightly so it can work
the way that trap.c needs it to work). This one is chosen as
it was a much nicer implementation, and because while kill is
always built into the shell, kill also exists without the shell.
Leave the old implementation #if 0'd in trap.c (but updated to
match the calling convention of the one in kill.c) - for now.
Delete references of sys_signame[] from sh/trap.c and along with
that several uses of NSIG (unfortunately, there are still more)
and replace them with the newer libc functional interfaces.
definitions (just to avoid any temptation to ever use them again).
Update a comment which would make no sense without following the
preceding comment which is being deleted with the macros it describes.
While here, remove another comment that referred to events that have
long past as if they were still to come. Also a grammatical comment
correction - paragraphs start with capital letters...
NFC (even with DEBUG defined).
the ancient DEBUG TRACE() method, to the newer CTRACE() et. al.)
that turns out never really needed committing - the mechanism, and
the code that obsoleted it, were committed together (May 2017).
[It was useful to me while getting to that state...]
NFC. Not even with DEBUG shells.
and use whatever works for the sh running this script. Previously
we were using the (broken, and incorrect) method that worked in
old broken NetBSD sh's (and some others) and not the method that
works with the current (fixed) /bin/sh and other correct shells
(like bash). (For an exotic reason, in the particular use case,
both methods work with ksh93, but it is also generally correct).
This hasn't really mattered, as the difference is only significant
(only causes actual issues - the build fails) when compiling with DEBUG
enabled, which is something that most sane humans would never do, if they
want to retain that sanity.
The problem was detected by Patrick Welche when looking for an
unrelated problem, which was once considered to be a possible sh
problem, but turned out to be something entirely different.
XXX pullup -8
(which has redirects, and so is included in -x output) use the -x/+x
setting that existed when the comoound started, so if the state of
xtrace changes during the command we don't end up with just half of
the -x output (either the intro, or the conclusion, depending on
which way the change happened). [this also happens to avoid a core
dump in the previous code, but that could have been done other ways,
this way actually simplifies things (less code)]
Make test(1) always use the POSIX "number of args" evaluation rules
when they apply.
Only fall back to the old expression evaluation when there are more
than 4 args, or when the args given cannot work as a test expression
using the POSIX rules. That is when the result is unspecified.
Also fix old bug where a string of whitespace is considered to be a
valid number (at least one digit is needed amongst it somewhere...)
XXX pullup -8
the option when a pipeline is created that controls the way the exit
status of the pipeline is calculated. Previously it was the state of
the option when the exit status of the pipeline was collected.
This makes no difference at all for foreground pipelines (there is
no way to change the option between starting and completing the
pipeline) but it does for asynchronous (background) pipelines.
This was always the right way to implement it - it was originally
done the other way as I could not find any other shell implemented
this way - they all seemed to do it our previous way, and I could
not see a good reason to be the sole different shell.
However, now I know that ksh93 works as we will now work, and I
am told that if the option is added to the FreeBSD shell (apparently
the code exists, uncommitted) it will be the same.
Save more characters of command in non-interactive jobs, in case of
core dumps and similar (16 effective chars was a few too little).
Arrange for number to increase if command buffer size increases.
Add a paragraph (briefer than previously posted to mailing lists)
to explain that there is no guarantee that the results of a command
substitution will be available before all commands started by the
cmdsub have completed.
Include the original proposed text (much longer) as *roff comments, so
it will at least be available to those who browse the man page sources.
While here, clean up the existing text about command substitutions to
make it a little more accurate (and to advise against using the `` form).
Deal with the new shell internal exit reason EXEXIT in the case of
a shell which has vfork()'d. It takes a peculiar set of circumstances
to get into a situation where this is ever relevant, but it can be
done. See the PR for details.
we had incorrect usage of setstackmark()/popstackmark()
There was an ancient idiom (imported from CSRG in 1993) where code
can do:
setstackmark(&smark); loop until whatever condition {
/* do lots of code */ popstackmark(&smark);
} popstackmark(&smark);
The 1st (inner) popstackmark() resets the stack, conserving memory,
The 2nd one is needed just in case the "whatever condition" was never
true, and the first one was never executed.
This is (was) safe as all popstackmark() did was reset the stack.
That could be done over and over again with no harm.
That is, until 2000 when a fix from FreeBSD for another problem was
imported. That connected all the stack marks as a list (so they can be
located). That caused the problem, as the idiom was not changed, now
there is this list of marks, and popstackmark() was removing an entry.
It rarely (never?) caused any problems as the idiom was rarely used
(the shell used to do loops like above, mostly, without the inner
popstackmark()). Further, the stack mark list is only ever used when
a memory block is realloc'd.
That is, until last weekend - with the recent set of changes.
Part of that copied code from FreeBSD introduced the idiom above
into more functions - functions used much more, and with a greater
possibility of stack marks being set on blocks that are realloc'd
and so cause the problem. In the FreeBSD code, they changed the idiom,
and always do a setstackmark() immediately after the inner popstackmark().
But not for reasons related to a list of stack marks, as in the
intervening period, FreeBSD deleted that, but for another reason.
We do not have their issue, and I did not believe that their
updated idiom was needed (I did some analysis of exactly this issue -
just missed the important part!), and just continued using the old one.
Hence Patrick's core dump....
The solution used here is to split popstackmark() into 2 halves,
popstackmark() continues to do what it has (recently) done,
but is now implemented as a call of (a new func) rststackmark()
which does all the original work of popstackmark - but not removing
the entry from the stack mark list (which remains in popstackmark()).
Then in the idiom above, the inner popstackmark() turns into a call of
rststackmark() so the stack is reset, but the stack mark list is
unchanged. Tail recursion elimination makes this essentially free.
Import a whole set of tree evaluation enhancements from FreeBSD.
With these, before forking, the shell predicts (often) when all it will
have to do after forking (in the parent) is wait for the child and then
exit with the status from the child, and in such a case simply does not
fork, but rather allows the child to take over the parent's role.
This turns out to handle the particular test case from PR bin/48875 in
such a way that it works as hoped, rather than as it did (the delay there
was caused by an extra copy of the shell hanging around waiting for the
background child to complete ... and keeping the command substitution
stdout open, so the "real" parent had to wait in case more output appeared).
As part of doing this, redirection processing for compound commands gets
moved out of evalsubshell() and into a new evalredir(), which allows us
to properly handle errors occurring while performing those redirects,
and not mishandle (as in simply forget) fd's which had been moved out
of the way temporarily.
evaltree() has its degree of recursion reduced by making it loop to
handle the subsequent operation: that is instead of (for any binop
like ';' '&&' (etc)) where it used to
evaltree(node->left);
evaltree(node->right);
return;
it now does (kind of)
next = node;
while ((node = next) != NULL) {
next = NULL;
if (node is a binary op) {
evaltree(node->left);
if appropriate /* if && test for success, etc */
next = node->right;
continue;
}
/* similar for loops, etc */
}
which can be a good saving, as while the left side (now) tends to be
(usually) a simple (or simpleish) command, the right side can be many
commands (in a command sequence like a; b; c; d; ... the node at the
top of the tree will now have "a" as its left node, and the tree for
b; c; d; ... as its right node - until now everything was evaluated
recursively so it made no difference, and the tree was constructed
the other way).
if/while/... statements are done similarly, recurse to evaluate the
condition, then if the (or one of the) body parts is to be evaluated,
set next to that, and loop (previously it recursed).
There is more to do in this area (particularly in the way that case
statements are processed - we can avoid recursion there as well) but
that can wait for another day.
While doing all of this we keep much better track of when the shell is
just going to exit once the current tree is evaluated (with a new
predicate at_eof() to tell us that we have, for sure, reached the end
of the input stream, that is, this shell will, for certain, not be reading
more command input) and use that info to avoid unneeded forks. For that
we also need another new predicate (have_traps()) to determine of there
are any caught traps which might occur - if there are, we need to remain
to (potentially) handle them, so these optimisations will not occur (to
make the issue in PR 48875 appear again, run the same code, but with a
trap set to execute some code when a signal (or EXIT) occurs - note that
the trap must be set in the appropriate level of sub-shell to have this
effect, any caught traps are cleared in a subshell whenever one is created).
There is still work to be done to handle traps properly, whatever
weirdness they do (some of which is related to some of this.)
These changes do not need man page updates, but 48875 does - an update
to sh.1 will be forthcoming once it is decided what it should say...
Once again, all the heavy lifting for this set of changes comes directly
(with thanks) from the FreeBSD shell.
XXX pullup-8 (but not very soon)
Revert the changes that were made 19 May 2016 (principally eval.c 1.125)
and the bug fixes in subsequent days (eval.c 1.126 and 1.127) and also
update some newer code that was added more recently which acted in
accordance with those changes (make that code be as it would have been
if the changes now being reverted had never been made).
While the changes made did solve the problem, in a sense, they were
never correct (see the PR for some discussion) and it had always been
intended that they be reverted. However, in practical sh code, no
issues were reported - until just recently - so nothing was done,
until now...
After this commit, the validate_fn_redirects test case of the sh ATF
test t_redir will fail. In particular, the subtest of that test
case which is described in the source (of the test) as:
This one is the real test for PR bin/48875
will fail.
Alternative changes, not to "fix" the problem in the PR, but to
often avoid it will be coming very soon - after which that ATF
test will succeed again.
XXX pullup-8
previous rev) the two values (node name, and node number) were
arbitrarily printed in different formats and orders (depending
upon my mood at the time I guess...) The new macros will standardise
that usage (in the debug output) once some use of them actually begins.
When the macros were added, I arbitrarily copied the format of one
use I was looking at at that instant (the one which inspired the change),
but after gazing at DEBUG mode output over the intervening time, I
have concluded that I did not pick the easiest to read/follow format.
So, even before they are used, change the style... Also, conform
to standard PRIxxxx macro style by omitting the leading '%'.
NFC (since they aren't used at all, anywhere, yet, not even the
possibility of anything changing!)
This generates nodenames.h which is a file that used to begin
#ifdef DEBUG
(line 1) and end with
#endif
(last line) with no intervening (matching) #else ... ie: for DEBUG use only.
That led to situations where non-debug code would like to make use
of the info provided, if DEBUG was enabled, needed to add #ifdef DEBUG
at the point of use.
Avoid that by providing new macros that are always defined (DEBUG or not,
so now we have a #else) which allow code to be written to make use of
the extra DEBUG info, if it is available, or not, if not.
While here, add double-include protection on the generated .h file
(just being cautious - nothing is ever going to cause it to get
included anywhere twice - or it shouldn't) and add the traditional
comments on the #else and #endif stuff (which is also really useless
as no-one is really expected to ever read the generated file). Never mind.
Nothing yet (elsewhere in the sh source) uses the new macros, so there's
even less chance of this changing anything than there would otherwise be.
Fix "command not found" handling so that the error message
goes to stderr (after any redirections are applied).
More importantly, in
foo > /tmp/junk
/tmp/junk should be created, before any attempt is made
to execute (the assumed non-existing) "foo".
All this was always true for any command (not found command)
containing a / in its name
foo/bar >/tmp/junk 2>>/tmp/errs
would have created /tmp/junk, then complained (in /tmp/errs)
about foo/bar not being found. Now that happens for ordinary
commands as well.
The fix (which I found when I saw differences between our
code and FreeBSD's, where, for the benefit of PR 42184,
this has been fixed, sometime in the past 9 years) is
frighteningly simple. Simply do not short circuit execution
(or print any error) when the initial lookup fails to
find the command - it will fail anyway when we actually
try running it. The cost is a (seemingly unnecessary,
except that it really is) fork in this case.
This is what I had been planning, but I expected it would
be much more difficult than it turned out....
XXX pullup-8
1. Make command -pv (and -pV) work (which is not as easy as the PR
suggests it might be (the "check and cause error" was there because
it did not work, not in order to prevent it from working).
2. Stop -v and -V being both used (that makes no sense).
3. Stop the "type" builtin inheriting the args (-pvV) that "command" has
(which it did, as when -v -or -V is used with command, it and type are
implemented using the same code).
4. make "command -v word" DTRT for sh keywords (was treating them as an error).
5. Require at least one arg for "command -[vV]" or "type" else usage & error.
Strictly this should also apply to "command" and "command -p" (no -v)
but that's handled elsewhere, so perhaps some other time. Perhaps
"command -v" (and -V) should be limited to 1 command name (where "type"
can have many) as in the POSIX definitions, but I don't think that matters.
6. With "command -V alias", (or "type alias" which is the same thing),
(but not "command -v alias") alter the output format, so we get
ll is an alias for: ls -al
instead of the old
ll is an alias for
ls -al
(and note there was a space, for some reason, after "for")
That is, unless the alias value contains any \n characters, in which
case (something approximating) the old multi-line format is retained.
Also note: that if code wants to parse/use the value of an alias, it
should be using the output of "alias name", not command or type.
Note that none of the above affects "command [-p] cmd" (no -v or -V options)
only "command -[vV]" and "type".
Note also that the changes to eval.[ch] are merely to make syspath()
visible in exec.c rather than static in eval.c
Attempt to correctly deal with \ (both when it is a literal,
in appropriate cases, and when it appears as CTLESC when it was
detected as a quoting character during parsing).
In a pattern, in sh, no quoted character can ever be anything other
than a literal character. This is quite different than regular
expressions, and even different than other uses of glob matching,
where shell quoting is not an issue.
In something like
ls ?\*.c
the ? is a meta-character, the * is a literal (it was quoted). This
is nothing new, sh has handled that properly for ever.
But the same happens with
VAR='?\*.c'
and
ls $VAR
which has not always been handled correctly. Of course, in
ls "$VAR"
nothing in VAR is a meta-character (the entire expansion is quoted)
so even the '\' must match literally (or more accurately, no matching
happens - VAR simply contains an "unusual" filename). But if it had
been
ls *"$VAR"
then we would be looking for filenames that end with the literal 5
characters that make up $VAR.
The same kinds of things are requires of matching patterns in case
statements, and sub-strings with the % and # operators in variable
expansions.
While here, the final remnant of the ancient !! pattern matching
hack has been removed (the code that actually implemented it was
long gone, but one small piece remained, not doing any real harm,
but potentially wasting time - if someone gave a pattern which would
once have invoked that hack.)
This is more or less the same patch as provided in the PR
(just 11 years later, so changed a bit) by woods@...
Since there is no known way to actually cause the reported crash,
we may never know if this change actually fixes anything. But
even if it doesn't it certainly cannot hurt.
There is a potential race which could possibly explain the issue
(see commentary in the PR) which is not easy to avoid - if that is
the actual cause, this should provide a defence, if not really a fix.
possibilities that we do not currently handle all that well.
This mostly means (for now) making sure that quoted pattern
magic characters (as well as quoted sh syntax magic chars)
are properly marked, so they remain known as being quoted,
and do not turn into pattern magic. Also, make sure that an
unquoted \ in a pattern always quotes whatever comes next
(which, unlike in regular expressions, includes inside []
matches),
Mostly use number() (no longer implemented using atoi()) when an
unsigned integer is required, but use strtoXXX() when a conversion
is wanted, without the possibility or error (like setting OPTIND
and RANDOM). Always init OPTIND to 1 when sh starts (overriding
anything in environ.)
that is longer than we can handle the same way we treat an unknown
class name (as a valid char class which contains nothing, so never
matches). Previously a "too long" class name invalidated the
class, so [:very-long-name:] would match any of '[' ':' 'v' ...
(note: "very-long-name" is not long enough to trigger this, but you
get the idea!)
However, the name itself has a restricted syntax ([[:***:]] is not a
character class, it is a match for one of a '[' ':' or '*', followed by
a ']') which we did not implement - check the syntax of the name before
treating it as a character class (but we do add '_' to alphanumerics
as legal class name characters).