Correct an issue found by Oguz <oguzismailuysal@gmail.com> and reported
in e-mail (on the bug-bash list initially!) with the code changed to deal
with PR bin/48875
With:
sh -c 'echo start at $SECONDS;
(sleep 3 & (sleep 1& wait) );
echo end at $SECONDS'
The shell should say "start at 0\nend at 1\n", but instead (before
this fix, in -9 and HEAD, but not -8) does "start at 0\nend at 3\n"
(Not in -8 as the 48875 changes were never pulled up)>
There was an old problem, fixed years ago, which cause the same symptom,
related to the way the jobs table was cleared (or not) in subshells, and
it seemed like that might have resurfaced.
But not so, the issue here is the sub-shell elimination, which was part
of the 48875 "fix" (not really, it wasn't really a bug, just sub-optimal
and unexpected behaviour).
What the shell actually has been running in this case is:
sh -c 'echo start at $SECONDS;
(sleep 3 & sleep 1& wait );
echo end at $SECONDS'
as the inner subshell was deemed unnecessary - all its parent would
do is wait for its exit status, and then exit with that status - we
may as well simply replace the current sub-shell with the new one,
let it do its thing, and we're done...
But not here, the running "sleep 3" will remain a child of that merged
sub-shell, and the "wait" will thus wait for it, along with the sleep 1
which is all it should be seeing.
For now, fix this by not eliminating a sub-shell if there are existing
unwaited upon children in the current one. It might be possible to
simply disregard the old child for the purposes of wait (and "jobs", etc,
all cmds which look at the jobs table) but the bookkeeping required to
make that work reliably is likely to take some time to get correct...
Along with this fix comes a fix to DEBUG mode shells, which, in situations
like this, could dump core in the debug code if the relevant tracing was
enabled, and add a new trace for when the jobs table is cleared (which was
added predating the discovery of the actual cause of this issue, but seems
worth keeping.) Neither of these changes have any effect on shells
compiled normally.
XXX pullup -9
Correctly handle (ie: ignore completely) \0 chars (nuls) in the
shell command input stream (script, dot file, or stdin).
Previously nul chars were ignored correctly in the line in which
they occurred, but would cause trailing chars of that line to reappear
as the start of the following line. If there was just one \0 skipped,
this would generally result in an extra \n in the sh input, which in
most cases has no effect. With multiple \0's in a single line, more
of the end of that line was duplicated into the following one. This
usually manifested as a weird "command not found" error.
Note that any \0 chars in the sh input make the script non-conforming,
so fixing this is not crucial (no \0's should really ever be seen) but
it was an obvious bug in the code, which was attempting to ignore nul
chars (as do many other shells), so let it be fixed.
XXX pullup -9
This fixes the MSAN detected reference to an unitialised variable
(an unitialised field in a struct) which happens when a command is
not found after a PATH search.
Aside from skipping some known to be going to fail exec*() calls
in some cases, the setting of the relevant field is irrelevant,
so this problem makes no practical difference to the shell, or any
shell script.
XXX (maybe) pullup -9
the "local" built-in command description (pointed out by mrg@ via uwe@ in
private e-mail).
Add a description to the export command of why this quoting is required,
and then refer to it from local and readonly (explained in export as that
one comes first).
Note that some shells parse export/local/readonly (and often more) as
"declarative" commands, and this quoting isn't needed (provided the
command name is literal and not the result of an expansion) making
X=$Y type args not require quoting, as they often don't in a regular
variable assignment (preceding, or not part of, another command).
POSIX is going to allow, but not require, that behaviour. We do not
implement it.
instead of simply assuming that the pid of the first (leftmost) process
in a pipeline is the pgrp - someday we may switch things around and
create pipelines right to left instead, which has several advantages,
but which would invalidate the assumption which was being made here.
abs(pid)) and indicate that -- is (strictly) needed if the first pid arg
(there often is only one) is negative - though this implementation works
without it if a signal to send has been explicitly given, but whereas
'kill 1234" is valid (send SIGTERM to pid 1234) "kill -1234" will generate
a usage error from the attempt to send signal 1234 to nothing, to send
SIGTERM to pgrp 1234 it needs to be "kill -- -1234" (or "kill -s term -1234").
While here do a couple of markup improvements, and allow for the
possibility that users might be running the builtin kill from some
shell other than csh or sh.
That means we cannot use (pid_t)-1 as an error indicator, as that's a
valid pid to use (described as working in kill(1) - yet it wasn't working
in /bin/kill (nor sh's builtin kill, which is essentially the same code).
This is even required to work by POSIX.
So change processnum() (the parser/validator for the pid args) to take
a pointer to a pid_t and return the pid that way, leaving the return value
of the (now int) function to indicate just ok/error. While here, fix
the validation a little ('' is no longer an accepted alias for 0) and in
case of an error from kill(2) have the error message indicate whether the
kill was targeted at a pid of a pgrp.
environment, rather than the nicer layout that is normally used.
Note this applies to /bin/kill only, the builtin kill in sh uses its
"posix" option for the same purpose, the one in csh only ever uses
POSIX format.
Better describe the command search procedure.
Document "trap -P"
Describe what works as a function name.
More accurate description of reserved word recognition.
Be more accurate about when field splittng happens after
expansions (and in particular note that tilde expansions are
not subject to field splitting). Be clear that "$@" is
not field split, it simply produces multiple fields as part
of its expansion (hence IFS is irrelevant to this), but if
used as $@ (unquoted) each field produced is potentially subject
to field splitting. Other minor wording changes.
traps_invalid (that is, when we actually nuke the parent shell's
caught traps in a subshell). This allows more reasonable use of
"trap -p" (and similar) in subshells than existed before (and in
particular, that command can be in a function now - there can also
be several related commands like
traps=$(trap -p INT; trap -p QUIT; trap -p HUP)
A side effect of all of this is that
(eval "$(trap -p)"; ...)
now allows copying caught traps into a subshell environment, if desired.
Also att the ksh93 variant (the one not picked by POSIX as it isn't
generally as useful) of "trap -p" (but call it "trap -P" which extracts
just the trap action for named signals (giving more than one is usually
undesirable). This allows
eval "$(trap -P INT)"
to run the action for SIGINT traps, without needing to attempt to parse
the "trap -p" output.
Also enhance some of the DEBUG mode trace output (nothing visible
in a normal shell build).
A couple of very minor code changes that no-one should ever notice
(eg: one less wait() call in the case that there is nothing pending).
functions being defined (they can still be included if quoted).
If we parsed the way POSIX specifies (leaving the exact input text of
$ and ` expansions unaltered, until required to be expanded) this would
not be needed, as the name of a function being defined does not underbo
parameter, command, or arith expansions, so xxx$3() { : ; } would just
work. But for many reasons we don't do that (and are unlikely to ever,
though maintaing both forms might be an option someday) - which led to
very obscure behaviour (if sh were compiled in DEBUG mode, even an abort())
and certainly nothing useful. So just prohibit these uses for now.
(A portable function name must be a "name" so this makes no difference
at all to posix compat applications/scripts).
A doc update is pending (the updated sh.1 also contains updates in other
areas not yet appropriate to commit).
When allocating for a Char **, it should use sizeof(Char *), not
sizeof(Char **). This doesn't actually affect the results except
on DS9000 though :-)
(part 2, the instance in this file was as far as I can tell
inexplicably missed by CVS on the first go...)
extra && or || or something ... forgotten now) as part a failed attempt
to fix an earlier bug (later fixed a better way) - when the extra
test (never committed) was removed, the now-redundant parentheses got
forgotten...
NFC.
Fix a bug that has existed since the "command" command was added in
2003. "command foo" would cause the definition of a function "foo"
to be lost (not freed, simply discarded) if "foo" is (in addition to
being a function) a filesystem command. The case where "foo" is
a builtin was handled.
For now, when a function exists with the same name as a filesystem
command, the latter can never appear in the command hash table, and
when used (which can only be via "command foo", just "foo" finds
the function) will always result in a full PATH search.
XXX pullup everything (from NetBSD 2.0 onwards). (really -8 and -9)
It is not enough to avoid displaying the contents of the directory,
we need to set FTS_SKIP to avoid descending into any subdirs too.
Otherwise, if a ".foo" directory has a subdirectory "bar", ls will
descend into bar and display its contents. From Todd Miller
`mv -h source target' just issues rename(source, target) without
discriminating on whether target resolves to a directory; this way
you can atomically replace a symlink to a directory.