1. A serious bug introduced 3 1/2 months ago (approx) (rev 1.116) which
broke all but the simple cases of ~ expansions is fixed (amazingly,
given the magnitude of this problem, no-one noticed!)
2. An ancient bug (probably from when ~ expansion was first addedin 1994, and
certainly is in NetBSD-6 vintage shells) where ${UnSeT:-~} (and similar)
does not expand the ~ is fixed (note that ${UnSeT:-~/} does expand,
this should give a clue to the cause of the problem.
3. A fix/change to make the effects of ~ expansions on ${UnSeT:=whatever}
identical to those in UnSeT=whatever In particular, with HOME=/foo
${UnSeT:=~:~} now assigns, and expands to, /foo:/foo rather than ~:~
just as VAR=~:~ assigns /foo:/foo to VAR. Note this is even after the
previous fix (ie: appending a '/' would not change the results here.)
It is hard to call this one a bug fix for certain (though I believe it is)
as many other shells also produce different results for the ${V:=...}
expansions than they do for V=... (though not all the same as we did).
POSIX is not clear about this, expanding ~ after : in VAR=whatever
assignments is clear, whether ${U:=whatever} assignments should be
treated the same way is not stated, one way or the other.
4. Change to make ':' terminate the user name in a ~ expansion in all cases,
not only in assignments. This makes sense, as ':' is one character that
cannot occur in user names, no matter how otherwise weird they become.
bash (incl in posix mode) ksh93 and bosh all act this way, whereas most
other shells (and POSIX) do not. Because this is clearly an extension
to POSIX, do this one only when not in posix mode (not set -o posix).
causes a core dump in some exotic circumstances (when restoring local
variables when a function returns). ("build.sh makewrapper" exposed it.)
This was introduced in 1.63 - not as part of the substance of that
change (addition) but as an unrelated "must be the right thing to do"
cleanup, which wasn't...
Implementation largely obtained from FreeBSD, with adaptations to meet the
needs and style of this sh, some updates to agree with the current POSIX spec,
and a few other minor changes.
The POSIX spec for this ( http://austingroupbugs.net/view.php?id=249 )
[see note 2809 for the current proposed text] is yet to be approved,
so might change. It currently leaves several aspects as unspecified,
this implementation handles those as:
Where more than 2 hex digits follow \x this implementation processes the
first two as hex, the following characters are processed as if the \x
sequence was not present. The value obtained from a \nnn octal sequence
is truncated to the low 8 bits (if a bigger value is written, eg: \456.)
Invalid escape sequences are errors. Invalid \u (or \U) code points are
errors if known to be invalid, otherwise can generate a '?' character.
Where any escape sequence generates nul ('\0') that char, and the rest of
the $'...' string is discarded, but anything remaining in the word is
processed, ie: aaa$'bbb\0ccc'ddd produces the same as aaa'bbb'ddd.
Differences from FreeBSD:
FreeBSD allows only exactly 4 or 8 hex digits for \u and \U (as does C,
but the current sh proposal differs.) reeBSD also continues consuming
as many hex digits as exist after \x (permitted by the spec, but insane),
and reject \u0000 as invalid). Some of this is possibly because that
their implementation is based upon an earlier proposal, perhaps note 590 -
though that has been updated several times.
Differences from the current POSIX proposal:
We currently always generate UTF-8 for the \u & \U escapes. We should
generate the equivalent character from the current locale's character set
(and UTF8 only if that is what the current locale uses.)
If anyone would like to correct that, go ahead.
We (and FreeBSD) generate (X & 0x1F) for \cX escapes where we should generate
the appropriate control character (SOH for \cA for example) with whatever
value that has in the current character set. Apart from EBCDIC, which
we do not support, I've never seen a case where they differ, so ...
Avoid mangling history when editing is enabled, and the prompt contains a \n
Also, allow empty input lines into history when they are being appended to
a previous (partial) command (but not when they would just make an empty entry).
For all the gory details, see the PR.
Note nothing here actually makes prompts containing \n work correctly
when editing is enabled, that's a libedit issue, which will be addressed
some other time.
Don't ignore unexpected reserved words after ';'
Don't allow any random token type as a case stmt pattern, only a word.
Those are ancient ash bugs and do not affect correct scripts.
Don't ignore redirects in a case stmt list where the list is nothing but
redirects (if the pattern matches, the redirects should be performed).
That was introduced when a redirect only case stmt list was allowed
(older shells had generated a syntax error.)
Random cleanups/refactoring taken from or inspired by the FreeBSD sh
parser ... use makename() consistently to create a NARG node - we
were using it in a couple of places but most NARG node creation was open
coded. Introduce consumetoken() (from FreeBSD) to handle the fairly
common case where exactly one token type must come next, and we need to
check that, and skip past it when found (or error) and linebreak() (new)
to handle places where optional \n's are permitted.
Both previously open coded.
Simplify list() by removing its second arg, which was only ever used when
handling the end of a `` (old style command substitution). Simply move
the code from inside list() to just after its call in the `` case (from
FreeBSD.)
(operators all come first, then TWORD, then keywords), and switch
from using TIF to define KWDOFFSET to using TWORD (the barrier,
rather than the token that happens to be first after it.)
to cause (when set, which it is not by default) the exit status of a
pipe to be 0 iff all commands in the pipe exited with status 0, and
otherwise, the status of the rightmost command to exit with a non-0
status.
In the doc, while describing this, also reword some of the text about
commands in general, how they are structured, and when they are executed.
by rudolf at eq.cz on tech-userlevel (July 15, 2017.)
Also correct a typo, de-correct some entirely proper English so
the doc remains written in American instead. And note that
interactive mode is set when stdin & stderr are terminals, not
stding and stdout.
Absent other information, the shell should be interactive if reading
from stdin, and stdin and stderr are ttys, not stdin and stdout.
So sayeth the great lord posix.
Silence nuisance testing environments - avoid << of a negative number
(a signed char -- in a hash function, the result is irrelevant, as long
as it is repeatable).
ALso, cause exec failures to always cause the shell to exit with
status 126 or 127, whatever the cause. 127 is intended for lookup
failures (and is used that way), 126 is used for anything else that
goes wrong (as in several other shells.) We no longer use 2 (more easily
confused with an exit status of the command exec'd) for shell exec failures.
the new format. Also #if 0 a function definition that is used nowhere.
While here, change the function of pushfile() slightly - it now sets
the buf pointer in the top (new) input descriptor to NULL, instead of
simply leaving it - code that needs a buffer always (before and after)
must malloc() one and assign it after the call. But code which does not
(which will be reading from a string or similar) now does not have to
explicitly set it to NULL (cleaner interface.) NFC intended (or observed.)
configure script, ie: $(( which is intended to be a sub-shell in a
command substitution, but is an arith subst instead, it needs to be
written $( ( to do as intended. Instead of just blindly carrying on to
find the missing )) somewhere, anywhere, give up as soon as we have seen
an unbalanced ')' that isn't immediately followed by another ')' which
in a valid arith subst it always would be.
While here, there has been a comment in the code for quite a while noting a
difference in the standard between the text descr & grammar when it comes to
the syntax of case statements. Add more comments to explain why parsing it
as we do is in fact definitely the correct way (ie: the grammar wins arguments
like this...).
in prompts when expanded at prompt time, but all available for general use.
Many of the new ones are not available in SMALL shells (they work as normal
if assigned, but the shell does not set or use them - and there is no magic
in a SMALL shell (usually for install media.))
parsing the way getopt(3) would, if only it could handle the (required)
-signumber and -signame options. This adds two "features" to kill,
-ssigname and -lstatus now work (ie: one word with all of the '-', the
option letter, and its value) and "--" also now works (kill -- -pid1 pid2
will not attempt to send the pid1 signal to pid2, but rather SIGTERM
to the pid1 process group and pid2). It is still the case that (apart
from --) at most 1 option is permitted (-l, -s, -signame, or -signumber.)
Note that we now have an ambiguity, -sname might mean "-s name" or
send the signal "sname" - if one of those turns out to be valid, that
will be accepted, otherwise the error message will indicate that "sname"
is not a valid signal name, not that "name" is not. Keeping the "-s"
and signal name as separate words avoids this issue.
Also caution: should someone be weird enough to define a new signal
name (as in the part after SIG) which is almost the same name as an
existing name that starts with 'S' by adding an extra 'S' prepended
(eg: adding a SIGSSYS) then the ambiguity problem becomes much worse.
In that case "kill -ssys" will be resolved in favour of the "-s"
flag being used (the more modern syntax) and would send a SIGSYS, rather
that a SIGSSYS. So don't do that.
While here, switch to using signalname(3) (bye bye NSIG, et. al.), add
some constipation, and show a little pride in formatting the signal names
for "kill -l" (and in the usage when appropriate -- same routine.) Respect
COLUMNS (POSIX XBD 8.3) as primary specification of the width (terminal width,
not number of columns to print) for kill -l, a very small value for COLUMNS
will cause kill -l output to list signals one per line, a very large
value will cause them all to be listed on one line.) (eg: "COLUMNS=1 kill -l")
TODO: the signal printing for "trap -l" and that for "kill -l"
should be switched to use a common routine (for the sh builtin versions.)
All changes of relevance here are to bin/kill - the (minor) changes to bin/sh
are only to properly expose the builtin version of getenv(3) so the builtin
version of kill can use it (ie: make its prototype available.)
caused by incorrect macro usage (ie: using the wrong one) which has
been in the sources since version 1.1 (ie: forever).
Like the previous (STACKSTRNUL) bug, the probability of this one
actually occurring has been infinitesimal but the LINENO code increases
that to infinitesimal and a smidgen... (or a few, depending upon usage).
Still, apparently that was enough, Kamil Rytarowski discovered that the
zsh configure script (damn competition!) managed to trigger this problem.
Two bugs here, one benign because of the way the script is used.
The other hidden by NetBSD's sort being stable, and the data not really
requiring sorting at all...
So as it happens these fixes change nothing, but they are needed anyway.
(The contents of the generated file are only used in DEBUG shells, so
this is really even less important than it seems.)
When processing a string (as in eval, trap, or sh -c) don't allow
trailing \n's to destroy the exit status of the last command executed.
That is:
sh -c 'false
'
echo $?
should produce 1, not 0.
(It was inheriting the value from end of profile file processing) - I didn't
notice before as I usually test with empty or no profile files to avoid
complications. Trivial change which should have very limited impact.
purpose) in exposing the bug in its implementation, go back to not using
it when not needed for DEBUG TRACE purposes. This change should have no
practical effect on either a DEBUG shell (where the STACKSTRNUL() calls
remain) or a non DEBUG shell where they are not needed.