output to the stderr which existed when the -X option was (last) enabled.
It also enables tracing by enabling -x (and when reset, +X, also resets
the 'x' flag (+x)). Note that it is still -x/+x which actually
enables/disables the trace output. Hence "apparent variant" - what -X
actually does (aside from setting -x) is just to lock the trace output,
rather than having it follow wherever stderr is later redirected.
to I32 P64 systems - keep nextc first, as that's used in macros,
and nleft next, as that's used (and both are updated) in the same macro,
which is used frequently, this increases the chance they're in the
same cache line (unchanged from before). Beyond that it matters less,
so just shuffle a bit to avoid internal padding when pointers are 64 bits.
Note that there are just 3 of these structs (currently), even if there was
to be a memory saving (there probably won't be, trailing padding will eat it)
it would be of the order of 12 or 24 bytes total, so all this really
just panders to my sense of rightness....
Note to anyone who might be tempted, please don't update the struct
initializers to use newer C forms - eventually sh is planned to become
a host tool, and a separable package, so it wants to remain able to be
compiled using older (though at least ansi) compilers that implement only
older C variants.
output includes a single quote (') then see if using double-quotes
to quote it is reasonable (if no chars that are magic in " also appear).
If so, and if the string is not entirely the ' character, then
use " quoting. This avoids some ugly looking results (occasionally).
Also, fix a bug introduced about 20 months ago where null strings
in xtrace output are dropped, instead of made explicit ('').
To observe this, before you get the fix: set -x; echo '' (or similar.)
Move a comment from the wrong place to the right place.
the same order that option flags with a similar property are sorted.
This corresponds with the change made to the sort order of the short
names made in the previous update (1.4).
Right now, this change makes no difference at all, as there are no
long option names that differ only in char case (yet.)
Correct a (relatively harmless) use after free in prompt expansion
processing [detected by asan.]
Relatively harmless: as (while incorrect) the way the data is (was)
used more or less guaranteed that the buffer contents would be
unaltered until well after they are (were) no longer wanted (this
is the expanded prompt string, it is just output (or copied into
libedit internal storage) and forgotten.
This should make no visible difference to anyone (not using asan or
similar.)
XXX pullup -8
-p var: set var to identifier, from arg list, or PID if no job args)
of the job for which status is returned (becomes $? after wait.)
Note: var is unset if the status returned from wait came from wait
itself rather than from some job exiting (so it is now possible to
tell whether 127 means "no such job" or "job did exit(127)", and
whether $? > 128 means "wait was interrupted" or "job was killed
by a signal or did exit(>128)". ($? is too limited to to allow
indicating whether the job died with a signal, or exited with a
status such that it looks like it did...)
process group (-g), the process leader pid (-p) ($! if the job was &'d)
and the job identifier (-j) (the %n that refers to the job) in addition to
(default) the list of all pids in the job (which it has always done).
No change to the (single) "job" arg, which is a specifier of the job:
the process leader pid, or one of the % forms, and defaults to %% (aka %+).
(This is all now documented in sh(1))
Also document the jobs command properly (no change to the command, just
document what it actually is.)
And while here, a whole new section in sh(1) "Job Control". It probably
needs better wording, but this is (perhaps) better than the nothing that
was there before.
Don't delete jobs from the jobs table merely because they finished,
if they are not the job we are waiting upon. (bin/52640 part 1)
In a sub-shell environment, don't allow wait to find jobs from the
parent shell that had already exited (before the sub-shell was
created) and return status for them as if they are our children.
(bin/52640 part 2)
Don't have the "jobs" command also be an implicit "wait" command
in non-interactive shells. (bin/52641)
Use WCONTINUED (when it exists) so we can report on stopped jobs that
"mysteriously" move back to running state without the user issuing
a "bg" command (eg: kill -CONT <pid>) Previously they would keep
being reported as stopped until they exited.
When a job is detected as having changed status just as we're
issuing a "jobs" command (i.e.: the change occurred between the last
prompt and the jobs command being entered) don't report it twice,
once from the status change, and then again in the jobs command
output. Once is enough (keep the jobs output, suppress the other).
Apply some sanity to the way jobs_invalid is processed - ignore it
in getjob() instead of just ignoring it most of the time there, and
instead always check it before calling getjob() in situations where
we can handle only children of the current shell. This allows the
(totally broken) save/clear/restore of jobs_invalid in jobscmd() to
be done away with (previously an error while in the clear state would
have left jobs_invalid incorrectly cleared - shouldn't have mattered
since jobs_invalid => subshell => error causes exit, but better to be safe).
Add/improve the DEBUG more tracing.
XXX pullup -8
code duplication, and reducing the size of /bin/sh by a trivial amount.
NFCI.
This is being done now as there are two other changes forthcoming, both
of which benefit - one would result in even more code duplication without
this, the other might need to alter how this is done, and doing it after this
means there's just one place to change (if required).
1. A serious bug introduced 3 1/2 months ago (approx) (rev 1.116) which
broke all but the simple cases of ~ expansions is fixed (amazingly,
given the magnitude of this problem, no-one noticed!)
2. An ancient bug (probably from when ~ expansion was first addedin 1994, and
certainly is in NetBSD-6 vintage shells) where ${UnSeT:-~} (and similar)
does not expand the ~ is fixed (note that ${UnSeT:-~/} does expand,
this should give a clue to the cause of the problem.
3. A fix/change to make the effects of ~ expansions on ${UnSeT:=whatever}
identical to those in UnSeT=whatever In particular, with HOME=/foo
${UnSeT:=~:~} now assigns, and expands to, /foo:/foo rather than ~:~
just as VAR=~:~ assigns /foo:/foo to VAR. Note this is even after the
previous fix (ie: appending a '/' would not change the results here.)
It is hard to call this one a bug fix for certain (though I believe it is)
as many other shells also produce different results for the ${V:=...}
expansions than they do for V=... (though not all the same as we did).
POSIX is not clear about this, expanding ~ after : in VAR=whatever
assignments is clear, whether ${U:=whatever} assignments should be
treated the same way is not stated, one way or the other.
4. Change to make ':' terminate the user name in a ~ expansion in all cases,
not only in assignments. This makes sense, as ':' is one character that
cannot occur in user names, no matter how otherwise weird they become.
bash (incl in posix mode) ksh93 and bosh all act this way, whereas most
other shells (and POSIX) do not. Because this is clearly an extension
to POSIX, do this one only when not in posix mode (not set -o posix).
causes a core dump in some exotic circumstances (when restoring local
variables when a function returns). ("build.sh makewrapper" exposed it.)
This was introduced in 1.63 - not as part of the substance of that
change (addition) but as an unrelated "must be the right thing to do"
cleanup, which wasn't...
Implementation largely obtained from FreeBSD, with adaptations to meet the
needs and style of this sh, some updates to agree with the current POSIX spec,
and a few other minor changes.
The POSIX spec for this ( http://austingroupbugs.net/view.php?id=249 )
[see note 2809 for the current proposed text] is yet to be approved,
so might change. It currently leaves several aspects as unspecified,
this implementation handles those as:
Where more than 2 hex digits follow \x this implementation processes the
first two as hex, the following characters are processed as if the \x
sequence was not present. The value obtained from a \nnn octal sequence
is truncated to the low 8 bits (if a bigger value is written, eg: \456.)
Invalid escape sequences are errors. Invalid \u (or \U) code points are
errors if known to be invalid, otherwise can generate a '?' character.
Where any escape sequence generates nul ('\0') that char, and the rest of
the $'...' string is discarded, but anything remaining in the word is
processed, ie: aaa$'bbb\0ccc'ddd produces the same as aaa'bbb'ddd.
Differences from FreeBSD:
FreeBSD allows only exactly 4 or 8 hex digits for \u and \U (as does C,
but the current sh proposal differs.) reeBSD also continues consuming
as many hex digits as exist after \x (permitted by the spec, but insane),
and reject \u0000 as invalid). Some of this is possibly because that
their implementation is based upon an earlier proposal, perhaps note 590 -
though that has been updated several times.
Differences from the current POSIX proposal:
We currently always generate UTF-8 for the \u & \U escapes. We should
generate the equivalent character from the current locale's character set
(and UTF8 only if that is what the current locale uses.)
If anyone would like to correct that, go ahead.
We (and FreeBSD) generate (X & 0x1F) for \cX escapes where we should generate
the appropriate control character (SOH for \cA for example) with whatever
value that has in the current character set. Apart from EBCDIC, which
we do not support, I've never seen a case where they differ, so ...
Avoid mangling history when editing is enabled, and the prompt contains a \n
Also, allow empty input lines into history when they are being appended to
a previous (partial) command (but not when they would just make an empty entry).
For all the gory details, see the PR.
Note nothing here actually makes prompts containing \n work correctly
when editing is enabled, that's a libedit issue, which will be addressed
some other time.
Don't ignore unexpected reserved words after ';'
Don't allow any random token type as a case stmt pattern, only a word.
Those are ancient ash bugs and do not affect correct scripts.
Don't ignore redirects in a case stmt list where the list is nothing but
redirects (if the pattern matches, the redirects should be performed).
That was introduced when a redirect only case stmt list was allowed
(older shells had generated a syntax error.)
Random cleanups/refactoring taken from or inspired by the FreeBSD sh
parser ... use makename() consistently to create a NARG node - we
were using it in a couple of places but most NARG node creation was open
coded. Introduce consumetoken() (from FreeBSD) to handle the fairly
common case where exactly one token type must come next, and we need to
check that, and skip past it when found (or error) and linebreak() (new)
to handle places where optional \n's are permitted.
Both previously open coded.
Simplify list() by removing its second arg, which was only ever used when
handling the end of a `` (old style command substitution). Simply move
the code from inside list() to just after its call in the `` case (from
FreeBSD.)
(operators all come first, then TWORD, then keywords), and switch
from using TIF to define KWDOFFSET to using TWORD (the barrier,
rather than the token that happens to be first after it.)
to cause (when set, which it is not by default) the exit status of a
pipe to be 0 iff all commands in the pipe exited with status 0, and
otherwise, the status of the rightmost command to exit with a non-0
status.
In the doc, while describing this, also reword some of the text about
commands in general, how they are structured, and when they are executed.
by rudolf at eq.cz on tech-userlevel (July 15, 2017.)
Also correct a typo, de-correct some entirely proper English so
the doc remains written in American instead. And note that
interactive mode is set when stdin & stderr are terminals, not
stding and stdout.
Absent other information, the shell should be interactive if reading
from stdin, and stdin and stderr are ttys, not stdin and stdout.
So sayeth the great lord posix.
Silence nuisance testing environments - avoid << of a negative number
(a signed char -- in a hash function, the result is irrelevant, as long
as it is repeatable).
ALso, cause exec failures to always cause the shell to exit with
status 126 or 127, whatever the cause. 127 is intended for lookup
failures (and is used that way), 126 is used for anything else that
goes wrong (as in several other shells.) We no longer use 2 (more easily
confused with an exit status of the command exec'd) for shell exec failures.
the new format. Also #if 0 a function definition that is used nowhere.
While here, change the function of pushfile() slightly - it now sets
the buf pointer in the top (new) input descriptor to NULL, instead of
simply leaving it - code that needs a buffer always (before and after)
must malloc() one and assign it after the call. But code which does not
(which will be reading from a string or similar) now does not have to
explicitly set it to NULL (cleaner interface.) NFC intended (or observed.)
configure script, ie: $(( which is intended to be a sub-shell in a
command substitution, but is an arith subst instead, it needs to be
written $( ( to do as intended. Instead of just blindly carrying on to
find the missing )) somewhere, anywhere, give up as soon as we have seen
an unbalanced ')' that isn't immediately followed by another ')' which
in a valid arith subst it always would be.
While here, there has been a comment in the code for quite a while noting a
difference in the standard between the text descr & grammar when it comes to
the syntax of case statements. Add more comments to explain why parsing it
as we do is in fact definitely the correct way (ie: the grammar wins arguments
like this...).
in prompts when expanded at prompt time, but all available for general use.
Many of the new ones are not available in SMALL shells (they work as normal
if assigned, but the shell does not set or use them - and there is no magic
in a SMALL shell (usually for install media.))
parsing the way getopt(3) would, if only it could handle the (required)
-signumber and -signame options. This adds two "features" to kill,
-ssigname and -lstatus now work (ie: one word with all of the '-', the
option letter, and its value) and "--" also now works (kill -- -pid1 pid2
will not attempt to send the pid1 signal to pid2, but rather SIGTERM
to the pid1 process group and pid2). It is still the case that (apart
from --) at most 1 option is permitted (-l, -s, -signame, or -signumber.)
Note that we now have an ambiguity, -sname might mean "-s name" or
send the signal "sname" - if one of those turns out to be valid, that
will be accepted, otherwise the error message will indicate that "sname"
is not a valid signal name, not that "name" is not. Keeping the "-s"
and signal name as separate words avoids this issue.
Also caution: should someone be weird enough to define a new signal
name (as in the part after SIG) which is almost the same name as an
existing name that starts with 'S' by adding an extra 'S' prepended
(eg: adding a SIGSSYS) then the ambiguity problem becomes much worse.
In that case "kill -ssys" will be resolved in favour of the "-s"
flag being used (the more modern syntax) and would send a SIGSYS, rather
that a SIGSSYS. So don't do that.
While here, switch to using signalname(3) (bye bye NSIG, et. al.), add
some constipation, and show a little pride in formatting the signal names
for "kill -l" (and in the usage when appropriate -- same routine.) Respect
COLUMNS (POSIX XBD 8.3) as primary specification of the width (terminal width,
not number of columns to print) for kill -l, a very small value for COLUMNS
will cause kill -l output to list signals one per line, a very large
value will cause them all to be listed on one line.) (eg: "COLUMNS=1 kill -l")
TODO: the signal printing for "trap -l" and that for "kill -l"
should be switched to use a common routine (for the sh builtin versions.)
All changes of relevance here are to bin/kill - the (minor) changes to bin/sh
are only to properly expose the builtin version of getenv(3) so the builtin
version of kill can use it (ie: make its prototype available.)
caused by incorrect macro usage (ie: using the wrong one) which has
been in the sources since version 1.1 (ie: forever).
Like the previous (STACKSTRNUL) bug, the probability of this one
actually occurring has been infinitesimal but the LINENO code increases
that to infinitesimal and a smidgen... (or a few, depending upon usage).
Still, apparently that was enough, Kamil Rytarowski discovered that the
zsh configure script (damn competition!) managed to trigger this problem.
Two bugs here, one benign because of the way the script is used.
The other hidden by NetBSD's sort being stable, and the data not really
requiring sorting at all...
So as it happens these fixes change nothing, but they are needed anyway.
(The contents of the generated file are only used in DEBUG shells, so
this is really even less important than it seems.)
When processing a string (as in eval, trap, or sh -c) don't allow
trailing \n's to destroy the exit status of the last command executed.
That is:
sh -c 'false
'
echo $?
should produce 1, not 0.
(It was inheriting the value from end of profile file processing) - I didn't
notice before as I usually test with empty or no profile files to avoid
complications. Trivial change which should have very limited impact.
purpose) in exposing the bug in its implementation, go back to not using
it when not needed for DEBUG TRACE purposes. This change should have no
practical effect on either a DEBUG shell (where the STACKSTRNUL() calls
remain) or a non DEBUG shell where they are not needed.
the line number when included in the trace line tag to show whether it
comes from the parser, or the elsewhere as they tend to be quite different).
Initially only one case was changed, while I pondered whether I liked it
or not. Now it is all done... Also when there is a line tag at all,
always include the root/sub-shell indicator character, not only when the
pid is included.
to my delicate sensibilities... (NFC).
Arrange not to barf (ever) if some turkey makes _ readonly. Do this
by adding a VNOERROR flag that causes errors in var setting to be
ignored (intended use is only for internal shell var setting, like of "_").
(nb: invalid var name errors ignore this flag, but those should never
occur on a var set by the shell itself.)
From FreeBSD: don't simply discard memory if a variable is not set for
any reason (including because it is readonly) if the var's value had
been malloc'd. Free it instead...
to local_lineno as the latter seemed to be marginally more popular,
and perhaps more importantly, is the same length as the peviously
existing quietprofile option, which means the man page indentation
for the list of options can return to (about) what it was before...
(That is, less indented, which means more data/line, which means less
lines of man page - a good thing!)
PR bin/52302 (core dump with interactive shell, here doc and error
on same line) is fixed. (An old bug.)
echo "$( echo x; for a in $( seq 1000 ); do printf '%s\n'; done; echo y )"
consistently prints 1002 lines (x, 1000 empty ones, then y) as it should
(And you don't want to know what it did before, or why.) (Another old one.)
(Recently added) Problems with ~ expansion fixed (mem management related).
Proper fix for the cwrappers configure problem (which includes the quick
fix that was done earlier, but extends upon that to be correct). (This was
another newly added problem.)
And the really devious (and rare) old bug - if STACKSTRNUL() needs to
allocate a new buffer in which to store the \0, calculate the size of
the string space remaining correctly, unlike when SPUTC() grows the
buffer, there is no actual data being stored in the STACKSTRNUL()
case - the string space remaining was calculated as one byte too few.
That would be harmless, unless the next buffer also filled, in which
case it was assumed that it was really full, not one byte less, meaning
one junk char (a nul, or anything) was being copied into the next (even
bigger buffer) corrupting the data.
Consistent use of stalloc() to allocate a new block of (stack) memory,
and grabstackstr() to claim a block of (stack) memory that had already
been occupied but not claimed as in use. Since grabstackstr is implemented
as just a call to stalloc() this is a no-op change in practice, but makes
it much easier to comprehend what is really happening. Previous code
sometimes used stalloc() when the use case was really for grabstackstr().
Change grabstackstr() to actually use the arg passed to it, instead of
(not much better than) guessing how much space to claim,
More care when using unstalloc()/ungrabstackstr() to return space, and in
particular when the stack must be returned to its previous state, rather than
just returning no-longer needed space, neither of those work. They also don't
work properly if there have been (really, even might have been) any stack mem
allocations since the last stalloc()/grabstackstr(). (If we know there
cannot have been then the alloc/release sequence is kind of pointless.)
To work correctly in general we must use setstackmark()/popstackmark() so
do that when needed. Have those also save/restore the top of stack string
space remaining.
[Aside: for those reading this, the "stack" mentioned is not
in any way related to the thing used for maintaining the C
function call state, ie: the "stack segment" of the program,
but the shell's internal memory management strategy.]
More comments to better explain what is happening in some cases.
Also cleaned up some hopelessly broken DEBUG mode data that were
recently added (no effect on anyone but the poor semi-human attempting
to make sense of it...).
User visible changes:
Proper counting of line numbers when a here document is delimited
by a multi-line end-delimiter, as in
cat << 'REALLY
END'
here doc line 1
here doc line 2
REALLY
END
(which is an obscure case, but nothing says should not work.) The \n
in the end-delimiter of the here doc (the last one) was not incrementing
the line number, which from that point on in the script would be 1 too
low (or more, for end-delimiters with more than one \n in them.)
With tilde expansion:
unset HOME; echo ~
changed to return getpwuid(getuid())->pw_home instead of failing (returning ~)
POSIX says this is unspecified, which makes it difficult for a script to
compensate for being run without HOME set (as in env -i sh script), so
while not able to be used portably, this seems like a useful extension
(and is implemented the same way by some other shells).
Further, with
HOME=; printf %s ~
we now write nothing (which is required by POSIX - which requires ~ to
expand to the value of $HOME if it is set) previously if $HOME (in this
case) or a user's directory in the passwd file (for ~user) were a null
STRING, We failed the ~ expansion and left behind '~' or '~user'.
being done (one in probably dead code that is never compiled, the other
in a very rare error case.) Since it is stack memory it wasn't lost
in any case, just held longer than needed.
Expanding `` containing \ \n sequences looks to have been giving
problems. I don't think this is the correct fix, but it will do
no worse harm than (perhaps) incorrectly calculating LINENO in this
kind of (rare) circumstance. I'll look and see if there should be
a better fix later.
didn't get removed with v2, and should have. This would have had
(I think, without having tested it) one very minor effect on the way
LINENO worked in the v2 implementation, but my guess is it would have
taken a long time before anyone noticed...
need these changes to be fixed - and these cause problems in another
absurd use case. Either of these issues is unlikely to be seen by
anyone who isn't an idiot masochist...
would have usually been set earlier, this change is mostly an effective
no-op, but it is better this way (just in case) - not observed to have
caused any problems.
amd64 (problem was missing prototype for snprintf witout <stdio.h>)
While here, add some (DEBUG mode only) tracing that proved useful in
solving another problem.
the LINENO hack, and uses the LINENO var for both ${LINENO} and $((LINENO)).
(Code to invert the LINENO hack when required, like when de-compiling the
execution tree to provide the "jobs" command strings, is still included,
that can be deleted when the LINENO hack is completely removed - look for
refs to VSLINENO throughout the code. The var funclinno in parser.c can
also be removed, it is used only for the LINENO hack.)
This version produces accurate results: $((LINENO)) was made as accurate
as the LINENO hack made ${LINENO} which is very good. That's why the
LINENO hack is not yet completely removed, so it can be easily re-enabled.
If you can tell the difference when it is in use, or not in use, then
something has broken (or I managed to miss a case somewhere.)
The way that LINENO works is documented in its own (new) section in the
man page, so nothing more about that, or the new options, etc, here.
This version introduces the possibility of having a "reference" function
associated with a variable, which gets called whenever the value of the
variable is required (that's what implements LINENO). There is just
one function pointer however, so any particular variable gets at most
one of the set function (as used for PATH, etc) or the reference function.
The VFUNCREF bit in the var flags indicates which func the variable in
question uses (if any - the func ptr, as before, can be NULL).
I would not call the results of this perfect yet, but it is close.
Aside from one problem (not too hard to fix if it was ever needed) this version
does about as well as most other shell implementations when expanding
$((LINENO)) and better for ${LINENO} as it retains the "LINENO hack" for the
latter, and that is very accurate.
Unfortunately that means that ${LINENO} and $((LINENO)) do not always produce
the same value when used on the same line (a defect that other shells do not
share - aside from the FreeBSD sh as it is today, where only the LINENO hack
exists and so (like for us before this commit) $((LINENO)) is always either
0, or at least whatever value was last set, perhaps by
LINENO=${LINENO}
which does actually work ... for that one line...)
This could be corrected by simply removing the LINENO hack (look for the string
LINENO in parser.c) in which case ${LINENO} and $((LINENO)) would give the
same (not perfectly accurate) values, as do most other shells.
POSIX requires that LINENO be set before each command, and this implementation
does that fairly literally - except that we only bother before the commands
which actually expand words (for, case and simple commands). Unfortunately
this forgot that expansions also occur in redirects, and the other compound
commands can also have redirects, so if a redirect on one of the other compound
commands wants to use the value of $((LINENO)) as a part of a generated file
name, then it will get an incorrect value. This is the "one problem" above.
(Because the LINENO hack is still enabled, using ${LINENO} works.)
This could be fixed, but as this version of the LINENO implementation is just
for reference purposes (it will be superseded within minutes by a better one)
I won't bother. However should anyone else decide that this is a better choice
(it is probably a smaller implementation, in terms of code & data space then
the replacement, but also I would expect, slower, and definitely less accurate)
this defect is something to bear in mind, and fix.
This version retains the *BSD historical practice that line numbers in functions
(all functions) count from 1 from the start of the function, and elsewhere,
start from 1 from where the shell started reading the input file/stream in
question. In an "eval" expression the line number starts at the line of the
"eval" (and then increases if the input is a multi-line string).
Note: this version is not documented (beyond as much as LINENO was before)
hence this slightly longer than usual commit message.
while doing a half-hearted, broken, partial, version of cd -L instead.
The latter (as the manual says) is not supported, what's more, it is an
abomination, and should never be supported (anywhere.)
Fix the doc so that the pretense that we notice when a path given crosses
a symlink (and turns on printing of the destination directory) is claimed
no more (that used to be true until late Dec 2016, but was changed). Now
the print happens if -o cdprint is set, or if an entry from CDPATH that is
not "" or "." is used (or if the "cd dest repl" cd cmd variant is used.)
Fix CDPATH processing: avoid the magic '%' processing that is used for
PATH and MAILPATH from corrupting CDPATH. The % magic (both variants)
remains undocumented.
Also, don't double the '/' if an entry in PATH or CDPATH ends in '/'
(as in CDPATH=":/usr/src/"). A "cd usr.bin" used to do
chdir("/usr/src//usr.bin"). No more. This is almost invisible,
and relatively harmless, either way....
Also fix a bug where if a plausible destination directory in CDPATH
was located, but the chdir() failed (eg: permission denied) and then
a later "." or "" CDPATH entry succeeded, "print" mode was turned on.
That is:
cd /tmp; mkdir bin
mkdir -p P/bin; chmod 0 P/bin
CDPATH=/tmp/P:
cd bin
would cd to /tmp/bin (correctly) but print it (incorrectly).
Also when in "cd dest replace" mode, if the result of the replacement
generates '-' as the path named, as in:
cd $PWD -
then simply change to '-' (or attempt to, with CDPATH search), rather
than having this being equivalent to "cd -")
Because of these changes, the pwd command (and $PWD) essentially
always acts as pwd -P, even when called as pwd -L (which is still
the default.) That is, even more than it did before.
Also fixed a (kind of minor) mem management error (CDPATH related)
"whosoever shall padvance must stunalloc before repeating" (and the
same for MAILPATH).
negative of a negative number, just add a positive number instead...
(the previous version came about purely as an accident of the way the
relevant piece of code was added and debugged.... that's my story anyway!)
Fixing this fixes a regression introduced earlier today (UTC) where
arithmetic expressions would be split correctly when the arithmetic
started at the beginning of a word:
echo $(( expression ))
where "begin" is 0, and so (begin, length) is the same as (begin, begin+length)
(aka: (begin,end) - and yes, "end" means 1 after last to consider).
but did not work correctly when the usage was
echo XXX$( expression ))
(begin !+ 0) and would only split (some part of) the result of the expression.
This regression was also foung by the new t_fsplit:split_arith
test case added earlier to the ATF tests for sh.
what matters is the quoting state just before we switch into arithmetic
syntax parsing mode, not the state after...
This fixes the regiression introduced earlier today (UTC) where
quoted arithmetic expressions were being subjected to word splitting.
differently...)
In particular ${01} is now $1 not $0 (for ${0any-digits})
${4294967297} is most probably now ""
(unless you have a very large number of params)
it is no longer an alias for $1 (4294967297 & 0xFFFFFFFF) == 1
$(( expr $(( more )) stuff )) is no longer the same as
$(( expr (( more )) stuff )) which was sometimes OK, as in:
$(( 3 + $(( 2 - 1 )) * 3 ))
but not always as in:
$(( 1$((1 + 1))1 ))
which should be 121, but was an arith syntax error as
1((1 + 1))1
is meaningless.
Probably some more. This also sprinkles a little const, splits a big
func that had 2 (kind of unrelated) purposes into two simpler ones,
and avoids some (semi-dubious) modifications (and restores) in the input
string to insert \0's when they were needed.
them whenever the user tries to step on one, we can change our behaviour
back to what the kernel considers to be that of a well behaved shell
(wrt file descriptor usage). If our user causes problems, we will soon
move into recalcitrant process territory, but that should normally be
rare. This should reduce kernel overheads a little.
parser tracing is useful when debugging the parser (which admittedly is
fairly often...) but there is a lot of it, and it gets in the way when
looking at something else. Now we can turn it off when not wanted.
option sorting (no longer required option.list to be manually
sorted by long option name) and properly handles conditional
options. Cleaner output format as well.
This allows option.list to be reordered to group related options
together ... also added more comments to it.
Unless the shell is compiled with the (compilation time) option
BOGUS_NOT_COMMAND (as in CFLAGS+=-DBOGUS_NOT_COMMAND) which it
will not normally be, the ! command (reserved word) will only
be permitted at the start of a pipeline (which includes the
degenerate pipeline with no '|'s in it of course - ie: a simple cmd)
and not in the middle of a pipeline sequence (no "cmd | ! cmd" nonsense.)
If the latter is really required, then "cmd | { ! cmd; }" works as
a standard equivalent.
In POSIX mode, permit only one ! ("! pipeline" is ok. "! ! pipeline" is not).
Again, if needed (and POSIX conformance is wanted) "! { ! pipeline; }"
works as an alternative - and is safer, some shells treat "! ! cmd" as
being identical to "cmd" (this one did until recently.)
inheritance when a variable is declared local, but instead leave
the local var unset (if not given a value) in the function.
Only ash derived shells do inheritance it seems.
So, to compensate for that, and get one step closer to making
"local" part of POSIX, so we can really rely upon it, a compromise
has been suggested, where "local x" is implementation defined
when it comes to this issue, and we add "local -I x" to specify
inheritance, and "local -N x" to specify "not" (something...)
(not inherited, or not set, or whatever you prefer to imagine!)
The option names took a lot of hunting to find something reasonable
that no shell (we know of) had already used for some other purpose...
The I was easy, but 'u' 'U' 'X' ... all in use somewhere.
This implements that (well, semi-) agreement.
While here, add "local -x" (which many other shells already have)
which causes the local variable to be made exported. Not a lot
of gain in that (since "export x" can always be done immediately
after "local x") but it is very cheap to add and allows more other
scripts to work with out shell.
Note that while 'local x="${x}"' always works to specify inheritance
(while making the shell work harder), "local x; unset x" does not
always work to specify the alternative, as some shells have
"re-interpreted" unset of a local variable to mean something that
would best be described as "unlocal" instead - ie: after the unset
you might be back with the variable from the outer scope, rather
than with an unset local variable.
Also add "unset -x" to allow unsetting a variable without removing
any exported status it has.
There are gazillions of other options that are not supported here!
sh +c "command string" no longer works (it must be -c)
sh +o and sh -o no longer work (if you could call what they did
before working.) nb: this is without an option name.
-ooo Opt1 Opt2 Opt3 no longer works (set & cmd line), this should be
-o Opt1 -o Opt2 -o Opt3 (same with +ooo of course).
-oOpt is now supported - option value (name of option in
this case) immediately following -o (or +o).
(as with other commands that use std opt parsing)
Both set comamnd and command line.
In addition, the output from "set +o" has shrunk dramatically, by borrowing
a trick from ksh93 (but implemented in a more traditional syntax).
"set +o" is required to produce a command (or commands) which when executed
later, will return all options to the state they were in when "set +o"
was done. Previously that was done by generating a set command, with
every option listed (set -o opt +o other-opt ...) to set them all back
to their current setings. Now we have a new "magic option" ("default")
which sets all options to their default values, so now set +o output
need only be "set -o default -o changed-opt ..." (only the options that
have been changed from their default values need be explicitly mentioned.)
The definition of "default value" for this is the value the shell set the
option to, after startup, after processing the command line (with any
flags, or -o option type settings), but before beginning processing any
user input (incuding startup files, like $ENV etc).
Anyone can execute "set -o default" of course, but only from a "set"
command (it makes no sense at all as a -o option to sh). This also
causes "set +o" to be slightly more useful as a general command, as
ignoring the "set -o default" part of the result, it lists just those
options that have been altered after sh startup. There is no +o default.
There isn't an option called "default" at all...
This causes some of the commented out text from sh.1 to become uncommented.
levels for debug output. This change accidentally omitted earlier (only
effect is incorrect nesting levels shown in trace output when the option
to show them is enabled.) NFC for any normal shell build.
output from "sh -x" only (tracing execution), not quoting + is better,
as it makes tracing commands with + and - options, or numbers, more
consistent.
Also one minor white space change (excess indentation removed).
compiled for DEBUG.)
Add debug builtin command, and corresponding -D command line option.
As usual, for DEBUG related stuff, read the source for info, that's
all there is about this.
This completes the infrastructure changes for the updated DEBUG TRACE
mechanism, so now converting the rest of the shell's internal tracing
can happen as desired - piecemeal.
upgrade a while ago (this should make no difference to anything
other than a minor - very minor - build time speedup, ld is
smart enough to relaise that nothing from the lex library was
needed, and the executable contains no reference to it, even
befor ethis change.)
Document the (slightly) enhanced NETBSD_SHELL.
Fix a typo (one of my typos...)
Move a commented out section to align with current planned changes
(and fix its commented out markup.)
MKREPRO_TIMESTAMP (as an additional word in the value, with a "BUILD:" prefix)
if it is set during the build. (Trailing 00 pairs in the time are removed).
While here, throw in some extra words that list the compilation
options used which alter sh behaviour (mostly by removing stuff.)
Usually that will only be noticed in a SMALL shell compiled for
install media, or similar - none of the others (not that there
are many) are ever changed from the default in a normal build
(default settings are just omitted.) This also allows scripts
to tell if they are running in a DEBUG shell, which can sometimes
make debugging easier.
First, be aware that the DEBUG spoken of here has nothing whatever to
do with MKDEBUG=true type builds of NetBSD. The only way to get a
DEBUG shell is to build it yourself manually.
That said, for non-DEBUG shells, this change makes only one slight
(trivial really) difference, which should affect nothing.
Previously some code was defined like ...
function(args)
{
#ifdef DEBUG
/* function code goes here */
#endif
}
and called like ...
#ifdef DEBUG
function(params);
#endif
resulting in several empty functions that are never called being
defined in non-DEBUG shells. Those are now gone. If you can detect
the difference any way other than using "nm" or similar, I'd be very
surprised...
For DEBUG shells, this introduces a whole new TRACE() setup to use
to assist in debugging the shell.
I have had this locally (uncommitted) for over a year... it helps.
By itself this change is almost useless, nothing really changes, but
it provides the framework to allow other TRACE() calls to be updated
over time. This is why I had not committed this earlier, my previous
version required a flag day, with all the shell's internal tracing
being updated a once - which I had done, but that shell version has
bit-rotted so badly now it is almost useless...
Future updates will add the mechanism to allow the new stuff to actually
be used in a productive way, and following that, over time, gradual
conversion of all the shell tracing to the updated form (as required,
or when I am bored...)
The one useful change that we do get now is that the fd that the shell
uses for tracing (which was usually 3, but not any more) is now protected
from user/script interference, like all the other shell inernal fds.
There is no doc (nor will there be) on any of this, if you are not reading
the source code it is useless to you, if you are, you know how it works.
arg list processing), and the set command in general.
Also add some (new) commented out text related to options which may
be commented back in sometime soon...
${#VAR:-foo} (or any other modifier on ${#VAR} is a syntax error.
On the other hand ${##} is not, nor is ${##13} though they mean
quite different things (the latter is an idiom everyone should learn,
... $# except we refuse to admit the possibility that it is 13...
Even I cannot explain what ${#-foo} used to do, but it wasn't sane!
(It should be just $# as $# is never unset, but ...)
Shell syntax is truly a wondrous thing!
one of the words happens to contain ${#var}. (This is the command
string shown by the "jobs" command, and when a background job completes)
While here, undo the LINENO hack when building that string.
And one ot two other foibles...
POSIX requires that the output of the "set" command (with no args -- it
gives a list of variables, and their values) be sorted according to
the collating sequence defined by the current locale.
Now I'm not aware of any locale where the collating sequence order of
ascii letters, digits, and '_' are any different than they are in the
C locale (and those are the only characters that can occur in variable
names - unless there is perhaps a locale that defines "dictionary" order
as the sort order) but never mind, that isn't the bug...
What "collating sequence order" does mean however, if not "collating
sequence order, except when we happen to have two variable names, where
one name is a prefix of the other (say X and XY) and the first character
of the 'Y' part of the longer name happens to be a digit..."
"set" is not a frequently used command (particularly in scripts where
it matters - that is, the no args form, nothing here alters anything
about any use of set with args) and is already a bit slow (sluggish...)
because of the sort requirement, so let's make it fractionally even
slower, but correct.
! ! pipeline
(And for now the other places where ! is permitted)
we should at least generate the logically correct exit
status:
! ! (exit 5); echo $?
should print 1, not 5. ksh and bosh do it this way - and it makes sense.
bash and the FreeBSD sh echo "5" (as did we until now.)
dash, zsh, yash all enforce the standard syntax, and prohibit this.
by FreeBSD sh (though different, for other reasons) - but the bug discovered
while searching for why a (nonsense) attempted test of the forthcoming
code to handle "! ! pipeline" properly wasn't working... (it was how I was
testing it that was broken, but until I achieved enlightenment, I was bug
hunting, and found this...)
Most likely the bugs here wouldn't have affected any real code (no bug
reports anyway), but ...
(even if no shell in existence, that I am aware of, does that).
That is, POSIX says ... [of the trap command with no args]
The shell shall format the output, including the proper use of
quoting, so that it is suitable for re-input to the shell as commands
that achieve the same trapping results. For example:
save_traps=$(trap)
...
eval "$save_traps"
It is obvious what the intent is there. But no shell makes it work.
An example using bash (as the NetBSD shell, still does not do the save_traps=
stuff correctly - but that is a problem for a different time and place...)
Given this script
printf 'At start: '; trap
printf '\n'
traps=$(trap)
trap 'echo hello' INT
printf 'inside : '; trap
printf '\n'
eval "${traps}"
printf 'At end : '; trap
printf '\n'
One would expect that (assuming no traps are set at the start, and
there aren't) that the first trap will print nothing, then the inside
trap will show the trap that was set, and then when we get to the
end everything will be back to nothing again.
But:
At start:
inside : trap -- 'echo hello' SIGINT
At end : trap -- 'echo hello' SIGINT
And of course. when you think about it, it is obvious why this happens.
The first "trap" command prints nothing ... nothing has changed when we
get to the "traps=$(trap)" command ... that trap command also prints
nothing. So this does traps=''. When we do eval "${traps}" we are
doing eval "", and it is hardly surprising that this accomplishes nothing!
Now we cannot rationally change the "trap" command without args to
behave in a way that would make it useful for the posix purpose (and
here, what they're aiming for is good, it should be possible to
accomplish that objective) so is there some other way?
I think I have seen some shell (but I do not remember which one) that
actually has "trap -" that resets all traps to the default, so with that,
if we changed the 'eval "${traps}"' line to 'trap -; eval "${traps}"'
then things would actually work - kind of - that version has race conditions,
so is not really safe to use (it will work, most of the time...)
But, both ksh93 and bash have a -p arg to "trap" that allows information
about the current trap status of named signals to be reported. Unfortunately
they don't do quite the same thing, but that's not important right now,
either would be usable, and they are, but it is a lot of effort, not
nearly as simple as the posix example.
First, while "trap -p" (with no signals specified) works, it works just
the same (in both bash and ksh93, aside from output format) as "trap".
That is, that is useless. But we can to
trap_int=$(trap -p int)
trap_hup=$(trap -p hup)
...
and then reset them all, one by one, later...
(bash syntax)
test -n "${trap_int}" && eval "${trap_int}" || trap - int
test -n "${trap_hup}" && eval "${trap_hup}" || trap - hup
(ksh93 syntax)
trap "${trap_int:-}" int
trap "${trap_hup:-}" hup
the test (for bash) and variable with default for ksh93, is needed
because they both still print nothing if the signal action is the default.
So, this modification attempts to fix all of that...
1) we add trap -p, but make it always output something for every signal
listed (all of the signals if none are given) even if the signal
action is the default.
2) choose the bash output format for trap -p, over the ksh93 format,
even though the simpler usage just above makes the ksh93 form seem
better. But it isn't. Consider:
ksh93$ trap -p int hup
echo hello
One of the two traps has "echo hello" as its action, the other is
still at the default, but which?
From bash...
bash$ trap -p int hup
trap -- 'echo hello' SIGINT
And now we know! Given the bash 'trap -p' format, the following function
produces ksh93 format output (for use with named signals only) instead...
ksh93_trap_p() {
for _ARG_ do
_TRAP_=$(trap -p "${_ARG_}") || return 1
eval set -- "${_TRAP_}"
printf '%s' "$3${3:+
}"
done
return 0
}
[ It needs to be entered without the indentation, that '}"' line has to be
at the margin. If the shell running that has local vars (bash does) then
_ARG_ and _TRAP_ should be made local. ]
So the bash format was chosen (except we do not include the "SIG" on the
signal names. That's irrelevant.)
If no traps are set, "trap -p" will say (on NetBSD of course)...
trap -- - EXIT HUP INT QUIT ILL TRAP ABRT EMT FPE KILL BUS SEGV SYS
trap -- - PIPE ALRM TERM URG STOP TSTP CONT CHLD TTIN TTOU IO XCPU XFSZ
trap -- - VTALRM PROF WINCH INFO USR1 USR2 PWR RT0 RT1 RT2 RT3 RT4 RT5
trap -- - RT6 RT7 RT8 RT9 RT10 RT11 RT12 RT13 RT14 RT15 RT16 RT17 RT18
trap -- - RT19 RT20 RT21 RT22 RT23 RT24 RT25 RT26 RT27 RT28 RT29 RT30
Obviously if traps are set, the relevant signal names will be removed from
that list, and additional lines added for the trapped signals.
With args, the signals names are listed, one line each, whatever
the status of the trap for that signal is:
$ trap -p HUP INT QUIT
trap -- - HUP
trap -- 'echo interrupted' INT
trap -- - QUIT
3) we add "trap -" to reset all traps to default. (It is easy, and seems
useful.)
4) While here, lots of generic cleanup. In particular, get rid of the
NSIG+1 nonsense, and anything that ever believes a signo == NSIG
is in any way rational. Before there was a bunch of confusion,
as we need all the signals for traps, plus one more for the EXIT
trap, which looks like we then need NSIG+1. But EXIT is 0, NSIG
includes signals from 0..NSIG-1 but there is no signal 0, EXIT
uses that slot, so we do not need to add and extra one, NSIG is
enough. (To see the effect of this, use a /bin/sh from before
this fix, and compare the output from
trap '' 64
and trap '' 65
both invalid signal numbers.
Then try just "trap" and watch your shell drop core...)
Eventually NSIG needs to go away completely (from user apps), it
is not POSIX, it isn't really useful (unless we make lots of
assumptions about how signals are numbered, which are not guaranteed,
so even if apps, like this sh, work on NetBSD, they're not portable,)
and it isn't necessary (or will not be, soon.)
But that is for another day...
5) As is kind of obvious above, when listing "all" traps, list all the
ones still at their defaults, and all the ignored signals, on as
few lines as possible (it could all be on one line - technically it
would work as well, but it would have made this cvs log message
really ugly...) Signals with a non-null action still get listed
one to a line (even if several do have the exact same action.)
6) Man page updates as well.
After this change, the following script:
printf 'At start: '; trap
printf '\n'
trap -p >/tmp/out.$$
trap 'echo hello' INT
printf 'inside : '; trap
printf '\n'
. /tmp/out.$$; rm /tmp/out.$$
printf 'At end : '; trap
printf '\n'
which is just the example from above,
using "trap -p" instead of just "trap" to save the traps,
and modified to a form that will work with the NetBSD shell today
produces:
At start:
inside : trap -- 'echo hello' INT
At end :
[Do I get a prize for longest commit log message of the year?]
(by which they mean > 0). We were checking for negative numbers, but
not for 0. More by chance of the implementation than any specific design
(I suspect) "break 0" was being treated the same as "break" or "break 1".
Since 3 ways to achieve the same thing is overkill, let's do what posix
wants and forbid "break 0" and "continue 0".
which causes fall through the to command list of the following pattern
(wuthout evaluating that pattern). This has been approved for inclusion
in the next major version of the POSIX standard (Issue 8), and is
implemented by most other shells.
Now all form a circle and together attempt to summon the great wizd
in the hopes that his magic spells can transform the poor attempt
at documenting this feature into something rational...
shells (anything made by build.sh) there is no change at all.
In DEBUG shells, when tree dumping, remember to include NNOT (same
omission as was just corrected in jobs.c :1.81) - of course, here there
are lots of other node types not handled as well.
ALso, avoid a core dump bug when doing a tree dump of a pieline
where the commands are not all simple commands (which can only
happen with a command string like " cmd | ! cmd | ... ". The "!"
in the middle is utter nonsense, and should be forbidden, but
for now, at least avoid a core dump.
"jobs" output (or other places where the cmd string is shown - like
when reporting status when a background job completes.)
Without this fix, try
! sleep 5 &
jobs
wait
and try not to wonder at the '???" that appears instead of "! sleep 5"
fixed (there might be no way) - but it "feels right"!
When popping an (exhausted) input string off the input stack, allow
for the possibility that the previous string might also just happened
to have run out of steam as well, so keep poppin' along until we
run out of pop, or find something to consume.
If there are no arguments, or only null arguments,
eval shall return a zero exit status;
Make it so. Now:
false; eval; echo $?
produces 0 instead of 1.
They can occur anywhere (*anywhere*) not only where it
happens to be convenient to the parser...
This fix from FreeBSD (thanks again folks).
To make this work, pushstring()'s signature needed to change to allow a
const char * as its string arg, which meant sprinkling some const other
places for a brighter appearance (and handling fallout).
All this because I wanted to see what number would come from
echo $\
{\
L\
I\
N\
E\
N\
O\
}
and was surprised at the result! That works now...
The bug would also affect stuff like
true &\
& false
and all kinds of other uses where the \newline occurred in the
"wrong" place.
An ATF test for sh syntax is coming... (sometime.)
arg (struct alias *) rather than using void * and then casting it
when used. For callers, the arg either is a struct alias *, or is NULL,
so nothing to adjust there.
NB: This change untested by itself, it was going to be a part of the next
change (coming in a few minutes) but is logically unrelated, so ...
command it should remain unset afterwards.
Previouly "export VAR" did much the same as:
export VAR="${VAR}"
(but without the side effects if VAR had previously been VAR='~' or similar)
Also stop unset exported variables from actually making it into the
environment. Previously this was impossible - variables get exported
in just one of 3 ways, by being imported from the environ (which means
the var is set) when -a is set, and a var is given a value (so the var
is set), or using "export" which previously always set a null string
if the var was otheriwse unset.
The same happens for "readonly" (readonly and export use the same mechanism)
- except, once marked readonly, it is no longer possible to set the var, so
(assuming VAR is not already readonly)
unset VAR; readonly VAR
is (now) a way to guarantee that "VAR" can never be set.
This conforms with POISX (though it is not particularly clear on this
point) and with bash and ksh93 (and also with the FreeBSD shell, though
they export unset variables that are marked for export as if set to '')
It s not clear whether
unset VAR; readonly VAR; unset VAR; echo $?
should print 0, or non-0, so for now just leave this as it is (prints 1).