Commit Graph

1325 Commits

Author SHA1 Message Date
christos eda85bc164 Be more explicit with sort fields to produce consistent results with gnu
sort (Jan-Benedict Glaw)
2024-04-16 23:30:19 +00:00
kre 30112b560c Edgar Fuß pointed out that sh(1) did not mention comments (at all).
This has been true forever, and no-one else (including me) ever seems
to have noticed this ommission.

Correct that.

While in the area, improve the general sections on the Lexical structure
of the shell's input, and including some refinements to how quoting is
described.
2024-04-12 19:09:50 +00:00
kre 1568b40160 Redo the mktemp(1) part - some mktemp's (including ours) require the
XXXX's to be at the end of the name (like mk*temp(3)) so however well
it will work with mktemp implementations which allow the X's to be
anywhere in the final component of the name, it will work just as
well on them with the X's at the end.

But we don't normally need all of that mess - knowing which temp
file is which is useful only when debugging the script, and that's
(mostly) long done.   So, in normal uses now just use $(mktemp) and
allow mktemp to pick its own name - we don't need to know what it is.
Every mktemp(1) supports that mode of operation.

Bug when debugging the script (which for current purposes will be
taken to be when the -x flag is passed to the shell running it, to
trace what it does) then we will make the temp files have names we
can recognise (and in that case, also don't delete them when done).

While here, check for mktemp(1) failing, and abort if that
happens (we assume that if it fails it will write an error
message to stderr, so the script does not need to.)

As for the purpose of the script ... of course the header file
generated (or an equivalent elsewhere) could be generated and
maintained by hand, but why would anyone want to do all that
work when software can do it for us, and do it correctly without
human thought?

This also allows the options in the master list (option.list) to be
arranged in a way that is meaningful for them, unrelated to the order
the shell needs to have them in (or rearrange them to be at run time)
and have that order shuffled however is convenient.   Currently all
the posix standard options are first, then the "hybrid" options, and
finally the local ones for this shell.   Currently "pipefail" is in the
final set, but once the next posix version is published, that will
become a standard option, and get moved in the list - the shell won't
even notice as this script puts the options into shell desired order.
2024-04-06 14:20:27 +00:00
christos 3200c2817b From Jan-Benedict Glaw:
Fix a redirection and prepare a stable sort for upper-/lowercase
option letters

This script is a mess, I strongly believe that it should be rewritten.
However, I'm not 100% sure why it was invented in the first place
(come on, the generated header file isn't _that_ complicated that
it couldn't be sanely managed by hand!), but let's fix the sorting
order by using LC_ALL=C.

Also add a few 'X' to the `mktemp` template to make non-BSD
implementations happy. As a bonus, actually *use* the initial `sed`
output instead of throwing it away by piping it into `sort` with
also connecting `sort`'s stdin with the original input file...
2024-04-05 22:22:17 +00:00
andvar 100a3398b8 fix spelling mistakes, mainly in comments and log messages. 2024-02-09 22:08:30 +00:00
kre f55c8670e1 PR bin/57894
For jobs -p for a non-job-control job, avoid just printing 0 (as
there is no process group pid) and instead output what we used to,
the pid of one of the processes in the job (usually the right one!)

XXX pullup -10 (9 and earlier not affected).
2024-01-30 19:05:07 +00:00
kre acdf05cdcd Remove an ancient incorrect notion which somehow survived intact for ages.
"$@" is (as it is in double quotes) not subject to field splitting.  "$@"
generates (potentially) multiple words, but field splitting has nothing
to do with it.

While here, rename the section from "White Space Splitting (Field Splitting)"
to simply be "Field Splitting" as white space is only relevant if it happens
to occur in IFS (which is the default case, but IFS can be anything, and
isn't required to contain any white space at all).
2024-01-16 14:30:22 +00:00
kre 6654ff1c5a PR bin/57773
Fix another bug reported by Jarle Fredrik Greipsland and added
to PR bin/57773, which relates to calculating the length of a
positional parameter which contains CTL chars -- yes, this one
really is that specific, though it would also affect the special
param $0 if it were to contain CTL chars, and its length was
requested - that is fixed with the same change.  And note: $0
is not affected because it looks like a positional param (it
isn't, ${00} would be, but is always unset, ${0} isn't) all
special parame would be affected the same way, but the only one
that can ever contain a CTL char is $0 I believe.  ($@ and $*
were affected, but just because they're expanding the positional
params ... ${#@} and ${#*} are both technically unspecified
expansions - and different shells produce different results.

See the PR for the details of this one (and the previous).

Thanks for the PR.

XXX pullup to everything.
2023-12-29 15:49:23 +00:00
kre c6d0f408e8 PR bin/57773
Fix a bug reported by Jarle Fredrik Greipsland in PR bin/57773,
where a substring expansion where the substring to be removed from
a variable expansion is itself a var expansion where the value
contains one (or more) of sh's CTLxxx chars - the pattern had
CTLESC inserted, the string to be matched against did not.  Fail.
We fix that by always inserting CTLESC in var assign expansions.
See the PR for all the gory details.

Thanks for the PR.

XXX pullup to everything.
2023-12-25 04:52:38 +00:00
kre 391d454067 Correct a bizarre piece of source formatting that crept in by
accident several years ago (change a space into newline tab).

NFC
2023-12-25 02:28:47 +00:00
kre 879813d8b2 Work around a probably gcc12 bug in detecting "potentially clobbered"
variables after longjmp() for some architectures (sh3 at least).

This should allow the workaround to disable those warnings for this
file to be removed.

In the affected function the extra var & assignment added should simply
be deleted by any good optimiser, but if not, it doesn't matter, as
performance of this function (expandonstack()) is almost irrelevant.
2023-10-20 22:08:52 +00:00
mrg d528393cd3 convert gcc12 -O1 into -Wno-error=clobbered.
parser.c wants all the optimisation, and this is very likely a
false positive.
2023-10-19 04:27:24 +00:00
mrg 08e29b5df3 the parser.c longjmp vs gcc12 issue affects a few ports,
make the workaround global.
2023-10-14 06:53:56 +00:00
uwe ce7bd196f3 sh(1): touch up markup for the ENV example
Don't use Dq in a literal display, ascii quotes are \*q
While here mark up as literal a few things around this example.
2023-10-12 01:45:07 +00:00
kre 28b2f4e3f6 If the read builtin is told to read into IFS, we must avoid doing
that until all current uses of IFS are complete (as we have IFS's
value cached in ifs - if IFS alters, ifs might point anywhere).
Handle this by deferring assignments to IFS until everything is done.
This makes us appear to comply with the (currently) proposed requirement
for read by POSIX that field splitting complete before vars are
assigned.   (Other shells, like dash, ksh93, yash, bosh behave like this)

That might end up being unspecified though, as other shells (bosh,
mksh) assign each field to its var as it is delimited (though bosh
appears to have bugs).   If we wanted to go that route, the issue here
could have been handled by re-doing the init of ifs after every
setvar() that is performed here (except the last, after which it is
no longer needed).

XXX pullup -10
2023-10-05 20:33:31 +00:00
kre 4c31efd22f At the request of bad@ enhance the synopsis of the set built-in
command to include explicit mention of the -o opt and +o opt forms.

Fix the synopsis to have the 4 forms that the description of the
utility discusses, rather than expecting users to understand that
the 3rd and 4th forms of the command were combined into the 3rd
synopsis format.   After doing that, the options in the 3rd format
no longer need to be optional, so now all 4 formats are distinct
(previously, the third, omitting everything that was optional, and
the first, could not be distinguished).

While here, some wording and formatting "improvements" as well (nothing
too serious).
2023-09-01 01:57:54 +00:00
mrg 046f28e789 use -O1 on sh3, GCC 12 and parser.c.
this triggers clobbered vs. longjmp/setjmp warnings with -Os that sh3 uses.
2023-08-14 03:18:14 +00:00
jschauma f2c0b7b228 tyops:
* redicection -> redirection
* escaoed -> escaped

Noted by J. Lewis Muir on netbsd-docs@
2023-08-04 15:31:40 +00:00
msaitoh 418b48633c Fix typo in a debug message. 2023-06-24 05:17:02 +00:00
kre 5409234e44 Remove an end of file trailing blank line that served no purpose.
NFCI
2023-04-07 10:42:28 +00:00
kre 0827e1f954 The great shell trailing whitespace cleanup of 2023...
Inspired by private e-mail comments from mouse@

NFCI.
2023-04-07 10:34:13 +00:00
hannken f438a6a065 Use "sigjmp_buf loc" after switch to sigsetjmp()/siglongjmp().
Fixes errors and aborts on sparc at least.
2023-03-21 08:31:30 +00:00
kre 98b2eb3607 Do a better job handling EACCES errors from exec() calls. If the
EACCES is from the namei(), treat it just like ENOENT or ENOTDIR
(and if that is the final error, the exit status from a failed exec
will be 127).   If the EACCES is from the exec() itself, that indicates
the file to be run exists, but has no 'x' permission.   That's a
meaningful error (as distinct from just "yet another PATH element
search failure").

While here, return the first meaingful error we encountered while
searching PATH, rather than the last (and ENOENT if there are none
of those).

This change results in some failed command executions returning status
127 now, where they returned 126 before - which better reflects the
intent of those values (127 is simply "not found" whereas 126 is "found
but couldn't be executed").

We still do nothing to distinguish errors encountered looking up the
command name give, with errors encountered (by the kernel) attempting to
run an interpreter needed for the exec to succeed (#! line path, or
/libexec/ld.elf_so and similar - or anything else of a similar nature).
2023-03-19 17:55:57 +00:00
kre 91522b5b9d Switch from using _setjmp()/_longjmp() (on BSD systems which aren't SVR4)
(and setjmp()/longjmp() elsewhere) to using sigsetjmp()/siglongjmp()
everywhere.

NFCI.
2023-03-19 17:47:48 +00:00
kre d45073cdfd Change a few #defines from octal to hex (pdp11 days are long gone).
Improve the layout of those definitions at the same time.

NFC.
2023-03-19 17:45:29 +00:00
kre 726d188a97 Adjust tilde expansion as will be documented in the forthcoming
version of the POSIX standard (Issue 8).   I believe we were already
compliant with what is to be required, but POSIX is now encouraging
(and will likely require in a later version) that if a tilde expansion
produces a string which ends in a '/' and the '~' that was expanded
is immediately followed by a '/' in the input word, that one of those
two slashes be omitted.   The worst (current) example of this is
when HOME=/ and we expand ~/foo - previously producing //foo which is
(in POSIX) a path with implementation defined semantics, and so not
what we should be generating by accident.   Change that, so now if
the ~ prefix expansion ends in a '/' and there is a '/' following
immediately after, the resulting word contains only one of those
chars (in the example just given, we will now produce /foo instead).

POSIX is also making it clear that the expansion that results from
the tilde expansion is treated as quoted (not subject to pathname
expansion, or field splitting, or any var/arith/command substitutions)
and that if HOME="" the expansion of ~ must generate "" (not nothing).
Our implementation did all of that already (though older versions
used to treat an empty expansion of HOME the same as if HOME was
unset - that was fixed some time ago).

The actual modification made here is probably smaller than this log entry,
and without added comments, certainly is!
2023-03-06 05:54:34 +00:00
kre 64c7eea502 Allow (but do not require) the magic '--' option terminator in
the builtin 'alias' command.   This allows portability (not that
anyone should really care with aliases) for scripts from other
shells in which the alias command has options, and the -- is
required to allow the first alias name to begin with a '-'.

That is, for us, alias -x='echo x'  works fine, always has,
and still does.   But other shells treat that as an attempt
to use the -x option (and maybe -= etc), and require
alias -- -x='echo x'.   For us that variant used to complain
about the alias -- not existing (as an arg with no '=' is
treated as a request to extract the value of the alias).

Posix also generally requires all standard commands (or
which "alias" is one, unfortunately) to support '--' even
if they have no options, for precisely this reason.
2023-02-24 19:04:54 +00:00
kre d1668f2837 More markup errors. \+ was intended to be \&+ and .EV .Ev of course.
As best I can tell, the rest of what mandoc -Wall complains about is
incorrect (it could probably be avoided by adding more markup, but
there doesn't seem to be any point).
2022-12-20 17:51:54 +00:00
kre d51e0a2e8a Using .Cm Cm makes no sense at all - no idea what I was thinking there
(perhaps just an editing error).
2022-12-20 16:48:57 +00:00
uwe c14a7046e4 sh(1): Fix markup. -compact must be last. 2022-12-20 01:18:42 +00:00
kre e2731928b1 It appears that POSIX intends to add a -d X option to the read command
in its next version, so it can be used as -d '' (to specify a \0 end
character for the record read, rather than the default \n) to accompany
find -print0 and xargs -0 options (also likely to be added).

Add support for -d now.   While here fix a bug where escaped nul
chars (\ \0) in non-raw mode were not being dropped, as they are
when not escaped (if not dropped, they're still not used in any
useful way, they just ended the value at that point).
2022-12-11 08:23:10 +00:00
kre 1c627bdffa PR bin/57053 is related (peripherally) here.
sh has been remembering the process group of a job for a while now, but
using that for almost nothing.

The old way to resume a job, was to try each pid in the job with a
SIGCONT (using it as the process group identifier via killpg()) until
one worked (or none did, in which case resuming would be impossible,
but that never actually happened).   This wasn't as bad as it seems,
as in practice the first process attempted was *always* the correct
one.  Why the loop was considered necessary I am not sure.  Nothing
but the first could possibly work.

This worked until a fix for an obscure possible bug was added a
while ago - now a process which has already finished, and had its
zombie collected via wait*() is no longer ever considered to have
a pid which is a candidate for use in any system call.  That's
because the kernel might have reassigned that pid for some newly
created process (we have no idea how much time might have passed
since the pid was returned to the kernel for reuse, it might have
happened weeks ago).

This is where the example in bin/57053 revealed a problem.

That PR is really about a quite different problem in zsh (from pksrc)
and should be pkg/57053, but as the test case also hit the problem
here, it was assumed (by some) they were the same issue.

The example is (in a small directory)
	ls | less
which is then suspended (^Z), and resumed (fg).   Since the directory
is small, ls will be finished, and reaped by sh - so the code would
now refuse to use its pid for the killpg() call to send the SIGCONT.
The (useless) loop would attempt to use less's pid for this purpose
(it is still alive at this point) but that would fail, as that pid
is not a process group identifier, of anything.   Hence the job
could not be resumed.

Before the PR (or preceding mailing list discussion) the change here
had already been made (part of a much bigger set of changes, some of
which might follow - sometime).   We now actually use the job's
remembered process group identifier when we want the process group
identifier, instead of trying to guess which pid it happens to be
(which actually never took any guessing, it was, and is always the
pid of the first process created for the job).   A couple of minor
fixes to how the pgrp is obtained, and used, accompany the changes
to use it when appropriate.
2022-10-30 01:46:16 +00:00
kre ec520a6a57 Note in the description of "jobs -p" that the process id returned is
also the process group identifier (that's a requirement from POSIX, and
is what we have always done - just not been explicit about in sh.1).
Add a note that this value and $! are not necessarily the same (currently,
and perhaps forever, never the same in a pipeline with 2 or more elements).
2022-10-30 01:19:08 +00:00
kre 9eccf618bb Oops, somehow managed to commit an older version where NBSH_INVOCATION
start char was '@' rather than '!' (which meant not lexically ordered).
This is how it was intended to be (and is documented).
2022-09-18 17:11:33 +00:00
kre e7b0505e69 Add the -l option (aka -o login): be a login shell. Meaningful only on
the command line (with both - and + forms) - overrides the presence (or
otherwise) of a '-' as argv[0][0].

Since this allows any shell to be a login shell (which simply means that
it runs /etc/profile and ~/.profile at shell startup - there are no other
side effects) add a new, always set at startup, variable NBSH_INVOCATION
which has a char string as its value, where each char has a meaning,
more or less related to how the shell was started.   See sh(1).
This is intended to allow those startup scripts to tailor their behaviour
to the nature of this particular login shell (it is possible to detect
whether a shell is a login shell merely because of -l, or whether it would
have been anyway, before the -l option was added - and more).   The
var could also be used to set different values for $ENV for different
uses of the shell.
2022-09-18 06:03:19 +00:00
kre e1f658a30d More wording improvements. There might be more to come. 2022-09-16 19:25:09 +00:00
kre f9baa2eaae Minor wording improvements.
Note these do not alter anything about what the man page specifies,
just say a couple of things in a slightly better way, hence no Dd
update accompanies this change (deliberately).
2022-09-16 17:32:18 +00:00
kre c5917dbfa8 Move a comment that used to be in the correct place, once upon a time,
back where it belongs, and make it stand out more, so other text is
less likely to find itself pushed between the comment and the text
to which it appears.   This change should make no visible difference
to the man page displayed.
2022-09-16 17:29:21 +00:00
kre 65d376a4f6 Whitespace. 2022-09-16 17:25:09 +00:00
kre 258641d776 Correct spelling of terminal (it doesn't have a 2nd m). 2022-09-15 18:00:36 +00:00
kre f5b9e602e2 Add debugging trace points for history and the editline interface.
NFC for any normal shell (not compiled with debugging (sh DEBUG) enabled.

We have had a defined debug mode for this for years, but since I have
not often played in this arena, never used it.  Until recently (relatively).
This (or a small part of it) played a part in discovering the fc -e
bug cause.   I have had it in my tree a while now - recent changes
kept causing merge conflicts (all because I hadn't bothered to commit
this), so I think now is the time...
2022-08-22 17:33:11 +00:00
nia 29c67fe23d sh(1): revert previous because it interferes with custom user bindings 2022-08-21 21:35:36 +00:00
kre 16e67595a2 Improve the description of the read builtin command. 2022-08-19 13:37:03 +00:00
kre a1f6605654 Don't output the error for bad usage (no var name given)
after already writing the prompt (set with the -p option).

That results in nonsense like:

	$ read -p foo
	fooread: arg count

While here, improve the error message so it means something.

Now we will get:

$ read -p foo
read: variable name required
Usage: read [-r] [-p prompt] var...

[Detected by code reading while doing the work for the previous fix]
2022-08-19 12:52:31 +00:00
kre 143e8d551e PR bin/56972 Fix escape ('\') handling in sh read builtin.
In 1.35 (March 2005) (the big read fixup), most escape handling and IFS
processing in the read builtin was corrected.  However 2 cases were missed,
one is a word (something to be assigned to any variable but the last) in
which every character is escaped (the code was relying on a non-escaped char
to set the "in a word" status), and second trailing IFS whitespace at
the end of the line was being deleted, even if the chars had been escaped
(the escape chars are no longer present).

See the PR for more details (including the case that detected the problem).

After fixing this, I looked at the FreeBSD code (normally might do it
before, but these fixes were trivial) to check their implementation.
Their code does similar things to ours now does, but in a completely
different way, their read builtin is more complex than ours needs to
be (they handle more options).   For anyone tempted to simply incorporate
their code, note that it relies upon infrastructure changes elsewhere
in the shell, so would not be a simple cut and drop in exercise.

This needs pullups to -3 -4 -5 -6 -7 -8 and -9 (fortunately this is
happening before -10 is branched, so will never be broken this way there).
2022-08-19 12:17:18 +00:00
nia 16c8f5ddd4 sh(1): Allow an explicit set -o vi or set -o emacs to override ~/.editrc 2022-08-18 14:10:05 +00:00
nia 635122b300 sh(1): Assign the tab completion key binding last so a user having
"bind -v" or "bind -e" in ~/.editrc doesn't cause tab completion
to no longer function.
2022-08-17 22:27:17 +00:00
andvar ff23aff6ad fix various typos in comments, documentation and messages. 2022-05-31 08:43:13 +00:00
kre 21f0086877 Introduce a new macro JNUM to replace the idiom jp-jobtab+1
(the job number, given jp a pointer to a jobs table entry)
used open coded previously in many places (mostly in DEBUG mode
trace messages, so not included in most shells, but there are
a few others).

Make the type of JNUM() be int rather than the ptrdiff_t the
open coded version became ... which when used in some printf()
type function arg list was cast to some other arbitrary (but not
consistent) int type for which there is a standard %Xd type
format conversion.   Now we can (and do) just use %d for this.

If the number of jobs ever exceeds the range of an int, we would
have far more serious problems than the broken output this would
cause.

While here improve a comment or two, and use JOBRUNNING instead
of 0 where the intent is the former (JOBRUNNING is #defined as 0).

NFCI.
2022-04-18 06:02:27 +00:00
andvar e2710f6fc4 fix various typos in comments. 2022-04-17 21:24:52 +00:00