.\" $NetBSD: script.7,v 1.4 2007/03/01 19:35:47 wiz Exp $ .\" .\" Copyright (c) 2005 The NetBSD Foundation, Inc. .\" All rights reserved. .\" .\" This document was originally contributed to The NetBSD Foundation .\" by Perry E. Metzger of Metzger, Dowdeswell & Co. LLC. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" 3. All advertising materials mentioning features or use of this software .\" must display the following acknowledgement: .\" This product includes software developed by the NetBSD .\" Foundation, Inc. and its contributors. .\" 4. Neither the name of The NetBSD Foundation nor the names of its .\" contributors may be used to endorse or promote products derived .\" from this software without specific prior written permission. .\" .\" THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTORS .\" ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED .\" TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR .\" PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE FOUNDATION OR CONTRIBUTORS .\" BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR .\" CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF .\" SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS .\" INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN .\" CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) .\" ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE .\" POSSIBILITY OF SUCH DAMAGE. .\" .Dd May 6, 2005 .Dt SCRIPT 7 .Os .Sh NAME .Nm script .Nd interpreter script execution .Sh DESCRIPTION The system is capable of treating a text file containing commands intended for an interpreter, such as .Xr sh 1 or .Xr awk 1 , as an executable program. .Pp An .Dq interpreter script is a file which has been set executable (see .Xr chmod 2 ) and which has a first line of the form: .Pp .D1 Li #! Ar pathname Op Ar argument .Pp The .Dq #! must appear as the first two characters of the file. A space between the .Dq #! and .Ar pathname is optional. At most one .Ar argument may follow .Ar pathname , and the length of the entire line is limited (see below). .Pp If such a file is executed (such as via the .Xr execve 2 system call), the interpreter specified by the .Ar pathname is executed by the system. (The .Ar pathname is executed without regard to the .Ev PATH variable, so in general .Ar pathname should be an absolute path.) .Pp The arguments passed to the interpreter will be as follows. .Va argv[0] will be the path to the interpreter itself, as specified on the first line of the script. If there is an .Ar argument following .Ar pathname on the first line of the script, it will be passed as .Va argv[1] . The subsequent elements of .Va argv will be the path to the interpreter script file itself (i.e. the original .Va argv[0] ) followed by any further arguments passed when .Xr execve 2 was invoked to execute the script file. .Pp By convention, it is expected that an interpreter will open the script file passed as an argument and process the commands within it. Typical interpreters treat .Sq # as a comment character, and thus will ignore the initial line of the script because it begins .Dq #! , but there is no requirement for this per se. .Pp On .Nx , the length of the .Dq #! line, excluding the .Dq #! itself, is limited to .Dv PATH_MAX (as defined in .Aq Pa limits.h ) . Other operating systems impose much smaller limits on the length of the .Dq #! line (see below). .Pp Note that the interpreter may not itself be an interpreter script. If .Ar pathname does not point to an executable binary, execution of the interpreter script will fail. .Ss Trampolines and Portable Scripts Different operating systems often have interpreters located in different locations, and the kernel executes the passed interpreter without regard to the setting of environment variables such as .Ev PATH . This makes it somewhat challenging to set the .Dq #! line of a script so that it will run identically on different systems. .Pp Since the .Xr env 1 utility executes a command passed to it on its command line, it is often used as a .Dq trampoline to render scripts portable. If the leading line of a script reads .Dl #! /usr/bin/env interp then the .Xr env 1 command will execute the .Dq interp command it finds in its .Ev PATH , passing on to it all subsequent arguments with which it itself was called. Since .Pa /usr/bin/env is found on almost all .Tn POSIX style systems, this trick is frequently exploited by authors who need a script to execute without change on multiple systems. .Ss Historical Note: Scripts without Dq #! Shell scripts predate the invention of the .Dq #! convention, which is implemented in the kernel. In the days of .At v7 , there was only one interpreter used on the system, .Pa /bin/sh , and the shell treated any file that failed to execute with an .Er ENOEXEC error (see .Xr intro 2 ) as a shell script. .Pp Most shells (such as .Xr sh 1 ) and certain other facilities (including .Xr execlp 3 and .Xr execvp 3 but not other types of .Xr exec 3 calls) still pass interpreter scripts that do not include the .Dq #! (and thus fail to execute with .Er ENOEXEC ) to .Pa /bin/sh . .Pp As this behavior is implemented outside the kernel, there is no mechanism that forces it to be respected by all programs that execute other programs. It is thus not completely reliable. It is therefore important to always include .Dl #!/bin/sh in front of Bourne shell scripts, and to treat the traditional behavior as obsolete. .Sh EXAMPLES Suppose that an executable binary exists in .Pa /bin/interp and that the file .Pa /tmp/script contains: .Bd -literal -offset indent #!/bin/interp -arg [...] .Ed .Pp and that .Pa /tmp/script is set mode 755. .Pp Executing .Pp .Dl $ /tmp/script one two three .Pp at the shell will result in .Pa /bin/interp being executed, receiving the following arguments in .Va argv (numbered from 0): .Pp .Bd -ragged -offset indent .Qq /bin/interp , .Qq "-arg" , .Qq /tmp/script , .Qq one , .Qq two , .Qq three .Ed .Ss Portability Note: Multiple arguments The behavior of multiple arguments on the .Dq #! line is highly non-portable between different systems. In general, only one argument can be assumed to work consistently. .Pp Consider the following variation on the previous example. Suppose that an executable binary exists in .Pa /bin/interp and that the file .Pa /tmp/script contains: .Bd -literal -offset indent #!/bin/interp -x -y [...] .Ed .Pp and that .Pa /tmp/script is set mode 755. .Pp Executing .Pp .Dl $ /tmp/script one two three .Pp at the shell will result in .Pa /bin/interp being executed, receiving the following arguments in .Va argv (numbered from 0): .Pp .Bd -ragged -offset indent .Qq /bin/interp , .Qq "-x -y" , .Qq /tmp/script , .Qq one , .Qq two , .Qq three .Ed .Pp Note that .Qq "-x -y" will be passed on .Nx as a single argument. .Pp Although most .Tn POSIX style operating systems will pass only one .Ar argument , the behavior when multiple arguments are included is not consistent between platforms. Some, such as current releases of .Nx , will concatenate multiple arguments into a single argument (as above), some will truncate them, and at least one will pass them as multiple arguments. .Pp The .Nx behavior is common but not universal. Sun's .Tn Solaris would present the above argument as .Qq -x , dropping the .Qq " -y" entirely. Perhaps uniquely, recent versions of Apple's .Tn OS X will actually pass multiple arguments properly, i.e.: .Bd -ragged -offset indent .Qq /bin/interp , .Qq -x , .Qq -y , .Qq /tmp/script , .Qq one , .Qq two , .Qq three .Ed .Pp The behavior of the system in the face of multiple arguments is thus not currently standardized, should not be relied on, and may be changed in future releases. In general, pass at most one argument, and do not rely on multiple arguments being concatenated. .Sh SEE ALSO .Xr awk 1 , .Xr csh 1 , .Xr ksh 1 , .Xr sh 1 , .Xr chmod 2 , .Xr execve 2 , .Xr intro 2 , .Xr execlp 3 , .Xr execvp 3 , .Xr fd 4 , .Xr options 4 , .Xr setuid 7 .Sh STANDARDS The behavior of interpreter scripts is obliquely referred to, but never actually described in, .St -p1003.1-2004 . .Pp The behavior is partially (but not completely) described in the .St -svid4 . .Pp Although it has never been formally standardized, the behavior described is largely portable across .Tn POSIX style systems, with two significant exceptions: the maximum length of the .Dq #! line, and the behavior if multiple arguments are passed. Please be aware that some operating systems limit the line to 32 or 64 characters, and that (as described above) the behavior in the face of multiple arguments is not consistent across systems. .Sh HISTORY The behavior of the kernel when encountering scripts that start in .Dq #! was not present in .At v7 . A Usenet posting to net.unix by Guy Harris on October 16, 1984 claims that the idea for the .Dq #! behavior was first proposed by Dennis Ritchie but that the first implementation was on .Bx . .Pp Historical manuals (specifically the exec man page) indicate that the behavior was present in .Bx 4 at least as early as April, 1981. Information on precisely when it was first implemented, and in which version of .Ux , is solicited. .Sh SECURITY CONSIDERATIONS Numerous security problems are associated with setuid interpreter scripts. .Pp In addition to the fact that many interpreters (and scripts) are simply not designed to be robust in a setuid context, a race condition exists between the moment that the kernel examines the interpreter script file and the moment that the newly invoked interpreter opens the file itself. .Pp Because of these security issues, .Nx does not allow setuid interpreter scripts by default. In order to turn on setuid interpreter scripts, .D1 Cd options SETUIDSCRIPTS must be set in the configuration of the running kernel. Setting this option implies the .Cd FDSCRIPTS option, which causes the kernel to open the script file on behalf of the interpreter and pass it in .Va argv as .Pa /dev/fd/[fdnum] . (See .Xr fd 4 for an explanation of the .Pa /dev/fd/[fdnum] devices.) This design avoids the race condition, at the cost of denying the interpreter the actual name of the script file. See .Xr options 4 for more information. .Pp However, the .Cd FDSCRIPTS mechanism is not a cure-all for security issues in setuid interpreters and scripts. Subtle techniques can be used to subvert even seemingly well written scripts. Scripts executed by Bourne type shells can be subverted in numerous ways, such as by setting the .Ev IFS variable before executing the script. Other interpreters possess their own vulnerabilities. Turning on .Cd SETUIDSCRIPTS is therefore very dangerous, and should not be done lightly if at all.