% $Id: manual.tex,v 1.56 2002/06/06 12:49:28 roberto Exp roberto $ \documentclass[11pt,twoside,draft]{article} \usepackage{fullpage} \usepackage{bnf} \usepackage{graphicx} % no need for subscripts... \catcode`\_=12 %\newcommand{\See}[1]{Section~\ref{#1}} \newcommand{\See}[1]{\S\ref{#1}} %\newcommand{\see}[1]{(see~\See{#1} on page \pageref{#1})} \newcommand{\see}[1]{(see~\See{#1})} \newcommand{\seepage}[1]{(see page~\pageref{#1})} \newcommand{\M}[1]{{\rm\emph{#1}}} \newcommand{\T}[1]{{\tt #1}} \newcommand{\Math}[1]{$#1$} \newcommand{\nil}{{\bf nil}} \newcommand{\False}{{\bf false}} \newcommand{\True}{{\bf true}} %\def\tecgraf{{\sf TeC\kern-.21em\lower.7ex\hbox{Graf}}} \def\tecgraf{{\sf Tecgraf}} \newcommand{\Index}[1]{#1\index{#1@{\lowercase{#1}}}} \newcommand{\IndexVerb}[1]{\T{#1}\index{#1@{\tt #1}}} \newcommand{\IndexEmph}[1]{\emph{#1}\index{#1@{\lowercase{#1}}}} \newcommand{\IndexTM}[1]{\index{#1 event@{``#1'' event}}\index{tag method!#1}} \newcommand{\Def}[1]{\emph{#1}\index{#1}} \newcommand{\IndexAPI}[1]{\T{#1}\DefAPI{#1}} \newcommand{\IndexLIB}[1]{\T{#1}\DefLIB{#1}} \newcommand{\DefLIB}[1]{\index{#1@{\tt #1}}} \newcommand{\DefAPI}[1]{\index{C API!#1@{\tt #1}}} \newcommand{\IndexKW}[1]{\index{keywords!#1@{\tt #1}}} \newcommand{\ff}{$\bullet$\ } \newcommand{\Version}{5.0 (alpha)} % changes to bnf.sty by LHF \renewcommand{\Or}{$|$ } \renewcommand{\rep}[1]{{\rm\{}\,#1\,{\rm\}}} \renewcommand{\opt}[1]{{\rm [}\,#1\,{\,\rm]}} \renewcommand{\ter}[1]{{\rm`{\tt#1}'}} \newcommand{\Nter}[1]{{\tt#1}} \newcommand{\NOTE}{\par\medskip\noindent\emph{NOTE}: } \makeindex \begin{document} %{=============================================================== \thispagestyle{empty} \pagestyle{empty} { \parindent=0pt \vglue1.5in {\LARGE\bf The Programming Language Lua} \hfill \vskip4pt \hrule height 4pt width \hsize \vskip4pt \hfill Reference Manual for Lua version \Version \\ \null \hfill Last revised on \today \\ \vfill \centering \includegraphics[width=0.7\textwidth]{nolabel.ps} \vfill \vskip4pt \hrule height 2pt width \hsize } \newpage \begin{quotation} \parskip=10pt \parindent=0pt \footnotesize \null\vfill \noindent Copyright \copyright\ 2002 Tecgraf, PUC-Rio. All rights reserved. Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. Copies of this manual can be obtained at Lua's official web site, \verb|www.lua.org|. \bigskip The Lua logo was designed by A. Nakonechny. Copyright \copyright\ 1998. All rights reserved. \end{quotation} %}=============================================================== \newpage \title{\Large\bf Reference Manual of the Programming Language Lua \Version} \author{% Roberto Ierusalimschy\qquad Luiz Henrique de Figueiredo\qquad Waldemar Celes \vspace{1.0ex}\\ \smallskip \small\tt lua@tecgraf.puc-rio.br \vspace{2.0ex}\\ %MCC 08/95 --- \tecgraf\ --- Computer Science Department --- PUC-Rio } %\date{{\small \tt\$Date: 2002/06/06 12:49:28 $ $}} \maketitle \pagestyle{plain} \pagenumbering{roman} \begin{abstract} \noindent Lua is a powerful, light-weight programming language designed for extending applications. Lua is also frequently used as a general-purpose, stand-alone language. Lua combines simple procedural syntax (similar to Pascal) with powerful data description constructs based on associative arrays and extensible semantics. Lua is dynamically typed, interpreted from opcodes, and has automatic memory management with garbage collection, making it ideal for configuration, scripting, and rapid prototyping. This document describes version \Version\ of the Lua programming language and the Application Program Interface (API) that allows interaction between Lua programs and their host C~programs. \end{abstract} \def\abstractname{Resumo} \begin{abstract} \noindent Lua \'e uma linguagem de programa\c{c}\~ao poderosa e leve, projetada para estender aplica\c{c}\~oes. Lua tamb\'em \'e frequentemente usada como uma linguagem de prop\'osito geral. Lua combina programa\c{c}\~ao procedural (com sintaxe semelhante \`a de Pascal) com poderosas constru\c{c}\~oes para descri\c{c}\~ao de dados, baseadas em tabelas associativas e sem\^antica extens\'\i vel. Lua \'e tipada dinamicamente, interpretada a partir de \emph{opcodes}, e tem gerenciamento autom\'atico de mem\'oria com coleta de lixo. Essas caracter\'{\i}sticas fazem de Lua uma linguagem ideal para configura\c{c}\~ao, automa\c{c}\~ao (\emph{scripting}) e prototipagem r\'apida. Este documento descreve a vers\~ao \Version\ da linguagem de programa\c{c}\~ao Lua e a Interface de Programa\c{c}\~ao (API) que permite a intera\c{c}\~ao entre programas Lua e programas C~hospedeiros. \end{abstract} \newpage \null \newpage \tableofcontents \newpage \setcounter{page}{1} \pagestyle{plain} \pagenumbering{arabic} %------------------------------------------------------------------------------ \section{Introduction} Lua is an extension programming language designed to support general procedural programming with data description facilities. Lua is intended to be used as a powerful, light-weight configuration language for any program that needs one. Lua is implemented as a library, written in C. Being an extension language, Lua has no notion of a ``main'' program: it only works \emph{embedded} in a host client, called the \emph{embedding program} or simply the \emph{host}. This host program can invoke functions to execute a piece of Lua code, can write and read Lua variables, and can register C~functions to be called by Lua code. Through the use of C~functions, Lua can be augmented to cope with a wide range of different domains, thus creating customized programming languages sharing a syntactical framework. Lua is free software, and is provided as usual with no guarantees, as stated in its copyright notice. The implementation described in this manual is available at Lua's official web site, \verb|www.lua.org|. Like any other reference manual, this document is dry in places. For a discussion of the decisions behind the design of Lua, see the papers below, which are available at Lua's web site. \begin{itemize} \item R.~Ierusalimschy, L.~H.~de Figueiredo, and W.~Celes. Lua---an extensible extension language. \emph{Software: Practice \& Experience} {\bf 26} \#6 (1996) 635--652. \item L.~H.~de Figueiredo, R.~Ierusalimschy, and W.~Celes. The design and implementation of a language for extending applications. \emph{Proceedings of XXI Brazilian Seminar on Software and Hardware} (1994) 273--283. \item L.~H.~de Figueiredo, R.~Ierusalimschy, and W.~Celes. Lua: an extensible embedded language. \emph{Dr. Dobb's Journal} {\bf 21} \#12 (Dec 1996) 26--33. \item R.~Ierusalimschy, L.~H.~de Figueiredo, and W.~Celes. The evolution of an extension language: a history of Lua, \emph{Proceedings of V Brazilian Symposium on Programming Languages} (2001) B-14--B-28. \end{itemize} %------------------------------------------------------------------------------ \section{Lua Concepts}\label{concepts} This section describes the main concepts of Lua as a language. The syntax and semantics of Lua are described in \See{language}. The discussion below is not purely conceptual; it includes references to the C~API \see{API}, because Lua is designed to be embedded in host programs. It also includes references to the standard libraries \see{libraries}. \subsection{Environment and Chunks} All statements in Lua are executed in a \Def{global environment}. This environment is initialized with a call from the embedding program to \verb|lua_open| and persists until a call to \verb|lua_close| or the end of the embedding program. If necessary, the host programmer can create multiple independent global environments, and freely switch between them \see{mangstate}. The unit of execution of Lua is called a \Def{chunk}. A chunk is simply a sequence of statements. Statements are described in \See{stats}. A chunk may be stored in a file or in a string inside the host program. When a chunk is executed, first it is pre-compiled into opcodes for a virtual machine, and then the compiled statements are executed by an interpreter for the virtual machine. All modifications a chunk effects on the global environment persist after the chunk ends. Chunks may also be pre-compiled into binary form and stored in files; see program \IndexVerb{luac} for details. Text files with chunks and their binary pre-compiled forms are interchangeable; Lua automatically detects the file type and acts accordingly. \index{pre-compilation} \subsection{\Index{Values and Types}} \label{TypesSec} Lua is a \emph{dynamically typed language}. That means that variables do not have types; only values do. There are no type definitions in the language. All values carry their own type. There are seven \Index{basic types} in Lua: \Def{nil}, \Def{boolean}, \Def{number}, \Def{string}, \Def{function}, \Def{userdata}, and \Def{table}. \emph{Nil} is the type of the value \nil, whose main property is to be different from any other value; usually it represents the absence of a useful value. \emph{Boolean} is the type of the values \False{} and \True. In Lua, both \nil{} and \False{} make a condition fails, and any other value makes it succeeds. \emph{Number} represents real (double-precision floating-point) numbers. \emph{String} represents arrays of characters. \index{eight-bit clean} Lua is 8-bit clean, and so strings may contain any 8-bit character, including embedded zeros (\verb|'\0'|) \see{lexical}. Functions are \emph{first-class values} in Lua. That means that functions can be stored in variables, passed as arguments to other functions, and returned as results. Lua can call (and manipulate) functions written in Lua and functions written in C \see{functioncall}. The type \emph{userdata} is provided to allow the store of arbitrary C data in Lua variables. This type corresponds to a block of raw memory and has no pre-defined operations in Lua, except assignment and identity test. However, by using \emph{metatables}, the programmer can define operations for userdata values \see{metatables}. Userdata values cannot be created or modified in Lua, only through the C~API. This guarantees the integrity of data owned by the host program. The type \emph{table} implements \Index{associative arrays}, that is, \Index{arrays} that can be indexed not only with numbers, but with any value (except \nil). Moreover, tables can be \emph{heterogeneous}, that is, they can contain values of all types. Tables are the sole data structuring mechanism in Lua; they may be used not only to represent ordinary arrays, but also symbol tables, sets, records, graphs, trees, etc. To represent \Index{records}, Lua uses the field name as an index. The language supports this representation by providing \verb|a.name| as syntactic sugar for \verb|a["name"]|. There are several convenient ways to create tables in Lua \see{tableconstructor}. Like indices, the value of a table field can be of any type. In particular, because functions are first class values, table fields may contain functions. So, tables may also carry \emph{methods} \see{func-def}. Tables, functions, and userdata values are \emph{objects}: variables do not actually \emph{contain} these values, only \emph{references} to them. Assignment, parameter passing, and returns from functions always manipulate references to these values, and do not imply any kind of copy. The library function \verb|type| returns a string describing the type of a given value \see{pdf-type}. \subsubsection{Metatables} Each table or userdata object in Lua may have a \Index{metatable}. You can change several aspects of the behavior of an object by setting specific fields in its metatable. For instance, when an object is the operand of an addition, Lua checks for a function in the field \verb|"__add"| in its metatable. If it finds one, Lua calls that function to perform the addition. We call the keys in a metatable \Index{events}, and the values \Index{metamethods}. In the previous example, \verb|"add"| is the event, and the metamethod is the function that performs the addition. A metatable controls how an object behaves in arithmetic operations, order comparisons, concatenation, and indexing. A metatable can also defines a function to be called when a userdata is garbage collected. \See{metatable} gives a detailed description of which events you can control with metatables. You can query and change the metatable of an object through the \verb|setmetatable| and \verb|getmetatable| functions \see{pdf-getmetatable}. \subsection{Coercion} \label{coercion} Lua provides automatic conversion between string and number values at run time. Any arithmetic operation applied to a string tries to convert that string to a number, following the usual rules. Conversely, whenever a number is used when a string is expected, the number is converted to a string, in a reasonable format. The format is chosen so that a conversion from number to string then back to number reproduces the original number \emph{exactly}. For complete control of how numbers are converted to strings, use the \verb|format| function \see{format}. \subsection{Variables} There are two kinds of variables in Lua: global variables and local variables. Variables are assumed to be global unless explicitly declared local \see{localvar}. Before the first assignment, the value of a variable is \nil. All global variables live as fields in ordinary Lua tables. Usually, globals live in a table called \Index{table of globals}. However, a function can individually change its global table, so that all global variables in that function will refer to that table. This mechanism allows the creation of \Index{namespaces} and other modularization facilities. \Index{Local variables} are lexically scoped. Therefore, local variables can be freely accessed by functions defined inside their scope \see{visibility}. \subsection{Garbage Collection}\label{GC} Lua does automatic memory management. That means that you do not have to worry about allocating memory for new objects and freeing it when the objects are no longer needed. Lua manages memory automatically by running a \Index{garbage collector} from time to time and collecting all dead objects (all objects that are no longer accessible from Lua). All objects in Lua are subject to automatic management: tables, userdata, functions, and strings. Using the C~API, you can set garbage-collector metamethods for userdata \see{metatable}. When it is about to free a userdata, Lua calls the metamethod associated with event \verb|gc| in the userdata's metatable. Using such facility, you can coordinate Lua's garbage collection with external resource management (such as closing files, network or database connections, or freeing your own memory). Lua uses two numbers to control its garbage-collection cycles. One number counts how many bytes of dynamic memory Lua is using, and the other is a threshold. When the number of bytes crosses the threshold, Lua runs the garbage collector, which reclaims the memory of all dead objects. The byte counter is corrected, and then the threshold is reset to twice the value of the byte counter. Through the C~API, you can query those numbers, and change the threshold \see{GC-API}. Setting the threshold to zero actually forces an immediate garbage-collection cycle, while setting it to a huge number effectively stops the garbage collector. Using Lua code you have a more limited control over garbage-collection cycles, through the functions \verb|gcinfo| and \verb|collectgarbage| \see{predefined}. \subsubsection{Weak Tables}\label{weak-table} A \IndexEmph{weak table} is a table whose elements are \IndexEmph{weak references}. A weak reference is ignored by the garbage collector. In other words, if the only references to an object are weak references, then the garbage collector will collect that object. A weak table can have weak keys, weak values, or both. A table with weak keys allows the collection of its keys, but prevents the collection of its values. A table with both weak keys and weak values allows the collection of both keys and values. In any case, if either the key or the value is collected, the whole pair is removed from the table. The weakness of a table is set with the \verb|setmode| function. %------------------------------------------------------------------------------ \section{The Language}\label{language} This section describes the lexis, the syntax, and the semantics of Lua. In other words, this section describes which tokens are valid, how they can be combined, and what their combinations mean. \subsection{Lexical Conventions} \label{lexical} \IndexEmph{Identifiers} in Lua can be any string of letters, digits, and underscores, not beginning with a digit. This coincides with the definition of identifiers in most languages. (The definition of letter depends on the current locale: any character considered alphabetic by the current locale can be used in an identifier.) The following \IndexEmph{keywords} are reserved, and cannot be used as identifiers: \index{reserved words} \begin{verbatim} and break do else elseif end false for function global if in local nil not or repeat return then true until while \end{verbatim} Lua is a case-sensitive language: \T{and} is a reserved word, but \T{And} and \T{\'and} (if the locale permits) are two different, valid identifiers. As a convention, identifiers starting with an underscore followed by uppercase letters (such as \verb|_VERSION|) are reserved for internal variables. The following strings denote other \Index{tokens}: \begin{verbatim} + - * / ^ % ~= <= >= < > == = ( ) { } [ ] ; : , . .. ... \end{verbatim} \IndexEmph{Literal strings} can be delimited by matching single or double quotes, and can contain the C-like escape sequences `\verb|\a|' (bell), `\verb|\b|' (backspace), `\verb|\f|' (form feed), `\verb|\n|' (newline), `\verb|\r|' (carriage return), `\verb|\t|' (horizontal tab), `\verb|\v|' (vertical tab), `\verb|\\|' (backslash), `\verb|\"|' (double quote), `\verb|\'|' (single quote), and `\verb|\|\emph{newline}' (that is, a backslash followed by a real newline, which results in a newline in the string). A character in a string may also be specified by its numerical value, through the escape sequence `\verb|\|\emph{ddd}', where \emph{ddd} is a sequence of up to three \emph{decimal} digits. Strings in Lua may contain any 8-bit value, including embedded zeros, which can be specified as `\verb|\0|'. Literal strings can also be delimited by matching \verb|[[| $\ldots$ \verb|]]|. Literals in this bracketed form may run for several lines, may contain nested \verb|[[| $\ldots$ \verb|]]| pairs, and do not interpret escape sequences. For convenience, when the opening \verb|[[| is immediately followed by a newline, the newline is not included in the string. That form is specially convenient for writing strings that contain program pieces or other quoted strings. As an example, in a system using ASCII (in which `\verb|a|' is coded as~97, newline is coded as~10, and `\verb|1|' is coded as~49), the four literals below denote the same string: \begin{verbatim} 1) "alo\n123\"" 2) '\97lo\10\04923"' 3) [[alo 123"]] 4) [[ alo 123"]] \end{verbatim} \IndexEmph{Numerical constants} may be written with an optional decimal part and an optional decimal exponent. Examples of valid numerical constants are \begin{verbatim} 3 3.0 3.1416 314.16e-2 0.31416E1 \end{verbatim} \IndexEmph{Comments} start anywhere outside a string with a double hyphen (\verb|--|); If the text after \verb|--| is different from \verb|[[|, the comment is a short comment, that runs until the end of the line. Otherwise, it is a long comment, that runs until the corresponding \verb|]]|. Long comments may run for several lines, and may contain nested \verb|[[| $\ldots$ \verb|]]| pairs. For convenience, the first line of a chunk is skipped if it starts with \verb|#|. This facility allows the use of Lua as a script interpreter in Unix systems \see{lua-sa}. \subsection{Variables}\label{variables} Variables are places that store values. %In Lua, variables are given by simple identifiers or by table fields. A single name can denote a global variable, a local variable, or a formal parameter in a function (formal parameters are just local variables): \begin{Produc} \produc{var}{\Nter{Name}} \end{Produc}% Square brackets are used to index a table: \begin{Produc} \produc{var}{prefixexp \ter{[} exp \ter{]}} \end{Produc}% The first expression should result in a table value, and the second expression identifies a specific entry inside that table. The syntax \verb|var.NAME| is just syntactic sugar for \verb|var["NAME"]|: \begin{Produc} \produc{var}{prefixexp \ter{.} \Nter{Name}} \end{Produc}% The expression denoting the table to be indexed has a restricted syntax; \See{expressions} for details. The meaning of assignments and evaluations of global and indexed variables can be changed via metatables. An assignment to a global variable \verb|x = val| is equivalent to the assignment \verb|_glob.x = val|, where \verb|_glob| is the table of globals of the running function (\see{global-table} for a discussion about the table of globals). An assignment to an indexed variable \verb|t[i] = val| is equivalent to \verb|settable_event(t,i,val)|. An access to a global variable \verb|x| is equivalent to \verb|_glob.x| (again, \see{global-table} for a discussion about \verb|_glob|). An access to an indexed variable \verb|t[i]| is equivalent to a call \verb|gettable_event(t,i)|. See \See{metatable} for a complete description of the \verb|settable_event| and \verb|gettable_event| functions. (These functions are not defined in Lua. We use them here only for explanatory purposes.) \subsection{Statements}\label{stats} Lua supports an almost conventional set of \Index{statements}, similar to those in Pascal or C. The conventional commands include assignment, control structures, and procedure calls. Non-conventional commands include table constructors and variable declarations. \subsubsection{Chunks}\label{chunks} The unit of execution of Lua is called a \Def{chunk}. A chunk is simply a sequence of statements, which are executed sequentially. Each statement can be optionally followed by a semicolon: \begin{Produc} \produc{chunk}{\rep{stat \opt{\ter{;}}}} \end{Produc}% \subsubsection{Blocks} A \Index{block} is a list of statements; syntactically, a block is equal to a chunk: \begin{Produc} \produc{block}{chunk} \end{Produc}% A block may be explicitly delimited to produce a single statement: \begin{Produc} \produc{stat}{\rwd{do} block \rwd{end}} \end{Produc}% \IndexKW{do} Explicit blocks are useful to control the scope of variable declarations. Explicit blocks are also sometimes used to add a \rwd{return} or \rwd{break} statement in the middle of another block \see{control}. \subsubsection{\Index{Assignment}} \label{assignment} Lua allows \Index{multiple assignment}. Therefore, the syntax for assignment defines a list of variables on the left side and a list of expressions on the right side. The elements in both lists are separated by commas: \begin{Produc} \produc{stat}{varlist1 \ter{=} explist1} \produc{varlist1}{var \rep{\ter{,} var}} \produc{explist1}{exp \rep{\ter{,} exp}} \end{Produc}% Expressions are discussed in \See{expressions}. Before the assignment, the list of values is \emph{adjusted} to the length of the list of variables.\index{adjustment} If there are more values than needed, the excess values are thrown away. If there are less values than needed, the list is extended with as many \nil's as needed. If the list of expressions ends with a function call, then all values returned by that function call enter in the list of values, before the adjust (except when the call is enclosed in parentheses; see \See{expressions}). The assignment statement first evaluates all its expressions, and only then makes the assignments. So, the code \begin{verbatim} i = 3 i, a[i] = i+1, 20 \end{verbatim} sets \verb|a[3]| to 20, without affecting \verb|a[4]| because the \verb|i| in \verb|a[i]| is evaluated before it is assigned 4. Similarly, the line \begin{verbatim} x, y = y, x \end{verbatim} exchanges the values of \verb|x| and \verb|y|. \subsubsection{Control Structures}\label{control} The control structures \rwd{if}, \rwd{while}, and \rwd{repeat} have the usual meaning and familiar syntax: \index{while-do statement}\IndexKW{while} \index{repeat-until statement}\IndexKW{repeat}\IndexKW{until} \index{if-then-else statement}\IndexKW{if}\IndexKW{else}\IndexKW{elseif} \begin{Produc} \produc{stat}{\rwd{while} exp \rwd{do} block \rwd{end}} \produc{stat}{\rwd{repeat} block \rwd{until} exp} \produc{stat}{\rwd{if} exp \rwd{then} block \rep{\rwd{elseif} exp \rwd{then} block} \opt{\rwd{else} block} \rwd{end}} \end{Produc}% Lua also has a \rwd{for} statement, in two flavors \see{for}. The \Index{condition expression} \M{exp} of a control structure may return any value. All values different from \nil{} and \False{} are considered true (in particular, the number 0 and the empty string are also true); both \False{} and \nil{} are considered false. The \rwd{return} statement is used to return values from a function or from a chunk.\IndexKW{return} \label{return}% \index{return statement}% Functions and chunks may return more than one value, and so the syntax for the \rwd{return} statement is \begin{Produc} \produc{stat}{\rwd{return} \opt{explist1}} \end{Produc}% The \rwd{break} statement can be used to terminate the execution of a \rwd{while}, \rwd{repeat}, or \rwd{for} loop, skipping to the next statement after the loop:\IndexKW{break} \index{break statement} \begin{Produc} \produc{stat}{\rwd{break}} \end{Produc}% A \rwd{break} ends the innermost enclosing loop. \NOTE For syntactic reasons, \rwd{return} and \rwd{break} statements can only be written as the \emph{last} statement of a block. If it is really necessary to \rwd{return} or \rwd{break} in the middle of a block, then an explicit inner block can used, as in the idioms `\verb|do return end|' and `\verb|do break end|', because now \rwd{return} and \rwd{break} are the last statements in their (inner) blocks. In practice, those idioms are only used during debugging. (For instance, a line `\verb|do return end|' can be added at the beginning of a chunk for syntax checking only.) \subsubsection{For Statement} \label{for}\index{for statement} The \rwd{for} statement has two forms, one for numbers and one generic. \IndexKW{for}\IndexKW{in} The numerical \rwd{for} loop repeats a block of code while a control variable runs through an arithmetic progression. It has the following syntax: \begin{Produc} \produc{stat}{\rwd{for} \Nter{Name} \ter{=} exp \ter{,} exp \opt{\ter{,} exp} \rwd{do} block \rwd{end}} \end{Produc}% The \emph{block} is repeated for \emph{name} starting at the value of the first \emph{exp}, until it reaches the second \emph{exp} by steps of the third \emph{exp}. More precisely, a \rwd{for} statement like \begin{verbatim} for var = e1, e2, e3 do block end \end{verbatim} is equivalent to the code: \begin{verbatim} do local var, _limit, _step = tonumber(e1), tonumber(e2), tonumber(e3) if not (var and _limit and _step) then error() end while (_step>0 and var<=_limit) or (_step<=0 and var>=_limit) do block var = var+_step end end \end{verbatim} Note the following: \begin{itemize}\itemsep=0pt \item Both the limit and the step are evaluated only once, before the loop starts. \item \verb|_limit| and \verb|_step| are invisible variables. The names are here for explanatory purposes only. \item The behavior is \emph{undefined} if you assign to \verb|var| inside the block. \item If the third expression (the step) is absent, then a step of~1 is used. \item You can use \rwd{break} to exit a \rwd{for} loop. \item The loop variable \verb|var| is local to the statement; you cannot use its value after the \rwd{for} ends or is broken. If you need the value of the loop variable \verb|var|, then assign it to another variable before breaking or exiting the loop. \end{itemize} The generic \rwd{for} statement works over functions, called \Index{generators}. It calls its generator to produce a new value for each iteration, stopping when the new value is \nil. It has the following syntax: \begin{Produc} \produc{stat}{\rwd{for} \Nter{Name} \rep{\ter{,} \Nter{Name}} \rwd{in} explist1 \rwd{do} block \rwd{end}} \end{Produc}% A \rwd{for} statement like \begin{verbatim} for var_1, ..., var_n in explist do block end \end{verbatim} is equivalent to the code: \begin{verbatim} do local _f, _s, var_1 = explist while 1 do local var_2, ..., var_n var_1, ..., var_n = _f(_s, var_1) if var_1 == nil then break end block end end \end{verbatim} Note the following: \begin{itemize}\itemsep=0pt \item \verb|explist| is evaluated only once. Its results are a ``generator'' function, a ``state'', and an initial value for the ``iterator variable''. \item \verb|_f| and \verb|_s| are invisible variables. The names are here for explanatory purposes only. \item The behavior is \emph{undefined} if you assign to any \verb|var_i| inside the block. \item You can use \rwd{break} to exit a \rwd{for} loop. \item The loop variables \verb|var_i| are local to the statement; you cannot use their values after the \rwd{for} ends. If you need these values, then assign them to other variables before breaking or exiting the loop. \end{itemize} \subsubsection{Function Calls as Statements} \label{funcstat} Because of possible side-effects, function calls can be executed as statements: \begin{Produc} \produc{stat}{functioncall} \end{Produc}% In this case, all returned values are thrown away. Function calls are explained in \See{functioncall}. \subsubsection{Local Declarations} \label{localvar} \Index{Local variables} may be declared anywhere inside a block. The declaration may include an initial assignment:\IndexKW{local} \begin{Produc} \produc{stat}{\rwd{local} namelist \opt{\ter{=} explist1}} \produc{namelist}{\Nter{Name} \rep{\ter{,} \Nter{Name}}} \end{Produc}% If present, an initial assignment has the same semantics of a multiple assignment \see{assignment}. Otherwise, all variables are initialized with \nil. A chunk is also a block \see{chunks}, and so local variables can be declared outside any explicit block. Such local variables die when the chunk ends. Visibility rules for local variables are explained in \See{visibility}. \subsection{\Index{Expressions}}\label{expressions} %\subsubsection{\Index{Basic Expressions}} The basic expressions in Lua are the following: \begin{Produc} \produc{exp}{prefixexp} \produc{exp}{\rwd{nil} \Or \rwd{false} \Or \rwd{true}} \produc{exp}{Number} \produc{exp}{Literal} \produc{exp}{function} \produc{exp}{tableconstructor} \produc{prefixexp}{var \Or functioncall \Or \ter{(} exp \ter{)}} \end{Produc}% \IndexKW{nil}\IndexKW{false}\IndexKW{true} An expression enclosed in parentheses always results in only one value. Thus, \verb|(f(x,y,z))| is always a single value, even if \verb|f| returns several values. (The value of \verb|(f(x,y,z))| is the first value returned by \verb|f| or \nil{} if \verb|f| does not return any values.) \emph{Numbers} and \emph{literal strings} are explained in \See{lexical}; variables are explained in \See{variables}; function definitions are explained in \See{func-def}; function calls are explained in \See{functioncall}; table constructors are explained in \See{tableconstructor}. Expressions can also be built with arithmetic operators, relational operators, and logical operadors, all of which are explained below. \subsubsection{Arithmetic Operators} Lua supports the usual \Index{arithmetic operators}: the binary \verb|+| (addition), \verb|-| (subtraction), \verb|*| (multiplication), \verb|/| (division), and \verb|^| (exponentiation); and unary \verb|-| (negation). If the operands are numbers, or strings that can be converted to numbers \see{coercion}, then all operations except exponentiation have the usual meaning, while exponentiation calls a global function \verb|pow|; ?? otherwise, an appropriate metamethod is called \see{metatable}. The standard mathematical library defines function \verb|pow|, giving the expected meaning to \Index{exponentiation} \see{mathlib}. \subsubsection{Relational Operators}\label{rel-ops} The \Index{relational operators} in Lua are \begin{verbatim} == ~= < > <= >= \end{verbatim} These operators always result in \False{} or \True. Equality (\verb|==|) first compares the type of its operands. If the types are different, then the result is \False. Otherwise, the values of the operands are compared. Numbers and strings are compared in the usual way. Tables, userdata, and functions are compared \emph{by reference}, that is, two tables are considered equal only if they are the \emph{same} table. ??eq metamethod?? Every time you create a new table (or userdata, or function), this new value is different from any previously existing value. \NOTE The conversion rules of \See{coercion} \emph{do not} apply to equality comparisons. Thus, \verb|"0"==0| evaluates to \emph{false}, and \verb|t[0]| and \verb|t["0"]| denote different entries in a table. \medskip The operator \verb|~=| is exactly the negation of equality (\verb|==|). The order operators work as follows. If both arguments are numbers, then they are compared as such. Otherwise, if both arguments are strings, then their values are compared according to the current locale. Otherwise, the ``lt'' or the ``le'' metamethod is called \see{metatable}. \subsubsection{Logical Operators} The \Index{logical operators} in Lua are \index{and}\index{or}\index{not} \begin{verbatim} and or not \end{verbatim} Like the control structures \see{control}, all logical operators consider both \False{} and \nil{} as false and anything else as true. \IndexKW{and}\IndexKW{or}\IndexKW{not} The operator \rwd{not} always return \False{} or \True. The conjunction operator \rwd{and} returns its first argument if its value is \False{} or \nil; otherwise, \rwd{and} returns its second argument. The disjunction operator \rwd{or} returns its first argument if it is different from \nil and \False; otherwise, \rwd{or} returns its second argument. Both \rwd{and} and \rwd{or} use \Index{short-cut evaluation}, that is, the second operand is evaluated only if necessary. For example, \begin{verbatim} 10 or error() -> 10 nil or "a" -> "a" nil and 10 -> nil false and error() -> false false and nil -> false false or nil -> nil 10 and 20 -> 20 \end{verbatim} \subsubsection{Concatenation} \label{concat} The string \Index{concatenation} operator in Lua is denoted by two dots (`\verb|..|'). If both operands are strings or numbers, then they are converted to strings according to the rules mentioned in \See{coercion}. Otherwise, the ``concat'' metamethod is called \see{metatable}. \subsubsection{Precedence} \Index{Operator precedence} in Lua follows the table below, from lower to higher priority: \begin{verbatim} or and < > <= >= ~= == .. + - * / not - (unary) ^ \end{verbatim} All binary operators are left associative, except for \verb|^| (exponentiation), which is right associative. \NOTE The pre-compiler may rearrange the order of evaluation of associative operators, and may exchange the operands of commutative operators, as long as these optimizations do not change normal results. However, these optimizations may change some results if you define non-associative (or non-commutative) metamethods for those operators. \subsubsection{Table Constructors} \label{tableconstructor} Table \Index{constructors} are expressions that create tables; every time a constructor is evaluated, a new table is created. Constructors can be used to create empty tables, or to create a table and initialize some of its fields. The general syntax for constructors is \begin{Produc} \produc{tableconstructor}{\ter{\{} \opt{fieldlist} \ter{\}}} \produc{fieldlist}{field \rep{fieldsep field} \opt{fieldsep}} \produc{field}{\ter{[} exp \ter{]} \ter{=} exp \Or \Nter{Name} \ter{=} exp \Or exp} \produc{fieldsep}{\ter{,} \Or \ter{;}} \end{Produc}% Each field of the form \verb|[exp1] = exp2| adds to the new table an entry with key \verb|exp1| and value \verb|exp2|. A field of the form \verb|name = exp| is equivalent to \verb|["name"] = exp|. Finally, fields of the form \verb|exp| are equivalent to \verb|[i] = exp|, where \verb|i| are consecutive numerical integers, starting with 1. Fields in the other formats do not affect this counting. For example, \begin{verbatim} a = {[f(1)] = g; "x", "y"; x = 1, f(x), [30] = 23; 45} \end{verbatim} is equivalent to \begin{verbatim} do local temp = {} temp[f(1)] = g temp[1] = "x" -- 1st exp temp[2] = "y" -- 2nd exp temp.x = 1 -- temp["x"] = 1 temp[3] = f(x) -- 3rd exp temp[30] = 23 temp[4] = 45 -- 4th exp a = temp end \end{verbatim} If the last expression in the list is a function call, then all values returned by the call enter the list consecutively \see{functioncall}. If you want to avoid this, enclose the function call in parentheses. The field list may have an optional trailing separator, as a convenience for machine-generated code. \subsubsection{Function Calls} \label{functioncall} A \Index{function call} in Lua has the following syntax: \begin{Produc} \produc{functioncall}{prefixexp args} \end{Produc}% In a function call, first \M{prefixexp} and \M{args} are evaluated. If the value of \M{prefixexp} has type \emph{function}, then that function is called, with the given arguments. Otherwise, its ``call'' metamethod is called, having as first parameter the value of \M{prefixexp}, followed by the original call arguments \see{metatable}. The form \begin{Produc} \produc{functioncall}{prefixexp \ter{:} \Nter{name} args} \end{Produc}% can be used to call ``methods''. A call \verb|v:name(...)| is syntactic sugar for \verb|v.name(v, ...)|, except that \verb|v| is evaluated only once. Arguments have the following syntax: \begin{Produc} \produc{args}{\ter{(} \opt{explist1} \ter{)}} \produc{args}{tableconstructor} \produc{args}{Literal} \end{Produc}% All argument expressions are evaluated before the call. A call of the form \verb|f{...}| is syntactic sugar for \verb|f({...})|, that is, the argument list is a single new table. A call of the form \verb|f'...'| (or \verb|f"..."| or \verb|f[[...]]|) is syntactic sugar for \verb|f('...')|, that is, the argument list is a single literal string. Because a function can return any number of results \see{return}, the number of results must be adjusted before they are used. If the function is called as a statement \see{funcstat}, then its return list is adjusted to~0 elements, thus discarding all returned values. If the function is called inside another expression, or in the middle of a list of expressions, then its return list is adjusted to~1 element, thus discarding all returned values but the first one. If the function is called as the last element of a list of expressions, then no adjustment is made (unless the call is enclosed in parentheses). Here are some examples: \begin{verbatim} f() -- adjusted to 0 results g(f(), x) -- f() is adjusted to 1 result g(x, f()) -- g gets x plus all values returned by f() a,b,c = f(), x -- f() is adjusted to 1 result (and c gets nil) a,b,c = x, f() -- f() is adjusted to 2 a,b,c = f() -- f() is adjusted to 3 return f() -- returns all values returned by f() return x,y,f() -- returns x, y, and all values returned by f() {f()} -- creates a list with all values returned by f() {f(), nil} -- f() is adjusted to 1 result \end{verbatim} If you enclose a function call in parentheses, then it is adjusted to return exactly one value: \begin{verbatim} return x,y,(f()) -- returns x, y, and the first value from f() {(f())} -- creates a table with exactly one element \end{verbatim} As an exception to the format-free syntax of Lua, you cannot put a line break before the \verb|(| in a function call. That restriction avoids some ambiguities in the language. If you write \begin{verbatim} a = f (g).x(a) \end{verbatim} Lua would read that as \verb|a = f(g).x(a)|. So, if you want two statements, you must add a semi-colon between them. If you actually want to call \verb|f|, you must remove the line break before \verb|(g)|. \subsubsection{\Index{Function Definitions}} \label{func-def} The syntax for function definition is\IndexKW{function} \begin{Produc} \produc{function}{\rwd{function} funcbody} \produc{funcbody}{\ter{(} \opt{parlist1} \ter{)} block \rwd{end}} \end{Produc}% The following syntactic sugar simplifies function definitions: \begin{Produc} \produc{stat}{\rwd{function} funcname funcbody} \produc{stat}{\rwd{local} \rwd{function} \Nter{name} funcbody} \produc{funcname}{\Nter{name} \rep{\ter{.} \Nter{name}} \opt{\ter{:} \Nter{name}}} \end{Produc}% The statement \begin{verbatim} function f () ... end \end{verbatim} translates to \begin{verbatim} f = function () ... end \end{verbatim} The statement \begin{verbatim} function t.a.b.c.f () ... end \end{verbatim} translates to \begin{verbatim} t.a.b.c.f = function () ... end \end{verbatim} The statement \begin{verbatim} local function f () ... end \end{verbatim} translates to \begin{verbatim} local f; f = function () ... end \end{verbatim} A function definition is an executable expression, whose value has type \emph{function}. When Lua pre-compiles a chunk, all its function bodies are pre-compiled too. Then, whenever Lua executes the function definition, the function is \emph{instantiated} (or \emph{closed}). This function instance (or \emph{closure}) is the final value of the expression. Different instances of the same function may refer to different non-local variables \see{visibility} and may have different tables of globals \see{global-table}. Parameters act as local variables, initialized with the argument values: \begin{Produc} \produc{parlist1}{namelist \opt{\ter{,} \ter{\ldots}}} \produc{parlist1}{\ter{\ldots}} \end{Produc}% \label{vararg}% When a function is called, the list of \Index{arguments} is adjusted to the length of the list of parameters, unless the function is a \Def{vararg function}, which is indicated by three dots (`\verb|...|') at the end of its parameter list. A vararg function does not adjust its argument list; instead, it collects all extra arguments into an implicit parameter, called \IndexLIB{arg}. The value of \verb|arg| is a table, with a field~\verb|n| whose value is the number of extra arguments, and the extra arguments at positions 1,~2,~\ldots,~\verb|n|. As an example, consider the following definitions: \begin{verbatim} function f(a, b) end function g(a, b, ...) end function r() return 1,2,3 end \end{verbatim} Then, we have the following mapping from arguments to parameters: \begin{verbatim} CALL PARAMETERS f(3) a=3, b=nil f(3, 4) a=3, b=4 f(3, 4, 5) a=3, b=4 f(r(), 10) a=1, b=10 f(r()) a=1, b=2 g(3) a=3, b=nil, arg={n=0} g(3, 4) a=3, b=4, arg={n=0} g(3, 4, 5, 8) a=3, b=4, arg={5, 8; n=2} g(5, r()) a=5, b=1, arg={2, 3; n=2} \end{verbatim} Results are returned using the \rwd{return} statement \see{return}. If control reaches the end of a function without encountering a \rwd{return} statement, then the function returns with no results. The \emph{colon} syntax is used for defining \IndexEmph{methods}, that is, functions that have an implicit extra parameter \IndexVerb{self}. Thus, the statement \begin{verbatim} function t.a.b.c:f (...) ... end \end{verbatim} is syntactic sugar for \begin{verbatim} t.a.b.c.f = function (self, ...) ... end \end{verbatim} \subsection{Visibility Rules} \label{visibility} \index{visibility} Lua is a lexically scoped language. The scope of variables begins at the first statement \emph{after} their declaration and lasts until the end of the innermost block that includes the declaration. For instance: \begin{verbatim} x = 10 -- global variable do -- new block local x = x -- new `x', with value 10 print(x) --> 10 x = x+1 do -- another block local x = x+1 -- another `x' print(x) --> 12 end print(x) --> 11 end print(x) --> 10 (the global one) \end{verbatim} Notice that, in a declaration like \verb|local x = x|, the new \verb|x| being declared is not in scope yet, so the second \verb|x| refers to the ``outside'' variable. Because of those \Index{lexical scoping} rules, local variables can be freely accessed by functions defined inside their scope. For instance: \begin{verbatim} local counter = 0 function inc (x) counter = counter + x return counter end \end{verbatim} Notice that each execution of a \rwd{local} statement ``creates'' new local variables. Consider the following example: \begin{verbatim} a = {} local x = 20 for i=1,10 do local y = 0 a[i] = function () y=y+1; return x+y end end \end{verbatim} In that code, each function uses a different \verb|y| variable, while all of them share the same \verb|x|. \subsection{Error Handling} \label{error} %% TODO Must be rewritten!!! Because Lua is an extension language, all Lua actions start from C~code in the host program calling a function from the Lua library. Whenever an error occurs during Lua compilation or execution, the function \verb|_ERRORMESSAGE| is called \DefLIB{_ERRORMESSAGE} (provided it is different from \nil), and then the corresponding function from the library (\verb|lua_dofile|, \verb|lua_dostring|, \verb|lua_dobuffer|, or \verb|lua_call|) is terminated, returning an error condition. Memory allocation errors are an exception to the previous rule. When memory allocation fails, Lua may not be able to execute the \verb|_ERRORMESSAGE| function. So, for this kind of error, Lua does not call the \verb|_ERRORMESSAGE| function; instead, the corresponding function from the library returns immediately with a special error code (\verb|LUA_ERRMEM|). This and other error codes are defined in \verb|lua.h| \see{luado}. The only argument to \verb|_ERRORMESSAGE| is a string describing the error. The default definition for this function calls \verb|_ALERT|, \DefLIB{_ALERT} which prints the message to \verb|stderr| \see{alert}. The standard I/O library redefines \verb|_ERRORMESSAGE| and uses the debug interface \see{debugI} to print some extra information, such as a call-stack traceback. Lua code can explicitly generate an error by calling the function \verb|error| \see{pdf-error}. Lua code can ``catch'' an error using the function \verb|call| \see{pdf-call}. \subsection{Metatables} \label{metatable} Every table and userdata value in Lua may have a \emph{metatable}. This \IndexEmph{metatable} is a table that defines the behavior of the original table and userdata under some operations. You can query and change the metatable of an object with functions \verb|setmetatable| and \verb|getmetatable| \see{pdf-getmetatable}. For each of those operations Lua associates a specific key, called an \emph{event}. When Lua performs one of those operations over a table or a userdata, if checks whether that object has a metatable with the corresponding event. If so, the value associated with that key (the \IndexEmph{metamethod}) controls how Lua will perform the operation. Metatables control the operations listed next. Each operation is identified by its corresponding name. The key for each operation is a string with its name prefixed by two underscores; for instance, the key for operation ``add'' is the string \verb|"__add"|. The semantics of these operations is better explained by a Lua function describing how the interpreter executes that operation. %Each function shows how a handler is called, %its arguments (that is, its signature), %its results, %and the default behavior in the absence of a handler. The code shown here in Lua is only illustrative; the real behavior is hard coded in the interpreter, and it is much more efficient than this simulation. All functions used in these descriptions (\verb|rawget|, \verb|tonumber|, etc.) are described in \See{predefined}. \begin{description} \item[``add'':]\IndexTM{add} the \verb|+| operation. The function \verb|getbinhandler| below defines how Lua chooses a handler for a binary operation. First, Lua tries the first operand. If its type does not define a handler for the operation, then Lua tries the second operand. \begin{verbatim} function getbinhandler (op1, op2, event) return metatable(op1)[event] or metatable(op2)[event] end \end{verbatim} Using that function, the behavior of the ``add'' operation is \begin{verbatim} function add_event (op1, op2) local o1, o2 = tonumber(op1), tonumber(op2) if o1 and o2 then -- both operands are numeric return o1+o2 -- '+' here is the primitive 'add' else -- at least one of the operands is not numeric local h = getbinhandler(op1, op2, "__add") if h then -- call the handler with both operands return h(op1, op2) else -- no handler available: default behavior error("unexpected type at arithmetic operation") end end end \end{verbatim} \item[``sub'':]\IndexTM{sub} the \verb|-| operation. Behavior similar to the ``add'' operation. \item[``mul'':]\IndexTM{mul} the \verb|*| operation. Behavior similar to the ``add'' operation. \item[``div'':]\IndexTM{div} the \verb|/| operation. Behavior similar to the ``add'' operation. \item[``pow'':]\IndexTM{pow} the \verb|^| operation (exponentiation) operation. \begin{verbatim} ?? function pow_event (op1, op2) local h = getbinhandler(op1, op2, "__pow") ??? if h then -- call the handler with both operands return h(op1, op2) else -- no handler available: default behavior error("unexpected type at arithmetic operation") end end \end{verbatim} \item[``unm'':]\IndexTM{unm} the unary \verb|-| operation. \begin{verbatim} function unm_event (op) local o = tonumber(op) if o then -- operand is numeric return -o -- '-' here is the primitive 'unm' else -- the operand is not numeric. -- Try to get a handler from the operand; local h = metatable(op).__unm if h then -- call the handler with the operand and nil return h(op, nil) else -- no handler available: default behavior error("unexpected type at arithmetic operation") end end end \end{verbatim} \item[``lt'':]\IndexTM{lt} the \verb|<| operation. \begin{verbatim} function lt_event (op1, op2) if type(op1) == "number" and type(op2) == "number" then return op1 < op2 -- numeric comparison elseif type(op1) == "string" and type(op2) == "string" then return op1 < op2 -- lexicographic comparison else local h = getbinhandler(op1, op2, "__lt") if h then return h(op1, op2) else error("unexpected type at comparison"); end end end \end{verbatim} \verb|a>b| is equivalent to \verb|b=b| is equivalent to \verb|b<=a|. Notice that, in the absence of a ``le'' metamethod, Lua tries the ``lt'', assuming that \verb|a<=b| is equivalent to \verb|not (b