% $Id: manual.tex,v 1.53 2001/10/31 18:06:05 roberto Exp roberto $ \documentclass[11pt,twoside,draft]{article} \usepackage{fullpage} \usepackage{bnf} \usepackage{graphicx} % no need for subscripts... \catcode`\_=12 %\newcommand{\See}[1]{Section~\ref{#1}} \newcommand{\See}[1]{\S\ref{#1}} %\newcommand{\see}[1]{(see~\See{#1} on page \pageref{#1})} \newcommand{\see}[1]{(see~\See{#1})} \newcommand{\seepage}[1]{(see page~\pageref{#1})} \newcommand{\M}[1]{{\rm\emph{#1}}} \newcommand{\T}[1]{{\tt #1}} \newcommand{\Math}[1]{$#1$} \newcommand{\nil}{{\bf nil}} %\def\tecgraf{{\sf TeC\kern-.21em\lower.7ex\hbox{Graf}}} \def\tecgraf{{\sf TeCGraf}} \newcommand{\Index}[1]{#1\index{#1@{\lowercase{#1}}}} \newcommand{\IndexVerb}[1]{\T{#1}\index{#1@{\tt #1}}} \newcommand{\IndexEmph}[1]{\emph{#1}\index{#1@{\lowercase{#1}}}} \newcommand{\IndexTM}[1]{\index{#1 event@{``#1'' event}}\index{tag method!#1}} \newcommand{\Def}[1]{\emph{#1}\index{#1}} \newcommand{\IndexAPI}[1]{\T{#1}\DefAPI{#1}} \newcommand{\IndexLIB}[1]{\T{#1}\DefLIB{#1}} \newcommand{\DefLIB}[1]{\index{#1@{\tt #1}}} \newcommand{\DefAPI}[1]{\index{C API!#1@{\tt #1}}} \newcommand{\IndexKW}[1]{\index{keywords!#1@{\tt #1}}} \newcommand{\ff}{$\bullet$\ } \newcommand{\Version}{4.1 (alpha)} % changes to bnf.sty by LHF \renewcommand{\Or}{$|$ } \renewcommand{\rep}[1]{{\rm\{}\,#1\,{\rm\}}} \renewcommand{\opt}[1]{{\rm [}\,#1\,{\,\rm]}} \renewcommand{\ter}[1]{{\rm`{\tt#1}'}} \newcommand{\NOTE}{\par\medskip\noindent\emph{NOTE}: } \makeindex \begin{document} %{=============================================================== \thispagestyle{empty} \pagestyle{empty} { \parindent=0pt \vglue1.5in {\LARGE\bf The Programming Language Lua} \hfill \vskip4pt \hrule height 4pt width \hsize \vskip4pt \hfill Reference Manual for Lua version \Version \\ \null \hfill Last revised on \today \\ \vfill \centering \includegraphics[width=0.7\textwidth]{nolabel.ps} \vfill \vskip4pt \hrule height 2pt width \hsize } \newpage \begin{quotation} \parskip=10pt \parindent=0pt \footnotesize \null\vfill \noindent Copyright \copyright\ 1994--2001 TeCGraf, PUC-Rio. All rights reserved. Permission is hereby granted, without written agreement and without license or royalty fees, to use, copy, modify, translate, and distribute this software and its documentation (hereby called the ``package'') for any purpose, including commercial applications, subject to the following conditions: \begin{itemize} \item The above copyright notice and this permission notice shall appear in all copies or substantial portions of this package. \item The origin of this package must not be misrepresented; you must not claim that you wrote the original package. If you use this package in a product, an acknowledgment in the product documentation would be greatly appreciated (but it is not required). \item Altered source versions must be plainly marked as such, and must not be misrepresented as being the original package. \end{itemize} The authors specifically disclaim any warranties, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose. The package provided hereunder is on an ``as~is'' basis, and the authors have no obligation to provide maintenance, support, updates, enhancements, or modifications. In no event shall TeCGraf, PUC-Rio, or the authors be held liable to any party for direct, indirect, special, incidental, or consequential damages arising out of the use of this package and its documentation. The Lua language and this implementation have been entirely designed and written by Waldemar Celes, Roberto Ierusalimschy, and Luiz Henrique de Figueiredo at TeCGraf, PUC-Rio in Brazil. This implementation contains no third-party code. Copies of this manual can be obtained at Lua's official web site, \verb|www.lua.org|. \bigskip The Lua logo was designed by A. Nakonechny. Copyright \copyright\ 1998. All rights reserved. \end{quotation} %}=============================================================== \newpage \title{\Large\bf Reference Manual of the Programming Language Lua \Version} \author{% Roberto Ierusalimschy\qquad Luiz Henrique de Figueiredo\qquad Waldemar Celes \vspace{1.0ex}\\ \smallskip \small\tt lua@tecgraf.puc-rio.br \vspace{2.0ex}\\ %MCC 08/95 --- \tecgraf\ --- Computer Science Department --- PUC-Rio } %\date{{\small \tt\$Date: 2001/10/31 18:06:05 $ $}} \maketitle \pagestyle{plain} \pagenumbering{roman} \begin{abstract} \noindent Lua is a powerful, light-weight programming language designed for extending applications. Lua is also frequently used as a general-purpose, stand-alone language. Lua combines simple procedural syntax (similar to Pascal) with powerful data description constructs based on associative arrays and extensible semantics. Lua is dynamically typed, interpreted from bytecodes, and has automatic memory management with garbage collection, making it ideal for configuration, scripting, and rapid prototyping. This document describes version \Version\ of the Lua programming language and the Application Program Interface (API) that allows interaction between Lua programs and their host C~programs. \end{abstract} \def\abstractname{Resumo} \begin{abstract} \noindent Lua \'e uma linguagem de programa\c{c}\~ao poderosa e leve, projetada para estender aplica\c{c}\~oes. Lua tamb\'em \'e frequentemente usada como uma linguagem de prop\'osito geral. Lua combina programa\c{c}\~ao procedural (com sintaxe semelhante \`a de Pascal) com poderosas constru\c{c}\~oes para descri\c{c}\~ao de dados, baseadas em tabelas associativas e sem\^antica extens\'\i vel. Lua \'e tipada dinamicamente, interpretada a partir de \emph{bytecodes}, e tem gerenciamento autom\'atico de mem\'oria com coleta de lixo. Essas caracter\'{\i}sticas fazem de Lua uma linguagem ideal para configura\c{c}\~ao, automa\c{c}\~ao (\emph{scripting}) e prototipagem r\'apida. Este documento descreve a vers\~ao \Version\ da linguagem de programa\c{c}\~ao Lua e a Interface de Programa\c{c}\~ao (API) que permite a intera\c{c}\~ao entre programas Lua e programas C~hospedeiros. \end{abstract} \newpage \null \newpage \tableofcontents \newpage \setcounter{page}{1} \pagestyle{plain} \pagenumbering{arabic} %------------------------------------------------------------------------------ \section{Introduction} Lua is an extension programming language designed to support general procedural programming with data description facilities. Lua is intended to be used as a powerful, light-weight configuration language for any program that needs one. Lua is implemented as a library, written in C. Being an extension language, Lua has no notion of a ``main'' program: it only works \emph{embedded} in a host client, called the \emph{embedding program} or simply the \emph{host}. This host program can invoke functions to execute a piece of Lua code, can write and read Lua variables, and can register C~functions to be called by Lua code. Through the use of C~functions, Lua can be augmented to cope with a wide range of different domains, thus creating customized programming languages sharing a syntactical framework. Lua is free software, and is provided as usual with no guarantees, as stated in its copyright notice. The implementation described in this manual is available at Lua's official web site, \verb|www.lua.org|. Like any other reference manual, this document is dry in places. For a discussion of the decisions behind the design of Lua, see the papers below, which are available at Lua's web site. \begin{itemize} \item R.~Ierusalimschy, L.~H.~de Figueiredo, and W.~Celes. Lua---an extensible extension language. \emph{Software: Practice \& Experience} {\bf 26} \#6 (1996) 635--652. \item L.~H.~de Figueiredo, R.~Ierusalimschy, and W.~Celes. The design and implementation of a language for extending applications. \emph{Proceedings of XXI Brazilian Seminar on Software and Hardware} (1994) 273--283. \item L.~H.~de Figueiredo, R.~Ierusalimschy, and W.~Celes. Lua: an extensible embedded language. \emph{Dr. Dobb's Journal} {\bf 21} \#12 (Dec 1996) 26--33. \item R.~Ierusalimschy, L.~H.~de Figueiredo, and W.~Celes. The evolution of an extension language: a history of Lua, \emph{Proceedings of V Brazilian Symposium on Programming Languages} (2001) B-14--B-28. \end{itemize} %------------------------------------------------------------------------------ \section{Lua Concepts}\label{concepts} This section describes the main concepts of Lua as a language. The syntax and semantics of Lua are described in \See{language}. The discussion below is not purely conceptual; it includes references to the C~API \see{API}, because Lua is designed to be embedded in host programs. It also includes references to the standard libraries \see{libraries}. \subsection{Environment and Chunks} All statements in Lua are executed in a \Def{global environment}. This environment is initialized with a call from the embedding program to \verb|lua_open| and persists until a call to \verb|lua_close| or the end of the embedding program. If necessary, the host programmer can create multiple independent global environments, and freely switch between them \see{mangstate}. The global environment can be manipulated by Lua code or by the embedding program, which can read and write global variables using the API functions from the library that implements Lua. The unit of execution of Lua is called a \Def{chunk}. A chunk is simply a sequence of statements, which are executed sequentially. Statements are described in \See{stats}. A chunk may be stored in a file or in a string inside the host program. When a chunk is executed, first it is pre-compiled into bytecodes for a virtual machine, and then the compiled statements are executed by an interpreter for the virtual machine. All modifications a chunk effects on the global environment persist after the chunk ends. Chunks may also be pre-compiled into binary form and stored in files; see program \IndexVerb{luac} for details. Text files with chunks and their binary pre-compiled forms are interchangeable; Lua automatically detects the file type and acts accordingly. \index{pre-compilation} \subsection{\Index{Values and Types}} \label{TypesSec} Lua is a \emph{dynamically typed language}. This means that variables do not have types; only values do. Therefore, there are no type definitions in the language. All values carry their own type. There are six \Index{basic types} in Lua: \Def{nil}, \Def{number}, \Def{string}, \Def{function}, \Def{userdata}, and \Def{table}. \emph{Nil} is the type of the value \nil, whose main property is to be different from any other value. \emph{Number} represents real (double-precision floating-point) numbers. \emph{String} represents arrays of characters. \index{eight-bit clean} Lua is 8-bit clean, and so strings may contain any 8-bit character, including embedded zeros (\verb|'\0'|) \see{lexical}. Functions are considered \emph{first-class values} in Lua. This means that functions can be stored in variables, passed as arguments to other functions, and returned as results. Lua can call (and manipulate) functions written in Lua and functions written in C \see{functioncall}. The type \emph{userdata} is provided to allow arbitrary \Index{C~pointers} to be stored in Lua variables. This type corresponds to a \verb|void*| and has no pre-defined operations in Lua, except assignment and equality test. However, by using \emph{tag methods}, the programmer can define operations for userdata values \see{tag-method}. Userdata values cannot be created or modified in Lua, only through the C~API. This guarantees the integrity of data owned by the host program. The type \emph{table} implements \Index{associative arrays}, that is, \Index{arrays} that can be indexed not only with numbers, but with any value (except \nil). Moreover, tables are \emph{heterogeneous}, that is, they can contain values of all types. Tables are the main data structuring mechanism in Lua; they may be used not only to represent ordinary arrays, but also symbol tables, sets, records, graphs, trees, etc. To represent \Index{records}, Lua uses the field name as an index. The language supports this representation by providing \verb|a.name| as syntactic sugar for \verb|a["name"]|. There are several convenient ways to create tables in Lua \see{tableconstructor}. Like indices, the value of a table field can be of any type. In particular, because functions are first class values, table fields may contain functions. So, tables may also carry \emph{methods}. %The form \verb|t:f(x)| is syntactic sugar for \verb|t.f(t,x)|, %which calls the method \verb|f| from the table \verb|t| passing %the table itself as the first parameter \see{func-def}. Strings, tables, functions, and userdata values are \emph{objects}: variables do not actually \emph{contain} these values, only \emph{references} to them. Assignment, parameter passing, and returns from functions always manipulate references to these values, and do not imply any kind of copy. The library function \verb|type| returns a string describing the type of a given value \see{pdf-type}. \subsubsection{Tags}\label{tags} Each type is denoted both by a \emph{name}, which is a string, and a \IndexEmph{tag}, which is an integer. Tags are mainly used by C~code, to avoid the manipulation of strings. In the C~API, most operations over types require a tag to identify the type. In Lua, all operations over types work transparently with both type names and tags. The \verb|tag| function returns the tag of a given value \see{pdf-tag}. \subsubsection{User-defined Types} Lua programs can create new types, called \IndexEmph{user-defined types}. A user-defined type is always based on a base type, which can be either table or userdata. Objects of a user-defined type have an internal structure identical to the corresponding base type, but the programmer may define different semantics for each operation on them \see{tag-method}. The \verb|newtype| function creates a new type \see{pdf-newtype} with a name selected by the programmer. Types created by Lua programs are always based on tables; types created in~C can be based on tables or on userdata. The \verb|settagmethod| function defines new semantics for the operations of this new type \see{tag-method}. The \verb|settype| function changes the type of a given object \see{pdf-settype}. \subsection{\Index{Coercion}} \label{coercion} Lua provides automatic conversion between string and number values at run time. Any arithmetic operation applied to a string tries to convert that string to a number, following the usual rules. Conversely, whenever a number is used when a string is expected, the number is converted to a string, in a reasonable format. The format is chosen so that a conversion from number to string then back to number reproduces the original number \emph{exactly}. The conversion does not necessarily produces nice-looking text for some numbers. For complete control of how numbers are converted to strings, use the \verb|format| function \see{format}. \subsection{Variables} There are two kinds of variables in Lua: global variables and local variables. \Index{Global variables} do not need to be declared. Variables are assumed to be global unless explicitly declared local \see{localvar}. Before the first assignment, the value of a variable is \nil\ % (this default can be changed for global variables; see \See{tag-method}). An ordinary Lua table is used to keep all global names and values. This table can be accessed and changed with the \verb|globals| function \see{pdf-globals}. \Index{Local variables} are lexically scoped. Therefore, local variables can be freely accessed by functions defined inside their scope \see{visibility}. \subsection{Garbage Collection}\label{GC} Lua does automatic memory management. This means that you do not have to worry about allocating memory for new objects and freeing it when the objects are no longer needed. Lua manages memory automatically by running a \Index{garbage collector} from time to time and collecting all dead objects (all objects that are no longer accessible from Lua). All objects in Lua are subject to automatic management: tables, userdata, functions, and strings. Using the C~API, you can set garbage-collector tag methods for user-defined types based on userdata \see{tag-method}. Lua calls those functions when it is about to free a userdata of the corresponding type. Using this facility, you can coordinate Lua's garbage collection with external resource management (such as closing files, network or database connections, or freeing your own memory). Lua uses two numbers to control its garbage-collection cycles. One number counts how many bytes of dynamic memory Lua is using, and the other is a threshold. When the number of bytes crosses the threshold, Lua runs the garbage collector, which reclaims the memory of all dead objects. The byte counter is corrected, and then the threshold is reset to twice the value of the byte counter. Through the C~API, you can query those numbers, and change the threshold \see{GC-API}. Setting the threshold to zero actually forces an immediate garbage-collection cycle, while setting it to a huge number effectively stops the garbage collector. Using Lua code you have a more limited control over garbage-collection cycles, through the functions \verb|gcinfo| and \verb|collectgarbage| \see{predefined}. \subsection{Weak Tables}\label{weak-table} A \IndexEmph{weak table} is a table whose elements are \IndexEmph{weak references}. A weak reference is ignored by the garbage collector. In other words, if the only references to an object are weak references, then the garbage collector will collect that object. A weak table can have weak keys, weak values, or both. A table with weak keys allows the collection of its keys, but prevents the collection of its values. A table with both weak keys and weak values allows the collection of both keys and values In any case, if either the key or the value is collected, the whole pair is removed from the table. The weakness of a table is controled by the \verb|weakmode| function \see{weakmode}. %------------------------------------------------------------------------------ \section{The Language}\label{language} This section describes the lexis, the syntax, and the semantics of Lua. In other words, this section describes which tokens are valid, how they can be combined, and what their combinations mean. \subsection{Lexical Conventions} \label{lexical} \IndexEmph{Identifiers} in Lua can be any string of letters, digits, and underscores, not beginning with a digit. This coincides with the definition of identifiers in most languages. (The definition of letter depends on the current locale: any character considered alphabetic by the current locale can be used in an identifier.) The following \IndexEmph{keywords} are reserved, and cannot be used as identifiers: \index{reserved words} \begin{verbatim} and break do else elseif end for function global if in local nil not or repeat return then until while \end{verbatim} (The keyword \rwd{global} is reserved for future use.) %\IndexKW{and}\IndexKW{break}\IndexKW{do}\IndexKW{else}\IndexKW{elseif} %\IndexKW{end}\IndexKW{for}\IndexKW{function}\IndexKW{global}\IndexKW{if} %\IndexKW{in}\IndexKW{local}\IndexKW{nil}\IndexKW{not}\IndexKW{or} %\IndexKW{repeat}\IndexKW{return}\IndexKW{then}\IndexKW{until}\IndexKW{while} Lua is a case-sensitive language: \T{and} is a reserved word, but \T{And} and \T{\'and} (if the locale permits) are two different, valid identifiers. As a convention, identifiers starting with an underscore followed by uppercase letters (such as \verb|_INPUT|) are reserved for internal variables. The following strings denote other \Index{tokens}: \begin{verbatim} + - * / ^ % ~= <= >= < > == = ( ) { } [ ] ; : , . .. ... \end{verbatim} \IndexEmph{Literal strings} can be delimited by matching single or double quotes, and can contain the C-like escape sequences `\verb|\a|' (bell), `\verb|\b|' (backspace), `\verb|\f|' (form feed), `\verb|\n|' (newline), `\verb|\r|' (carriage return), `\verb|\t|' (horizontal tab), `\verb|\v|' (vertical tab), `\verb|\\|' (backslash), `\verb|\"|' (double quote), `\verb|\'|' (single quote), and `\verb|\|\emph{newline}' (that is, a backslash followed by a real newline, which results in a newline in the string). A character in a string may also be specified by its numerical value, through the escape sequence `\verb|\|\emph{ddd}', where \emph{ddd} is a sequence of up to three \emph{decimal} digits. Strings in Lua may contain any 8-bit value, including embedded zeros, which can be specified as `\verb|\000|'. Literal strings can also be delimited by matching \verb|[[| $\ldots$ \verb|]]|. Literals in this bracketed form may run for several lines, may contain nested \verb|[[| $\ldots$ \verb|]]| pairs, and do not interpret escape sequences. For convenience, when the opening \verb|[[| is immediately followed by a newline, this newline is not included in the string. This form is specially convenient for writing strings that contain program pieces or other quoted strings. As an example, in a system using ASCII (in which `\verb|a|' is coded as~97, newline is coded as~10, and `\verb|1|' is coded as~49), the following four literals below are equivalent: \begin{verbatim} 1) "alo\n123\"" 2) '\97lo\10\04923"' 3) [[alo 123"]] 4) [[ alo 123"]] \end{verbatim} \IndexEmph{Numerical constants} may be written with an optional decimal part and an optional decimal exponent. Examples of valid numerical constants are \begin{verbatim} 3 3.0 3.1416 314.16e-2 0.31416E1 \end{verbatim} \IndexEmph{Comments} start anywhere outside a string with a double hyphen (\verb|--|) and run until the end of the line. (There are no block comments in Lua.) For convenience, the first line of a chunk is skipped if it starts with \verb|#|. This facility allows the use of Lua as a script interpreter in Unix systems \see{lua-sa}. \subsection{Variables}\label{variables} Variables are places that store values. %In Lua, variables are given by simple identifiers or by table fields. A single name can denote a global variable, a local variable, or a formal parameter in a function (formal parameters are just local variables): \begin{Produc} \produc{var}{name} \end{Produc}% Square brackets are used to index a table: \begin{Produc} \produc{var}{exp \ter{[} exp \ter{]}} \end{Produc}% The first expression should result in a table value, and the second expression identifies the specific place inside that table. The syntax \verb|var.NAME| is just syntactic sugar for \verb|var["NAME"]|: \begin{Produc} \produc{var}{exp \ter{.} name} \end{Produc}% Expressions are discussed in \See{expressions}. The meaning of assignments and evaluations of global variables and indexed variables can be changed by tag methods \see{tag-method}. An assignment to a global variable \verb|x = val| is equivalent to a call \verb|setglobal("x", val)| and an assignment to an indexed variable \verb|t[i] = val| is equivalent to \verb|settable_event(t,i,val)|. An access to a global variable \verb|x| is equivalent to a call \verb|getglobal("x")| and an access to an indexed variable \verb|t[i]| is equivalent to a call \verb|gettable_event(t,i)|. See \See{tag-method} for a complete description of these functions (\verb|setglobal| and \verb|getglobal| are in the basic library; \T{settable\_event} and \T{gettable\_event} are used for explanatory purposes only). \subsection{Statements}\label{stats} Lua supports an almost conventional set of \Index{statements}, similar to those in Pascal or C. The conventional commands include assignment, control structures, and procedure calls. Non-conventional commands include table constructors \see{tableconstructor} and local variable declarations \see{localvar}. \subsubsection{Chunks}\label{chunks} The unit of execution of Lua is called a \Def{chunk}. A chunk is simply a sequence of statements, which are executed sequentially. Each statement can be optionally followed by a semicolon: \begin{Produc} \produc{chunk}{\rep{stat \opt{\ter{;}}}} \end{Produc}% The notation used above is the usual extended BNF, in which \rep{\emph{a}}~means 0 or more \emph{a}'s, and \opt{\emph{a}}~means an optional \emph{a}. Non-terminals are shown in \emph{italics}, keywords are shown in {\bf bold}, and other terminal symbols are shown in {\tt typewriter} font, enclosed in single quotes. The complete syntax of Lua in EBNF is given on page~\pageref{BNF}. \subsubsection{Blocks} A \Index{block} is a list of statements; syntactically, a block is equal to a chunk: \begin{Produc} \produc{block}{chunk} \end{Produc}% A block may be explicitly delimited: \begin{Produc} \produc{stat}{\rwd{do} block \rwd{end}} \end{Produc}% Explicit blocks are useful to control the scope of local variables \see{localvar}. Explicit blocks are also sometimes used to add a \rwd{return} or \rwd{break} statement in the middle of another block \see{control}. \subsubsection{\Index{Assignment}} \label{assignment} Lua allows \Index{multiple assignment}. Therefore, the syntax for assignment defines a list of variables on the left side and a list of expressions on the right side. The elements in both lists are separated by commas: \begin{Produc} \produc{stat}{varlist1 \ter{=} explist1} \produc{varlist1}{var \rep{\ter{,} var}} \produc{explist1}{exp \rep{\ter{,} exp}} \end{Produc}% Expressions are discussed in \See{expressions}. Before the assignment, the list of values is \emph{adjusted} to the length of the list of variables.\index{adjustment} If there are more values than are needed, the excess values are thrown away. If there are less values than are needed, the list is extended with as many \nil's as needed. If the list of expressions ends with a function call, then all values returned by that function call enter in the list of values, before the adjust (except when the call is enclosed in parentheses; see \See{expressions}). This statement first evaluates all values on the right side and eventual indices on the left side, and then makes the assignments. So, the code \begin{verbatim} i = 3 i, a[i] = i+1, 20 \end{verbatim} sets \verb|a[3]| to 20, but does not affect \verb|a[4]| because the \verb|i| in \verb|a[i]| is evaluated before it is assigned 4. Multiple assignment can be used to exchange two values, as in \begin{verbatim} x, y = y, x \end{verbatim} \subsubsection{Control Structures}\label{control} The control structures \rwd{if}, \rwd{while}, and \rwd{repeat} have the usual meaning and familiar syntax: \index{while-do statement} \index{repeat-until statement} \index{if-then-else statement} \begin{Produc} \produc{stat}{\rwd{while} exp \rwd{do} block \rwd{end}} \produc{stat}{\rwd{repeat} block \rwd{until} exp} \produc{stat}{\rwd{if} exp \rwd{then} block \rep{\rwd{elseif} exp \rwd{then} block} \opt{\rwd{else} block} \rwd{end}} \end{Produc}% There is also a \rwd{for} statement in two flavors \see{for}. The \Index{condition expression} \M{exp} of a control structure may return any value. All values different from \nil\ are considered true; only \nil\ is considered false. The \rwd{return} statement is used to return values from a function or from a chunk. \label{return}% \index{return statement}% Functions and chunks may return more than one value, and so the syntax for the \rwd{return} statement is \begin{Produc} \produc{stat}{\rwd{return} \opt{explist1}} \end{Produc}% The \rwd{break} statement can be used to terminate the execution of a \rwd{while}, \rwd{repeat}, or \rwd{for} loop, skipping to the next statement after the loop: \index{break statement} \begin{Produc} \produc{stat}{\rwd{break}} \end{Produc}% A \rwd{break} ends the innermost enclosing loop. \NOTE For syntactic reasons, \rwd{return} and \rwd{break} statements can only be written as the \emph{last} statements of a block. If it is really necessary to \rwd{return} or \rwd{break} in the middle of a block, then an explicit inner block can used, as in the idioms `\verb|do return end|' and `\verb|do break end|', because now \rwd{return} and \rwd{break} are last statements in the inner block. In practice, these idioms are only used during debugging. (The idiom `\verb|do return end|' can be added at the beginning of a chunk for syntax checking only.) \subsubsection{For Statement} \label{for}\index{for statement} The \rwd{for} statement has two forms, one for numbers and one for tables. The numerical \rwd{for} loop repeats a block of code while a control variables runs through an arithmetic progression. It has the following syntax: \begin{Produc} \produc{stat}{\rwd{for} name \ter{=} exp \ter{,} exp \opt{\ter{,} exp} \rwd{do} block \rwd{end}} \end{Produc}% The \emph{block} is repeated for \emph{name} starting at the value of the first \emph{exp}, until it reaches the second \emph{exp} by steps of the third \emph{exp}. More precisely, a \rwd{for} statement like \begin{verbatim} for var = e1, e2, e3 do block end \end{verbatim} is equivalent to the code: \begin{verbatim} do local var, _limit, _step = tonumber(e1), tonumber(e2), tonumber(e3) if not (var and _limit and _step) then error() end while (_step>0 and var<=_limit) or (_step<=0 and var>=_limit) do block var = var+_step end end \end{verbatim} Note the following: \begin{itemize}\itemsep=0pt \item \verb|_limit| and \verb|_step| are invisible variables. The names are here for explanatory purposes only. \item The behavior is \emph{undefined} if you assign to \verb|var| inside the block. \item If the third expression (the step) is absent, then a step of~1 is used. \item Both the limit and the step are evaluated only once, before the loop starts. \item You can use \rwd{break} to exit a \rwd{for} loop. \item The loop variable \verb|var| is local to the statement; you cannot use its value after the \rwd{for} ends or is broken. If you need the value of the loop variable \verb|var|, then assign it to another variable before breaking or exiting the loop. \end{itemize} The table \rwd{for} statement traverses all pairs (index,value) of a given table. It has the following syntax: \begin{Produc} \produc{stat}{\rwd{for} name \ter{,} name \rwd{in} exp \rwd{do} block \rwd{end}} \end{Produc}% A \rwd{for} statement like \begin{verbatim} for index, value in exp do block end \end{verbatim} is equivalent to the code: \begin{verbatim} do local _t = exp local index, value = next(_t, nil) while index do block index, value = next(_t, index) end end \end{verbatim} Note the following: \begin{itemize}\itemsep=0pt \item \verb|_t| is an invisible variable. The name is here for explanatory purposes only. \item The behavior is \emph{undefined} if you assign to \verb|index| inside the block. \item The behavior is \emph{undefined} if you change the table \verb|_t| during the traversal. \item You can use \rwd{break} to exit a \rwd{for} loop. \item The loop variables \verb|index| and \verb|value| are local to the statement; you cannot use their values after the \rwd{for} ends. If you need the value of \verb|index| or \verb|value|, then assign them to other variables before breaking or exiting the loop. \item The order that table elements are traversed is undefined, \emph{even for numerical indices}. If you want to traverse indices in numerical order, use a numerical \rwd{for}. \end{itemize} \subsubsection{Function Calls as Statements} \label{funcstat} Because of possible side-effects, function calls can be executed as statements: \begin{Produc} \produc{stat}{functioncall} \end{Produc}% In this case, all returned values are thrown away. Function calls are explained in \See{functioncall}. \subsubsection{Local Declarations} \label{localvar} \Index{Local variables} may be declared anywhere inside a block. The declaration may include an initial assignment: \begin{Produc} \produc{stat}{\rwd{local} namelist \opt{\ter{=} explist1}} \produc{namelist}{name \rep{\ter{,} name}} \end{Produc}% If present, an initial assignment has the same semantics of a multiple assignment \see{assignment}. Otherwise, all variables are initialized with \nil. A chunk is also a block \see{chunks}, and so local variables can be declared outside any explicit block. Such local variables die when the chunk ends. Visibility rules for local variables are explained in \See{visibility}. \subsection{\Index{Expressions}}\label{expressions} %\subsubsection{\Index{Basic Expressions}} The basic expressions in Lua are the following: \begin{Produc} \produc{exp}{\ter{(} exp \ter{)}} \produc{exp}{\rwd{nil}} \produc{exp}{number} \produc{exp}{literal} \produc{exp}{var} \produc{exp}{function} \produc{exp}{functioncall} \produc{exp}{tableconstructor} \end{Produc}% An expression enclosed in parentheses always results in only one value. Thus, \verb|(f(x,y,z))| is always a single value, even if \verb|f| returns several values. (The value of \verb|(f(x,y,z))| is the first value returned by \verb|f| or \nil\ if \verb|f| does not return any values.) \emph{Numbers} and \emph{literal strings} are explained in \See{lexical}; variables are explained in \See{variables}; function definitions are explained in \See{func-def}; function calls are explained in \See{functioncall}; table constructors are explained in \See{tableconstructor}. Expressions can also be built with arithmetic operators, relational operators, and logical operadors, all of which are explained below. \subsubsection{Arithmetic Operators} Lua supports the usual \Index{arithmetic operators}: the binary \verb|+| (addition), \verb|-| (subtraction), \verb|*| (multiplication), \verb|/| (division), and \verb|^| (exponentiation); and unary \verb|-| (negation). If the operands are numbers, or strings that can be converted to numbers \see{coercion}, then all operations except exponentiation have the usual meaning; otherwise, an appropriate tag method is called \see{tag-method}. An exponentiation always calls a tag method. The standard mathematical library defines this method for numbers, giving the expected meaning to \Index{exponentiation} \see{mathlib}. \subsubsection{Relational Operators}\label{rel-ops} The \Index{relational operators} in Lua are \begin{verbatim} == ~= < > <= >= \end{verbatim} These operators return \nil\ as false and a value different from \nil\ as true. Equality (\verb|==|) first compares the type of its operands. If the types are different, then the result is \nil. Otherwise, the values of the operands are compared. Numbers and strings are compared in the usual way. Tables, userdata, and functions are compared \emph{by reference}, that is, two tables are considered equal only if they are the \emph{same} table. Every time you create a new table (or userdata, or function), this new value is different from any previously existing value. \NOTE The conversion rules of \See{coercion} \emph{do not} apply to equality comparisons. Thus, \verb|"0"==0| evaluates to \emph{false}, and \verb|t[0]| and \verb|t["0"]| denote different entries in a table. \medskip The operator \verb|~=| is exactly the negation of equality (\verb|==|). The order operators work as follows. If both arguments are numbers, then they are compared as such. Otherwise, if both arguments are strings, then their values are compared according to the current locale. Otherwise, the ``lt'' tag method is called \see{tag-method}. \subsubsection{Logical Operators} The \Index{logical operators} in Lua are \index{and}\index{or}\index{not} \begin{verbatim} and or not \end{verbatim} Like the control structures \see{control}, all logical operators consider \nil\ as false and anything else as true. The conjunction operator \rwd{and} returns \nil\ if its first argument is \nil; otherwise, \rwd{and} returns its second argument. The disjunction operator \rwd{or} returns its first argument if it is different from \nil; otherwise, \rwd{or} returns its second argument. Both \rwd{and} and \rwd{or} use \Index{short-cut evaluation}, that is, the second operand is evaluated only if necessary. There are two useful Lua idioms that use logical operators. The first idiom is \begin{verbatim} x = x or v \end{verbatim} which is equivalent to \begin{verbatim} if x == nil then x = v end \end{verbatim} This idiom sets \verb|x| to a default value \verb|v| when \verb|x| is not set. The second idiom is \begin{verbatim} x = a and b or c \end{verbatim} which should be read as \verb|x = (a and b) or c|. This idiom is equivalent to \begin{verbatim} if a then x = b else x = c end \end{verbatim} provided that \verb|b| is not \nil. \subsubsection{Concatenation} \label{concat} The string \Index{concatenation} operator in Lua is denoted by two dots (`\verb|..|'). If both operands are strings or numbers, then they are converted to strings according to the rules mentioned in \See{coercion}. Otherwise, the ``concat'' tag method is called \see{tag-method}. \subsubsection{Precedence} \Index{Operator precedence} in Lua follows the table below, from lower to higher priority: \begin{verbatim} and or < > <= >= ~= == .. + - * / not - (unary) ^ \end{verbatim} All binary operators are left associative, except for \verb|^| (exponentiation), which is right associative. \NOTE The pre-compiler may rearrange the order of evaluation of associative or commutative operators, as long as these optimizations do not change normal results. However, these optimizations may change some results if you define non-associative (or non-commutative) tag methods for these operators. In general, you should not write code that depends on the order of evaluation. \subsubsection{Table Constructors} \label{tableconstructor} Table \Index{constructors} are expressions that create tables; every time a constructor is evaluated, a new table is created. Constructors can be used to create empty tables, or to create a table and initialize some of its fields. The general syntax for constructors is \begin{Produc} \produc{tableconstructor}{\ter{\{} fieldlist \ter{\}}} \produc{fieldlist}{lfieldlist \Or ffieldlist \Or lfieldlist \ter{;} ffieldlist \Or ffieldlist \ter{;} lfieldlist} \produc{lfieldlist}{\opt{explist1 \opt{\ter{,}}}} \produc{ffieldlist}{\opt{ffieldlist1 \opt{\ter{,}}}} \end{Produc}% The form \emph{explist1} is used to initialize \IndexEmph{lists}. The expressions in a list are assigned to consecutive numerical indices in the table, starting with~1. For example, \begin{verbatim} a = {"v1", "v2", 34} \end{verbatim} is equivalent to \begin{verbatim} do local temp = {} temp[1] = "v1" temp[2] = "v2" temp[3] = 34 a = temp end \end{verbatim} If the last expression in the list is a function call, then all values returned by the call enter the list consecutively \see{functioncall}. If you want to avoid this, enclose the function call in parentheses. The form \emph{ffieldlist1} initializes other fields in a table: \begin{Produc} \produc{ffieldlist1}{ffield \rep{\ter{,} ffield}} \produc{ffield}{\ter{[} exp \ter{]} \ter{=} exp \Or name \ter{=} exp} \end{Produc}% For example, \begin{verbatim} a = {[f(k)] = g(y), x = 1, y = 3, [0] = b+c} \end{verbatim} is equivalent to \begin{verbatim} do local temp = {} temp[f(k)] = g(y) temp["x"] = 1 -- or temp.x = 1 temp["y"] = 3 -- or temp.y = 3 temp[0] = b+c a = temp end \end{verbatim} An expression like \verb|{x = 1, y = 4}| is in fact syntactic sugar for \verb|{["x"] = 1, ["y"] = 4}|. Both forms may have an optional trailing comma (for convenience of machine-generated code), and can be used in the same constructor separated by a semi-colon. For example, all forms below are correct. \begin{verbatim} x = {;} x = {"a", "b",} x = {type="list"; "a", "b"} x = {f(0), f(1), f(2),; n=3,} \end{verbatim} \subsubsection{Function Calls} \label{functioncall} A \Index{function call} in Lua has the following syntax: \begin{Produc} \produc{functioncall}{exp args} \end{Produc}% In a function call, first \M{exp} and \M{args} are evaluated. If the value of \M{exp} has type \emph{function}, then this function is called, with the given arguments. Otherwise, the ``function'' tag method is called, having as first parameter the value of \M{exp}, followed by the original call arguments \see{tag-method}. The form \begin{Produc} \produc{functioncall}{exp \ter{:} name args} \end{Produc}% can be used to call ``methods''. A call \verb|v:name(...)| is syntactic sugar for \verb|v.name(v, ...)|, except that \verb|v| is evaluated only once. Arguments have the following syntax: \begin{Produc} \produc{args}{\ter{(} \opt{explist1} \ter{)}} \produc{args}{tableconstructor} \produc{args}{literal} \end{Produc}% All argument expressions are evaluated before the call. A call of the form \verb|f{...}| is syntactic sugar for \verb|f({...})|, that is, the argument list is a single new table. A call of the form \verb|f'...'| (or \verb|f"..."| or \verb|f[[...]]|) is syntactic sugar for \verb|f('...')|, that is, the argument list is a single literal string. Because a function can return any number of results \see{return}, the number of results must be adjusted before they are used. If the function is called as a statement \see{funcstat}, then its return list is adjusted to~0, thus discarding all returned values. If the function is called inside another expression, or in the middle of a list of expressions, then its return list is adjusted to~1, thus discarding all returned values but the first one. If the function is called as the last element of a list of expressions, then no adjustment is made (unless the call is enclosed in parentheses). Here are some examples: \begin{verbatim} f() -- adjusted to 0 results g(f(), x) -- f() is adjusted to 1 result g(x, f()) -- g gets x plus all values returned by f() a,b,c = f(), x -- f() is adjusted to 1 result (and c gets nil) a,b,c = x, f() -- f() is adjusted to 2 a,b,c = f() -- f() is adjusted to 3 return f() -- returns all values returned by f() return x,y,f() -- returns x, y, and all values returned by f() {f()} -- creates a list with all values returned by f() {f(), nil} -- f() is adjusted to 1 result \end{verbatim} If you enclose a function call in parentheses, then it is adjusted to return exactly one value: \begin{verbatim} return x,y,(f()) -- returns x, y, and the first value from f() {(f())} -- creates a table with exactly one element \end{verbatim} \subsubsection{\Index{Function Definitions}} \label{func-def} The syntax for function definition is \begin{Produc} \produc{function}{\rwd{function} \ter{(} \opt{parlist1} \ter{)} block \rwd{end}} \produc{stat}{\rwd{function} funcname \ter{(} \opt{parlist1} \ter{)} block \rwd{end}} \produc{funcname}{name \rep{\ter{.} name} \opt{\ter{:} name}} \end{Produc}% The statement \begin{verbatim} function f () ... end \end{verbatim} is syntactic sugar for \begin{verbatim} f = function () ... end \end{verbatim} and the statement \begin{verbatim} function t.a.b.c.f () ... end \end{verbatim} is syntactic sugar for \begin{verbatim} t.a.b.c.f = function () ... end \end{verbatim} A function definition is an executable expression, whose value has type \emph{function}. When Lua pre-compiles a chunk, all its function bodies are pre-compiled too. Then, whenever Lua executes the function definition, the function is \emph{instantiated} (or \emph{closed}). This function instance (or \emph{closure}) is the final value of the expression. Different instances of the same function may refer to different non-local variables \see{visibility}. Parameters act as local variables, initialized with the argument values: \begin{Produc} \produc{parlist1}{\ter{\ldots}} \produc{parlist1}{namelist \opt{\ter{,} \ter{\ldots}}} \end{Produc}% \label{vararg}% When a function is called, the list of \Index{arguments} is adjusted to the length of the list of parameters, unless the function is a \Def{vararg function}, which is indicated by three dots (`\verb|...|') at the end of its parameter list. A vararg function does not adjust its argument list; instead, it collects all extra arguments into an implicit parameter, called \IndexLIB{arg}. The value of \verb|arg| is a table, with a field~\verb|n| whose value is the number of extra arguments, and the extra arguments at positions 1,~2,~\ldots,~\verb|n|. As an example, consider the following definitions: \begin{verbatim} function f(a, b) end function g(a, b, ...) end function r() return 1,2,3 end \end{verbatim} Then, we have the following mapping from arguments to parameters: \begin{verbatim} CALL PARAMETERS f(3) a=3, b=nil f(3, 4) a=3, b=4 f(3, 4, 5) a=3, b=4 f(r(), 10) a=1, b=10 f(r()) a=1, b=2 g(3) a=3, b=nil, arg={n=0} g(3, 4) a=3, b=4, arg={n=0} g(3, 4, 5, 8) a=3, b=4, arg={5, 8; n=2} g(5, r()) a=5, b=1, arg={2, 3; n=2} \end{verbatim} Results are returned using the \rwd{return} statement \see{return}. If control reaches the end of a function without encountering a \rwd{return} statement, then the function returns with no results. The \emph{colon} syntax is used for defining \IndexEmph{methods}, that is, functions that have an implicit extra parameter \IndexVerb{self}. Thus, the statement \begin{verbatim} function t.a.b.c:f (...) ... end \end{verbatim} is syntactic sugar for \begin{verbatim} t.a.b.c.f = function (self, ...) ... end \end{verbatim} \subsection{Visibility Rules} \label{visibility} \index{visibility} Lua is a lexically scoped language. The scope of local variables begins at the first statement \emph{after} their declaration and lasts until the end of the innermost block that includes the declaration. For instance: \begin{verbatim} x = 10 -- global variable do -- new block local x = x -- new `x', with value 10 print(x) --> 10 x = x+1 do -- another block local x = x+1 -- another x print(x) --> 12 end print(x) --> 11 end print(x) --> 10 (the global one) \end{verbatim} Notice that, in a declaration like \verb|local x = x|, the new \verb|x| being declared is not in scope yet, so the second \verb|x| refers to the ``outside'' variable. Because of this \Index{lexical scoping} rules, local variables can be freely accessed by functions defined inside their scope. For instance: \begin{verbatim} local counter = 0 function inc (x) counter = counter + x return counter end \end{verbatim} Notice that each execution of a \rwd{local} statement ``creates'' new local variables. Consider the following example: \begin{verbatim} a = {} local x = 20 for i=1,10 do local y = 0 a[i] = function () y=y+1; return x+y end end \end{verbatim} In that code, each function uses a different \verb|y| variable, while all of them share the same \verb|x|. \subsection{Error Handling} \label{error} Because Lua is an extension language, all Lua actions start from C~code in the host program calling a function from the Lua library. Whenever an error occurs during Lua compilation or execution, the function \verb|_ERRORMESSAGE| is called \DefLIB{_ERRORMESSAGE} (provided it is different from \nil), and then the corresponding function from the library (\verb|lua_dofile|, \verb|lua_dostring|, \verb|lua_dobuffer|, or \verb|lua_call|) is terminated, returning an error condition. Memory allocation errors are an exception to the previous rule. When memory allocation fails, Lua may not be able to execute the \verb|_ERRORMESSAGE| function. So, for this kind of error, Lua does not call the \verb|_ERRORMESSAGE| function; instead, the corresponding function from the library returns immediately with a special error code (\verb|LUA_ERRMEM|). This and other error codes are defined in \verb|lua.h| \see{luado}. The only argument to \verb|_ERRORMESSAGE| is a string describing the error. The default definition for this function calls \verb|_ALERT|, \DefLIB{_ALERT} which prints the message to \verb|stderr| \see{alert}. The standard I/O library redefines \verb|_ERRORMESSAGE| and uses the debug interface \see{debugI} to print some extra information, such as a call-stack traceback. Lua code can explicitly generate an error by calling the function \verb|error| \see{pdf-error}. Lua code can ``catch'' an error using the function \verb|call| \see{pdf-call}. \subsection{Tag Methods} \label{tag-method} A \IndexEmph{tag method} is a programmer-defined function that defines how Lua operations act over user-defined types (and, sometimes, over basic types as well). An \Def{event} is any operation that may invoke a tag method. Lua selects the tag method called for any specific event according to the types of the values involved in the event \see{TypesSec}. The function \IndexLIB{settagmethod} changes the tag method associated with a given (\M{type}, \M{event}) pair. The first parameter to \verb|settagmethod| is the type (represented by its name or tag), the second parameter is the event name (a string; see below), and the third parameter is the new method (a function), or \nil\ to restore the default behavior for the pair. A companion function \IndexLIB{gettagmethod} receives a type and an event name and returns the current method associated to them. Tag methods are called in the following events, identified by the given names. The semantics of tag methods is better explained by a Lua function describing the behavior of the interpreter at each event. Each event-handler function shows how a tag method is called, its arguments (that is, its signature), its results, and the default behavior in the absence of a tag method. The code shown here in Lua is only illustrative; the real behavior is hard coded in the interpreter, and it is much more efficient than this simulation. All functions used in these descriptions (\verb|rawget|, \verb|tonumber|, \verb|call|, etc.)\ are described in \See{predefined}. \begin{description} \item[``add'':]\IndexTM{add} called when a \verb|+| operation is applied to non-numerical operands. The function \verb|getbinmethod| below defines how Lua chooses a tag method for a binary operation. First, Lua tries the first operand. If its type does not define a tag method for the operation, then Lua tries the second operand. If it also fails, then it gets a tag method from tag~0. \begin{verbatim} function getbinmethod (op1, op2, event) return gettagmethod(tag(op1), event) or gettagmethod(tag(op2), event) or gettagmethod(0, event) end \end{verbatim} Using this function, the tag method for the ``add'' event is \begin{verbatim} function add_event (op1, op2) local o1, o2 = tonumber(op1), tonumber(op2) if o1 and o2 then -- both operands are numeric return o1+o2 -- '+' here is the primitive 'add' else -- at least one of the operands is not numeric local tm = getbinmethod(op1, op2, "add") if tm then -- call the method with both operands return tm(op1, op2) else -- no tag method available: default behavior error("unexpected type at arithmetic operation") end end end \end{verbatim} \item[``sub'':]\IndexTM{sub} called when a \verb|-| operation is applied to non-numerical operands. Behavior similar to the ``add'' event. \item[``mul'':]\IndexTM{mul} called when a \verb|*| operation is applied to non-numerical operands. Behavior similar to the ``add'' event. \item[``div'':]\IndexTM{div} called when a \verb|/| operation is applied to non-numerical operands. Behavior similar to the ``add'' event. \item[``pow'':]\IndexTM{pow} called when a \verb|^| operation (exponentiation) is applied, even for numerical operands. \begin{verbatim} function pow_event (op1, op2) local tm = getbinmethod(op1, op2, "pow") if tm then -- call the method with both operands return tm(op1, op2) else -- no tag method available: default behavior error("unexpected type at arithmetic operation") end end \end{verbatim} \item[``unm'':]\IndexTM{unm} called when a unary \verb|-| operation is applied to a non-numerical operand. \begin{verbatim} function unm_event (op) local o = tonumber(op) if o then -- operand is numeric return -o -- '-' here is the primitive 'unm' else -- the operand is not numeric. -- Try to get a tag method from the operand; -- if it does not have one, try a "global" one (tag 0) local tm = gettagmethod(tag(op), "unm") or gettagmethod(0, "unm") if tm then -- call the method with the operand and nil return tm(op, nil) else -- no tag method available: default behavior error("unexpected type at arithmetic operation") end end end \end{verbatim} \item[``lt'':]\IndexTM{lt} called when an order operation is applied to non-numerical or non-string operands. It corresponds to the \verb|<| operator. \begin{verbatim} function lt_event (op1, op2) if type(op1) == "number" and type(op2) == "number" then return op1 < op2 -- numeric comparison elseif type(op1) == "string" and type(op2) == "string" then return op1 < op2 -- lexicographic comparison else local tm = getbinmethod(op1, op2, "lt") if tm then return tm(op1, op2) else error("unexpected type at comparison"); end end end \end{verbatim} The other order operators use the \verb|"lt"| tag method according to the usual equivalences: \begin{verbatim} a>b <=> b not (b=b <=> not (a