% $Id: manual.tex,v 1.49 2001/01/31 19:53:01 roberto Exp roberto $ \documentclass[11pt]{article} \usepackage{fullpage} \usepackage{bnf} \usepackage{graphicx} %\usepackage{times} \catcode`\_=12 %\newcommand{\See}[1]{Section~\ref{#1}} \newcommand{\See}[1]{\S\ref{#1}} \newcommand{\see}[1]{(see~\See{#1})} \newcommand{\M}[1]{{\rm\emph{#1}}} \newcommand{\T}[1]{{\tt #1}} \newcommand{\Math}[1]{$#1$} \newcommand{\nil}{{\bf nil}} \def\tecgraf{{\sf TeC\kern-.21em\lower.7ex\hbox{Graf}}} \newcommand{\Index}[1]{#1\index{#1@{\lowercase{#1}}}} \newcommand{\IndexVerb}[1]{\T{#1}\index{#1@{\tt #1}}} \newcommand{\IndexEmph}[1]{\emph{#1}\index{#1@{\lowercase{#1}}}} \newcommand{\IndexTM}[1]{\index{#1 event@{``#1'' event}}\index{tag method!#1}} \newcommand{\Def}[1]{\emph{#1}\index{#1}} \newcommand{\IndexAPI}[1]{\T{#1}\DefAPI{#1}} \newcommand{\IndexLIB}[1]{\T{#1}\DefLIB{#1}} \newcommand{\DefLIB}[1]{\index{#1@{\tt #1}}} \newcommand{\DefAPI}[1]{\index{C API!#1@{\tt #1}}} \newcommand{\ff}{$\bullet$\ } \newcommand{\Version}{4.1 (alpha)} % LHF \renewcommand{\ter}[1]{{\rm`{\tt#1}'}} \newcommand{\NOTE}{\par\medskip\noindent\emph{NOTE}: } \makeindex \begin{document} %{=============================================================== \thispagestyle{empty} \pagestyle{empty} { \parindent=0pt \vglue1.5in {\LARGE\bf The Programming Language Lua} \hfill \vskip4pt \hrule height 4pt width \hsize \vskip4pt \hfill Reference Manual for Lua version \Version \\ \null \hfill Last revised on \today \\ \vfill \centering \includegraphics[width=0.7\textwidth]{nolabel.ps} \vfill \vskip4pt \hrule height 2pt width \hsize } \newpage \begin{quotation} \parskip=10pt \footnotesize \null\vfill \noindent Copyright \copyright\ 1994--2001 TeCGraf, PUC-Rio. All rights reserved. \noindent Permission is hereby granted, without written agreement and without license or royalty fees, to use, copy, modify, translate, and distribute this software and its documentation (hereby called the "package") for any purpose, including commercial applications, subject to the following conditions: \begin{itemize} \item The above copyright notice and this permission notice shall appear in all copies or substantial portions of this package. \item The origin of this package must not be misrepresented; you must not claim that you wrote the original package. If you use this package in a product, an acknowledgment in the product documentation would be greatly appreciated (but it is not required). \item Altered source versions must be plainly marked as such, and must not be misrepresented as being the original package. \end{itemize} The authors specifically disclaim any warranties, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose. The package provided hereunder is on an ``as is'' basis, and the authors have no obligation to provide maintenance, support, updates, enhancements, or modifications. In no event shall TeCGraf, PUC-Rio, or the authors be held liable to any party for direct, indirect, special, incidental, or consequential damages arising out of the use of this package and its documentation. \noindent The Lua language and this implementation have been entirely designed and written by Waldemar Celes, Roberto Ierusalimschy, and Luiz Henrique de Figueiredo at TeCGraf, PUC-Rio in Brazil. \noindent This implementation contains no third-party code. \noindent Copies of this manual can be obtained at \verb|http://www.lua.org|. \bigskip \noindent The Lua logo was designed by A. Nakonechny. Copyright \copyright\ 1998. All rights reserved. \end{quotation} %}=============================================================== \newpage \title{\Large\bf Reference Manual of the Programming Language Lua \Version} \author{% Roberto Ierusalimschy\quad Luiz Henrique de Figueiredo\quad Waldemar Celes \vspace{1.0ex}\\ \smallskip \small\tt lua@tecgraf.puc-rio.br \vspace{2.0ex}\\ %MCC 08/95 --- \tecgraf\ --- Computer Science Department --- PUC-Rio } \date{{\small \tt\$Date: 2001/01/31 19:53:01 $ $}} \maketitle \pagestyle{plain} \pagenumbering{roman} \begin{abstract} \noindent Lua is a powerful, light-weight programming language designed for extending applications. Lua is also frequently used as a general-purpose, stand-alone language. Lua combines simple procedural syntax (similar to Pascal) with powerful data description constructs based on associative arrays and extensible semantics. Lua is dynamically typed, interpreted from bytecodes, and has automatic memory management with garbage collection, making it ideal for configuration, scripting, and rapid prototyping. This document describes version \Version\ of the Lua programming language and the Application Program Interface (API) that allows interaction between Lua programs and their host C~programs. \end{abstract} \def\abstractname{Resumo} \begin{abstract} \noindent Lua \'e uma linguagem de programa\c{c}\~ao poderosa e leve, projetada para estender aplica\c{c}\~oes. Lua tamb\'em \'e frequentemente usada como uma linguagem de prop\'osito geral. Lua combina programa\c{c}\~ao procedural (com sintaxe semelhante \`a de Pascal) com poderosas constru\c{c}\~oes para descri\c{c}\~ao de dados, baseadas em tabelas associativas e sem\^antica extens\'\i vel. Lua \'e tipada dinamicamente, interpretada a partir de \emph{bytecodes}, e tem gerenciamento autom\'atico de mem\'oria com coleta de lixo. Essas caracter\'{\i}sticas fazem de Lua uma linguagem ideal para configura\c{c}\~ao, automa\c{c}\~ao (\emph{scripting}) e prototipagem r\'apida. Este documento descreve a vers\~ao \Version\ da linguagem de programa\c{c}\~ao Lua e a Interface de Programa\c{c}\~ao (API) que permite a intera\c{c}\~ao entre programas Lua e programas C~hospedeiros. \end{abstract} \newpage \null \newpage \tableofcontents \newpage \setcounter{page}{1} \pagestyle{plain} \pagenumbering{arabic} \section{Introduction} Lua is an extension programming language designed to support general procedural programming with data description facilities. Lua is intended to be used as a powerful, light-weight configuration language for any program that needs one. Lua is implemented as a library, written in C. Being an extension language, Lua has no notion of a ``main'' program: it only works \emph{embedded} in a host client, called the \emph{embedding} program. This host program can invoke functions to execute a piece of code in Lua, can write and read Lua variables, and can register C~functions to be called by Lua code. Through the use of C~functions, Lua can be augmented to cope with a wide range of different domains, thus creating customized programming languages sharing a syntactical framework. Lua is free-distribution software, and is provided as usual with no guarantees, as stated in its copyright notice. The implementation described in this manual is available at the following URL's: \begin{verbatim} http://www.lua.org ftp://ftp.lua.org \end{verbatim} Like any other reference manual, this document is dry in places. For a discussion of the decisions behind the design of Lua, see the papers below, which are available at the web site above. \begin{itemize} \item R.~Ierusalimschy, L.~H.~de Figueiredo, and W.~Celes. Lua---an extensible extension language. \emph{Software: Practice \& Experience} {\bf 26} \#6 (1996) 635--652. \item L.~H.~de Figueiredo, R.~Ierusalimschy, and W.~Celes. The design and implementation of a language for extending applications. \emph{Proceedings of XXI Brazilian Seminar on Software and Hardware} (1994) 273--283. \item L.~H.~de Figueiredo, R.~Ierusalimschy, and W.~Celes. Lua: an extensible embedded language. \emph{Dr. Dobb's Journal} {\bf 21} \#12 (Dec 1996) 26--33. \end{itemize} \section{Environment and Chunks} All statements in Lua are executed in a \Def{global environment}. This environment is initialized with a call from the embedding program to \verb|lua_open| and persists until a call to \verb|lua_close|, or the end of the embedding program. If necessary, the host programmer can create multiple independent global environments, and freely switch between them \see{mangstate}. The global environment can be manipulated by Lua code or by the embedding program, which can read and write global variables using API functions from the library that implements Lua. \Index{Global variables} in Lua do not need to be declared. Any variable is assumed to be global unless explicitly declared local \see{localvar}. Before the first assignment, the value of a global variable is \nil\ % (this default can be changed; see \See{tag-method}). A table is used to keep all global names and values (tables are explained in \See{TypesSec}). The unit of execution of Lua is called a \Def{chunk}. A chunk is simply a sequence of statements, which are executed sequentially. Each statement can be optionally followed by a semicolon: \begin{Produc} \produc{chunk}{\rep{stat \opt{\ter{;}}}} \end{Produc}% Statements are described in \See{stats}. (The notation above is the usual extended BNF, in which \rep{\emph{a}} means 0 or more \emph{a}'s, \opt{\emph{a}} means an optional \emph{a}, and \oneormore{\emph{a}} means one or more \emph{a}'s. The complete syntax of Lua is given on page~\pageref{BNF}.) A chunk may be stored in a file or in a string inside the host program. When a chunk is executed, first it is pre-compiled into bytecodes for a virtual machine, and then the statements are executed in sequential order, by simulating the virtual machine. All modifications a chunk effects on the global environment persist after the chunk ends. Chunks may also be pre-compiled into binary form and stored in files; see program \IndexVerb{luac} for details. Text files with chunks and their binary pre-compiled forms are interchangeable. Lua automatically detects the file type and acts accordingly. \index{pre-compilation} \section{\Index{Types and Tags}} \label{TypesSec} Lua is a \emph{dynamically typed language}. This means that variables do not have types; only values do. Therefore, there are no type definitions in the language. All values carry their own type. Besides a type, all values also have a \IndexEmph{tag}. There are six \Index{basic types} in Lua: \Def{nil}, \Def{number}, \Def{string}, \Def{function}, \Def{userdata}, and \Def{table}. \emph{Nil} is the type of the value \nil, whose main property is to be different from any other value. \emph{Number} represents real (double-precision floating-point) numbers, while \emph{string} has the usual meaning. \index{eight-bit clean} Lua is 8-bit clean, and so strings may contain any 8-bit character, including embedded zeros (\verb|'\0'|) \see{lexical}. The \verb|type| function returns a string describing the type of a given value \see{pdf-type}. Functions are considered \emph{first-class values} in Lua. This means that functions can be stored in variables, passed as arguments to other functions, and returned as results. Lua can call (and manipulate) functions written in Lua and functions written in C. The type \emph{userdata} is provided to allow arbitrary \Index{C~pointers} to be stored in Lua variables. This type corresponds to a \verb|void*| and has no pre-defined operations in Lua, except assignment and equality test. However, by using \emph{tag methods}, the programmer can define operations for \emph{userdata} values \see{tag-method}. The type \emph{table} implements \Index{associative arrays}, that is, \Index{arrays} that can be indexed not only with numbers, but with any value (except \nil). Therefore, this type may be used not only to represent ordinary arrays, but also symbol tables, sets, records, graphs, trees, etc. Tables are the main data structuring mechanism in Lua. To represent \Index{records}, Lua uses the field name as an index. The language supports this representation by providing \verb|a.name| as syntactic sugar for \verb|a["name"]|. Tables may also carry \emph{methods}: Because functions are first class values, table fields may contain functions. The form \verb|t:f(x)| is syntactic sugar for \verb|t.f(t,x)|, which calls the method \verb|f| from the table \verb|t| passing the table itself as the first parameter \see{func-def}. Note that tables are \emph{objects}, and not values. Variables do not contain tables, only \emph{references} to them. Assignment, parameter passing, and returns always manipulate references to tables, and do not imply any kind of copy. Moreover, tables must be explicitly created before used \see{tableconstructor}. \subsection{Tags} Each type has a \emph{name}, and a numerical identifier, called a \Index{tag}. Tags are mainly used by C code, to avoid the manipulation of strings. Most operations over types, in the C API, require a tag to identify the type. In Lua, all operations over types work both with type names or tags. \subsection{User-defined Types} Lua programs can create new types, called \Index{User-defined Types}. A user-defined type is always based on a base type, either a table or a userdata. Objects of an extended type have an internal structure identical to the corresponding base type, but may have diferent semantics for each operation. The \verb|newtype| function creates a new type \see{pdf-newtype}. Types created by Lua programs are always based upon tables; types created by C can be based upon tables or upon userdata. The \verb|settagmethod| function defines new semantics for the operations of this new type \see{tag-method}. The \verb|settype| function changes the type of a given object \see{pdf-settype}. \section{Garbage Collection}\label{GC} Lua does automatic memory management. To do that, Lua runs a \Index{garbage collector} from time to time. All objects in Lua are subjected to automatic management: tables, userdata, functions, and strings. Lua uses two numbers to control its garbage-collection cycles. One number counts how many bytes of dynamic memory Lua is using, and the other is a threshold. When the number of bytes crosses the threshold, Lua runs the garbage collector, which reclaims the memory of all ``dead'' objects (that is, objects no longer accessible from Lua). The byte counter is corrected, and then the threshold is reset to twice the value of the byte counter. Through the C API, you can consult those numbers, and change the threshold \see{GC-API}. Setting the threshold to zero actually forces an immediate garbage-collection cycle, while setting it to a huge number stops the garbage collector. Using Lua code you have a more limited control of memory management, through functions \verb|gcinfo| and \verb|collectgarbage|. You can set garbage-collector tag methods for user-defined types based on userdata \see{tag-method}. Lua calls those functions when it is about to free a userdata of the corresponding type. Using this facility, you can coordinate Lua's garbage collection with external resourse management (such as closing files or freeing your own memory). \subsection{Weak Tables}\label{weak-table} A \IndexEmph{weak table} is a table whose elements are \IndexEmph{weak references}. A weak reference is ignored by the garbage collector, so that if the only references to an object are weak references, the garbage collector will collect that object. A weak table can have weak keys, weak values, or both. A table with weak keys allows the collection of its keys, but avoids the collection of its values. A table with both weak keys and weak values allow the collection of both. In any case, if either the key or the value is collected, the whole pair is removed from the table. The weakness of a table is controled by the function \verb|weakmode| \see{weakmode}. \section{The Language} This section describes the lexis, the syntax, and the semantics of Lua. \subsection{Lexical Conventions} \label{lexical} \IndexEmph{Identifiers} in Lua can be any string of letters, digits, and underscores, not beginning with a digit. This coincides with the definition of identifiers in most languages, except that the definition of letter depends on the current locale: Any character considered alphabetic by the current locale can be used in an identifier. The following words are \emph{reserved}, and cannot be used as identifiers: \index{reserved words} \begin{verbatim} and break do else elseif end for function global if in local nil not or repeat return then until while \end{verbatim} (\rwd{global} is reserved for future use.) Lua is a case-sensitive language: \T{and} is a reserved word, but \T{And} and \T{\'and} (if the locale permits) are two different, valid identifiers. As a convention, identifiers starting with underscore followed by uppercase letters (such as \verb|_INPUT|) are reserved for internal variables. The following strings denote other \Index{tokens}: \begin{verbatim} + - * / ^ % ~= <= >= < > == = ( ) { } [ ] ; : , . .. ... \end{verbatim} \IndexEmph{Literal strings} can be delimited by matching single or double quotes, and can contain the C-like escape sequences `\verb|\a|' (bell), `\verb|\b|' (backspace), `\verb|\f|' (form feed), `\verb|\n|' (newline), `\verb|\r|' (carriage return), `\verb|\t|' (horizontal tab), `\verb|\v|' (vertical tab), `\verb|\\|' (backslash), `\verb|\"|' (double quote), `\verb|\'|' (single quote), and `\verb|\|\emph{newline}' (that is, a backslash followed by a real newline, which results in a newline in the string). A character in a string may also be specified by its numerical value, through the escape sequence `\verb|\|\emph{ddd}', where \emph{ddd} is a sequence of up to three \emph{decimal} digits. Strings in Lua may contain any 8-bit value, including embedded zeros, which can be specified as `\verb|\000|'. Literal strings can also be delimited by matching \verb|[[| \dots\ \verb|]]|. Literals in this bracketed form may run for several lines, may contain nested \verb|[[| \dots\ \verb|]]| pairs, and do not interpret escape sequences. When the \verb|[[| is immediatly followed by a newline, this newline is not included in the string. This form is specially convenient for writing strings that contain program pieces or other quoted strings. As an example, in a system using ASCII, the following three literals are equivalent: \begin{verbatim} 1) "alo\n123\"" 2) '\97lo\10\04923"' 3) [[alo 123"]] 4) [[ alo 123"]] \end{verbatim} \IndexEmph{Comments} start anywhere outside a string with a double hyphen (\verb|--|) and run until the end of the line. Moreover, the first line of a chunk is skipped if it starts with \verb|#|. This facility allows the use of Lua as a script interpreter in Unix systems \see{lua-sa}. \IndexEmph{Numerical constants} may be written with an optional decimal part and an optional decimal exponent. Examples of valid numerical constants are \begin{verbatim} 3 3.0 3.1416 314.16e-2 0.31416E1 \end{verbatim} \subsection{\Index{Coercion}} \label{coercion} Lua provides some automatic conversions between values at run time. Any arithmetic operation applied to a string tries to convert that string to a number, following the usual rules. Conversely, whenever a number is used when a string is expected, that number is converted to a string, in a reasonable format. The format is chosen so that a conversion from number to string then back to number reproduces the original number \emph{exactly}. Thus, the conversion does not necessarily produces nice-looking text for some numbers. For complete control of how numbers are converted to strings, use the \verb|format| function \see{format}. \subsection{Statements}\label{stats} Lua supports an almost conventional set of \Index{statements}, similar to those in Pascal or C. The conventional commands include assignment, control structures, and procedure calls. Non-conventional commands include table constructors \see{tableconstructor} and local variable declarations \see{localvar}. \subsubsection{Blocks} A \Index{block} is a list of statements; syntactically, a block is equal to a chunk: \begin{Produc} \produc{block}{chunk} \end{Produc}% A block may be explicitly delimited: \begin{Produc} \produc{stat}{\rwd{do} block \rwd{end}} \end{Produc}% Explicit blocks are useful to control the scope of local variables \see{localvar}. Explicit blocks are also sometimes used to add a \rwd{return} or \rwd{break} statement in the middle of another block \see{control}. \subsubsection{\Index{Assignment}} \label{assignment} Lua allows \Index{multiple assignment}. Therefore, the syntax for assignment defines a list of variables on the left side and a list of expressions on the right side. The elements in both lists are separated by commas: \begin{Produc} \produc{stat}{varlist1 \ter{=} explist1} \produc{varlist1}{var \rep{\ter{,} var}} \end{Produc}% This statement first evaluates all values on the right side and eventual indices on the left side, and then makes the assignments. So, the code \begin{verbatim} i = 3 i, a[i] = 4, 20 \end{verbatim} sets \verb|a[3]| to 20, but does not affect \verb|a[4]| because the \verb|i| in \verb|a[i]| is evaluated before it is assigned \verb|4|. Multiple assignment can be used to exchange two values, as in \begin{verbatim} x, y = y, x \end{verbatim} Before the assignment, the list of values is adjusted to the length of the list of variables. If there are more values than are needed, the excess values are thrown away. If there are less values than are needed, the list is extended with as many \nil's as needed. If the list of expressions (\M{explist1}) ends with a function call, all values returned by the function call enter in the list of values, before the adjust. A single name can denote a global variable, a local variable, or a formal parameter: \begin{Produc} \produc{var}{name} \end{Produc}% Square brackets are used to index a table: \begin{Produc} \produc{var}{exp \ter{[} exp \ter{]}} \end{Produc}% The first expression (\M{exp}) should result in a table value, from where the field indexed by the expression \M{exp} value gets the assigned value. The syntax \verb|var.NAME| is just syntactic sugar for \verb|var["NAME"]|: \begin{Produc} \produc{var}{exp \ter{.} name} \end{Produc}% The meaning of assignments and evaluations of global variables and indexed variables can be changed by tag methods \see{tag-method}. Actually, an assignment \verb|x = val|, where \verb|x| is a global variable, is equivalent to a call \verb|setglobal("x", val)| and an assignment \verb|t[i] = val| is equivalent to \verb|settable_event(t,i,val)|. See \See{tag-method} for a complete description of these functions (\verb|setglobal| is in the basic library; \T{settable\_event} is used for explanatory purposes only). \subsubsection{Control Structures}\label{control} The control structures \rwd{if}, \rwd{while}, and \rwd{repeat} have the usual meaning and familiar syntax %(there is also a \rwd{for} statement; see \See{for}): \index{while-do statement} \index{repeat-until statement} \index{if-then-else statement} \begin{Produc} \produc{stat}{\rwd{while} exp \rwd{do} block \rwd{end}} \produc{stat}{\rwd{repeat} block \rwd{until} exp} \produc{stat}{\rwd{if} exp \rwd{then} block \rep{\rwd{elseif} exp \rwd{then} block} \opt{\rwd{else} block} \rwd{end}} \end{Produc}% The \Index{condition expression} \M{exp} of a control structure may return any value. All values different from \nil\ are considered true; only \nil\ is considered false. The \rwd{return} statement is used to return values from a function or from a chunk. \label{return}% \index{return statement}% Because functions or chunks may return more than one value, the syntax for the \rwd{return} statement is \begin{Produc} \produc{stat}{\rwd{return} \opt{explist1}} \end{Produc}% The \rwd{break} statement can be used to terminate the execution of a loop, skipping to the next statement after the loop: \index{break statement} \begin{Produc} \produc{stat}{\rwd{break}} \end{Produc}% A \rwd{break} ends the innermost enclosing loop (\rwd{while}, \rwd{repeat}, or \rwd{for}). \NOTE For syntactic reasons, \rwd{return} and \rwd{break} statements can only be written as the \emph{last} statements of a block. If it is really necessary to \rwd{return} or \rwd{break} in the middle of a block, an explicit inner block can used, as in the idiom `\verb|do return end|', because now \rwd{return} is last statement in the inner block. \subsubsection{For Statement} \label{for}\index{for statement} The \rwd{for} statement has two forms, one for numbers and one for tables. \newpage The numerical \rwd{for} loop has the following syntax: \begin{Produc} \produc{stat}{\rwd{for} name \ter{=} exp \ter{,} exp \opt{\ter{,} exp} \rwd{do} block \rwd{end}} \end{Produc}% A \rwd{for} statement like \begin{verbatim} for var = e1, e2, e3 do block end \end{verbatim} is equivalent to the code: \begin{verbatim} do local var, _limit, _step = tonumber(e1), tonumber(e2), tonumber(e3) if not (var and _limit and _step) then error() end while (_step>0 and var<=_limit) or (_step<=0 and var>=_limit) do block var = var+_step end end \end{verbatim} Note the following: \begin{itemize}\itemsep=0pt \item \verb|_limit| and \verb|_step| are invisible variables. The names are here for explanatory purposes only. \item The behavior is \emph{undefined} if you assign to \verb|var| inside the block. \item If the third expression (the step) is absent, then a step of~1 is used. \item Both the limit and the step are evaluated only once, before the loop starts. \item The variable \verb|var| is local to the statement; you cannot use its value after the \rwd{for} ends. \item You can use \rwd{break} to exit a \rwd{for}. If you need the value of the index, assign it to another variable before breaking. \end{itemize} The table \rwd{for} statement traverses all pairs (index,value) of a given table. It has the following syntax: \begin{Produc} \produc{stat}{\rwd{for} name \ter{,} name \rwd{in} exp \rwd{do} block \rwd{end}} \end{Produc}% A \rwd{for} statement like \begin{verbatim} for index, value in exp do block end \end{verbatim} is equivalent to the code: \begin{verbatim} do local _t = exp local index, value = next(_t, nil) while index do block index, value = next(_t, index) end end \end{verbatim} Note the following: \begin{itemize}\itemsep=0pt \item \verb|_t| is an invisible variable. The name is here for explanatory purposes only. \item The behavior is \emph{undefined} if you assign to \verb|index| inside the block. \item The behavior is \emph{undefined} if you change the table \verb|_t| during the traversal. \item The variables \verb|index| and \verb|value| are local to the statement; you cannot use their values after the \rwd{for} ends. \item You can use \rwd{break} to exit a \rwd{for}. If you need the value of \verb|index| or \verb|value|, assign them to other variables before breaking. \item The order that table elements are traversed is undefined, \emph{even for numerical indices}. If you want to traverse indices in numerical order, use a numerical \rwd{for}. \end{itemize} \subsubsection{Function Calls as Statements} \label{funcstat} Because of possible side-effects, function calls can be executed as statements: \begin{Produc} \produc{stat}{functioncall} \end{Produc}% In this case, all returned values are thrown away. Function calls are explained in \See{functioncall}. \subsubsection{Local Declarations} \label{localvar} \Index{Local variables} may be declared anywhere inside a block. The declaration may include an initial assignment: \begin{Produc} \produc{stat}{\rwd{local} declist \opt{init}} \produc{declist}{name \rep{\ter{,} name}} \produc{init}{\ter{=} explist1} \end{Produc}% If present, an initial assignment has the same semantics of a multiple assignment. Otherwise, all variables are initialized with \nil. A chunk is also a block, and so local variables can be declared outside any explicit block. The scope of local variables begins \emph{after} the declaration and lasts until the end of the block. Thus, the code \verb|local print=print| creates a local variable called \verb|print| whose initial value is that of the \emph{global} variable of the same name. \subsection{\Index{Expressions}} \subsubsection{\Index{Basic Expressions}} The basic expressions in Lua are \begin{Produc} \produc{exp}{\ter{(} exp \ter{)}} \produc{exp}{\rwd{nil}} \produc{exp}{number} \produc{exp}{literal} \produc{exp}{var} \produc{exp}{upvalue} \produc{exp}{function} \produc{exp}{functioncall} \produc{exp}{tableconstructor} \end{Produc}% An expression enclosed in parentheses always results in only one value. Numbers (numerical constants) and literal strings are explained in \See{lexical}; variables are explained in \See{assignment}; upvalues are explained in \See{upvalue}; function definitions are explained in \See{func-def}; function calls are explained in \See{functioncall}. Table constructors are explained in \See{tableconstructor}. An access to a global variable \verb|x| is equivalent to a call \verb|getglobal("x")| and an access to an indexed variable \verb|t[i]| is equivalent to a call \verb|gettable_event(t,i)|. See \See{tag-method} for a description of these functions (\verb|getglobal| is in the basic library; \T{gettable\_event} is used for explanatory purposes only). \subsubsection{Arithmetic Operators} Lua supports the usual \Index{arithmetic operators}: the binary \verb|+| (addition), \verb|-| (subtraction), \verb|*| (multiplication), \verb|/| (division), and \verb|^| (exponentiation); and unary \verb|-| (negation). If the operands are numbers, or strings that can be converted to numbers (according to the rules given in \See{coercion}), then all operations except exponentiation have the usual meaning. Otherwise, an appropriate tag method is called \see{tag-method}. An exponentiation always calls a tag method. The standard mathematical library redefines this method for numbers, giving the expected meaning to \Index{exponentiation} \see{mathlib}. \subsubsection{Relational Operators} The \Index{relational operators} in Lua are \begin{verbatim} == ~= < > <= >= \end{verbatim} These operators return \nil\ as false and a value different from \nil\ as true. Equality (\verb|==|) first compares the tags of its operands. If they are different, then the result is \nil. Otherwise, their values are compared. Numbers and strings are compared in the usual way. Tables, userdata, and functions are compared by reference, that is, two tables are considered equal only if they are the \emph{same} table. Every time you create a new table (or userdata, or function) this new value is different from any previously existing value. The operator \verb|~=| is exactly the negation of equality (\verb|==|). \NOTE The conversion rules of \See{coercion} \emph{do not} apply to equality comparisons. Thus, \verb|"0"==0| evaluates to \emph{false}, and \verb|t[0]| and \verb|t["0"]| denote different entries in a table. \medskip The order operators work as follows. If both arguments are numbers, then they are compared as such. Otherwise, if both arguments are strings, then their values are compared using lexicographical order. Otherwise, the ``lt'' tag method is called \see{tag-method}. \subsubsection{Logical Operators} The \Index{logical operators} in Lua are \index{and}\index{or}\index{not} \begin{verbatim} and or not \end{verbatim} Like the control structures, all logical operators consider \nil\ as false and anything else as true. The conjunction operator \verb|and| returns \nil\ if its first argument is \nil; otherwise, it returns its second argument. The disjunction operator \verb|or| returns its first argument if it is different from \nil; otherwise, it returns its second argument. Both \verb|and| and \verb|or| use \Index{short-cut evaluation}, that is, the second operand is evaluated only if necessary. There are two useful Lua idioms that use logical operators. The first idiom is \begin{verbatim} x = x or v \end{verbatim} which is equivalent to \begin{verbatim} if x == nil then x = v end \end{verbatim} This idiom sets \verb|x| to a default value \verb|v| when \verb|x| is not set. The second idiom is \begin{verbatim} x = a and b or c \end{verbatim} which should be read as \verb|x = (a and b) or c|. This idiom is equivalent to \begin{verbatim} if a then x = b else x = c end \end{verbatim} provided that \verb|b| is not \nil. \subsubsection{Concatenation} \label{concat} The string \Index{concatenation} operator in Lua is denoted by two dots (`\IndexVerb{..}'). If both operands are strings or numbers, then they are converted to strings according to the rules in \See{coercion}. Otherwise, the ``concat'' tag method is called \see{tag-method}. \subsubsection{Precedence} \Index{Operator precedence} in Lua follows the table below, from the lower to the higher priority: \begin{verbatim} and or < > <= >= ~= == .. + - * / not - (unary) ^ \end{verbatim} All binary operators are left associative, except for \verb|^| (exponentiation), which is right associative. \NOTE The pre-compiler may rearrange the order of evaluation of associative or commutative operators, as long as these optimizations do not change normal results. However, these optimizations may change some results if you define non-associative (or non-commutative) tag methods for these operators. \subsubsection{Table Constructors} \label{tableconstructor} Table \Index{constructors} are expressions that create tables; every time a constructor is evaluated, a new table is created. Constructors can be used to create empty tables, or to create a table and initialize some of its fields. The general syntax for constructors is \begin{Produc} \produc{tableconstructor}{\ter{\{} fieldlist \ter{\}}} \produc{fieldlist}{lfieldlist \Or ffieldlist \Or lfieldlist \ter{;} ffieldlist \Or ffieldlist \ter{;} lfieldlist} \produc{lfieldlist}{\opt{explist1 \opt{\ter{,}}}} \produc{ffieldlist}{\opt{ffieldlist1}} \end{Produc}% The form \emph{explist1} is used to initialize lists. The expressions in the list are assigned to consecutive numerical indices, starting with~1. For example, \begin{verbatim} a = {"v1", "v2", 34} \end{verbatim} is equivalent to \begin{verbatim} do local temp = {} temp[1] = "v1" temp[2] = "v2" temp[3] = 34 a = temp end \end{verbatim} If the last expression in the list is a function call, all values returned by the call enter the list \see{functioncall}. The form \emph{ffieldlist1} initializes other fields in a table: \begin{Produc} \produc{ffieldlist1}{ffield \rep{\ter{,} ffield} \opt{\ter{,}}} \produc{ffield}{\ter{[} exp \ter{]} \ter{=} exp \Or name \ter{=} exp} \end{Produc}% For example, \begin{verbatim} a = {[f(k)] = g(y), x = 1, y = 3, [0] = b+c} \end{verbatim} is equivalent to \begin{verbatim} do local temp = {} temp[f(k)] = g(y) temp.x = 1 -- or temp["x"] = 1 temp.y = 3 -- or temp["y"] = 3 temp[0] = b+c a = temp end \end{verbatim} An expression like \verb|{x = 1, y = 4}| is in fact syntactic sugar for \verb|{["x"] = 1, ["y"] = 4}|. Both forms may have an optional trailing comma, and can be used in the same constructor separated by a semi-colon. For example, all forms below are correct. \begin{verbatim} x = {;} x = {"a", "b",} x = {type="list"; "a", "b"} x = {f(0), f(1), f(2),; n=3,} \end{verbatim} \subsubsection{Function Calls} \label{functioncall} A \Index{function call} in Lua has the following syntax: \begin{Produc} \produc{functioncall}{exp args} \end{Produc}% First, \M{exp} and \M{args} are evaluated. If the value of \M{exp} has type \emph{function}, then this function is called, with the given arguments. Otherwise, the ``function'' tag method is called, having as first parameter the value of \M{exp}, followed by the original call arguments \see{tag-method}. The form \begin{Produc} \produc{functioncall}{exp \ter{:} name args} \end{Produc}% can be used to call ``methods''. A call \verb|v:name(...)| is syntactic sugar for \verb|v.name(v, ...)|, except that \verb|v| is evaluated only once. Arguments have the following syntax: \begin{Produc} \produc{args}{\ter{(} \opt{explist1} \ter{)}} \produc{explist1}{\rep{exp \ter{,}} exp} \produc{args}{tableconstructor} \produc{args}{literal} \end{Produc}% All argument expressions are evaluated before the call. A call of the form \verb|f{...}| is syntactic sugar for \verb|f({...})|, that is, the argument list is a single new table. A call of the form \verb|f'...'| (or \verb|f"..."| or \verb|f[[...]]|) is syntactic sugar for \verb|f('...')|, that is, the argument list is a single literal string. Because a function can return any number of results \see{return}, the number of results must be adjusted before they are used. If the function is called as a statement \see{funcstat}, then its return list is adjusted to~0, thus discarding all returned values. If the function is called inside another expression, or in the middle of a list of expressions, then its return list is adjusted to~1, thus discarding all returned values but the first one. If the function is called as the last element of a list of expressions, then no adjustment is made. Here are some examples: \begin{verbatim} f() -- adjusted to 0 results g(f(), x) -- f() is adjusted to 1 result g(x, f()) -- g gets x plus all values returned by f() a,b,c = f(), x -- f() is adjusted to 1 result (and c gets nil) a,b,c = x, f() -- f() is adjusted to 2 a,b,c = f() -- f() is adjusted to 3 return f() -- returns all values returned by f() return x,y,f() -- returns x, y, and all values returned by f() {f()} -- creates a list with all values returned by f() {f(), nil} -- f() is adjusted to 1 result \end{verbatim} If you embrace a function call in parentheses, then it is adjusted to return exactly one value: \begin{verbatim} return x, y, (f()) -- returns x, y, and one value from f() {(f())} -- create a table with exactly one element \end{verbatim} \subsubsection{\Index{Function Definitions}} \label{func-def} The syntax for function definition is \begin{Produc} \produc{function}{\rwd{function} \ter{(} \opt{parlist1} \ter{)} block \rwd{end}} \produc{stat}{\rwd{function} funcname \ter{(} \opt{parlist1} \ter{)} block \rwd{end}} \produc{funcname}{name \rep{\ter{.} name} \opt{\ter{:} name}} \end{Produc}% The statement \begin{verbatim} function f () ... end \end{verbatim} is just syntactic sugar for \begin{verbatim} f = function () ... end \end{verbatim} and the statement \begin{verbatim} function v.c.f () ... end \end{verbatim} is syntactic sugar for \begin{verbatim} v.c.f = function () ... end \end{verbatim} A function definition is an executable expression, whose value has type \emph{function}. When Lua pre-compiles a chunk, all its function bodies are pre-compiled too. Then, whenever Lua executes the function definition, its upvalues are fixed \see{upvalue}, and the function is \emph{instantiated} (or \emph{closed}). This function instance (or \emph{closure}) is the final value of the expression. Different instances of the same function may have different upvalues. Parameters act as local variables, initialized with the argument values: \begin{Produc} \produc{parlist1}{\ter{\ldots}} \produc{parlist1}{name \rep{\ter{,} name} \opt{\ter{,} \ter{\ldots}}} \end{Produc}% \label{vararg}% When a function is called, the list of \Index{arguments} is adjusted to the length of the list of parameters, unless the function is a \Def{vararg function}, which is indicated by three dots (`\verb|...|') at the end of its parameter list. A vararg function does not adjust its argument list; instead, it collects all extra arguments into an implicit parameter, called \IndexLIB{arg}. The value of \verb|arg| is a table, with a field~\verb|n| whose value is the number of extra arguments, and the extra arguments at positions 1,~2,~\ldots,~\verb|n|. As an example, consider the following definitions: \begin{verbatim} function f(a, b) end function g(a, b, ...) end function r() return 1,2,3 end \end{verbatim} Then, we have the following mapping from arguments to parameters: \begin{verbatim} CALL PARAMETERS f(3) a=3, b=nil f(3, 4) a=3, b=4 f(3, 4, 5) a=3, b=4 f(r(), 10) a=1, b=10 f(r()) a=1, b=2 g(3) a=3, b=nil, arg={n=0} g(3, 4) a=3, b=4, arg={n=0} g(3, 4, 5, 8) a=3, b=4, arg={5, 8; n=2} g(5, r()) a=5, b=1, arg={2, 3; n=2} \end{verbatim} Results are returned using the \rwd{return} statement \see{return}. If control reaches the end of a function without encountering a \rwd{return} statement, then the function returns with no results. The \emph{colon} syntax is used for defining \IndexEmph{methods}, that is, functions that have an implicit extra parameter \IndexVerb{self}. The statement \begin{verbatim} function v.c:f (...) ... end \end{verbatim} is just syntactic sugar for \begin{verbatim} v.c.f = function (self, ...) ... end \end{verbatim} Note that the function gets an extra formal parameter called \verb|self|. \subsection{Visibility and Upvalues} \label{upvalue} \index{visibility}\index{upvalues} A function body may refer to its own local variables (which include its parameters) and to global variables, as long as they are not \emph{shadowed} by local variables with the same name from enclosing functions. A function \emph{cannot} access a local variable from an enclosing function, since such variables may no longer exist when the function is called. However, a function may access the \emph{value} of a local variable from an enclosing function, using \emph{upvalues}, whose syntax is \begin{Produc} \produc{upvalue}{\ter{\%} name} \end{Produc}% An upvalue is somewhat similar to a variable expression, but whose value is \emph{frozen} when the function wherein it appears is instantiated. The name used in an upvalue may be the name of any variable visible at the point where the function is defined, that is, global variables and local variables from the \emph{immediately enclosing} function. Note that when the upvalue is a table, only the \emph{reference} to that table (which is the value of the upvalue) is frozen; the table contents can be changed at will. Using table values as upvalues is a technique for having writable but private state attached to functions. Here are some examples: \begin{verbatim} a,b,c = 1,2,3 -- global variables local d function f (x) local b = {} -- x and b are local to f; b shadows the global b local g = function (a) local y -- a and y are local to g p = a -- OK, access local `a' p = c -- OK, access global `c' p = b -- ERROR: cannot access a variable in outer function p = %b -- OK, access frozen value of `b' (local to `f') %b = 3 -- ERROR: cannot change an upvalue %b.x = 3 -- OK, change the table contents p = %c -- OK, access frozen value of global `c' p = %y -- ERROR: `y' is not visible where `g' is defined p = %d -- ERROR: `d' is not visible where `g' is defined end -- g end -- f \end{verbatim} \subsection{Error Handling} \label{error} Because Lua is an extension language, all Lua actions start from C~code in the host program calling a function from the Lua library. Whenever an error occurs during Lua compilation or execution, the function \verb|_ERRORMESSAGE| is called \DefLIB{_ERRORMESSAGE} (provided it is different from \nil), and then the corresponding function from the library (\verb|lua_dofile|, \verb|lua_dostring|, \verb|lua_dobuffer|, or \verb|lua_call|) is terminated, returning an error condition. Memory allocation errors are an exception to the previous rule. When memory allocation fails, Lua may not be able to execute the \verb|_ERRORMESSAGE| function. So, for this kind of error, Lua does not call the \verb|_ERRORMESSAGE| function; instead, the corresponding function from the library returns immediately with a special error code (\verb|LUA_ERRMEM|). This and other error codes are defined in \verb|lua.h|; \See{luado}. The only argument to \verb|_ERRORMESSAGE| is a string describing the error. The default definition for this function calls \verb|_ALERT|, \DefLIB{_ALERT} which prints the message to \verb|stderr| \see{alert}. The standard I/O library redefines \verb|_ERRORMESSAGE| and uses the debug facilities \see{debugI} to print some extra information, such as a call stack traceback. Lua code can explicitly generate an error by calling the function \verb|error| \see{pdf-error}. Lua code can ``catch'' an error using the function \verb|call| \see{pdf-call}. \subsection{Tag Methods} \label{tag-method}\index{tag method} A tag method is a programmer-defined function that defines how Lua operations act over user-defined types (and, sometimes, over basic types as well). An \Def{event} is any operation that may invoke a tag method. Lua selects the tag method called for any specific event according to the types of the values involved in the event \see{TypesSec}. The function \IndexLIB{settagmethod} changes the tag method associated with a given pair \M{(type, event)}. Its first parameter is the type (its name or its tag), the second parameter is the event name (a string; see below), and the third parameter is the new method (a function), or \nil\ to restore the default behavior for the pair. A companion function \IndexLIB{gettagmethod} receives a type and an event name and returns the current method associated with the pair. Tag methods are called in the following events, identified by the given names. The semantics of tag methods is better explained by a Lua function describing the behavior of the interpreter at each event. Each event-handler function shows how a tag method is called, its arguments (that is, its signature), its results, and the default behavior in the absence of a tag method. The code shown here in Lua is only illustrative; the real behavior is hard coded in the interpreter, and it is much more efficient than this simulation. All functions used in these descriptions (\verb|rawget|, \verb|tonumber|, \verb|call|, etc.) are described in \See{predefined}. \begin{description} \item[``add'':]\IndexTM{add} called when a \verb|+| operation is applied to non-numerical operands. The function \verb|getbinmethod| below defines how Lua chooses a tag method for a binary operation. First, Lua tries the first operand. If its type does not define a tag method for the operation, then Lua tries the second operand. If it also fails, then it gets a tag method from tag~0. \begin{verbatim} function getbinmethod (op1, op2, event) return gettagmethod(tag(op1), event) or gettagmethod(tag(op2), event) or gettagmethod(0, event) end \end{verbatim} Using this function, the tag method for the ``add'' event is \begin{verbatim} function add_event (op1, op2) local o1, o2 = tonumber(op1), tonumber(op2) if o1 and o2 then -- both operands are numeric return o1+o2 -- '+' here is the primitive 'add' else -- at least one of the operands is not numeric local tm = getbinmethod(op1, op2, "add") if tm then -- call the method with both operands return tm(op1, op2) else -- no tag method available: default behavior error("unexpected type at arithmetic operation") end end end \end{verbatim} \item[``sub'':]\IndexTM{sub} called when a \verb|-| operation is applied to non-numerical operands. Behavior similar to the ``add'' event. \item[``mul'':]\IndexTM{mul} called when a \verb|*| operation is applied to non-numerical operands. Behavior similar to the ``add'' event. \item[``div'':]\IndexTM{div} called when a \verb|/| operation is applied to non-numerical operands. Behavior similar to the ``add'' event. \item[``pow'':]\IndexTM{pow} called when a \verb|^| operation (exponentiation) is applied, even for numerical operands. \begin{verbatim} function pow_event (op1, op2) local tm = getbinmethod(op1, op2, "pow") if tm then -- call the method with both operands return tm(op1, op2) else -- no tag method available: default behavior error("unexpected type at arithmetic operation") end end \end{verbatim} \item[``unm'':]\IndexTM{unm} called when a unary \verb|-| operation is applied to a non-numerical operand. \begin{verbatim} function unm_event (op) local o = tonumber(op) if o then -- operand is numeric return -o -- '-' here is the primitive 'unm' else -- the operand is not numeric. -- Try to get a tag method from the operand; -- if it does not have one, try a "global" one (tag 0) local tm = gettagmethod(tag(op), "unm") or gettagmethod(0, "unm") if tm then -- call the method with the operand and nil return tm(op, nil) else -- no tag method available: default behavior error("unexpected type at arithmetic operation") end end end \end{verbatim} \item[``lt'':]\IndexTM{lt} called when an order operation is applied to non-numerical or non-string operands. It corresponds to the \verb|<| operator. \begin{verbatim} function lt_event (op1, op2) if type(op1) == "number" and type(op2) == "number" then return op1 < op2 -- numeric comparison elseif type(op1) == "string" and type(op2) == "string" then return op1 < op2 -- lexicographic comparison else local tm = getbinmethod(op1, op2, "lt") if tm then return tm(op1, op2) else error("unexpected type at comparison"); end end end \end{verbatim} The other order operators use the \verb|"lt"| tag method according to the usual equivalences: \begin{verbatim} a>b <=> b not (b=b <=> not (a