The stack size is derived from 'stack_last', when needed. Moreover,
the handling of stack sizes is more consistent, always excluding the
extra space except when allocating/deallocating the array.
The previous stackless implementations marked all 'luaV_execute'
invocations as fresh. However, re-entering 'luaV_execute' when
resuming a coroutine should not be a fresh invocation. (It works
because 'unroll' called 'luaV_execute' for each call entry, but
it was slower than letting 'luaV_execute' finish all non-fresh
invocations.)
A "with stack" implementation gains too little in performance to be
worth all the noise from C-stack overflows.
This commit is almost a sketch, to test performance. There are several
pending stuff:
- review control of C-stack overflow and error messages;
- what to do with setcstacklimit;
- review comments;
- review unroll of Lua calls.
The field 'L->oldpc' is not always updated when control returns to a
function; an invalid value can seg. fault when computing 'changedline'.
(One example is an error in a finalizer; control can return to
'luaV_execute' without executing 'luaD_poscall'.) Instead of trying to
fix all possible corner cases, it seems safer to be resilient to invalid
values for 'oldpc'. Valid but wrong values at most cause an extra call
to a line hook.
Errors in finalizers need a valid 'pc' to produce an error message,
even if the error is not propagated. Therefore, calls to the GC (which
may call finalizers) inside luaV_execute must save the 'pc'.
Macro 'checkstackGC' was doing a GC step after resizing the stack;
the GC could shrink the stack and undo the resize. Moreover, macro
'checkstackp' also does a GC step, which could remove the preallocated
CallInfo when calling a function. (Its name has been changed to
'checkstackGCp' to emphasize that it calls the GC.)
Instead of an explicit value (field 'b'), true and false use different
tag variants. This avoids reading an extra field and results in more
direct code. (Most code that uses booleans needs to distinguish between
true and false anyway.)
The difference in performance between immediate operands and K operands
does not seem to justify all those extra opcodes. We only keep OP_ADDI,
due to its ubiquity and because the difference is a little more relevant.
(Later, OP_SUBI will be implemented by OP_ADDI, negating the constant.)
In arithmetic/bitwise operators, the call to metamethods is made
in a separate opcode following the main one. (The main
opcode skips this next one when the operation succeeds.) This
change reduces slightly the size of the binary and the complexity
of the arithmetic/bitwise opcodes. It also simplfies the treatment
of errors and yeld/resume in these operations, as there are much
fewer cases to consider. (Only OP_MMBIN/OP_MMBINI/OP_MMBINK,
instead of all variants of all arithmetic/bitwise operators.)
The family of opcodes OP_ADDK (arithmetic operators with K constant)
were not being handled in 'luaV_finishOp', which completes their
task after an yield.
When initializing a to-be-closed variable, check whether it has a
'__close' metamethod (or is a false value) and raise an error if
if it hasn't. This produces more accurate error messages. (The
check before closing still need to be done: in the C API, the value
is not constant; and the object may lose its '__close' metamethod
during the block.)
Instead of updating 'L->top' in every place that may call a
metamethod, the metamethod functions themselves (luaT_trybinTM and
luaT_callorderTM) correct the top. (When calling metamethods from
the C API, however, the callers must preserve 'L->top'.)
- OP_NEWTABLE can use 'ra + 1' to set top (instead of ci->top);
- OP_CLOSE doesn't need to set top ('Protect' already does that);
- OP_TFORCALL must use 'ProtectNT', to preserve the top already set.
(That was a small bug, because iterators could be called with
extra parameters besides the state and the control variable.)
- Comments and an extra test for the bug in previous item.
Open upvalues are kept alive together with their corresponding
stack. This change makes a simpler and safer fix to the issue in
commit 440a5ee78c, about upvalues in the list of open upvalues
being collected while others are being created. (That previous fix
may not be correct.)
When creating an upvalue, an emergency collection can collect the
previous upvalue where the new one would be linked. The following
code can trigger the bug, using valgrind on Lua compiled with the
-DHARDMEMTESTS option:
local x; local y
(function () return y end)();
(function () return x end)()
This commit brings a new implementation for HARDMEMTESTS, which forces
an emergency GC whenever possible. It also fixes some issues detected
with this option:
- A small bug in lvm.c: a closure could be collected by an emergency
GC while being initialized.
- Some tests: a memory address can be immediatly reused after a GC;
for instance, two consecutive '{}' expressions can return exactly the
same address, if the first one is not anchored.
OP_RETURN must update trap before updating stack. (Bug detected with
-DHARDSTACKTESTS). Also, in 'luaF_close', do not create a variable
with 'uplevel(uv)', as the stack may change and invalidate this
value. (This is not a bug, but could become one if 'upl' was used
again.)
Opcodes OP_NEWTABLE and OP_SETLIST use the same representation to
store the size of the array part of a table. This new representation
can go up to 2^33 (8 + 25 bits).
OP_NEWTABLE is followed by an OP_EXTRAARG, so that it can keep
the exact size of the array part of the table to be created.
(Functions 'luaO_int2fb'/'luaO_fb2int' were removed.)
In the generic for loop, it is simpler for OP_TFORLOOP to use the
same 'ra' as OP_TFORCALL. Moreover, the internal names of the loop
temporaries "(for ...)" don't need to leak internal details (even
because the numerical for loop doesn't have a fixed role for each of
its temporaries).
Ensure that operation macros, such as 'luai_numdiv' and 'luai_numidiv',
operate only on variables, or at most at 's2v(ra)'. ('s2v' is a nop, a
cast from pointer to pointer.)
- In 'readutf8esc' (llex.c), the overflow check must be done before
shifting the accumulator. It was working because tests were using
64-bit longs. Failed with 32-bit longs.
- In OP_FORPREP (lvm.c), avoid negating an unsigned value. Visual
Studio gives a warning for that operation, despite being well
defined in ISO C.
- In 'luaV_execute' (lvm.c), 'cond' can be defined only when needed,
like all other variables.
When calling metamethods for things like 'a < 3.0', which generates
the opcode OP_LTI, the C register tells that the operand was
converted to an integer, so that it can be corrected to float when
calling a metamethod.
This commit also includes some other stuff:
- file 'onelua.c' added to the project
- opcode OP_PREPVARARG renamed to OP_VARARGPREP
- comparison opcodes rewritten through macros
Changed some implementation details; in particular, it is back using
an internal variable to keep the index, with the control variable
being only a copy of that internal variable. (The direct use of
the control variable demands a check of its type for each access,
which offsets the gains from the use of a single variable.)
The numerical 'for' loop over integers now uses a precomputed counter
to control its number of iteractions. This change eliminates several
weird cases caused by overflows (wrap-around) in the control variable.
(It also ensures that every integer loop halts.)
Also, the special opcodes for the usual case of step==1 were removed.
(The new code is already somewhat complex for the usual case,
but efficient.)
The function 'string.gmatch' now has an optional 'init' argument,
similar to 'string.find' and 'string.match'. Moreover, there was
some reorganization in the manipulation of indices in the string
library.
This commit also includes small janitorial work in the manual
and in comments in the interpreter loop.