# # Run this Tcl script to generate the sqlite.html file. # set rcsid {$Id: opcode.tcl,v 1.1 2000/06/09 01:58:37 drh Exp $} puts { SQLite Virtual Machine Opcodes

SQLite Virtual Machine Opcodes

} puts "

(This page was last modified on [lrange $rcsid 3 4] GMT)

" set fd [open [lindex $argv 0] r] set file [read $fd [file size [lindex $argv 0]]] close $fd set current_op {} foreach line [split $file \n] { set line [string trim $line] if {[string index $line 1]!="*"} { set current_op {} continue } if {[regexp {^/\* Opcode: } $line]} { set current_op [lindex $line 2] set Opcode($current_op:args) [lrange $line 3 end] lappend OpcodeList $current_op continue } if {$current_op==""} continue if {[regexp {^\*/} $line]} { set current_op {} continue } set line [string trim [string range $line 3 end]] if {$line==""} { append Opcode($current_op:text) \n

} else { append Opcode($current_op:text) \n$line } } unset file puts {

Introduction

In order to execute an SQL statement, the SQLite library first parses the SQL, analyzes the statement, then generates a short program to execute the statement. The program is generated for a "virtual machine" implemented by the SQLite library. The document describes the operation of that virtual machine.

The source code to the virtual machine is in the vdbe.c source file. All of the opcode definitions further down in this document are contained in comments in the source file. In fact, the opcode table in this document was generated by scanning the vdbe.c source file and extracting the necessary information from comments. So the source code comments are really the canonical source of information about the virtual macchine. When in doubt, refer to the source code.

Each instruction in the virtual machine consists of an opcode and up to three operands named P1, P2 and P3. P1 may be an arbitrary integer. P2 must be a non-negative integer. P2 is always the jump destination in any operation that might cause a jump. P3 is a null-terminated string or NULL. Some operators use all three operands. Some use one or two. Some operators use none of the operands.

The virtual machine begins execution on instruction number 0. Execution continues until (1) a Halt instruction is seen, or (2) the program counter becomes one greater than the address of last instruction, or (3) there is an execution error. When the virtual machine halts, all memory that it allocated is released and all database files it may have had open are closed.

The virtual machine also contains an operand stack of unlimited depth. Many of the opcodes use operands from the stack. The details are described in the descriptions of each opcode.

The virtual machine can have zero or more cursors. Each cursor is a pointer into a single GDBM file. There can be multiple cursors pointing at the same file. All cursors operate independenly. The only way for the virtual machine to interact with a GDBM file is through a cursor. Instructions in the virtual machine can create a new cursor (Open), read data from a cursor (Field), advance the cursor to the next entry in the GDBM file (Next), and many other operations. All cursors are automatically closed when the virtual machine terminates.

The virtual machine contains an arbitrary number of fixed memory locations with addresses beginning at zero and growing upward. Each memory location can hold an arbitrary string. The memory cells are typically used to hold the result of a scalar SELECT that is part of a larger expression.

The virtual machine contains an arbitrary number of sorters. Each sorter is able to accumulate records, sort those records, then play the records back in sorted order. Sorters are used to implement the ORDER BY clause of a SELECT statement. The fact that the virtual machine allows multiple sorters is an historical accident. In practice no more than one sorter (sorter number 0) ever gets used.

The virtual machine may contain an arbitrary number of "Lists". Each list stores a list of integers. Lists are used to hold the GDBM keys for records of a GDBM file that needs to be modified. The WHERE clause of an UPDATE or DELETE statement scans through the table and writes the GDBM key of every record to be modified into a list. Then the list is played back and the table is modified in a separate step. It is necessary to do this in two steps since making a change to a GDBM file can alter the scan order.

The virtual machine can contain an arbitrary number of "Sets". Each set holds an arbitrary number of strings. Sets are used to implement the IN operator with a constant right-hand side.

The virtual machine can open a single external file for reading. This external read file is used to implement the COPY command.

Finally, the virtual machine can have a single set of aggregators. An aggregator is a device used to implement the GROUP BY clause of a SELECT. An aggregator has one or more slots that can hold values being extracted by the select. The number of slots is the same for all aggregators and is defined by the AggReset operation. At any point in time a single aggregator is current or "has focus". There are operations to read or write to memory slots of the aggregator in focus. There are also operations to change the focus aggregator and to scan through all aggregators.

Viewing Programs Generated By SQLite

Every SQL statement that SQLite interprets results in a program for the virtual machine. But if you precede the SQL statement with the keyword "EXPLAIN" the virtual machine will not execute the program. Instead, the instructions of the program will be returned like a query result. This feature is useful for debugging and for learning how the virtual machine operates.

You can use the sqlite command-line tool to see the instructions generated by an SQL statement. The following is an example:

} proc Code {body} { puts {
}
  regsub -all {&} [string trim $body] {\&} body
  regsub -all {>} $body {\>} body
  regsub -all {<} $body {\<} body
  regsub -all {\(\(\(} $body {} body
  regsub -all {\)\)\)} $body {} body
  puts $body
  puts {
} } Code { $ (((sqlite ex1))) sqlite> (((.explain))) sqlite> (((explain delete from tbl1 where two<20;))) addr opcode p1 p2 p3 ---- ------------ ----- ----- ------------------------------------- 0 ListOpen 0 0 1 Open 0 1 tbl1 2 Next 0 9 3 Field 0 1 4 Integer 20 0 5 Ge 0 2 6 Key 0 0 7 ListWrite 0 0 8 Goto 0 2 9 Noop 0 0 10 ListRewind 0 0 11 ListRead 0 14 12 Delete 0 0 13 Goto 0 11 14 ListClose 0 0 } puts {

All you have to do is add the "EXPLAIN" keyword to the front of the SQL statement. But if you use the ".explain" command to sqlite first, it will set up the output mode to make the program more easily viewable.

If sqlite has been compiled without the "-DNDEBUG=1" option (that is, with the NDEBUG preprocessor macro not defined) then you can put the SQLite virtual machine in a mode where it will trace its execution by writing messages to standard output. There are special comments to turn tracing on and off. Use the --vdbe-trace-on-- comment to turn tracing on and the --vdbe-trace-off-- comment to turn tracing back off.

The Opcodes

} puts "

There are currently [llength $OpcodeList] opcodes defined by the virtual machine." puts {All currently defined opcodes are described in the table below. This table was generated automatically by scanning the source code from the file vdbe.c.

} puts {

} foreach op [lsort -dictionary $OpcodeList] { puts {" } puts {
Opcode NameDescription
} puts "$op" puts "[string trim $Opcode($op:text)]

} puts {


Back to the SQLite Home Page

}