2000-06-09 05:58:35 +04:00
|
|
|
#
|
|
|
|
# Run this Tcl script to generate the sqlite.html file.
|
|
|
|
#
|
2004-06-16 07:02:00 +04:00
|
|
|
set rcsid {$Id: arch.tcl,v 1.13 2004/06/16 03:02:04 drh Exp $}
|
2004-05-31 19:06:28 +04:00
|
|
|
source common.tcl
|
|
|
|
header {Architecture of SQLite}
|
2000-06-09 05:58:35 +04:00
|
|
|
puts {
|
2004-05-31 19:06:28 +04:00
|
|
|
<h2>The Architecture Of SQLite</h2>
|
|
|
|
|
|
|
|
<h3>Introduction</h3>
|
2000-06-09 05:58:35 +04:00
|
|
|
|
|
|
|
<table align="right" border="1" cellpadding="15" cellspacing="1">
|
|
|
|
<tr><th>Block Diagram Of SQLite</th></tr>
|
2004-02-18 19:56:32 +03:00
|
|
|
<tr><td><img src="arch2.gif"></td></tr>
|
2000-06-09 05:58:35 +04:00
|
|
|
</table>
|
2001-09-20 05:44:42 +04:00
|
|
|
<p>This document describes the architecture of the SQLite library.
|
|
|
|
The information here is useful to those who want to understand or
|
|
|
|
modify the inner workings of SQLite.
|
|
|
|
</p>
|
|
|
|
|
|
|
|
<p>
|
2000-06-09 05:58:35 +04:00
|
|
|
A block diagram showing the main components of SQLite
|
2000-06-09 18:14:32 +04:00
|
|
|
and how they interrelate is shown at the right. The text that
|
2000-06-09 05:58:35 +04:00
|
|
|
follows will provide a quick overview of each of these components.
|
|
|
|
</p>
|
|
|
|
|
2004-05-31 19:06:28 +04:00
|
|
|
<h3>History</h3>
|
2004-02-18 19:56:32 +03:00
|
|
|
|
|
|
|
<p>
|
|
|
|
There are two main C interfaces to the SQLite library:
|
|
|
|
<b>sqlite_exec()</b> and <b>sqlite_compile()</b>. Prior to
|
|
|
|
version 2.8.0 (2003-Feb-16) only sqlite_exec() was supported.
|
|
|
|
For version 2.8.0, the sqlite_exec and sqlite_compile methods
|
|
|
|
existed as peers. Beginning with version 2.8.13, the sqlite_compile
|
|
|
|
method is the primary interface, and sqlite_exec is implemented
|
2004-03-14 14:57:58 +03:00
|
|
|
using sqlite_compile. Externally, this change is an enhancement
|
|
|
|
that maintains backwards compatibility. But internally,
|
2004-02-18 19:56:32 +03:00
|
|
|
the plumbing is very different. The diagram at the right shows
|
|
|
|
the structure of SQLite for version 2.8.13 and following.
|
|
|
|
</p>
|
|
|
|
|
2004-06-16 07:02:00 +04:00
|
|
|
<p>
|
|
|
|
This document describes the structure for SQLite version 2.X.
|
|
|
|
SQLite version 3.0.0 introduces many new features and capabilities.
|
|
|
|
The basic architecture of the library remains the same. However,
|
|
|
|
some of the details described here are different. For example,
|
|
|
|
the code was in the file <b>os.c</b> has now been split out into
|
|
|
|
several file, on for each operating system. And
|
|
|
|
the prefix on the names of API routines changed from <b>sqlite_</b>
|
|
|
|
to <b>sqlite3_</b>.
|
|
|
|
</p>
|
|
|
|
|
2004-05-31 19:06:28 +04:00
|
|
|
<h3>Interface</h3>
|
2000-06-09 05:58:35 +04:00
|
|
|
|
2004-02-18 19:56:32 +03:00
|
|
|
<p>Much of the public interface to the SQLite library is implemented by
|
|
|
|
functions found in the <b>main.c</b> source file though some routines are
|
|
|
|
scattered about in other files where they can have access to data
|
|
|
|
structures with file scope. The
|
2000-09-29 17:30:53 +04:00
|
|
|
<b>sqlite_get_table()</b> routine is implemented in <b>table.c</b>.
|
2004-02-18 19:56:32 +03:00
|
|
|
<b>sqlite_step()</b> is found in <b>vdbe.c</b>.
|
|
|
|
<b>sqlite_mprintf()</b> is found in <b>printf.c</b>.
|
2000-09-29 17:30:53 +04:00
|
|
|
The Tcl interface is implemented by <b>tclsqlite.c</b>. More
|
2000-06-09 05:58:35 +04:00
|
|
|
information on the C interface to SQLite is
|
|
|
|
<a href="c_interface.html">available separately</a>.<p>
|
|
|
|
|
|
|
|
<p>To avoid name collisions with other software, all external
|
|
|
|
symbols in the SQLite library begin with the prefix <b>sqlite</b>.
|
2001-09-20 05:44:42 +04:00
|
|
|
Those symbols that are intended for external use (in other words,
|
|
|
|
those symbols which form the API for SQLite) begin
|
2000-06-09 05:58:35 +04:00
|
|
|
with <b>sqlite_</b>.</p>
|
|
|
|
|
2004-06-16 07:02:00 +04:00
|
|
|
<h3>SQL Command Processor</h3>
|
2004-02-18 19:56:32 +03:00
|
|
|
|
|
|
|
<p>
|
|
|
|
|
2004-05-31 19:06:28 +04:00
|
|
|
<h3>Tokenizer</h3>
|
2000-06-09 05:58:35 +04:00
|
|
|
|
|
|
|
<p>When a string containing SQL statements is to be executed, the
|
|
|
|
interface passes that string to the tokenizer. The job of the tokenizer
|
|
|
|
is to break the original string up into tokens and pass those tokens
|
|
|
|
one by one to the parser. The tokenizer is hand-coded in C.
|
2001-09-20 05:44:42 +04:00
|
|
|
All of the code for the tokenizer
|
2000-06-09 05:58:35 +04:00
|
|
|
is contained in the <b>tokenize.c</b> source file.</p>
|
|
|
|
|
|
|
|
<p>Note that in this design, the tokenizer calls the parser. People
|
|
|
|
who are familiar with YACC and BISON may be used to doing things the
|
2001-09-20 05:44:42 +04:00
|
|
|
other way around -- having the parser call the tokenizer. The author
|
|
|
|
of SQLite
|
|
|
|
has done it both ways and finds things generally work out nicer for
|
2000-06-09 05:58:35 +04:00
|
|
|
the tokenizer to call the parser. YACC has it backwards.</p>
|
|
|
|
|
2004-05-31 19:06:28 +04:00
|
|
|
<h3>Parser</h3>
|
2000-06-09 05:58:35 +04:00
|
|
|
|
|
|
|
<p>The parser is the piece that assigns meaning to tokens based on
|
|
|
|
their context. The parser for SQLite is generated using the
|
|
|
|
<a href="http://www.hwaci.com/sw/lemon/">Lemon</a> LALR(1) parser
|
2003-03-19 06:14:00 +03:00
|
|
|
generator. Lemon does the same job as YACC/BISON, but it uses
|
2000-06-09 18:14:32 +04:00
|
|
|
a different input syntax which is less error-prone.
|
2000-06-09 05:58:35 +04:00
|
|
|
Lemon also generates a parser which is reentrant and thread-safe.
|
|
|
|
And lemon defines the concept of a non-terminal destructor so
|
|
|
|
that it does not leak memory when syntax errors are encountered.
|
|
|
|
The source file that drives Lemon is found in <b>parse.y</b>.</p>
|
|
|
|
|
|
|
|
<p>Because
|
|
|
|
lemon is a program not normally found on development machines, the
|
|
|
|
complete source code to lemon (just one C file) is included in the
|
|
|
|
SQLite distribution in the "tool" subdirectory. Documentation on
|
|
|
|
lemon is found in the "doc" subdirectory of the distribution.
|
|
|
|
</p>
|
|
|
|
|
2004-05-31 19:06:28 +04:00
|
|
|
<h3>Code Generator</h3>
|
2000-06-09 05:58:35 +04:00
|
|
|
|
2000-06-09 18:14:32 +04:00
|
|
|
<p>After the parser assembles tokens into complete SQL statements,
|
2000-06-09 05:58:35 +04:00
|
|
|
it calls the code generator to produce virtual machine code that
|
2004-03-14 14:57:58 +03:00
|
|
|
will do the work that the SQL statements request. There are many
|
|
|
|
files in the code generator: <b>build.c</b>, <b>copy.c</b>,
|
|
|
|
<b>delete.c</b>,
|
|
|
|
<b>expr.c</b>, <b>insert.c</b>, <b>pragma.c</b>,
|
|
|
|
<b>select.c</b>, <b>trigger.c</b>, <b>update.c</b>, <b>vacuum.c</b>
|
2000-06-09 18:14:32 +04:00
|
|
|
and <b>where.c</b>.
|
|
|
|
In these files is where most of the serious magic happens.
|
|
|
|
<b>expr.c</b> handles code generation for expressions.
|
|
|
|
<b>where.c</b> handles code generation for WHERE clauses on
|
2004-03-14 14:57:58 +03:00
|
|
|
SELECT, UPDATE and DELETE statements. The files <b>copy.c</b>,
|
|
|
|
<b>delete.c</b>, <b>insert.c</b>, <b>select.c</b>, <b>trigger.c</b>
|
|
|
|
<b>update.c</b>, and <b>vacuum.c</b> handle the code generation
|
|
|
|
for SQL statements with the same names. (Each of these files calls routines
|
2000-06-09 18:14:32 +04:00
|
|
|
in <b>expr.c</b> and <b>where.c</b> as necessary.) All other
|
|
|
|
SQL statements are coded out of <b>build.c</b>.</p>
|
2000-06-09 05:58:35 +04:00
|
|
|
|
2004-05-31 19:06:28 +04:00
|
|
|
<h3>Virtual Machine</h3>
|
2000-06-09 05:58:35 +04:00
|
|
|
|
|
|
|
<p>The program generated by the code generator is executed by
|
|
|
|
the virtual machine. Additional information about the virtual
|
|
|
|
machine is <a href="opcode.html">available separately</a>.
|
|
|
|
To summarize, the virtual machine implements an abstract computing
|
|
|
|
engine specifically designed to manipulate database files. The
|
2000-06-09 18:14:32 +04:00
|
|
|
machine has a stack which is used for intermediate storage.
|
|
|
|
Each instruction contains an opcode and
|
2000-06-09 05:58:35 +04:00
|
|
|
up to three additional operands.</p>
|
|
|
|
|
2004-03-14 14:57:58 +03:00
|
|
|
<p>The virtual machine itself is entirely contained in a single
|
2000-06-09 05:58:35 +04:00
|
|
|
source file <b>vdbe.c</b>. The virtual machine also has
|
2004-03-14 14:57:58 +03:00
|
|
|
its own header files: <b>vdbe.h</b> that defines an interface
|
|
|
|
between the virtual machine and the rest of the SQLite library and
|
|
|
|
<b>vdbeInt.h</b> which defines structure private the virtual machine.
|
|
|
|
The <b>vdbeaux.c</b> file contains utilities used by the virtual
|
|
|
|
machine and interface modules used by the rest of the library to
|
|
|
|
construct VM programs.</p>
|
2000-06-09 05:58:35 +04:00
|
|
|
|
2004-05-31 19:06:28 +04:00
|
|
|
<h3>Backend</h3>
|
2004-02-18 19:56:32 +03:00
|
|
|
|
|
|
|
<p>The backend is an abstraction layer that presents a uniform interface
|
|
|
|
to the virtual machine for either the B-Tree drivers for disk-based
|
|
|
|
databases or the Red/Black Tree driver for in-memory databases.
|
|
|
|
The <b>btree.h</b> source file contains the details.</p>
|
|
|
|
|
2004-05-31 19:06:28 +04:00
|
|
|
<h3>Red/Black Tree</h3>
|
2004-02-18 19:56:32 +03:00
|
|
|
|
2004-06-16 07:02:00 +04:00
|
|
|
<p>In SQLite version 2.X, in-memory databases are stored in a red/black
|
|
|
|
tree implementation contain in the <b>btree_rb.c</b> source file. In
|
|
|
|
version 3.0, the red/black tree implementation is omitted. In-memory
|
|
|
|
databases use the same B-Tree and pager code with a special flag that
|
|
|
|
tells the pager to maintain all data in cache and never write to the
|
|
|
|
disk.
|
2004-02-18 19:56:32 +03:00
|
|
|
</p>
|
|
|
|
|
2004-05-31 19:06:28 +04:00
|
|
|
<h3>B-Tree</h3>
|
2000-06-09 05:58:35 +04:00
|
|
|
|
2001-09-16 04:13:26 +04:00
|
|
|
<p>An SQLite database is maintained on disk using a B-tree implementation
|
|
|
|
found in the <b>btree.c</b> source file. A separate B-tree is used for
|
2001-09-29 03:11:24 +04:00
|
|
|
each table and index in the database. All B-trees are stored in the
|
2004-06-16 07:02:00 +04:00
|
|
|
same disk file.</p>
|
2000-06-09 05:58:35 +04:00
|
|
|
|
2001-09-20 05:44:42 +04:00
|
|
|
<p>The interface to the B-tree subsystem is defined by the header file
|
|
|
|
<b>btree.h</b>.
|
|
|
|
</p>
|
|
|
|
|
2004-05-31 19:06:28 +04:00
|
|
|
<h3>Page Cache</h3>
|
2001-09-16 04:13:26 +04:00
|
|
|
|
2001-09-20 05:44:42 +04:00
|
|
|
<p>The B-tree module requests information from the disk in 1024 byte
|
|
|
|
chunks. The page cache is reponsible for reading, writing, and
|
2002-08-06 16:04:59 +04:00
|
|
|
caching these chunks.
|
2001-09-20 05:44:42 +04:00
|
|
|
The page cache also provides the rollback and atomic commit abstraction
|
2001-09-16 04:13:26 +04:00
|
|
|
and takes care of reader/writer locking of the database file. The
|
|
|
|
B-tree driver requests particular pages from the page cache and notifies
|
2001-09-29 03:11:24 +04:00
|
|
|
the page cache when it wants to modify pages or commit or rollback
|
2001-09-16 04:13:26 +04:00
|
|
|
changes and the page cache handles all the messy details of making sure
|
|
|
|
the requests are handled quickly, safely, and efficiently.</p>
|
2001-09-20 05:44:42 +04:00
|
|
|
|
|
|
|
<p>The code to implement the page cache is contained in the single C
|
|
|
|
source file <b>pager.c</b>. The interface to the page cache subsystem
|
|
|
|
is defined by the header file <b>pager.h</b>.
|
|
|
|
</p>
|
|
|
|
|
2004-05-31 19:06:28 +04:00
|
|
|
<h3>OS Interface</h3>
|
2001-09-20 05:44:42 +04:00
|
|
|
|
|
|
|
<p>
|
|
|
|
In order to provide portability between POSIX and Win32 operating systems,
|
2002-08-06 16:04:59 +04:00
|
|
|
SQLite uses an abstraction layer to interface with the operating system.
|
2001-09-20 05:44:42 +04:00
|
|
|
The <b>os.c</b> file contains about 20 routines used for opening and
|
|
|
|
closing files, deleting files, creating and deleting locks on files,
|
|
|
|
flushing the disk cache, and so forth. Each of these functions contains
|
|
|
|
two implementations separated by #ifdefs: one for POSIX and the other
|
|
|
|
for Win32. The interface to the OS abstraction layer is defined by
|
|
|
|
the <b>os.h</b> header file.
|
|
|
|
</p>
|
2000-06-09 05:58:35 +04:00
|
|
|
}
|
2004-05-31 19:06:28 +04:00
|
|
|
footer $rcsid
|