sqlite/ext/expert
drh 1935887a68 Ensure that all fields of static sqlite3_module objects are explicitly
initialized, in order to hush-up nuisance compiler warnings.

FossilOrigin-Name: f3b3d712d6e58b1cb8fdebd2b6b3125080b6b3ac8c7c849a8cc1e5e778d62fe7
2023-10-06 12:51:05 +00:00
..
expert1.test Fix handling of columns with names that are SQL keywords in the ".expert" command. 2022-08-10 15:29:21 +00:00
expert.c Fix utility compilation issues with MSVC. 2018-03-07 14:42:17 +00:00
README.md Another minor formatting fix. 2017-04-21 19:58:35 +00:00
sqlite3expert.c Ensure that all fields of static sqlite3_module objects are explicitly 2023-10-06 12:51:05 +00:00
sqlite3expert.h Add header guard to the expert extension. 2019-11-13 18:50:36 +00:00
test_expert.c Do not attempt to build the code in ext/expert/sqlite3expert.c if 2018-01-09 14:30:49 +00:00

SQLite Expert Extension

This folder contains code for a simple system to propose useful indexes given a database and a set of SQL queries. It works as follows:

  1. The user database schema is copied to a temporary database.

  2. All SQL queries are prepared against the temporary database. Information regarding the WHERE and ORDER BY clauses, and other query features that affect index selection are recorded.

  3. The information gathered in step 2 is used to create candidate indexes - indexes that the planner might have made use of in the previous step, had they been available.

  4. A subset of the data in the user database is used to generate statistics for all existing indexes and the candidate indexes generated in step 3 above.

  5. The SQL queries are prepared a second time. If the planner uses any of the indexes created in step 3, they are recommended to the user.

C API

The SQLite expert C API is defined in sqlite3expert.h. Most uses will proceed as follows:

  1. An sqlite3expert object is created by calling sqlite3_expert_new(). A database handle opened by the user is passed as an argument.

  2. The sqlite3expert object is configured with one or more SQL statements by making one or more calls to sqlite3_expert_sql(). Each call may specify a single SQL statement, or multiple statements separated by semi-colons.

  3. Optionally, the sqlite3_expert_config() API may be used to configure the size of the data subset used to generate index statistics. Using a smaller subset of the data can speed up the analysis.

  4. sqlite3_expert_analyze() is called to run the analysis.

  5. One or more calls are made to sqlite3_expert_report() to extract components of the results of the analysis.

  6. sqlite3_expert_destroy() is called to free all resources.

Refer to comments in sqlite3expert.h for further details.

sqlite3_expert application

The file "expert.c" contains the code for a command line application that uses the API described above. It can be compiled with (for example):

  gcc -O2 sqlite3.c expert.c sqlite3expert.c -o sqlite3_expert

Assuming the database is named "test.db", it can then be run to analyze a single query:

  ./sqlite3_expert -sql <sql-query> test.db

Or an entire text file worth of queries with:

  ./sqlite3_expert -file <text-file> test.db

By default, sqlite3_expert generates index statistics using all the data in the user database. For a large database, this may be prohibitively time consuming. The "-sample" option may be used to configure sqlite3_expert to generate statistics based on an integer percentage of the user database as follows:

  # Generate statistics based on 25% of the user database rows:
  ./sqlite3_expert -sample 25 -sql <sql-query> test.db

  # Do not generate any statistics at all:
  ./sqlite3_expert -sample 0 -sql <sql-query> test.db