© Kevin Languedoc 2016

Kevin Languedoc, Build iOS Database Apps with Swift and SQLite, 10.1007/978-1-4842-2232-4_12

12. Analyzing SQLite Databases

Kevin Languedoc

(1)Montreal, Canada

This last chapter will focus on the different tools the SQLite platform provides to help you analyze your app's databases. Except for the ANALYZE statement, which can be executed with the sqlite3_exec command, the other tools are external software programs. Specifically, in this chapter we will discuss and explore, through examples, the following technologies:

  • The ANALYZE statement

  • The sqldiff tool

  • The sqlite3_analyzer tool

Other than the ANALYZE statement, which we will explore in Swift, all the other tools are external tools that can help you analyze your databases for support and/or for development optimization.

The Analyze Statement

The role of the ANALYZE statement is to gather information on tables and indexes in a database through statistics. SQLite accomplishes this task by creating a sqlite_stats1 table in your app's database when the ANALYZE statement is executed. SQLite passes this information to the SQLite Query Optimizer, which in turn uses the collected information to use the best query algorithm for the best performance.

If you build the database or enable SQLITE3_ENABLE_STAT3 or SQLITE3_ENABLE_STAT4, additional histogram information is gathered and stored in the sqlite3_stat3 and sqlite3_stat4 tables respectively.

To run the ANALYZE functionality in a database, you simply need to execute the ANALYZE statement, as follows:

ANALYZE schema

This will create a stat1 table for the whole database. If you only want to target a table, you could issue a command like:

ANALYZE schema.tablename

Of course, you can also build a stat1 table for an index in a table using the following command:

ANALYZE schema with or without the schema prefix:

ANALYZE schema.indexname

or,

ANALYZE indexname

In Swift, the ANALYZE statement can be executed like any other query, as follows:

func analyzeDatabase(_ name:String){

        let sql:String = "ANALYZE Chinook_Sqlite.sqlite"
        if(sqlite3_open(dbPath.path!, &db)==SQLITE_OK){
            if(sqlite3_prepare_v2(db, sql.cString(using: String.Encoding.utf8)!, -1, &sqlStatement, nil)==SQLITE_OK){
        while(sqlite3_step(sqlStatement)==SQLITE_ROW){
                    let output = String(cString: UnsafePointer<Int8>(sqlite3_column_text(sqlStatement, 0)))
                       print(output)
                   }
            }
        }
        sqlite3_close(db)
}

The sqldiff Tool

The sqldiff utility is an external application from SQLite. You can download it along with SQLite3_Analyzer—and SQLite, for that matter–from the download page at www.sqlite.org . Look for the zip package for OSX x86.

You only need to unzip the compressed file to a handy directory on your OSX machine. These tools are equally available for Windows and Linux. The instructions that follow are for OSX only; however, I am sure they work the same way on Windows and Linux.

The sqldiff utility is used to compare two SQLite databases and generates a SQL script to convert the source database (database1) into the target database (database2). The sqldiff tool is easy to use. A sample output is shown in Figure 12-1. From a Terminal window, navigate to the directory where you unzipped the utility, and sqldiff will issue the following command:

A427374_1_En_12_Fig1_HTML.jpg
Figure 12-1. sqldiff output to standard output
sqldiff database1.sqlite databse2.sqlite

To illustrate, I will run the utility on the Chinook_SQLite.sqlite database that I used for the backups. The second database is DbToBackup.sqlite, which is another empty database I created for the backups.

The utility has a number of options that you can use as well. For instance:

  • --changeset FILE

  • --lib or L

  • --primaryKey

  • --schema

  • --summary

  • --table TABLENAME

  • --transaction

  • --vtab

The –changeset option re-directs the output to a file. The –lib option loads a user-defined library prior to comparing the databases, like collating_sequences, for instance. If you prefer using the primary key in a table instead of the rowed, then–primaryKey is the way to go. Using the –schema flag, you can show the differences in the schema only, excluding the content. To pinpoint the changes that have occurred in two tables, use –summary. However, the actual changes won’t be displayed. Using the –table option allows you to compare specific table content. The –transaction option allows you to generate one large transaction for the complete operation. Finally, the –vtab choice works with virtual tables like FTS3, FTS5, and rtree tables.

The sqldiff utility works by comparing pairwise rowids, unless you are using the –primaryKey option. The output is generated as updates if the content is in similar tables. If the two databases have distinct tables, then the source tables are dropped (DELETE) and new ones created, along with using INSERT to insert the content.

There are, of course, some limitations with the current version of the tool. For instance, the utility only works on tables, and rowids must be accessible, unless you are using the –primaryKey option. Also, the content of virtual tables isn’t compared unless it results in a physical table. However, using the tool in this capacity can create corrupted databases.

The sqlite3_analyzer tool

The sqlite3_analyzer is a handy tool used to measure the effective use of the space in the tables in the target database. Like the sqldiff tool before, the sqlite3_analyzer is a command-line tool that is included in the same download package as the sqldiff tool.

The utility generates a text (ASCII)-based report in a text file. It is a human-readable file. To demonstrate, I will run the tool with the Chinook_Sqlite.sqlite database file from before.

From a Terminal window, navigate to the directory where the sqlite3_analyzer tool is situated and issue the following command:

sqlite3_analyzer database.sqlite

For example, here is the output of the DbToBackup.sqlite database:

Last login: Mon Sep  5 17:04:01 on ttys001
Kevins-MacBook-Air:∼ kevinlanguedoc$ /Users/kevinlanguedoc/Documents/sqlitetools/sqlite3_analyzer /Users/kevinlanguedoc/Documents/sqlitetools/DbToBackup.sqlite
/** Disk-Space Utilization Report For /Users/kevinlanguedoc/Documents/sqlitetools/DbToBackup.sqlite


Page size in bytes................................ 32768     
Pages in the whole file (measured)................ 4         
Pages in the whole file (calculated).............. 4         
Pages that store data............................. 4          100.0 %
Pages on the freelist (per header)................ 0            0.0 %
Pages on the freelist (calculated)................ 0            0.0 %
Pages of auto-vacuum overhead..................... 0            0.0 %
Number of tables in the database.................. 4         
Number of indices................................. 0         
Number of defined indices......................... 0         
Number of implied indices......................... 0         
Size of the file in bytes......................... 131072    
Bytes of user payload stored...................... 0            0.0 %


*** Page counts for all tables with their indices *****************************

CARS.............................................. 1           25.0 %
SQLITE_MASTER..................................... 1           25.0 %
SQLITE_SEQUENCE................................... 1           25.0 %
SQLITE_STAT1...................................... 1           25.0 %


*** Page counts for all tables and indices separately *************************

CARS.............................................. 1           25.0 %
SQLITE_MASTER..................................... 1           25.0 %
SQLITE_SEQUENCE................................... 1           25.0 %
SQLITE_STAT1...................................... 1           25.0 %


*** All tables ****************************************************************

Percentage of total database...................... 100.0 %    
Number of entries................................. 3         
Bytes of storage consumed......................... 131072    
Bytes of payload.................................. 296          0.23 %
Average payload per entry......................... 98.67     
Average unused bytes per entry.................... 43543.67  
Maximum payload per entry......................... 141       
Entries that use overflow......................... 0            0.0 %
Primary pages used................................ 4         
Overflow pages used............................... 0         
Total pages used.................................. 4         
Unused bytes on primary pages..................... 130631      99.66 %
Unused bytes on overflow pages.................... 0         
Unused bytes on all pages......................... 130631      99.66 %

*** Table CARS ****************************************************************


Percentage of total database...................... 25.0 %    
Number of entries................................. 0         
Bytes of storage consumed......................... 32768     
Bytes of payload.................................. 0            0.0 %
B-tree depth...................................... 1         
Average payload per entry......................... 0.0       
Average unused bytes per entry.................... 0.0       
Maximum payload per entry......................... 0         
Entries that use overflow......................... 0         
Primary pages used................................ 1         
Overflow pages used............................... 0         
Total pages used.................................. 1         
Unused bytes on primary pages..................... 32760       99.976 %
Unused bytes on overflow pages.................... 0         
Unused bytes on all pages......................... 32760       99.976 %

*** Table SQLITE_MASTER *******************************************************


Percentage of total database...................... 25.0 %    
Number of entries................................. 3         
Bytes of storage consumed......................... 32768     
Bytes of payload.................................. 296          0.90 %
B-tree depth...................................... 1         
Average payload per entry......................... 98.67     
Average unused bytes per entry.................... 10783.67  
Maximum payload per entry......................... 141       
Entries that use overflow......................... 0            0.0 %
Primary pages used................................ 1         
Overflow pages used............................... 0         
Total pages used.................................. 1         
Unused bytes on primary pages..................... 32351       98.7 %
Unused bytes on overflow pages.................... 0         
Unused bytes on all pages......................... 32351       98.7 %


*** Table SQLITE_SEQUENCE *****************************************************

Percentage of total database...................... 25.0 %    
Number of entries................................. 0         
Bytes of storage consumed......................... 32768     
Bytes of payload.................................. 0            0.0 %
B-tree depth...................................... 1         
Average payload per entry......................... 0.0       
Average unused bytes per entry.................... 0.0       
Maximum payload per entry......................... 0         
Entries that use overflow......................... 0         
Primary pages used................................ 1         
Overflow pages used............................... 0         
Total pages used.................................. 1         
Unused bytes on primary pages..................... 32760       99.976 %
Unused bytes on overflow pages.................... 0         
Unused bytes on all pages......................... 32760       99.976 %

*** Table SQLITE_STAT1 ********************************************************


Percentage of total database...................... 25.0 %    
Number of entries................................. 0         
Bytes of storage consumed......................... 32768     
Bytes of payload.................................. 0            0.0 %
B-tree depth...................................... 1         
Average payload per entry......................... 0.0       
Average unused bytes per entry.................... 0.0       
Maximum payload per entry......................... 0         
Entries that use overflow......................... 0         
Primary pages used................................ 1         
Overflow pages used............................... 0         
Total pages used.................................. 1         
Unused bytes on primary pages..................... 32760       99.976 %
Unused bytes on overflow pages.................... 0         
Unused bytes on all pages......................... 32760       99.976 %


*** Definitions ***************************************************************

Page size in bytes


    The number of bytes in a single page of the database file.
    Usually 1024.


Number of pages in the whole file

    The number of 32768-byte pages that go into forming the complete
    database


Pages that store data

    The number of pages that store data, either as primary B*Tree pages or
    as overflow pages. The number at the right is the data pages divided by
    the total number of pages in the file.


Pages on the freelist

    The number of pages that are not currently in use but are reserved for
    future use. The percentage at the right is the number of freelist pages
    divided by the total number of pages in the file.


Pages of auto-vacuum overhead

    The number of pages that store data used by the database to facilitate
    auto-vacuum. This is zero for databases that do not support auto-vacuum.


Number of tables in the database

    The number of tables in the database, including the SQLITE_MASTER table
    used to store schema information.


Number of indices

    The total number of indices in the database.

Number of defined indices

    The number of indices created using an explicit CREATE INDEX statement.

Number of implied indices

    The number of indices used to implement PRIMARY KEY or UNIQUE constraints
    on tables.


Size of the file in bytes

    The total amount of disk space used by the entire database files.

Bytes of user payload stored


    The total number of bytes of user payload stored in the database. The
    schema information in the SQLITE_MASTER table is not counted when
    computing this number. The percentage at the right shows the payload
    divided by the total file size.


Percentage of total database

    The amount of the complete database file that is devoted to storing
    information described by this category.


Number of entries

    The total number of B-Tree key/value pairs stored under this category.

Bytes of storage consumed

    The total amount of disk space required to store all B-Tree entries
    under this category. This is the total number of pages used times
    the pages size.


Bytes of payload

    The amount of payload stored under this category. Payload is the data
    part of table entries and the key part of index entries. The percentage
    at the right is the bytes of payload divided by the bytes of storage
    consumed.


Average payload per entry

    The average amount of payload on each entry. This is just the bytes of
    payload divided by the number of entries.


Average unused bytes per entry

    The average amount of free space remaining on all pages under this
    category on a per-entry basis. This is the number of unused bytes on
    all pages divided by the number of entries.


Non-sequential pages

    The number of pages in the table or index that are out of sequence.
    Many filesystems are optimized for sequential file access, so a small
    number of non-sequential pages might result in faster queries,
    especially for larger database files that do not fit in the disk cache.
    Note that after running VACUUM, the root page of each table or index is
    at the beginning of the database file and all other pages are in a
    separate part of the database file, resulting in a single non-
    sequential page.

Maximum payload per entry


    The largest payload size of any entry.

Entries that use overflow

    The number of entries that use one or more overflow pages.

Total pages used

    This is the number of pages used to hold all information in the current
    category. This is the sum of index, primary, and overflow pages.


Index pages used

    This is the number of pages in a table B-tree that hold only key (rowid)
    information and no data.


Primary pages used

    This is the number of B-tree pages that hold both key information and data.

Overflow pages used

    The total number of overflow pages used for this category.

Unused bytes on index pages

    The total number of bytes of unused space on all index pages. The
    percentage at the right is the number of unused bytes divided by the
    total number of bytes on index pages.


Unused bytes on primary pages

    The total number of bytes of unused space on all primary pages. The
    percentage at the right is the number of unused bytes divided by the
    total number of bytes on primary pages.


Unused bytes on overflow pages

    The total number of bytes of unused space on all overflow pages. The
    percentage at the right is the number of unused bytes divided by the
    total number of bytes on overflow pages.


Unused bytes on all pages

    The total number of bytes of unused space on all primary and overflow
    pages. The percentage at the right is the number of unused bytes
    divided by the total number of bytes.

*******************************************************************************
The entire text of this report can be sourced into any SQL database
engine for further analysis. All of the text above is an SQL comment.
The data used to generate this report follows:
*/
BEGIN;
CREATE TABLE space_used(
   name clob,        -- Name of a table or index in the database file
   tblname clob,     -- Name of associated table
   is_index boolean, -- TRUE if it is an index, false for a table
   is_without_rowid boolean, -- TRUE if WITHOUT ROWID table  
   nentry int,       -- Number of entries in the BTree
   leaf_entries int, -- Number of leaf entries
   depth int,        -- Depth of the b-tree
   payload int,      -- Total amount of data stored in this table or index
   ovfl_payload int, -- Total amount of data stored on overflow pages
   ovfl_cnt int,     -- Number of entries that use overflow
   mx_payload int,   -- Maximum payload size
   int_pages int,    -- Number of interior pages used
   leaf_pages int,   -- Number of leaf pages used
   ovfl_pages int,   -- Number of overflow pages used
   int_unused int,   -- Number of unused bytes on interior pages
   leaf_unused int,  -- Number of unused bytes on primary pages
   ovfl_unused int,  -- Number of unused bytes on overflow pages
   gap_cnt int,      -- Number of gaps in the page layout
   compressed_size int  -- Total bytes stored on disk
);
INSERT INTO space_used VALUES('sqlite_master','sqlite_master',0,0,3,3,1,296,0,0,141,0,1,0,0,32351,0,0,32768);
INSERT INTO space_used VALUES('sqlite_stat1','sqlite_stat1',0,0,0,0,1,0,0,0,0,0,1,0,0,32760,0,0,32768);
INSERT INTO space_used VALUES('cars','cars',0,0,0,0,1,0,0,0,0,0,1,0,0,32760,0,0,32768);
INSERT INTO space_used VALUES('sqlite_sequence','sqlite_sequence',0,0,0,0,1,0,0,0,0,0,1,0,0,32760,0,0,32768);
COMMIT;

This analysis report provides very detailed information on the use of space in the pages that are in the database. What follows is the extract on the Cars table. The primary concern to me as a database developer is the amount of free space in the database’s table. In this case, it is at 99.976 %, which is fantastic. This means that there isn’t any bloat in the pages.

A page is a unit of storage in a database—any database, really. In this table, only one page is used. However, that database is free to use as many pages as is needed to efficiently manage itself. When there is a lot of activity from inserts, deletes, and updates, the database can be become bloated with unused space in the pages.

In order to regain that space and optimize the database, you can execute the VACUUM command on the database, as we have discussed before, which removes empty space. You can also re-index the tables to optimize table I/O efficiency.

*** Table CARS ****************************************************************

Percentage of total database...................... 25.0 %    
Number of entries................................. 0         
Bytes of storage consumed......................... 32768     
Bytes of payload.................................. 0            0.0 %
B-tree depth...................................... 1         
Average payload per entry......................... 0.0       
Average unused bytes per entry.................... 0.0       
Maximum payload per entry......................... 0         
Entries that use overflow......................... 0         
Primary pages used................................ 1         
Overflow pages used............................... 0         
Total pages used.................................. 1         
Unused bytes on primary pages..................... 32760       99.976 %
Unused bytes on overflow pages.................... 0         
Unused bytes on all pages......................... 32760       99.976 %

Summary

This chapter is different from the other chapters in that the focus was primarily outside of Xcode and Swift. Most of these tools that we looked at are run in the Terminal. The sqldiff lets us compare two databases and copy the schema and content from one to another. The sqlite3_analyzer tool generates a report on how the space is used in the pages in the database. We also looked at the ANALYZE statement, which creates a stat1 table to store statistics on the tables, which are then used in the SQLite Query Optimizer to select the best algorithm to use to perform various queries against a database.

I hope this book serves you well.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset