Transaction log

Before explaining how to set up data replication, let's quickly look at how PostgreSQL handles changes in data at the lower level.

When PostgreSQL processes a command that changes the data in the database, it writes the new data on disk to make it persistent. There are two locations on disk where the data is written:

The data files located on Linux by default at /var/lib/postgresql/10/main/base. Here, the data is stored: tables, indexes, temporary tables, and other objects. The size of this directory is only limited by the size of the disk.
The transaction log located by default at /var/lib/postgresql/10/main/pg_wal. Here, the log of most recent changes into the data files is stored. Its size is limited in the configuration and by default is around 1 GB.

The same data is always written into both locations. This may seem redundant, but there is a reason for this:

Imagine there is a transaction inserting a record with a text value test into a big table, and it so happens that the server crashes in the middle of the transaction. The letters te are written and the rest was not written to disk. When the database is started again, it would not be able to tell if the record is corrupted, because it would not know if it is the word test or only the letters te were supposed to be in the field. OK, this can be solved using check sums. How would the database find the corrupted record? It would be very expensive to validate the check sums for the whole database after every unexpected restart.

The solution for this is that before writing data into the data files, PostgreSQL always writes it into the transaction log. The transaction log (also called the write-ahead log) is a list of changes made by PostgreSQL to a data file. The transaction log appears as a set of 16 MB files (called WAL-files) located in the subdirectory pg_wal under the PostgreSQL database path. Each file contains a lot of records that basically tell which data file should be changed in which way. Only when the transaction log is saved on disk the database write the data to the data files. When the transaction log is full, PostgreSQL deletes the oldest segment of it to reuse disk space. The transaction log is relatively small, and the server is able to go through it after an unexpected shutdown.

Now, after a system failure, the following will happen after restart:

If the system crashed during the writing of the transaction log, PostgreSQL will identify that the log record is incomplete because the check sum will not match. It would discard this log record and perform a rollback of transactions that were writing the data into that record.
If the system crashed during writing to the data files but the transaction log is not corrupted, PostgreSQL will go through the transaction log, validate that the content is written into the data files, and correct the data files when necessary. There is no need to scan all the data files because it knows from the transaction log which part of which data file was supposed to be changed and how.

The process of replaying the transaction log is called recovery, and this is what the database is always doing after unexpected restarts. If one had the full transaction log from the time the database server was initialized until the current time, then it would be possible to recover the state of the database to any point in time in the past. To provide this functionality, PostgreSQL can be configured to archive the transaction log somewhere instead of deleting old WAL files. This archive can be then used to perform so called point-in-time recovery of the database on another machine.

Table of Contents for Transaction log

Create new playlist

Sign In

Sign Up

Table of Contents for
Transaction log