Full text search indexing

GIN indexes are the preferred indexes for full text search, one can simply create an index on the document using the to_tsvector function, as follows:

CREATE INDEX ON <table_name> USING GIN (to_tsvector('english', <attribute name>));
-- OR
CREATE INDEX ON <table_name> USING GIN (to_tsvector(<attribute name>));

The query predicated is used to determine the index definition, for example, if the predicate looks like to_tsvector('english',..)@@ to_tsquey (.., then the first index will be used to evaluate the query.

GIST can be used to index tsvector and tsquery, while GIN can be used to index tsvector only. The GIST index is lossy and can return false matches, so PostgreSQL automatically rechecks the returned result and filters out false matches. False matches can reduce performance due to the records' random access cost. The GIN index stores only the lexemes of tsvector and not the weight labels. Due to this, the GIN index could also be considered lossy if weights are involved in the query.

The performance of the GIN and GIST indexes depends on the number of unique words, so it is recommended to use dictionaries to reduce the total number of unique words. The GIN index is faster to search and slower to build and requires more space than GIST. Increasing the value of maintenance_work_mem setting can improve the GIN index's build time, but this does not work for the GIST index.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset