Dealing with NULLs

NULL is a special value that any field or expression can have, except for when it is explicitly forbidden. NULL means the absence of any value. It can also be treated as an unknown value in some cases. In relation to logical values, NULL is neither true nor false. Working with NULLs can be confusing because almost all operators, when taking NULL as an argument, return NULL. If one tries to compare some values and one of them is NULL, the result will also be NULL, which is not true.

For example, consider the following condition:

WHERE a > b

This will return NULL if a or b have a NULL value. This can be expected, but for the following condition, this is not so clear:

WHERE a = b

Here, if both a and b have a value of NULL, the result will still be NULL. The equal operator = always returns NULL if any of the arguments is NULL. Similarly, the following will also be evaluated as NULL, even if a has a NULL value:

WHERE a = NULL

To check the expression for a NULL value, a special predicate is used: IS NULL.

In the previous examples, if it is necessary to find records when a = b or both a and b are NULL, the condition should be changed this way:

WHERE a = b OR (a IS NULL AND b IS NULL)

There is a special construct that can be used to check the equivalence of expressions taking NULL into account: IS NOT DISTINCT FROM. The preceding example can be implemented in the following way, which has the same logic:

WHERE a IS NOT DISTINCT FROM b

Logical operators are different. They can sometimes return a not NULL value even if they take NULL as an argument. For logical operators, NULL means an unknown value.

The operator AND always returns false when any of the operands is false, even if the second is NULL. OR always returns true if one of the arguments is true. In all other cases, the result is unknown, therefore NULL:

car_portal=> SELECT true AND NULL, false AND NULL, true OR NULL, false OR NULL, NOT NULL;
?column? | ?column? | ?column? | ?column? | ?column?
----------+----------+----------+----------+----------
| f | t | |

The subquery expression IN deals with NULLs in a way that might not seem obvious:

car_portal=> SELECT 1 IN (1, NULL) as in;
in
----
t

car_portal=> SELECT 2 IN (1, NULL) as in;
in
----

(1 row)

When evaluating the IN expression and the value that is being checked does not appear in the list of values inside IN (or in the result of a subquery) but there is a NULL value, the result will be also NULL and not false. This is easy to understand if we treat the NULL value as unknown, just as the logical operators do. In the first example, it is clear that 1 is included in the list inside the IN expression. Therefore, the result is true. In the second example, 2 is not equal to 1, but it is unknown about the NULL value. That's why the result is also unknown.

Functions can treat NULL values differently. Their behavior is determined by their code. Most built-in functions return NULL if any of the arguments are NULL.

Aggregating functions work with NULL values in a different way. They work with many rows and therefore many values. In general, they ignore NULL. sum calculates the total of all not-null values and ignores NULL. sum returns NULL only when all the received values are NULL. For avg, max and min it is the same. But for count it is different. count returns the number of not-null values. So if all the values are NULL, count returns zero.

In contrast to some other databases in PostgreSQL, an empty string is not NULL.

Consider the following example:

car_portal=> SELECT a IS NULL, b IS NULL, a = b FROM (SELECT ''::text a, NULL::text b) v;
?column? | ?column? | ?column?
----------+----------+----------
f | t |

There are a couple of functions designed to deal with NULL values: COALESCE and NULLIF.

The COALESCE function takes any number of arguments of the same data type or compatible types. It returns the value of the first of its arguments that IS NOT NULL:

COALESCE(a, b, c)

The preceding code is equivalent to the following:

CASE WHEN a IS NOT NULL THEN a WHEN b IS NOT NULL THEN b ELSE c END

NULLIF takes two arguments and returns NULL if they are equal. Otherwise, it returns the value of the first argument.

This is somehow the opposite of COALESCE:

NULLIF (a, b)

The preceding code is equivalent to the following:

CASE WHEN a = b THEN NULL ELSE a END

Another aspect of NULL values is that they are ignored by unique constraints. This means that if a field of a table is defined as unique, it is still possible to create several records having a NULL value of that field. Additionally, b-tree indexes, which are most commonly used, do not index NULL values. Consider the following query:

SELECT * FROM t WHERE a IS NULL

The preceding code will not use an index if it is created on the column, a.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset