The PostgreSQL PL/pgSQL control statements

The PostgreSQL control structure is an essential part of the PL/pgSQL language; it enables developers to code very complex business logic inside PostgreSQL.

Declaration statements

The general syntax of a variable declaration is as follows:

name [ CONSTANT ] type [ COLLATE collation_name ] [ NOT NULL ] [ { DEFAULT | := | = } expression ];

Let's see what the keywords mean:

  • name: The name should follow the naming rules discussed in Chapter 3, PostgreSQL Basic Building Blocks. For example, the name should not start with an integer.
  • CONSTANT: The variable cannot be assigned another value after the initialization. This is useful in defining constant variables such as Pi.
  • type: The type of variable can be simple such as integer, user defined data type, pseudo type, record, and so on. Since a type is created implicitly on creating a table, one can use this type to declare a variable.

    Tip

    In PostgreSQL, the following two declarations are equivalent; however, the second declaration is more portable with Oracle.

    Myrow tablename ;
    Myrow tablename%ROWTYPE;
    
  • NOT NULL: Not null causes a runtime error to be raised if the variable is assigned to null. Not null variables must be initialized.
  • DEFAULT: Causes the initialization of the variable to be delayed until the block is entered. This is useful in defining a timestamp variable to indicate when the function is called but not the function pre-compilation time.
  • Expression is a combination of one or more explicit values, operators, and functions that can be evaluated to another value.

The PostgreSQL PL/pgSQL function body is composed of nested blocks with an optional declaration section and a label. Variables are declared in the declare section of the block, shown as follows:

[<<label>>]
[ DECLARE
declarations]
BEGIN
statements
END [label];

The BEGIN and END keywords are not used in this context to control transactional behavior, but only for grouping. The declaration section is used for declaring variables, and the label is used to give a name for the block as well as to give fully qualified names to the variables. Finally, in each PL/pgSQL function, there is a hidden block labeled with the function name that contains predefined variables such as FOUND. To understand the function block, let us take a look at the following code, which defines the factorial function in a recursive manner:

CREATE OR REPLACE FUNCTION factorial(INTEGER ) RETURNS INTEGER AS $$
BEGIN
  IF $1 IS NULL OR $1 < 0 THEN
    RAISE NOTICE 'Invalid Number';
    RETURN NULL;
  ELSIF $1 = 1 THEN
    RETURN 1;
  ELSE
    RETURN factorial($1 - 1) * $1;
  END IF;
END;
$$ LANGUAGE 'plpgsql'

The block defines the variable scope; in our example, the scope of the argument variable $1 is the whole function. Also, as shown in the example, there is no declaration section.

To understand the scope between different code blocks, let us write the factorial function in a slightly different manner, which is as follows:

CREATE OR REPLACE FUNCTION factorial(INTEGER ) RETURNS INTEGER AS $$
DECLARE
  fact ALIAS FOR $1;
BEGIN
  IF fact IS NULL OR fact < 0 THEN
    RAISE NOTICE 'Invalid Number';
    RETURN NULL;
  ELSIF fact = 1 THEN
    RETURN 1;
  END IF;

  DECLARE
    result INT;
  BEGIN
    result = factorial(fact - 1) * fact;
    RETURN result;
  END;
END;
$$ LANGUAGE 'plpgsql'

The preceding function is composed of two blocks: the variable fact is an alias for the first argument. In the sub-block, the result variable is declared with a type integer. Since the fact variable is defined in the upper block, it can also be used in the sub-block. The result variable can be used only in the sub-block.

Assignment statements

The assignment operators := and = are used to assign an expression to a variable, as follows:

variable { := | = } expression;

For the variable names, one should choose names that do not conflict with the column names. This is important when writing parameterized SQL statements.

The = operator is supported not only for PostgreSQL version 9.4 but also for the previous versions. Unfortunately, the documentations for the previous versions does not mention it. Moreover, since the = operator is used in SQL for equality comparison, it is preferable to use the := operator to reduce confusion.

In certain contexts, it is important to pick up the right assignment operator:

The following example shows a case when one cannot use = and =: interchangeably:

  CREATE OR REPLACE FUNCTION cast_numeric_to_int (numeric_value numeric, round boolean = TRUE /*correct use of "=". Using ":=" will raise a syntax error */)
  RETURNS INT AS
$$
BEGIN
RETURN (CASE
         WHEN round = TRUE  THEN  CAST (numeric_value AS INTEGER)
         WHEN numeric_value>= 0 THEN CAST (numeric_value -.5 AS INTEGER)
         WHEN numeric_value< 0 THEN CAST (numeric_value +.5 AS INTEGER)
         ELSE NULL
       END);
END;
$$ LANGUAGE plpgsql;
car_portal=# SELECT cast_numeric_to_int(2.3, round:= true /*correct use of :=*/);
cast_numeric_to_int
---------------------
                   2
(1 row)

car_portal=# SELECT cast_numeric_to_int(2.3, round= true );
ERROR:  column "round" does not exist
LINE 1: SELECT cast_numeric_to_int(2.3, round= true );

The assignment expression can be a single atomic value such as pi = 3.14, or it can be a row:

DO $$
DECLARE
  test record;
BEGIN
  test =: ROW (1,'hello', 3.14);
  RAISE notice '%', test;
END;
$$

Finally, it can be a result of the evaluation of the SELECT statement, as shown in the following example. Note that the SELECT statement should not return a set:

car_portal=# DO $$
DECLARE
number_of_accounts INT:=0;
BEGIN
number_of_accounts:= (SELECT COUNT(*) FROM car_portal_app.account)::INT;
RAISE NOTICE 'number_of accounts: %', number_of_accounts;
END;$$
LANGUAGE plpgSQL;
NOTICE:  number_of accounts: 1
DO

There are other techniques for assigning values to the variables from a query that returns a single row:

SELECT select_expressions INTO [STRICT] targets FROM ...;
INSERT ... RETURNING expressions INTO [STRICT] targets;
UPDATE ... RETURNING expressions INTO [STRICT] targets;
DELETE ... RETURNING expressions INTO [STRICT] targets;

Often, the expressions are column names, while the targets are variable names. In the case of select into, the target can be of the type record.

The query INSERT … RETURNING is often used to return the default value of a certain column; this can be used to define the ID of a primary key using the SERIAL and BIGSERIAL data types, which is shown as follows:

CREATE TABLE test (
  id SERIAL PRIMARY KEY,
  name TEXT NOT NULL
);

DO $$
DECLARE
  auto_generated_id INT;
BEGIN
  INSERT INTO test(name) VALUES ('Hello World') RETURNING id INTO auto_generated_id;
  RAISE NOTICE 'The primary key is: %', auto_generated_id;
END
$$;
--- the result of executing the above
--NOTICE:  The primary key is: 1
--DO

Tip

One could get the default value when inserting a row in plain SQL using CTE, as follows:

WITH get_id AS (
  INSERT INTO test(name) VALUES ('Hello World') RETURNING id
) SELECT * FROM get_id;

Finally, one could use qualified names to perform assignment; in trigger functions, one could use NEW and OLD to manipulate the values of these records.

Conditional statements

PostgreSQL supports the IF and CASE statements, which allow a conditional execution based on a certain condition. PostgreSQL supports the IF statement construct, as follows:

IF boolean_expression THEN statement [statement]...
 [ELSIF boolean_expression THEN statement [statement]...]
   [ELSIF boolean_expression THEN statement [statement]...]...
 [ELSE statement [statement]...] END IF;

The case statement comes in two forms, as follows:

CASE search-expression
    WHEN expression [, expression [ ... ]] THEN
statements
[ WHEN expression [, expression [ ... ]] THEN
statements
    ... ]
[ ELSE
statements ]
END CASE;

CASE
    WHEN boolean-expression THEN
statements
[ WHENboolean-expression THEN
statements
    ... ]
[ ELSE
statements ]
END CASE;

To understand the IF statement, let us assume that we would like to convert the advertisement rank to text, as follows:

CREATE OR REPLACE FUNCTION cast_rank_to_text (rank int) RETURNS TEXT AS
$$
DECLARE
  rank ALIAS FOR $1;
  rank_result TEXT;
BEGIN
  IF rank = 5 THEN
    rank_result = 'Excellent';
  ELSIF rank = 4 THEN
    rank_result = 'Very Good';
  ELSIF rank = 3 THEN
    rank_result = 'Good';
  ELSIF rank = 2 THEN
    rank_result ='Fair';
  ELSIF rank = 1 THEN
    rank_result ='Poor';
  ELSE
    rank_result ='No such rank';
  END IF;
  RETURN rank_result;
END;
$$
Language plpgsql;

--- to test the function
SELECT n,cast_rank_to_text(n) FROM generate_series(1,5) as foo(n);

When any branch of the IF statement is executed due to the IF condition being met, then the execution control returns to the first statement after END IF, assuming the RETURN statement is not executed inside this branch. If none of the conditions are met for IF or ELSIF, then the ELSE branch will be executed. Also note that one could nest all the control structures; so, one can have an IF statement inside another one.

The following code snippet implements the preceding function using the CASE statement:

CREATE OR REPLACE FUNCTION cast_rank_to_text (rank int) RETURNS TEXT AS
$$
DECLARE
  rank ALIAS FOR $1;
  rank_result TEXT;
BEGIN
  CASE rank
  WHEN 5 THEN rank_result = 'Excellent';
  WHEN 4 THEN rank_result = 'Very Good';
  WHEN 3 THEN rank_result = 'Good';
  WHEN 2 THEN rank_result ='Fair';
  WHEN 1 THEN rank_result ='Poor';
  ELSE rank_result ='No such rank';
  END CASE;
  RETURN rank_result;
END;
$$
Language plpgsql;

In the CASE statement, if any branch of the case matches the selector, the execution of the case is terminated and the execution control goes to the first statement after CASE. Moreover, in the previous form of CASE, one cannot use it to match NULL values, since NULL equality is NULL. To overcome this limitation, one should specify the matching condition explicitly using the second form of CASE statement, as follows:

CREATE OR REPLACE FUNCTION cast_rank_to_text (rank int) RETURNS TEXT AS
$$
DECLARE
  rank ALIAS FOR $1;
  rank_result TEXT;
BEGIN
  CASE
  WHEN rank=5 THEN rank_result = 'Excellent';
  WHEN rank=4 THEN rank_result = 'Very Good';
  WHEN rank=3 THEN rank_result = 'Good';
  WHEN rank=2 THEN rank_result ='Fair';
  WHEN rank=1 THEN rank_result ='Poor';
  WHEN rank IS NULL THEN RAISE EXCEPTION 'Rank should be not NULL';
  ELSE rank_result ='No such rank';
  END CASE;
  RETURN rank_result;
END;
$$
Language plpgsql;
--- to test
SELECT cast_rank_to_text(null);

Finally, the CASE statement raises an exception if no branch is matched and the ELSE branch is not specified, as follows:

DO
$$
DECLARE
i int := 0;
BEGIN
case WHEN i=1 then
           RAISE NOTICE 'i is one';
END CASE;
END;
$$
LANGUAGE plpgsql;
ERROR:  case not found
HINT:  CASE statement is missing ELSE part.
CONTEXT:  PL/pgSQL function inline_code_block line 5 at CASE

Iteration

Iteration is used to repeat a block of statements to achieve a certain goal. With iteration, one often needs to specify the starting point and the ending condition. In PostgreSQL, there are several statements for iterating through the results and for performing looping, including LOOP, CONTINUE, EXIT, FOR, WHILE, and FOR EACH.

The loop statement

The basic LOOP statement has the following structure:

LOOP
  Statements
END LOOP;

To understand the LOOP statement, let us rewrite the factorial function, as follows:

CREATE OR REPLACE FUNCTION factorial (fact int) RETURNS BIGINT AS
$$
DECLARE
  result bigint = 1;
BEGIN
  IF fact = 1 THEN
    RETURN 1;
  ELSIF fact IS NULL or fact < 1 THEN
    RAISE EXCEPTION 'Provide a positive integer';
  ELSE
    LOOP
      result = result*fact;
      fact = fact-1;
      EXIT WHEN fact = 1;
    END Loop;
  END IF;
  RETURN result;
END;
$$
LANGUAGE plpgsql;

In the preceding code, the conditional EXIT statement is used to prevent infinite looping by exiting the LOOP statement. When an EXIT statement is encountered, the execution control goes to the first statement after the LOOP. To control the execution of the statements inside the LOOP statement, PL/pgSQL also provides the CONTINUE statement, which works somewhat like the EXIT statement. Thus, instead of forcing termination, the CONTINUE statement forces the next iteration of the loop to take place, skipping any code in between.

Tip

The usage of the CONTINUE and EXIT statements, especially in the middle of a code block, is not encouraged because it breaks the execution order, which makes the code harder to read and understand.

The while loop statement

The WHILE statement keeps executing a block of statements while a particular condition is met. Its syntax is as follows:

WHILE boolean-expression LOOP
    statements
END LOOP ;

The following example uses the while loop to print the days of the current month:

DO
$$
DECLARE
  first_day_in_month date := date_trunc('month', current_date)::date;
  last_day_in_month date := (date_trunc('month', current_date)+ INTERVAL '1 MONTH - 1 day')::date;
  counter date = first_day_in_month;
BEGIN
  WHILE (counter <= last_day_in_month) LOOP
    RAISE notice '%', counter;
    counter := counter + interval '1 day';
  END LOOP;
  
END;
$$
LANGUAGE plpgsql;

The for loop statement

PL/pgSQL provides two forms of the FOR statement, and they are used to:

  • Iterate through the rows returned by an SQL query
  • Iterate through a range of integer values

The syntax of the for loop statement is:

FOR index_name IN [ REVERSE ] expression1 .. expression2 [ BY expression ] LOOP
    statements
END LOOP;

index_name is the name of a local variable of the type integer. This local variable scope is the FOR loop. Statements inside the loop can read this variable, but cannot change its value. Finally, one can change this behavior by defining the variable in the declaration section of the outer block. expression1 and expression2 must be evaluated to integer values; if expression1 equals expression2 then the FOR loop is run only once.

The REVERSE key word is optional, and it is used to define the order in which the range will be generated (ascending or descending). If REVERSE is omitted, then expression1 should be smaller than expression2, otherwise the loop will not be executed. Finally, BY defines the steps between two successive numbers in the range. Note that the BY value should always be a positive integer. The following example shows a FOR loop iterating over a negative range of numbers in the reverse order:

DO
$$
BEGIN
FOR j IN REVERSE -1 .. -10 BY 2 LOOP
    Raise notice '%', j;
END LOOP;
END;
$$
LANGUAGE plpgsql;
--- output should be
NOTICE:  -1
NOTICE:  -3
NOTICE:  -5
NOTICE:  -7
NOTICE:  -9

To iterate through the result of a set query, the syntax is different, as follows:

[ FOR index_name IN { cursor  | ( select_statement )} LOOP
      statement...
END LOOP ;

index_name is a local variable in the outer block. Its type is not restricted to simple types such as integer and text. However, its type might be a composite or a RECORD data type. In PL/pgSQL, one could iterate over a cursor result or over a SELECT statement result.

Tip

Cursor is a special data object that can be used to encapsulate a SELECT query, and then to read the query result a few rows at a time.

The following example shows all the database names:

DO $$
DECLARE
  database RECORD;
BEGIN
  FOR database IN SELECT * FROM pg_database LOOP
    RAISE notice '%', database.datname;
  END LOOP;
END;
$$;
--- output should be like
NOTICE:  template1
NOTICE:  template0
NOTICE:  postgres
...

Returning from the function

The PostgreSQL return statement is used for terminating the function, and for returning the control to the caller. The return statement has different forms such as RETURN, RETURN NEXT, RETURN QUERY, RETURN expression, RETURN QUERY EXECUTE, and so on. The return statement can return a single value or a set to the caller, as will be shown in this chapter. In this context, let us consider the following anonymous function:

postgres=# DO $$
BEGIN
RETURN;
RAISE NOTICE 'This statement will not be executed';
END
$$
LANGUAGE plpgsql;
DO
postgres=#

As shown in the preceding example, the function is terminated before the execution of the RAISE statement due to the RETURN statement.

Returning void

A void type is used in a function for performing some side effects such as logging; the built-in function pg_sleep is used to delay the execution of a server process in seconds:

df pg_sleep
                            List of functions
   Schema   |   Name   | Result data type | Argument data types |  Type
------------+----------+------------------+---------------------+--------
 pg_catalog | pg_sleep | void             | double precision    | normal

Returning a single row

PL/pgSQL can be used to return a single row from a function; an example of this type is the factorial function.

Tip

Some developers refer to PL/pgSQL and SQL functions returning a single-row, single-column scalar variable as scalar functions.

The return type can be base, composite, domain, pseudo type, or domain data type. The following function returns a JSON representation of a certain account:

CREATE OR REPLACE FUNCTION car_portal_app.get_account_in_json (account_id INT) RETURNs JSON AS
$$
  SELECT row_to_json(account) FROM car_portal_app.account WHERE account_id = $1;
$$
LANGUAGE SQL;
-- For test
WITH inserted_account AS (
  INSERT into car_portal_app.account VALUES (DEFAULT, 'first_name', 'last_name', '[email protected]','some_pass') RETURNING account_id
) SELECT  car_portal_app.get_account_in_json (account_id) FROM inserted_account;
--Result
"{"account_id":15,"first_name":"first_name","last_name":"last_name","email":"[email protected]","password":"some_pass"}"

Returning multiple rows

Set Returning Functions (SRFs) can be used to return a set of rows. The row type can either be base type such as integer, composite, table type, pseudo type, or domain type.

To return a set from a function, the keyword SETOF is used to mark the function as an SRF, as follows:

CREATE OR REPLACE FUNCTION advertisement_in_last_7_days() RETURNS SETOF car_portal_app.advertisement AS $$
  SELECT * FROM car_portal_app.advertisement WHERE advertisement_date >= now() -INTERVAL '7 days';
$$ LANGUAGE SQL;

--To test
SELECT * FROM advertisement_in_last_7_days()

If the return type is not defined, one could:

  • Define a new data type and use it
  • Use return table
  • Use output parameters and record data type

Let us assume that we would like to return only the car_id and the account_seller_id, as in this case, we do not have a data type. Thus, the preceding function can be written as:

CREATE OR REPLACE FUNCTION advertisement_in_last_7_days() RETURNS TABLE (car_id INT, seller_account_id INT) AS $$
BEGIN
  RETURN QUERY SELECT a.car_id, a.seller_account_id FROM car_portal_app.advertisement a WHERE advertisement_date >= now() -INTERVAL '7 days';
END;
$$ LANGUAGE plpgSQL;

--To TEST
SELECT * FROM advertisement_in_last_7_days()

Note that in the preceding function, we use an alias; else, PL/pgSQL will raise an error, as follows:

ERROR:  column reference "car_id" is ambiguous
...
DETAIL:  It could refer to either a PL/pgSQL variable or a table column.
QUERY:  SELECT car_id, seller_account_id FROM

Moreover, in this function, RETURN QUERY is used only to show a different option in returning a query result.

The preceding function can also be written using the OUT variables; actually, the return table is implemented internally as indicated by the error using the OUT variables, as follows:

CREATE OR REPLACE FUNCTION advertisement_in_last_7_days(OUT car_id INT, OUT seller_account_id INT) RETURNS SETOF RECORD AS $$
BEGIN
  RETURN QUERY SELECT a.car_id, a.seller_account_id FROM car_portal_app.advertisement a WHERE advertisement_date >= now() -INTERVAL '7 days';
END;
$$ LANGUAGE plpgsql;

Function predefined variables

The PL/pgSQL functions have several special variables that are created automatically in the top-level block. For example, if the function returns a trigger, then several variables such as NEW, OLD, and TG_OP are created.

In addition to the trigger special values, there is a Boolean variable called FOUND. This is often used in combination with DML and PERFORM statements to conduct sanity checks. The value of the FOUND variable is affected by the SELECT, INSERT, UPDATE, DELETE, and PERFORM statements. These statements set FOUND to true if at least one row is selected, inserted, updated, or deleted.

The PERFORM statement is similar to the SELECT statement, but it discards the result of the query. Finally, the EXECUTE statement does not change the value of the FOUND variable. The following examples show how the FOUND variable is affected by the INSERT and PERFORM statements:

DO $$
BEGIN
  CREATE TABLE t1(f1 int);
  --- Set FOUND to true
  INSERT INTO t1 VALUES (1);
  RAISE NOTICE '%', FOUND;
  --- Set FOUND to false
  PERFORM* FROM t1 WHERE f1 = 0;
  RAISE NOTICE '%', FOUND;
  DROP TABLE t1;
END;
$$LANGUAGE plpgsql;
--- output
NOTICE:  t
NOTICE:  f

In addition to the preceding query, one could get the last OID—object identifier—for an inserted row as well as the affected number of rows by using the INSERT, UPDATE, and DELETE statements via the following commands:

GET DIAGNOSTICS variable = item;

Assuming that there is variable called i of type integer, one can get the affected number of rows, as follows:

GET DIAGNOSTICS i = ROW_COUNT;
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset