PostgreSQL function parameters

In Chapter 4, PostgreSQL Advanced Building Blocks, we discussed the function categories immutable, stable, and volatile. In this section, we will continue with other function options. These options are not PL/pgSQL language-specific.

Function authorization-related parameters

The first parameters are related to security, and can have one of the following values:

  • SECURITY DEFINER
  • SECURITY INVOKER

The default value for this option is SECURITY INVOKER, which indicates that the function will be executed with the privileges of the user who calls it. The SECURITY DEFINER functions will be executed using the privileges of the user who created it. For the SECURITY INVOKER functions, the user must have the permissions to execute the CRUD operations in the function; otherwise, the function will raise an error. The SECURITY INVOKER functions are very useful in defining triggers, or for promoting the user to perform tasks only supported by the function.

To test these security parameters, let us create two dummy functions, and execute them in different sessions, as follows:

psql -U postgres -h localhost -d car_portal
car_portal=# CREATE FUNCTION test_security_definer () RETURNS  TEXT AS $$ SELECT 'current_user :'||current_user || ' session_user: ' || session_user; $$ LANGUAGE SQL SECURITY DEFINER;
CREATE FUNCTION
car_portal=# CREATE FUNCTION test_security_invoker () RETURNS  TEXT AS $$ SELECT 'current_user :'||current_user || ' session_user: ' || session_user; $$ LANGUAGE SQL SECURITY INVOKER;
CREATE FUNCTION
car_portal=# SELECT test_security_definer();
test_security_definer
----------------------------------------------
current_user :postgres session_user: postgres
(1 row)

car_portal=# SELECT test_security_invoker();
test_security_invoker
----------------------------------------------
current_user :postgres session_user: postgres
(1 row)
car_portal=# q
$ psql -U car_portal_app -h localhost -d car_portal
car_portal=> SELECT test_security_invoker();
test_security_invoker
----------------------------------------------------------
current_user :car_portal_app session_user: car_portal_app
(1 row)

car_portal=> SELECT test_security_definer();
test_security_definer
----------------------------------------------------
current_user :postgres session_user: car_portal_app
(1 row)

The two functions test_security_definer and test_security_invoker are identical except for the security parameter. When the two functions are executed by a postgres user, the result of the two functions is identical to current_user :postgres session_user: postgres. This is simply because the one who created the function and the one who called it is the same user.

When the user car_portal_app executes the two preceding functions, the result of the test_security_definer function is current_user :postgres session_user: car_portal_app. In this case, the session_user is car_portal_app, since it has started the session using a psql client. However, the current_user who executes the SELECT statement SELECT 'current_user :'||current_user || ' session_user: ' || session_user; is postgres.

Function planner-related parameters

The following three parameters are used by the planner to determine the cost of executing the function, the number of rows that are expected to be returned, and whether the function pushes down when evaluating predicates. These parameters are:

  • Leakproof: Leakproof means that the function has no side effects. It does not reveal any information about its argument. For example, it does not throw error messages about its argument. This parameter affects the views with the security_barrier parameter.
  • Cost: It declares the execution cost per row; the default value for the C language function is 1, and for PL/pgSQL it is 100. The cost is used by the planner to determine the best execution plan.
  • Rows: The estimated number of returned rows by the function if the function is a returning set. The default value is 1000.

To understand the effect of the rows, let us consider the following example:

CREATE OR REPLACE FUNCTION a() RETURNS SET OF INTEGER AS $$
  SELECT 1;
$$
LANGUAGE SQL;

EXPLAIN SELECT * FROM a() CROSS JOIN (Values(1),(2),(3)) as foo;
                               QUERY PLAN
-------------------------------------------------------------------------
 Nested Loop  (cost=0.25..47.80 rows=3000 width=8)
   ->  Function Scan on a  (cost=0.25..10.25 rows=1000 width=4)
   ->  Materialize  (cost=0.00..0.05 rows=3 width=4)
         ->  Values Scan on "*VALUES*"  (cost=0.00..0.04 rows=3 width=4)
(4 rows)

The SQL function return type is SET OF INTEGER, which means that the planner expected more than one row to be returned from the function. Since the ROWS parameter is not specified, the planner uses the default value which is 1000. Finally, due to CROSS JOIN, the total estimated number of rows is 3000, which is calculated as 3 * 1000.

In the preceding example, a wrong estimation is not critical. However, in a real life example, where one might have several joins, the error of rows estimation will be propagated and amplified leading to bad execution plans.

The COST function parameter determines when the function will be executed such as:

  • It determines the function execution order
  • It determines if the function call can be pushed down

The following example shows how the execution order for functions is affected by the function cost. Let us assume we have two functions, as follows:

CREATE OR REPLACE FUNCTION slow_function (anyelement) RETURNS BOOLEAN AS
$$
BEGIN
  RAISE NOTICE 'Slow function %', $1;
  RETURN TRUE;
END;
$$
LANGUAGE PLPGSQL COST 10000;


CREATE OR REPLACE FUNCTION fast_function (anyelement) RETURNS BOOLEAN AS
$$
BEGIN
  RAISE NOTICE 'Fast function %', $1;
  RETURN TRUE;
END;
$$
LANGUAGE PLPGSQL COST 0.0001;

The fast_function and the slow_function are identical except for the cost parameter:

EXPLAIN SELECT * FROM pg_language WHERE lanname ILIKE '%sql%' AND slow_function(lanname)AND fast_function(lanname);
                                          QUERY PLAN
-----------------------------------------------------------------------------------------------
Seq Scan on pg_language  (cost=0.00..101.05 rows=1 width=114)
   Filter: (fast_function(lanname) AND (lanname ~~* '%sql%'::text) AND slow_function(lanname))
(2 rows)

EXPLAIN SELECT * FROM pg_language WHERE fast_function(lanname) AND slow_function(lanname) AND lanname ILIKE '%sql%';
                                          QUERY PLAN
-----------------------------------------------------------------------------------------------
Seq Scan on pg_language  (cost=0.00..101.05 rows=1 width=114)
   Filter: (fast_function(lanname) AND (lanname ~~* '%sql%'::text) AND slow_function(lanname))
(2 rows)

The preceding two SQL statements are identical, but the predicates are shuffled. Both statements give the same execution plan. Notice how the predicates are arranged in the filter execution plane node. The fast_function is evaluated first followed by the ILIKE operator, and finally, the slow_function is pushed.

When executing one of the preceding statements, one will get the following result:

SELECT * FROM pg_language WHERE lanname ILIKE '%sql%' AND slow_function(lanname)AND fast_function(lanname);
NOTICE:  Fast function internal
NOTICE:  Fast function c
NOTICE:  Fast function sql
NOTICE:  Slow function sql
NOTICE:  Fast function plpgsql
NOTICE:  Slow function plpgsql
lanname | lanowner | lanispl | lanpltrusted | lanplcallfoid | laninline | lanvalidator | lanacl
---------+----------+---------+--------------+---------------+-----------+--------------+--------
sql     |       10 | f       | t            |             0 |         0 |         2248 |
plpgsql |       10 | t       | t            |         11751 |     11752 |        11753 |
(2 rows)

Notice that fast_function was executed four times, and slow_function was executed only twice. This behavior is known as short-circuit evaluation. slow_function is executed only when fast_function and the ILIKE operator have returned true.

Tip

In PostgreSQL, the ILIKE operator is equivalent to the ~~* operator, and LIKE is equivalent to the ~~ operator.

As discussed in Chapter 4, PostgreSQL Advanced Building Blocks, views can be used to implement authorization, and they can be used to hide data from some users. The function cost parameter can be exploited in the earlier versions of postgres to crack views; however, this has been improved by the introduction of the LEAKPROOF and SECURITY_BARRIER flags.

To be able to exploit the function cost parameter to get data from a view, several conditions should be met, some of which are as follows:

  • The function cost should be very low
  • The function should be marked as LEAKPROOF. Note that only super users are allowed to mark functions as LEAKPROOF.
  • The view security barrier flag should not be set.
  • The function should be executed and not ignored due to short-circuit evaluation.

Meeting all these conditions is very difficult; the following code shows a hypothetical example of exploiting views:

CREATE OR REPLACE VIEW pg_sql_pl AS
SELECT lanname FROM pg_language WHERE lanname ILIKE '%sql%';
-- Only super user can do that
ALTER FUNCTION fast_function(anyelement)  LEAKPROOF;
SELECT * FROM pg_sql_pl WHERE fast_function(lanname);
NOTICE:  Fast function internal
NOTICE:  Fast function c
NOTICE:  Fast function sql
NOTICE:  Fast function plpgsql
lanname
---------
sql
plpgsql
(2 rows)

In the preceding example, the view itself should not show c and internal. By exploiting the function cost, the function was executed before executing the filter lanname ILIKE '%sql%'; exposing information that will never be shown by the view.

Tip

Since only superusers are allowed to mark a function as leak proof, exploiting the function cost in the newer versions of postgres is not possible.

Function configuration-related parameters

Function configuration-related parameters are the setting parameters. These parameters can be used to determine the resources such as the amount of memory required to perform an operation—work_mem—, or they can be used to determine the execution behavior, such as disabling a sequential scan or nested loop joins. Certainly, only parameters which have the context of the user can be used.

The SET clause causes the specified setting parameter to be set with a specified value when the function is entered; the same setting parameter value is reset back to its default value when the function exits. The parameter configuration setting can be set explicitly for the whole function, can be overwritten locally inside the function, and can inherit the value from the session setting using the FROM CURRENT clause.

These configuration parameters are often used to tweak the function performance in the case of limited resources, legacy code, bad design, wrong statistics estimation, and so on. For example, let us assume that a function behaves badly due to database normalization. In this case, refactoring the database might be expensive to perform. To solve this problem, one could alter the execution plan by enabling or disabling some settings, as show in the following example:

car_portal=# SET enable_seqscan TO OFF;
SET
car_portal=# CREATE OR REPLACE FUNCTION configuration_test () RETURNS VOID AS
$$
BEGIN
RAISE NOTICE 'Current session enable_seqscan value: %', (SELECT setting FROM pg_settings WHERE name ='enable_seqscan')::text;
RAISE NOTICE 'Function work_mem: %', (SELECT setting FROM pg_settings WHERE name ='work_mem')::text;

---
---SQL statement here will use index scan when possible
---
SET LOCAL enable_seqscan TO TRUE;
RAISE NOTICE 'Override session enable_seqscan value: %', (SELECT setting FROM pg_settings WHERE name ='enable_seqscan')::text;
---
---SQL statement here will use index scan when possible
---
END;
$$
LANGUAGE PLPGSQL
SET enable_seqscan FROM current
SET work_mem = '10MB';
CREATE FUNCTION
car_portal=# SELECT configuration_test () ;
NOTICE:  Current session enable_seqscan value: off
NOTICE:  Function work_mem: 10240
NOTICE:  Override session enable_seqscan value: on
-[ RECORD 1 ]------+-
configuration_test |
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset