Advanced grouping

Databases are often used as a data source for any kind of reporting. In reports, it is quite common to show in the same table subtotals, totals, and grand totals that summarize the data rows implying in fact grouping and aggregation. Consider the following report displaying the number of advertisements by car make and quarter, displaying also the totals for each quarter (aggregating all makes), and the grand total. Here is how to select the data for the report:

car_portal=> SELECT to_char(advertisement_date, 'YYYY-Q') as quarter, make, count(*) 
FROM advertisement a
INNER JOIN car c ON a.car_id = c.car_id
INNER JOIN car_model m ON m.car_model_id = c.car_model_id
GROUP BY quarter, make;
quarter | make | count
---------+---------------+-------
2014-4 | Peugeot | 12
2014-2 | Daewoo | 8
2014-4 | Skoda | 5
...

To make the subtotals, one would need to do additional queries, like this:

SELECT to_char(advertisement_date, 'YYYY-Q') as quarter, count(*) 
FROM advertisement a
INNER JOIN car c ON a.car_id = c.car_id
INNER JOIN car_model m ON m.car_model_id = c.car_model_id
GROUP BY quarter;

SELECT count(*)
FROM advertisement a
INNER JOIN car c ON a.car_id = c.car_id
INNER JOIN car_model m ON m.car_model_id = c.car_model_id;

The joins in these queries are redundant, they do not affect the results, but we leave them here to illustrate that the queries are actually the same as the first one, only the grouping makes the difference.

PostgreSQL provides a way to combine the three queries in a single one using a special construct, GROUP BY GROUPING SETS, as follows:

SELECT to_char(advertisement_date, 'YYYY-Q') as quarter, make, count(*) 
FROM advertisement a
INNER JOIN car c ON a.car_id = c.car_id
INNER JOIN car_model m ON m.car_model_id = c.car_model_id
GROUP BY GROUPING SETS ((quarter, make), (quarter), ())
ORDER BY quarter NULLS LAST, make NULLS LAST;
quarter | make | count
---------+---------------+-------
2014-1 | Alfa Romeo | 2
2014-1 | Audi | 5
...
2014-1 | Volvo | 9
2014-1 | | 121 <- This is the subtotal for the quarter 2014-1
2014-2 | Audi | 18
2014-2 | Citroen | 40
...
2014-2 | Volvo | 12
2014-2 | | 266 <- This is the subtotal for the quarter 2014-2
2014-3 | Audi | 11
...
2015-1 | Volvo | 4
2015-1 | | 57 <- This is the subtotal for the quarter 2015-1
| | 784 <- This is the grand total for the whole report

The construct GROUPING SETS ((quarter, make), (quarter), ()) makes the query work as a UNION ALL of the same queries with different GROUP BY clauses:

  • GROUP BY quarter, make
  • GROUP BY quarter--here, the make field will get the NULL value
  • All the rows grouped into a single group, both fields quarter and make will get NULL values

Generally speaking, the GROUP BY clause takes not just expressions, but grouping elements that could be expressions or constructs like GROUPING SETS.

Other possible grouping elements are ROLLUP and CUBE:

  • ROLLUP (a, b, c) is equivalent to GROUPING SETS ((a, b, c), (a, b), (c), ())
  • CUBE (a, b, c) is equivalent to GROUPING SETS ((a, b, c), (a, b), (a, c), (b, c), (a), (b), (c), ())--all possible combinations of the argument expressions

It is also possible to use CUBE and ROLLUP inside GROUPING SETS, as well as combine different grouping elements in one GROUP BY clause like this: GROUP BY a, CUBE (b, c). This would be equivalent to GROUP BY GROUPING SETS ((a, b, c), (a, b), (a, c), (a))--basically a combination of the grouping elements.

Although GROUPING SETS implements the logic of UNION ALL of the results of queries with different GROUP BY clauses, in the background, it works differently from the set operation. PostgreSQL will scan the tables and perform joining and filtering only once. This means that the usage of these grouping techniques can help optimize the performance of your solutions.

The documentation about GROUPING SETS is available at https://www.postgresql.org/docs/current/static/queries-table-expressions.html#QUERIES-GROUPING-SETS.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset