Chapter 3. DB2 UDB’s statistics, analytic, and OLAP functions 129
The SQL shown in Example 3-14 determines the median value for a table that
may include either an odd number of rows, or an even number of rows. In the
case of an odd number of rows, the virtual table ‘dt3’ has only a single row in it,
while in the case of an even number of rows, ‘dt3’ has 2 rows in it. When the
following query is issued against the table shown in Example 3-12, the median
value is 25.70, and when executed against the table shown in Example 3-13, the
median value is computed as 28.85.
Example 3-14 Compute median value with an even number of data points in the set
WITH
dt1 AS (SELECT purchases, ROWNUMBER() OVER (ORDER BY purchases) AS num
FROM cust_data),
dt2 AS (SELECT COUNT(purchases) + 1 AS count FROM dt1),
dt3 AS (SELECT purchases FROM dt1,dt2
WHERE num = FLOOR(count/2e0) OR num = CEILING(count/2e0))
SELECT DECIMAL(AVG(purchases),10,2) AS median FROM dt3
RANK example
Assume we would like to rank employees by total compensation and listed
alphabetically.
The SQL in Example 3-15 shows that the ORDER BY clause in the RANK ()
OVER statement controls only the ranking sequence and not output sequence.
Important: Note that there are no nulls in the set of data points. If nulls are
present, then they should be
excluded from the evaluation by adding to the
above query a predicate such as ...WHERE purchases IS NOT NULL.
Note: It is worth observing that the value for average purchases is distorted by
the customer making a purchase of 1000.70. This may or may not be a data
entry error.
Note: When specifying an OLAP function like Rank, Dense_Rank,
Row_Number, a window is specified that defines the rows over which the
function is applied, and in what order. This window is specified via the OVER()
clause.