“There is flattery in friendship.”
– William Shakespeare
The syntax for the Quantile function
QUANTILE (<partitions>, <column-name> ,<sort-key> [DESC | ASC])
[QUALIFY QUANTILE (<column-name>) {< | > | = | <= | >=} <number-of-rows>]
A Quantile is used to divide rows into a number of categories or grouping of roughly the same number of rows in each group. The percentile is the QUANTILE most commonly used in business. This means that the request is based on a value of 100 for the number of partitions. It is also possible to have quartiles (based on 4), tertiles (based on 3) and deciles (based on 10).
By default, both the QUANTILE column and the QUANTILE value itself will be output in ascending sequence. As in some cases, the ORDER BY clause may be used to reorder the output for display. Here the order of the output does not change the meaning of the output, unlike a summation where the values are being added together and all need to appear in the proper sequence.
This example determines the percentile for every row in the Sales
table based on the daily sales amount and sorts it into sequence
by the value being categorized, which here is daily sales.
SELECT Product_ID, Sale_Date, Daily_Sales
,QUANTILE (100, Daily_Sales ) AS "Quantile"
FROM Sales_Table
WHERE Product_ID < 3000
AND Sale_Date > 1000930 ;
Notice that the amount of 32800.50 in the first two rows has the same percentile value. They are the same value and will therefore be put into the same partition.
SELECT Product_ID ,Sale_Date ,Daily_Sales
,QUANTILE (100, Daily_Sales , Sale_Date DESC ) AS "Quantile"
FROM Sales_Table
WHERE Product_ID < 3000
AND Sale_Date >= 1000930 ;
Notice the first two rows. This is because the Sale date DESC, impacts the first two rows. Why? Since these rows have the same value, it uses the Sale_Date column as a tiebreaker for the sequencing and makes them different from each other. Hence, they are assigned to different values in different partitions.
This example uses a QUALIFY to show only
products that sell in the top 60 Percentile
SELECT Product_ID, Sale_Date, Daily_Sales
,QUANTILE (100, Daily_Sales, Sale_Date ) as "Percentile"
FROM Sales_Table
QUALIFY "Percentile" >= 60 ;
Like the aggregate functions, OLAP functions must read all required rows before performing their operation. Therefore, the WHERE clause cannot be used. Where the aggregates use HAVING, the OLAP functions uses QUALIFY. The QUALIFY evaluates the result to determine which ones to return.
SELECT Product_ID, Sale_Date, Daily_Sales
,QUANTILE (100, Daily_Sales, Sale_Date ) as "Percentile"
FROM Sales_Table
QUALIFY "Percentile" >= 70
ORDER BY "percentile" DESC ;
The ORDER BY changes the sequence of the rows being listed, not the meaning of the percentile. The above functions both determined that the highest number in the column is the highest percentile. The data value sequence ascends as the percentile ascends or descends as the percentile descends. When the sort in the QUANTILE function is changed to ASC the data value sequence changes to ascend as the percentile descends. The sequence of the percentile does not change, but the data value sequence is changed to ascend (ASC) instead of the default, which is to descend (DESC).
SELECT Product_ID ,Sale_Date, Daily_Sales
,QUANTILE (100, Daily_Sales ASC, Sale_Date)
as "Percentile"
FROM Sales_Table
QUALIFY "Percentile" >= 70 ;
The example SELECT above uses the ASC to cause the data values to go contradictory to the percentile.
SELECT Product_ID, Sale_Date, Daily_Sales
,QUANTILE (100, Daily_Sales ASC, Sale_Date ASC)
as "Percentile"
FROM Sales_Table
QUALIFY "Percentile" >= 70 ;
The next SELECT modifies the above query to incorporate the sale date as a tiebreaker and reverse the ordering for the two rows with sales of $32,800.50.
SELECT Product_ID, Sale_Date, Daily_Sales
,QUANTILE (4, Daily_Sales , Sale_Date ) AS "Quartiles"
FROM Sales_Table WHERE Product_ID in (1000, 2000) ;
Instead of 100 the example above uses a quartile (QUANTILE based on 4 partitions).
SELECT Product_ID, Sale_Date, Daily_Sales
,QUANTILE (4, Daily_Sales , Sale_Date ) AS "Quartiles"
FROM Sales_Table WHERE Product_ID = 1000;
Assigning a different value to the <partitions> indicator of the QUANTILE function changes the number of partitions established. Each Quantile partition is assigned a number starting at 0 increasing to a value that is one less than the partition number specified. So, with a Quantile of 4 the partitions are 0 through 3 and for 10, the partitions are assigned 0 through 9. Then, all the rows are distributed as evenly as possible into each partition from highest to lowest values. Normally, extra rows with the lowest value begin back in the lowest numbered partitions.