Chapter 4 - Aggregation

“Teradata climbed Aggregate Mountain and delivered a better way to Sum It.”

- Tera-Tom Coffing

Quiz – You calculate the Answer Set in your own Mind

Aggregation_Table

  Employee_No   

Salary      

423400  

100000.00

423401  

100000.00

423402  

      NULL

SELECT AVG(Salary)    as "AVG"

,Count(Salary) as SalCnt

,Count(*)         as RowCnt

FROM   Aggregation_Table ;

image

What would the result set be from the above query? The next slide shows answers!

Answer – You calculate the Answer Set in your own Mind

image

SELECT AVG(Salary)    as "AVG"

,Count(Salary) as SalCnt

,Count(*)         as RowCnt

FROM   Aggregation_Table ;

image

Here are your answers!

The 3 Rules of Aggregation

image

1) Aggregates Ignore Null Values.

2) Aggregates WANT to come back in one row.

3) You CAN’T mix Aggregates with normal columns
unless you use a GROUP BY.

image

There are Five Aggregates

There are FIVE AGGREGATES which are the following:

MIN – The Minimum Value.

MAX – The Maximum Value.

AVG – The Average of the Column Values.

SUM – The Sum Total of the Column Values.

COUNT – The Count of the Column Values.

SELECT MIN (Salary)

,MAX (Salary)

,SUM (Salary)

,AVG (Salary)

,Count(*)

FROM   Employee_Table ;

“Don’t count the days, make the days count.”

– Mohammed Ali

The five aggregates are listed above. Mohammed Ali was way off in his quote. He meant to say, "Don’t you count the days, make the data count for you".

Quiz – How many rows come back?

image

How many rows will the above query produce in the result set?

Answer – How many rows come back?

image

How many rows will the above query produce in the result set? The answer is one.

Troubleshooting Aggregates

image

If you have a normal column (non aggregate) in your query, you must have a corresponding GROUP BY statement.

GROUP BY when Aggregates and Normal Columns Mix

image

If you have a normal column (non aggregate) in your query, you must have a corresponding GROUP BY statement.

GROUP BY Delivers one row per Group

image

Group By Dept_No command allow for the Aggregates to be calculated per Dept_No. The data has also been sorted with the ORDER BY statement.

GROUP BY Dept_No or GROUP BY 1 the same thing

image

Both queries above produce the same result. The GROUP BY allows you to either name the column or use the number in the SELECT list just like the ORDER BY.

Limiting Rows and Improving Performance with WHERE

image

Will Dept_No 300 be calculated? Of course you know it will . . . NOT!.

WHERE Clause in Aggregation limits unneeded Calculations

image

The system eliminates reading any other Dept_No’s other than 200 and 400. This means that only Dept_No’s of 200 and 400 will come off the disk to be calculated.

Keyword HAVING tests Aggregates after they are Totaled

image

The HAVING Clause only works on Aggregate Totals. The WHERE filters rows to be excluded from calculation, but the HAVING filters the Aggregate totals after the calculations, thus eliminating certain Aggregate totals.

Aggregates Return Null on Empty Tables

image

When an aggregate is run against an empty table it returns a null value.

Keyword HAVING is like an Extra WHERE Clause for Totals

image

The HAVING Clause only works on Aggregate Totals, and in the above example, only Count(*) > 2 can return.

Getting the Average Values Per Column

SELECT 'Product_ID' AS "Column Name"

,COUNT(*) / COUNT(DISTINCT(Product_ID)) AS "Avg Rows"

FROM Sales_Table ;

Column Name

Avg Rows

Product_ID

7

SELECT 'Sale_Date' AS "Column Name"

,COUNT(*) / COUNT(DISTINCT(Sale_Date)) AS "Avg Rows"

FROM Sales_Table ;

Column Name

Avg Rows

Sale_Date

3

The first query retrieved the average rows per value for the column Product_ID. The example below did the same, but for the column Sale_Date.

Average Values Per Column For all Columns in a Table

SELECT 'Product_ID' AS "Column Name"

,COUNT(*) / COUNT(DISTINCT(Product_ID))

AS "Avg Rows"       

,'Sale_Date' AS "Column Name2"

,COUNT(*) / COUNT(DISTINCT(Sale_Date))

AS "Avg Rows2"       

FROM Sales_Table ;

image

The query above retrieved the average rows per value for both columns in the table.

Three types of Advanced Grouping

There are three advanced grouping options:

GROUP BY Grouping Sets

GROUP BY Rollup

GROUP BY Cube

SELECT Product_ID

,EXTRACT (MONTH FROM Sale_Date) AS MTH

,EXTRACT (YEAR FROM Sale_Date) AS YR

,SUM(Daily_Sales) AS SUM_Daily_Sales

FROM Sales_Table

GROUP BY GROUPING SETS (Product_ID, MTH, YR)

ORDER BY Product_ID Desc, MTH Desc, YR Desc;

Be prepared to be amazed. There are three advanced options listed above for grouping data. Each is more powerful that the one before. The next pages will give great examples.

GROUP BY Grouping Sets

image

GROUP BY GROUPING Sets above will show you what your Daily_Sales were for each Product_ID, for each month, and for each year.

GROUP BY Rollup

image

GROUP BY ROLLUP displays what the Daily_Sales were for each Product_ID, for each distinct month, for each month per year and for each year, plus a grand total.

GROUP BY Rollup Result Set

image

This is the full result set from the previous GROUP BY ROLLUP query.

GROUP BY Cube

image

GROUP BY ROLLUP displays Daily_Sales were for each Product_ID, for each distinct month, for each month per year and for each year, plus a grand total.

GROUP BY CUBE Result Set

image

GROUP BY CUBE Result Set

image

In Nexus, just right click on the Sales_Table and choose Super Join Builder. Then select all the columns. Then choose the Analytics tab on the top right. Choose Grouping Sets in the Analytics Tab. Then drag the Product_ID column to the Product. Drag the Sale_Date to the Date Column. Then drag the Daily_Sales column to the Sum. Then Check Box all the Group By Functions on the right of the screen. Then hit Execute or Send SQL to Nexus. Done!

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset