Creating Pivot Tables

The last thing we'll do today is look at a method for generating a pivot table using aggregates and grouping. Pivot tables, also called cross-tabs, are different from regular result sets because the column headings are drawn from the data itself. One column in the data is selected to become the headings for the cross-tab report. To understand this better, you need to see the result set for a pivot table. Then we will step through the process of building it. The result set we will produce analyzes current (1998) and prior (1997) year orders for the first five months of the year:

Month     Orders 1998 Orders 1997 Percent Change
--------- ----------- ----------- -------------------------
January            55          33                 66.666666
February           54          29                 86.206896




March              73          30                143.333333
April              74          31                138.709677
May                14          32                -56.250000
NULL              270         155                 74.193548

The months (extracted from the export date) are displayed on the left. The years (also extracted from the export date) are displayed at the top. Each cell (intersection of a month and year) contains the result of an aggregate based on that month and year. At the right is the percent change from the prior year to the current year for each month. At the bottom is a year-to-date total.

Note

Everything we have learned so far will end up in this query. Make sure that you feel pretty good about the material. You might even want to work on some of the exercises at the end before you go on.

We will take this example slowly, as we did the rest. Work through each stage of the code. Test it, change it, break it: It's the only way to learn.


The key to this query is the CASE operator. You learned about this yesterday, and now it's time to put it to good use.

We'll use CASE to examine the year of each order and decide whether the order belongs in a count of orders for 1998 or 1997. This query looks at 10 individual rows, performing that analysis:

set rowcount 10
select OrderDate,
       datepart(yy, OrderDate) as 'Year',
       case datepart(yy, OrderDate)
              when 1998 then 1
              else 0
       end as 'is1998'
from Orders
where OrderDate >= '12/30/97'
set rowcount 0

Results:

OrderDate                   Year        1998
--------------------------- ----------- -----------
1997-12-30 00:00:00.000            1997           0
1997-12-30 00:00:00.000            1997           0
1997-12-30 00:00:00.000            1997           0
1997-12-31 00:00:00.000            1997           0
1997-12-31 00:00:00.000            1997           0
1998-01-01 00:00:00.000            1998           1



1998-01-01 00:00:00.000            1998           1
1998-01-01 00:00:00.000            1998           1
1998-01-02 00:00:00.000            1998           1
1998-01-02 00:00:00.000            1998           1

Note

The first and last lines set a limit on the number of rows returned by a query. I wanted to limit the output to just 10 rows.


The WHERE condition was specifically chosen to find rows that select rows in 1997 and 1998. This enables you to see that the CASE expression returns 0 for rows that do not fall in 1998 and 1 for rows that do fall in 1998.

The expression to find rows in 1997 uses the same logic. Here is a query that lists the orders and includes a column for each year. The left column displays the month of the order.

set rowcount 10
select datename(mm, OrderDate) as 'Month',
       case datepart(yy, OrderDate)
              when 1998 then 1
              else 0
       end as 'is1998',
       case datepart(yy, OrderDate)
              when 1997 then 1
              else 0
       end as 'is1997'
from Orders
where OrderDate >= '12/30/97'
set rowcount 0

Results:

Month                          is1998      is1997
------------------------------ ----------- -----------
December                                 0           1
December                                 0           1
December                                 0           1
December                                 0           1
December                                 0           1
January                                  1           0
January                                  1           0
January                                  1           0
January                                  1           0
January                                  1           0

Notice that there is still no grouping clause, so each row in the result reflects a single row in the Orders table.

If we had a table with the columns Month, is1998, and is1997, it would be a simple matter to produce the grouped result. We would add up all the ones and zeros in each column to come up with a total number of rows. Here's the query we would write:

/* query we would write if we could */
select
    Month,
    sum(is1998) as 'Orders 1998',
    sum(is1997) as 'Orders 1997'
  from
       Orders
 where
       OrderDate >= '1/1/97'and
       OrderDate <  '1/1/99'
 group by
       MonthNumber, Month with rollup
having
       MonthNumber is null or
       MonthName is not null

You should add the date condition to avoid working with rows that are outside the range of either year column. That will save processing time.

To get the overall totals, I've added the ROLLUP operator. Make certain to group first by the month number rather than the month name so that the data is sorted chronologically, not alphabetically. (You learned about that problem earlier today.) I've added the HAVING clause to suppress rows created by the extra grouping level.

Unfortunately, we don't have those columns in our table, but we already have expressions for them. Just write the same query, but substitute the expressions from the preceding code. Listing 4.10 has the integrated code.

Code Listing 4.10. Summing the CASE Expressions Provides a Count of Rows Matching the Criteria
 1: select
 2:        datename(mm, OrderDate) as 'Month',
 3:        sum(case datepart(yy, OrderDate)
 4:               when 1998 then 1
 5:               else 0
 6:        end) as 'Orders 1998',
 7:        sum(case datepart(yy, OrderDate)
 8:               when 1997 then 1
 9:               else 0
10:        end) as 'Orders 1997'
11:   from
12:        Orders
13:  where




14:        OrderDate >= '1/1/97'and
15:        OrderDate <  '1/1/99'
16:  group by
17:        datepart(mm, OrderDate),
18:        datename(mm, OrderDate) with rollup
19: having
20:        datepart(mm, OrderDate) is null or
21:        datename(mm, OrderDate) is not null
					

The result set displays the month with the orders for each of the years:

Month                          Orders 1998 Orders 1997
------------------------------ ----------- -----------
January                                 55          33
February                                54          29
March                                   73          30
April                                   74          31
May                                     14          32
June                                     0          30
July                                     0          33
August                                   0          33
September                                0          37
October                                  0          38
November                                 0          34
December                                 0          48
NULL                                   270         408

Now we need to perform the year-to-year comparison. We'll limit the query to the first five months of the year (there's no data after May for 1998). To do that, use a WHERE clause.

We can calculate the percent change with this expression:

100 * Orders 1998 / Orders 1997 - 100 as 'Percent Change'

Again, it's just a matter of substituting the expressions we already have for Orders 1998 and Orders 1998 into this formula and adding it to the select list as shown in Listing 4.11.

Code Listing 4.11. Final Query Displaying Percent Change from Prior to Current Year
 1: select
 2:        DateName(mm, OrderDate) as 'Month',
 3:        sum(case DatePart(yy, OrderDate)
 4:               when 1998 then 1
 5:               else 0 end)      as 'Orders 1998',
 6:        sum(case DatePart(yy, OrderDate)
 7:               when 1997 then 1
 8:               else 0 end)      as 'Orders 1997',
 9:        (100. *
10:        sum(case DatePart(yy, OrderDate)
11:               when 1998 then 1.
12:               else 0. end)
13:        /
14:        sum(case DatePart(yy, OrderDate)
15:               when 1997 then 1.
16:               else 0. end)) - 100. as 'Percent Change'
17:   from Orders
18:  where OrderDate >= '1/1/1997'
19:    and OrderDate < '1/1/1999'
20:    and datepart(mm, OrderDate) <= 5
21:  group by
22:        DatePart(mm, OrderDate),
23:        DateName(mm, OrderDate) with rollup
24: having
25:        datepart(mm, OrderDate) is null
26:     or datename(mm, OrderDate) is not null

In this code, it's worth noting that all the numbers in the percentage calculation have been converted to the float data type by adding a decimal point (1., 0., 100.). Otherwise, the percentage would be calculated by using integer arithmetic and would be incredibly inaccurate.

We will come back to this code again later in the book. This query and queries like it are very common requirements of SQL programmers. Tomorrow, after you have learned to join tables, you will be asked to produce a similar analysis of the total dollar activity for the company.

Later in the book, you will learn to use a number of methods that will simplify queries like this one. User-defined functions and computed columns will enable users to invoke the detailed CASE logic more simply. Views will enable you to predefine whole queries to access similar data more easily. Stored procedures will enable you to execute a query like this one by name.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset