Chapter 10 – Temporary Tables

“I cannot imagine any condition which would cause this ship to founder. Modern shipbuilding has gone beyond that.”

- E. I. Smith, Captain of the Titanic

There are two types of Temporary Tables

Derived Table

Exists only within a query

Materialized by a SELECT Statement inside a query

Space comes from the User’s Spool space

Deleted when the query ends

Temporary Tables

Hive automatically deletes all temporary tables at the end of the
user's session. Tables are stored in the user's scratch directory and
not in the Hive warehouse directory. Scratch acts as the data
sandbox for a user, located by default in /tmp/hive-<username>.

The two types of Temporary tables are Derived and Temporary.

CREATING A Derived Table

image

The SELECT Statement that creates and populates the Derived table is always inside Parentheses.

CREATING A Derived Table using the WITH Command

image

When using the WITH Command, we can CREATE our Derived table before running the main query.

The Same Derived Query shown Two Different Ways

image

Both techniques above can be used to create a derived table.

Most Derived Tables Are Used to Join To Other Tables

image

Most derived tables involve calculations, aggregations or ordered analytics. This allows tables and derived columns to mix well on the final report. Above, we are finding all employees who make a salary that is greater than the average salary within their own department. We created a derived table that holds all departments and the average salary within the department. We then join the derived table (named TeraTom) to the employee_table where we can check the salary vs. the avg(salary).

The Three Components of a Derived Table

image

image

A derived table will always have a SELECT query to materialize the derived table with data. The SELECT query always starts with an open parenthesis and ends with a close parenthesis.

image

The derived table must be given a name. Above we called our derived table TeraTom.

image

You will need to define (alias) the columns in the derived table. Above we allowed Dept_No to default to Dept_No, but we had to specifically alias AVG(Salary) as AVGSAL.

Every derived table must have the three components listed above. All derived tables are deleted when the query is done.

Visualize This Derived Table

image

Our example above shows the data in the derived table named TeraTom. This query allows us to see each employee and the plus or minus avg of their salary compared to the other workers in their department.

Our Join Example with A Different Column Aliasing Style

image

Column Aliasing Can Default for Normal Columns

image

In a derived table, you will always have a SELECT query in parenthesis, and you will always name the table. You have options when aliasing the columns. As in the example above, you can let normal columns default to their current name.

Our Join Example with the WITH Syntax

image

The WITH statement created and populated the derived table we named TeraTom. After, the user must then provide a second SELECT statement to use the derived table in the actual query. If there are not two select statements the query will error.

Quiz - Answer the Questions

SELECT Dept_No, First_Name, Last_Name, AVGSAL

FROM Employee_Table

INNER JOIN

(SELECT Dept_No as Depty, AVG(Salary) as AVGSAL

  FROMEmployee_Table

  GROUP BY Dept_No) as TeraTom ON Dept_No = Depty  ;

1) What is the name of the derived table? __________

2) How many columns are in the derived table? _______

3) What is the name of the derived table columns? ______

4) Is there more than one row in the derived table? _______

5) What common keys join the Employee and Derived? _______

6) Why were the join keys named differently? ______________

Fill in the quiz and show you are a master of Hadoop and derived tables.

Answer to Quiz - Answer the Questions

SELECT Dept_No, First_Name, Last_Name, AVGSAL

FROM Employee_Table

INNER JOIN

(SELECT Dept_No as Depty, AVG(Salary) as AVGSAL

  FROMEmployee_Table

  GROUP BY Dept_No) as TeraTom ON Dept_No = Depty ;

1) What is the name of the derived table? TeraTom

2) How many columns are in the derived table? 2

3) What’s the name of the derived columns? Depty and AVGSAL

4) Is their more than one row in the derived table? Yes

5) What keys join the tables? Dept_No and Depty

6) Why were the join keys named differently? If both were named Dept_No, we would error unless we full qualified the column.

You are now the master of Hadoop and derived tables.

Clever Tricks on Aliasing Columns in a Derived Table

image

Understanding all options on aliasing your columns makes you an expert.

Two Derived Tables Joining to a Permanent Table

image

We created two derived tables named T and S and joined them all together with the permanent table named Employee_Table. We created the derived table named T using a WITH clause and then created the derived table S at the bottom of the query.

The Key to Multiple WITH Tables

image

The key to multiple tables in a WITH statement is to remember that there is only one WITH statement. Multiple tables within the WITH are separated by a comma. This query could have been achieved without any derived tables, but it is designed to simply show you form.

Joining Two WITH Tables to a Permanent Table

image

We created two derived tables named T and S and joined them all together with the permanent table named Employee_Table. Both derived tables (T and S) were created and materialized with a single WITH statement.

Using a Derived Table and Row_Number

WITH Results AS

( SELECT

ROW_NUMBER()

OVER(ORDER BY Product_ID, Sale_Time) AS RowNumber,

Product_ID, Sale_Time

FROM Sales_Table

) SELECT *

FROM Results

WHERE RowNumber BETWEEN 8 AND 14

image

In the example above, we are using a derived table called Results and then utilizing a WHERE clause to only take certain RowNumbers.

LEAD

select product_id, Sale_Time, daily_sales

, lead(daily_sales) over(partition by product_id order by Sale_Time)

AS Next_Value

from sales_table

image

This LEAD example shows the value of Daily_Sales and then the next value on the same line. The example on the next page will use a derived table to show how to find the first or last occurrence when partitioning. This will be valuable.

Finding the First Occurrence

WITH Derived_Tbl AS
(select Product_ID as Prod

            ,Cast(Sale_Time as Date) as Sale_Date

            ,Daily_Sales

,lead(daily_sales) over(partition by product_id

               order by Sale_Time ASC)  AS Next_Sales

,Row_Number() over (partition by product_id

               order by Sale_Time ASC) AS Row_Num

from sales_table)

Select * from Derived_Tbl Where Row_Num = 1 ;

image

Use a Derived Table and a Row_Number command and you can find the first occurrence of your data. Above, we partitioned by Product_ID, but we did the order by statement in ASC mode on Sale_Time. We just got the first Sale_Time. We also threw in the next row's Daily_Sales, which happened after the first row's Sale_Time.

Finding the Last Occurrence

WITH Derived_Tbl AS
(select Product_ID as Prod

            ,Cast(Sale_Time as Date) as Sale_Date

            ,Daily_Sales

,lead(daily_sales) over(partition by product_id

               order by Sale_Time DESC)  AS Previous_Sales

,Row_Number() over (partition by product_id

               order by Sale_Time DESC) AS Row_Num

from sales_table)

Select * from Derived_Tbl Where Row_Num = 1 ;

image

Use a Derived Table and a Row_Number command and you can find the last occurrence of your data. Above, we partitioned by Product_ID, but we did the order by statement in DESC mode on Sale_Time. We just got the last Sale_Time. We also threw in the previous row's Daily_Sales, which happened before the last row's Sale_Time.

Creating a Temporary Table

There are three ways to create a Temporary Table

CREATE TEMPORARY TABLE tmp1 (c1 string);

CREATE TEMPORARY TABLE tmp2 AS ...

CREATE TEMPORARY TABLE tmp3 LIKE ...

image

A temporary table can be the same name as a permanent table, but the user cannot access the permanent table during that session without dropping or renaming the temporary table. Once the session is over the temporary table is deleted.

Creating, Populating and Querying a Temporary Table

image

Above we create a temporary table named DeptTemp. We populated DeptTemp with an INSERT/SELECT statement to pull in the data from the Department_Table. We are then able to query DeptTemp as many times as we want. We cannot query it from another session. No other users can see or access DeptTemp. Once we end our session, DeptTemp is dropped.

Creating a Temporary Table Using the LIKE Keyword

image

Above we create a temporary table named DeptTemp. We populated DeptTemp with an INSERT/SELECT statement to pull in the data from the Department_Table. We are then able to query DeptTemp as many times as we want. We cannot query it from another session. No other users can see or access DeptTemp. Once we end our session, DeptTemp is dropped.

Creating a Temporary Table Using the LIKE Keyword

image

Above we create a temporary table named DeptTemp. We populated DeptTemp with an INSERT/SELECT statement to pull in the data from the Department_Table. We are then able to query DeptTemp as many times as we want. We cannot query it from another session. No other users can see or access DeptTemp. Once we end our session, DeptTemp is dropped.

Creating a Temporary Table and Populating it Simultaneously

image

Using the CREATE TABLE AS syntax we have created and populated our table with a single step.

Creating a Temporary Table that Joins Multiple Tables

image

Using the CREATE TABLE AS syntax we not only created and populated our table with a single step, but we joined two tables together to create one temporary table.

Many Users Can Use the Same Temporary Table Name

CREATE Temporary  TABLE   Dept_Agg

                   ( Dept_no                       Int

                     ,Sum_Salary                 Decimal(10,2)

                   )

image

Many users can create the same table with
the same name, but each gets their own copy
and nobody can see another user's data

Above we see two users creating the same table with the same name. Both users will get their own table. They can populate their table with different data, but neither can see the other's table. Each gets their own secure version of the data. When they logoff, the table is dropped.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset