Chapter 7 – Sub-query Functions

“An invasion of Armies can be resisted, but not an idea whose time has come.”

- Victor Hugo

An IN List is much like a Subquery

image

This query is easy to understand. It uses an IN List to find all Employees who are in Dept_No 100 or Dept_No 200.

An IN List Never has Duplicates – Just like a Subquery

image

What is going on with this IN List? Why in the world are their duplicates in there? Will this query even work? What will the result set look like? Duplicate values are ignored here. We got the same rows back as before, and it is as if the system ignored the duplicate values in the IN List. That is exactly what happened.

The Subquery

image

The query above is a Subquery which means there are multiple queries in the same SQL. The bottom query runs first, and its purpose is to build a distinct list of values that it passes to the top query. The top query then returns the result set. This query solves the problem: Show all Employees in Valid Departments!

The Three Steps of How a Basic Subquery Works

image

The bottom query runs first and builds a distinct IN list. Then the top query runs using the list.

These are Equivalent Queries

image

Both queries above are the same. Query 2 has values in an IN list. Query 1 runs a subquery to build the values in the IN list.

The Final Answer Set from the Subquery

image

SELECT *FROM  Employee_Table

WHERE Dept_No IN (SELECTDept_No FROM Department_Table) ;

image

Quiz- Answer the Difficult Question

image

How are Subqueries similar to Joins between two tables?

A great question was asked above. Do you know the key to answering? Turn the page!

Answer to Quiz- Answer the Difficult Question

image

How are Subqueries similar to Joins between two tables?

A Subquery between two tables or a Join between two tables will each need a common key
that represents the relationship. This is called a Primary Key/Foreign Key relationship.

A Subquery will use a common key linking the two tables together very similar to a join! When subquerying between two tables, look for the common link between the two tables. Most of the time they both have a column with the same name, but this is not always true.

Should you use a Subquery or a Join?

image

If you only want to see a report where the final result set has only columns from one table, use a Subquery. Obviously, if you need columns on the report where the final result set has columns from both tables, you have to do a Join

Quiz- Write the Subquery

image

Write the Subquery

Select all columns in the Customer_Table if the customer has placed an order!

Here is your opportunity to show how smart you are. Write a Subquery that will bring back everything from the Customer_Table if the customer has placed an order in the Order_Table. Good luck! Advice: Look for the common key among both tables!

Answer to Quiz- Write the Subquery

image

The common key among both tables is Customer_Number. The bottom query runs first and delivers a distinct list of Customer_Number values which the top query uses in the IN List!

Quiz- Write the More Difficult Subquery

image

Write the Subquery

Select all columns in the Customer_Table if the customer
has placed an order over $10,000.00 Dollars!

Here is your opportunity to show how smart you are. Write a Subquery that will bring back everything from the Customer_Table if the customer has placed an order in the Order_Table that is greater than $10,000.00.

Answer to Quiz- Write the More Difficult Subquery

image

Here is your answer!

Quiz- Write the Subquery with an Aggregate

image

Write the Subquery

Select all columns in the Employee_Table if the employee makes
a greater Salary than the AVERAGE Salary.

Another opportunity knocking! Would someone please answer the query door?

Answer to Quiz- Write the Subquery with an Aggregate

image

Hadoop won't allow you to use a subquery. However, you can do a LEFT SEMI JOIN. A LEFT SEMI JOIN will only place the results from the left table. Use the right table (TeraTom) as a qualifier.

Quiz- Write the Correlated Subquery

image

Write the Correlated Subquery

Select all columns in the Employee_Table if the employee makes a
greater Salary than the AVERAGE Salary (within their own Department).

Another opportunity knocking! This is a tough one, and only the best get this written correctly.

Answer to Quiz- Write the Correlated Subquery

image

Hadoop doesn't yet support a correlated subquery so you must use a derived table instead.

Quiz- Write the NOT Subquery

image

Write the Subquery

Select all columns in the Customer_Table if the
Customer has NOT placed an order.

Another opportunity knocking! Write the above query!

Answer to Quiz- Write the NOT Subquery

image

When a NOT IN subquery encounters a NULL value it returns nothing. Since the bottom query is passing up the Customer_Number to the top query if there are NULL values in any Customer_Number the top query returns nothing. That is why we utilized the IS NOT NULL in the bottom WHERE clause. That clause eliminates any NULL values from being passed to the top.

Quiz- Write the Subquery using a WHERE Clause

image

Write the Subquery

Select order_number, customer_number, and order_total
from the Customer_Table if an order was placed in the
Order_Table that by a customer with ‘Bill’ anywhere in
their name.

On the quiz above, take the opportunity to demonstrate what you have learned and show your brilliance!

Answer - Write the Subquery using a WHERE Clause

image

Great job on writing your query just like the above.

Quiz – How many rows return on a NOT IN with a NULL?

image

How many rows return from the query now that a
NULL value is in a Customer_Number?

We really didn’t place a new row inside the Order_Table with a NULL value for the Customer_Number column, but in theory, if we had, how many rows would return?

Answer – How many rows return on a NOT IN with a NULL?

image

How many rows return from the query now that a
NULL value is in a Customer_Number?

ZERO rows will return

The answer is no rows come back. This is because when you have a NULL value in a NOT IN list, the system doesn’t know the value of NULL, so it returns nothing.

How to handle a NOT IN with potential NULL Values?

image

You can utilize a WHERE clause that tests to make sure Customer_Number IS NOT NULL. This should be used when a NOT IN could encounter a NULL.

Using a Correlated Exists

image

Use EXISTS to find which Customers have placed an Order?

SELECTCustomer_Number, Customer_Name
FROMCustomer_Table as Top1
WHEREEXISTS
(SELECT *FROM Order_Table as Bot1
Where Top1.Customer_Number = Bot1.Customer_Number ) ;

The EXISTS command will determine via a Boolean if something is True or False. If a customer placed an order, it EXISTS. Using the Correlated Exists statement will bring back customers who have placed an order. EXISTS is different than IN as it is less restrictive as you will soon understand.

How a Correlated Exists matches up

image

Only customers who placed an order return with the above Correlated EXISTS. It is the final WHERE clause in the subquery that co-relates the Customer_Number from the top query to the Customer_Number of the bottom query. They are correlated.

The Correlated NOT Exists

image

The NOT EXISTS command determines via a Boolean if something is True or False. If a customer has not placed an order, it does not EXIST. Using the Correlated NOT Exists statement, only customers who have NOT placed an order will return in the answer set. NOT EXISTS is often used because it returns data even if NULL values are encountered. NOT EXISTS is different than NOT IN as it handles NULL values without any problems.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset