Chapter 4 – Distinct, Group By and TOP

“A bird does not sing because it has the answers, it sings because it has a song."

- Anonymous

The Distinct Command

image

DISTINCT eliminates duplicates from returning in the Answer Set.

Distinct vs. GROUP BY

image

Both examples produce the exact same result

Class_Code

?

FR

JR

SO

SR

Rules for Distinct Vs. GROUP BY

(1) Many Duplicates – use GROUP BY

(2) Few Duplicates – use DISTINCT

(3) Space Exceeded - use GROUP BY

Distinct and GROUP BY in the two examples return the same answer set.

Quiz – How many rows come back from the Distinct?

image

How many rows will come back from the above SQL?

Answer – How many rows come back from the Distinct?

SELECT       Distinct Class_Code, Grade_Pt

FROM         Student_Table

ORDER BY   Class_Code, Grade_Pt ;

image

How many rows will come back from the above SQL? 10. All rows came back. Why? Because there are no exact duplicates that contain a duplicate Class_Code and Duplicate Grade_Pt combined. Each row in the SELECT list is distinct.

TOP Command

image

In the above example, we brought back 3 rows only. This is because of the TOP 3 statement which means to get an answer set, and then bring back the first 3 rows in that answer set. Because this example does not have an ORDER BY statement, you can consider this example as merely bringing back 3 random rows.

TOP Command is brilliant when ORDER BY is used!

image

In the above example, we brought back 3 rows only. This is because of the TOP 3 statement which means to get an answer set, and then bring back the first 3 rows. Because this example uses an ORDER BY statement, the data brought back is from the top 3 students with the highest Grade_Pt. This is the real power of the TOP command. Use it with an ORDER BY!

TOP Command with PERCENT

image

In the above example, we brought back 2 rows only. This is because of the TOP 20 percent statement. We brought back 20 percent of the rows.

TOP Command with PERCENT

image

In the above example, we brought back 5 rows only. This is because of the TOP 50 percent statement. We brought back 50 percent of the rows, but also notice the ORDER BY statement. We brought back the top 50 percent of students with the highest grade_pt averages.

The TOP Command WITH TIES

image

By using the TOP WITH TIES Command, this will bring in the TOP amount along with ANY ties. So while you might only ask for the top 2 with ties, you might get 4 rows back. Why did 4 rows return here? Which row came back first? Four rows returned with the first row coming back as a NULL for Class_Code. Then the next row returned was one of the freshman. There were two other freshman that tie. All ‘FR’ come back in a tie!

The TOP Command Using a Variable

image

“You miss 100 percent of the shots you never take.”

– Wayne Gretzky

You can use the TOP command in conjunction with a variable. Above, we declared a variable called @TOPVAR. We set the variable to 5. If we highlight the DECLARE and the Query we get the answer set providing the TOP 5 salaried employees.

The TOP 1 Command for a Random Sample

SELECT       TOP(1) *

FROM          Employee_Table

ORDER BY NewID() ;

image

For those times that you need a single random row, you can use the TOP 1 technique and sorting by newid(). When you combine the TOP 1 with a sort by newid(), they return a random row each time the query executes. There is a performance cost to using TOP(1) and newid() because the system has to add a uniqueidentifier to every row and then sort by the uniqueidentifier. A better solution is to add a tablesample option to the table when randomly selecting a single row from a large table. Be careful though, because if the table is smaller sometimes no rows come back.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset