Chapter 8 - The WHERE Clause

“I saw the angel in the marble and carved until I set him free.”

- Michelangelo

Using Limit to bring back a Sample

LIMIT { ALL | <integer-number> }

The following is an example using LIMIT:

SELECT *

FROM Employee_table

LIMIT 3 ;

image

Redshift offers a unique capability in its SQL to limit the number of rows returned from the table's data. It is a LIMIT clause and it is normally added at end of a valid SELECT statement with the above example and syntax. This example uses a LIMIT clause to reduce the rows returned, but in reality, the limiting of rows comes from the WHERE clause.

Using Limit With an Order By Statement

image

The brilliance of the example above is that we have sorted the data using an ORDER BY statement. Since we are sorting by Salary DESC, and we have a limit of 5 rows, this will bring back the top 5 salaried employees..

The WHERE Clause limits Returning Rows

image

The WHERE Clause here filters how many ROWS are coming back. In this example, I am asking for the report to only rows WHERE the first name is Henry.

Using a Column ALIAS throughout the SQL

image

When you ALIAS a column, you give it a new name for the report header, but a good rule of thumb is to refer to the column by the alias throughout the query.

Double Quoted Aliases are for Reserved Words and Spaces

image

“Write a wise saying and your name will live forever.”

-Anonymous

When you ALIAS a column, you give it a new name for the report header, but a good rule of thumb is to refer to the column by the alias throughout the query. Whoever wrote the above quote was way off. "Write a wise alias and it will live until the query ends – bummer".

Character Data needs Single Quotes in the WHERE Clause

image

In the WHERE clause, if you search for Character data such as first name, you need single quotes around it. You Don’t single-quote integers.

Character Data needs Single Quotes, but Numbers Don’t

image

Character data (letters) need single quotes, but you need NO Single Quotes for Integers (numbers). Remember, you never use double quotes except for aliasing.

NULL means UNKNOWN DATA so Equal (=) won’t Work

image

First thing you need to know about a NULL is it is unknown data. It is NOT a zero. It is missing data. Since we don’t know what is in a NULL, you can’t use an = sign. You must use IS NULL or IS NOT NULL.

Use IS NULL or IS NOT NULL when dealing with NULLs

SELECT *

FROM    Student_Table

WHERE    Class_Code IS NULL ;

image

If you are looking for a row that holds NULL value, you need to put ‘IS NULL’. This will only bring back the rows with a NULL value in it.

NULL is UNKNOWN DATA so NOT Equal won’t Work

image

The same goes with = NOT NULL. We can’t compare a NULL with any equal sign. We can only deal with NULL values with IS NULL and IS NOT NULL.

Use IS NULL or IS NOT NULL when dealing with NULLs

SELECT *

FROM    Student_Table

WHERE    Class_Code IS NOT NULL ;

image

Much like before, when you want to bring back the rows that do not have NULLs in them, you put an ‘IS NOT NULL’ in the WHERE Clause.

Using Greater Than Or Equal To (>=)

image

All rows returned have a Grade_Pt >= 3.0

The WHERE Clause doesn’t just deal with ‘Equals’. You can look for things that are GREATER or LESSER THAN along with asking for things that are GREATER/LESSER THAN or EQUAL to.

AND in the WHERE Clause

image

Notice the WHERE statement and the word AND. In this example, qualifying rows must have a Class_Code = ‘FR’ and also must have a First_Name of ‘Henry’. Notice how the WHERE and the AND clause are on their own line. Good practice!

Troubleshooting AND

image

What is going wrong here? You are using an AND to check the same column. What you are basically asking with this syntax is to see the rows that have BOTH a Grade_Pt of 3.0 and a 4.0. That is impossible, so no rows will be returned.

OR in the WHERE Clause

image

Notice above in the WHERE Clause we use OR. Or allows for either of the parameters to be TRUE in order for the data to qualify and return.

Troubleshooting Or

image

Notice above in the WHERE Clause we use OR. Or allows for either of the parameters to be TRUE in order for the data to qualify and return. The first example errors and is a common mistake. The second example is perfect.

Troubleshooting Character Data

image

This query errors! What is WRONG with this syntax? No Single quotes around SR.

Using Different Columns in an AND Statement

image

Notice that AND separates two different columns, and the data will come back if both are TRUE.

Quiz – How many rows will return?

image

SELECT * FROM Student_Table

WHERE  Grade_Pt = 4.0 OR Grade_Pt = 3.0

AND           Class_Code = 'SR' ;

Which Seniors have a 3.0 or a 4.0 Grade_Pt average. How many rows will return?

A) 2

C) Error

B) 1

D) 3

Answer to Quiz – How many rows will return?

image

We had two rows return! Isn’t that a mystery? Why?

What is the Order of Precedence?

image     ()

image     NOT

image     AND

image     OR

SELECT     *

FROM       Student_Table

WHERE Grade_Pt = 4.0 OR Grade_Pt = 3.0

AND         Class_Code = 'SR' ;

Syntax has an ORDER OF PRECEDENCE. It will read anything with parentheses around it first. Then, it will read all the NOT statements. Then, the AND statements. FINALLY, the OR Statements. This is why the last query came out odd. Let’s fix it and bring back the right answer set.

Using Parentheses to change the Order of Precedence

image

This is the proper way of looking for rows that have both a Grade_Pt of 3.0 or 4.0 AND also having a Class_Code of ‘SR’. Only ONE row comes back. Parentheses are evaluated first, so this allows you to direct exactly what you want to work first.

Using an IN List in place of OR

image

Using an IN List is a great way of looking for rows that have both a Grade_Pt of 3.0 or 4.0 AND also have a Class_Code of ‘SR’. Only ONE row comes back.

The IN List is an Excellent Technique

image

The IN Statement avoids retyping the same column name separated by an OR. The IN allows you to search the same column for a list of values. Both queries above are equal, but the IN list is a nice way to keep things easy and organized.

IN List vs. OR brings the same Results

image

The IN Statement avoids retyping the same column name separated by an OR. The IN allows you to search the same column for a list of values. Both queries above are equal, but the IN list is a nice way to keep things easy and organized.

Using a NOT IN List

SELECT  *

FROM          Student_Table

WHERE   Grade_Pt  NOT IN (2.0, 3.0, 4.0) ;

image

“First you imitate, then you innovate.”

- Miles Davis

You can also ask to see the results that ARE NOT IN your parameter list. That requires the column name and a NOT IN. Neither the IN nor NOT IN can search for NULLs! Miles Davis got this IT quote all wrong. First you innovate, and then you sue anyone who imitates. Please make a note of it!

A Technique for Handling Nulls with a NOT IN List

image

This is a great technique to look for a NULL when using a NOT IN List.

Another Technique for Handling Nulls with a NOT IN List

image

This is a great technique to eliminate any NULL values when using a NOT IN List.

BETWEEN is Inclusive

SELECT   *

FROM      Student_Table

WHERE   Grade_Pt BETWEEN 2.0 AND 4.0 ;

image

This is a BETWEEN. What this allows you to do is see if a column falls in a range. It is inclusive meaning that in our example, we will be getting the rows that also have a 2.0 and 4.0 in their column!

NOT BETWEEN is Also Inclusive

image

"The difference between genius and stupidity is that genius has its limits."

Albert Einstein

This is a NOT BETWEEN example. What this allows you to do is see if a column does not fall in a range. It is inclusive meaning that in our example, we will be getting no rows where the grade_pt is between a 2.0 and 4.0 in their column! The 2.0 and the 4.0 will also not return.

LIKE command Underscore is Wildcard for one Character

image

The _ underscore sign is a wildcard for any a single character. We are looking for anyone who has an 'a' as the second letter of their last name.

LIKE Command Works Differently on Char Vs Varchar

image

It is important that you know the data type of the column you are using with your LIKE command. VARCHAR and CHAR data differ slightly.

The Ilike Command Is NOT Case Sensitive

image

With Redshift, the ilike command is NOT case sensitive, but the like command is case sensitive. These rows came back because they have an 'AR' in positions 2 and 3 of their last_name. The 'AR' are not really capitalized, but that is why you use the ilike command. It doesn't care about case!

Troubleshooting LIKE Command on Character Data

image

This is a CHAR(20) data type. That means that any words under 20 characters will pad spaces behind them until they reach 20 characters. You will not get any rows back from this example because technically, no row ends in an ‘N’, but instead ends in a space.

Introducing the TRIM Command

image

This is a CHAR(20) data type. That means that every Last_Name is going to be 20 characters long. Most names are not really 20 characters long, so spaces are padded at the end to ensure filling up all 20 characters. We need to do the TRIM command to remove the leading and trailing spaces. Once the spaces are trimmed, we can find out whose name ends in 'n'.

Quiz – What Data is Left Justified and What is Right?

SELECT   *

FROM       Sample_Table

WHERE   Column1 IS NULL

AND         Column2 IS NULL ;

image

Which Column from the Answer Set could have a DATA TYPE of INTEGER, and which could have Character Data?

Numbers are Right Justified and Character Data is Left

SELECT   *

FROM       Sample_Table

WHERE   Column1 IS NULL

AND         Column2 IS NULL ;

image

All Integers will start from the right and move left. Thus, Col1 was defined during the table create statement to hold an INTEGER. The next page shows a clear example.

Answer – What Data is Left Justified and What is Right?

SELECT  Employee_No, First_Name

FROM     Employee_Table

WHERE  Employee_No = 2000000;

image

All Integers will start from the right and move left. All Character data will start from the left and move to the right.

An Example of Data with Left and Right Justification

SELECT Student_ID, Last_Name

FROM    Student_Table ;

image

This is how a standard result set will look. Notice that the integer type in Student_ID starts from the right and goes left. Character data type in Last_Name moves left to right like we are use to seeing while reading English.

A Visual of CHARACTER Data vs. VARCHAR Data

image

Character data pads spaces to the right and Varchar uses a 2-byte VLI instead.

Use the TRIM command to remove spaces on CHAR Data

image

By using the TRIM command on the Last_Name column, you are able to trim off any spaces from the end. Once we use the TRIM on Last_Name, we have eliminated any spaces at the end, so now we are set to bring back anyone with a Last_Name that truly ends in ‘n’!

Like and Your Escape Character of Choice

image

Sometimes you want to use the LIKE command, but you also want to search for the values of percent (%) or Underscore (_). You can turn off these wildcards by using an escape character. The following example uses the escape character @ to search for strings that include "_" just after the word "start". The @ sign just in front of the underscore (_) means that the underscore is no longer a wildcard, but an actual literal underscore.

Like and the Default Escape Character

image

Sometimes you want to use the LIKE command, but you also want to search for the values of percent (%) or Underscore (_). You can turn off these wildcards by using an escape character. The following example uses the default escape characters \ to search for strings that include underscore "_" just after the word "start". The \ just in front of the underscore means that the underscore is no longer a wildcard, but a literal underscore.

Similar To Operators

.

Matches any single character.

*

Matches zero or more occurrences.

+

Matches one or more occurrences.

?

Matches zero or one occurrence.

|

Specifies alternative matches; for example, E | H means E or H.

^

Matches the beginning-of-line character.

$

Matches the end-of-line character.

$

$ Matches the end of the string.

[]

Brackets specify a matching list, that should match one expression in the list. A
  caret (^) precedes a nonmatching list, which matches any character except for
     the expressions represented in the list.

()

Parentheses group items into a single logical item.

{m}

Repeat the previous item exactly m times.

{m,}

Repeat the previous item m or more times.

{m,n}

Repeat the previous item at least m and not more than n times.

[: :]

Matches any character within a POSIX character class. In the following character
  classes, Amazon Redshift supports only ASCII characters: [:alnum:],
     [:alpha:],[:lower:], [:upper:]

POSIX pattern matching supports the above metacharacters

Similar To Operators

%

Matches any sequence of zero or more characters.

_

Matches any single character.

|

Denotes alternation (either of two alternatives).

*

Repeat the previous item zero or more times.

+

Repeat the previous item one or more times.

?

Repeat the previous item zero or one time.

{m}

Repeat the previous item exactly m times.

{m,}

Repeat the previous item m or more times.

{m,n}

Repeat the previous item at least m and not more than n times.

()

Parentheses group items into a single logical item.

[. . .]

A bracket expression specifies a character class, just as in
    POSIX regular expressions.

SELECT First_Name

,Last_Name

FROM   Employee_Table

WHERE First_Name similar to '%e%|%h%'

ORDER BY First_Name;

The following example finds all employees with a First_Name that contain "e" or "h". Regular expression matching using SIMILAR TO is computationally expensive. We recommend using LIKE whenever possible especially when processing a very large number of rows. For example, the following queries are functionally identical, but the query that uses LIKE executes several times faster than the query that uses a regular expression. The next page shows the answer set.

Similar To Example With Lower Case Letters

image

The example above finds all employees with a First_Name that contains an "e" or an "h". Herbert did not return because he has an 'h', but he returned because he does have an 'e' in his First_Name.

Similar To Example With Lower and Upper Case Letters

image

The example above finds all employees with a First_Name that contains an "i" or a capital "H". Notice that "John" is no longer in the answer set (like he was in the previous example). John has an "h" in it, but not a capital "H".

Similar To Example With Multiple Occurrences

image

The example above finds all employees with a First_Name that contains two l's. Both Billy and William contain two 'l's. Notice that both names have the letter 'l' back to back. Notice that the name William also has the letter 'i' in it twice also, but if I have changed the query to look for 'i' instead of 'l‘, then William would not have come back. The occurrences must follow consecutively. I will show the same query on the next page, but use the 'i' instead of 'l'.

Multiple Occurrences Must Be Consecutive

image

The name William has the letter 'i' in it twice, but no rows came back. This is because the occurrences must follow each other consecutively. There needs to be a name with two occurrences of the letter 'i' back to back!

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset