Filtering Data with WHERE Clauses

The first half of today's lesson will show you how to use a WHERE clause to restrict the rows returned from a table. Learning to limit rows with WHERE will improve system performance by reducing the amount of data returned to (and processed by) your application programs.

Queries that don't include a WHERE clause operate on all rows in a table. Later, when you learn to UPDATE and DELETE data, you will use WHERE clauses to determine what data in your tables is deleted and/or updated. If you don't include a WHERE clause in a DELETE statement, you will delete every row in the table, which is not a good idea. An example of an update that affects all the rows in a table is

Update Orders Set OrderDate = '1/1/2000'

This will modify every OrderDate in the table, regardless of what you really might have wanted to do.

WHERE clauses provide a test for each row to determine whether some condition is true or false for that row. The true or false answer is generated from criteria you initially set up in the test, which determines whether a row is returned. If a row meets the requirements you set for return, that row will be displayed. Subsequently, if a row does not meet the criteria, it will not be returned. Setting up the test and criteria generally follows this format:

<expression> <comparison-operator> <expression>

The expression can be any combination of columns, constants, variables (more on variables in the second week), functions, and operators. The WHERE clause must follow the table list (the FROM clause) in the SELECT statement, as shown in the following example:

use northwind
select EmployeeID, FirstName, LastName, city
From Employees
Where city = 'London'

Results:

EmployeeID  FirstName  LastName             city
----------- ---------- -------------------- ---------------
5           Steven     Buchanan             London
6           Michael    Suyama               London
7           Robert     King                 London
9           Anne       Dodsworth            London

(4 row(s) affected)

The comparison-operator defines the type of comparison that will occur between any given row and the criteria set forth for return. Here are the comparisons we'll learn about today:

  • Equality (=) and inequality (<>, !=)

  • Range search (>, >=, <, <=, BETWEEN, NOT BETWEEN)

  • List membership (IN, NOT IN)

  • Pattern match (LIKE, NOT LIKE)

Matching Column Data with Equality Statements

We'll work with the Products table, from the Northwind database, today for our exercises. Imagine that you would like a complete price list for all the items listed in the table. You could write this query inside your query analyzer:

select ProductName, UnitPrice
from Products

This query returns 77 rows, but I am interested in only the price of tofu today. To request a price for a particular product, you would include a WHERE clause that restricts the output to only rows having a value of 'Tofu' in the ProductName field. The following code example shows the WHERE clause in action:

select ProductName, UnitPrice
from Products
where ProductName = 'Tofu'

Results:

ProductName                              UnitPrice
---------------------------------------- ---------------------
Tofu                                                   23.2500

(1 row(s) affected)

The output includes a single row, with the price of tofu. When you execute a query with a WHERE clause, the server searches the table for rows matching your conditions. Only rows matching the search conditions are then returned in the result set (see Figure 2.1).

Figure 2.1. The condition described in the WHERE clause limits the result set to the rows in the table that match the condition.


Caution

What if you use double quotes to define a string? Here's the same query, but I've identified the string with double quotes instead of single quotes to set off the string data:

select ProductName, UnitPrice
from Products
where ProductName = "Tofu"

Sometimes, double quotes work fine and the results are exactly the same as with single quotes, but in other cases, you might get a message like this:

Server: Msg 207, Level 16, State 3, Line 1
Invalid column name 'Tofu'.

The server did not recognize that "Tofu" was a string value. The string value passed in double quotes has been interpreted as a (nonexistent) column name.

The server behaves differently with double quotes depending on your current version of the server (prior to version 7, SQL Server did not differentiate between single and double quotes). Server behavior also changes based on settings that you choose or your administrator chooses for you.

The short explanation is that the ANSI SQL standard (we discussed this standard yesterday) requires that identifiers (table names, database names, column names, and so forth) be optionally identified with double quotes. That's useful when a table name includes a space or other special character, as in this query:

select OrderID, ProductID
from "Order Details"
where OrderID = 12345

Without the quote marks, the server wouldn't know where the table name ended and the next word in the query began. This behavior can be changed with the set option, set quoted_identifier <on|off>. (You can also use brackets, as in [Order Details], to set off identifiers whenever you need to.)

When quoted_identifier is on (which is true by default), double quotes identify column names and table names, but not character strings. When it's off, double quotes identify character strings but not columns and tables.

Here's the bottom line. Use brackets to identify table and column names if necessary (as in [Order Details]). Use single quotes to identify strings (as in 'Tofu'). Never use double quotes. Ever. To help you out, SQL Query Analyzer turns single-quoted strings (but not double-quoted strings) red in the text window.


Uppercase and Lowercase Strings

Case sensitivity is a function of the server and database configuration. By default, SQL Server is not case sensitive. That means in most instances the server does not distinguish between uppercase and lowercase character strings.

What that means in a WHERE clause is that uppercase and lowercase won't matter there either. For example, in the query we tried earlier, it doesn't matter how the search string is capitalized. If we pass the string as 'tofu' (lowercase 't'), the query will still work:

select ProductName, UnitPrice
from Products
where ProductName = 'tofu'

Results:

ProductName                              UnitPrice
---------------------------------------- ---------------------
Tofu                                                   23.2500

(1 row(s) affected)

Caution

This is important to remember. If you received different results from this query than those shown here, the SQL Server you are connected to is running in case-sensitive (binary sorted) mode. This will make your queries case sensitive. This will also affect table and column naming and sort orders. You should ask the database administrator to find out for sure.

However, before you panic, try the following test. Run this query that checks whether the server distinguishes between uppercase and lowercase 'A':

if 'A'= 'a'
    print 'Dictionary'
else
    print 'Binary'

If the return message is 'Dictionary', you're using dictionary sorting and case sensitivity is not an issue. If you did not specify this setting when you installed SQL Server 2000, the default is used, which is dictionary sorting.

If the message is 'Binary', your results might differ from those shown in a number of places throughout the book. You will want to keep that in mind.

Changing this setting is not hard, but it does require the SQL Server administrator to perform the task. You can find out how to do this in Books Online by going to the following topic: "How to rebuild the master database."


Matching Rows Using Not Equals (!= or <>)

Matching rows when they do not equal a value is very similar to the test for equality. SQL Server accepts two different symbols for not equals: <> (which is familiar to BASIC programmers) and != (which is more familiar to C programmers). The two operators behave identically in your query, so use whichever one is more comfortable for you.

The following code is an example of using a not equals operator with a numeric value to find some needed data. Imagine that you are looking for all discontinued products. To find this information, you will need to query the Discontinued column in the Products table. The Discontinued column is considered a bit column, and can accept a value of only 0 or 1. The typical convention is to use 0 for false and 1 for true. In this case, a value of 0 in the Discontinued column means the product was not discontinued and a value of 1 means that it was; note that in the WHERE clause that follows.

select ProductName, UnitPrice
from products
where discontinued = 1

Results:

ProductName                              UnitPrice
---------------------------------------- ---------------------
Chef Anton's Gumbo Mix                                 21.3500
Mishi Kobe Niku                                        97.0000
Alice Mutton                                           39.0000
Guaraná Fantástica                                      4.5000
Rössle Sauerkraut                                      45.6000
Thüringer Rostbratwurst                               123.7900
Singaporean Hokkien Fried Mee                          14.0000
Perth Pasties                                          32.8000

By using a WHERE clause that checks whether the Discontinued column contains a 1 (meaning the product was discontinued), I can list only those products that match that condition. Notice that I didn't include the Discontinued column in the result set. It's quite common to exclude criteria columns in a final result presented to a user.

Tip

Not equals is slower than most other conditions in almost all situations. The server usually cannot find a way to "optimize" (that is, run efficiently) queries based solely on not equals conditions. If you can find a way to restate a condition without a not equals clause, it will probably be faster.

For example, because you know that Discontinued is limited to 0 and 1, change the WHERE clause from

where discontinued <> 0

to

where discontinued = 1

or

where discontinued >= 1

In general, use what you know about the data to simplify the server's work.


Searching for Numeric Values

In the previous query, we looked for rows by matching the product name to a string constant. (You will learn how to match strings using wildcards later today.) In addition to searching for string values, you can also search for numeric values. To find books with a price of $23.25, use the WHERE clause with a numeric comparison, as shown in the following example:

Select ProductName, UnitPrice
from Products
where UnitPrice = 23.25

Results:

ProductName                              UnitPrice
---------------------------------------- ---------------------
Tofu                                                   23.2500

(1 row(s) affected)

Notice that the numeric value is passed without quotes. What if you use quotes here?

select ProductName, UnitPrice
from Products
where UnitPrice = '23.25'

Results:

Server: Msg 260, Level 16, State 1, Line 1
Disallowed implicit conversion from data type varchar to
data type money, table 'Northwind.dbo.Products', column
'UnitPrice'. Use the CONVERT function to run this query.

The query failed because the server tried to compare a number (the UnitPrice column) to the character string, '23.25'. By using the quotes, you changed 23.25 from a number value to a string value, and the comparison caused an error because the server cannot perform implicit (that is, automatic) conversions between these data types. So, to properly query for numbers, you will be best served by not putting the numbers in quotes. Tomorrow, we'll learn how to convert these data types by using the cast() and convert() functions.

Returning Multiple Rows

All the queries have so far returned only one row. What if the condition applies to multiple rows? All rows matching the conditions are returned in the result set. Here's an example of a query that returns multiple rows:

select ProductName, UnitPrice
from Products
where UnitPrice = 12.5

Results:

ProductName                              UnitPrice
---------------------------------------- ---------------------
Gorgonzola Telino                                      12.5000
Scottish Longbreads                                    12.5000

(2 row(s) affected)

Although the preceding query looks the about the same as the others we have seen so far, the difference is in the actual data. Because there is more than one row where the unit price matches 12.50, they are all returned by the SELECT statement. Also, as you can see, you can use columns in the WHERE clause that are not requested in the SELECT columns list.

Note

This is the last time that I will include the message

(2 row(s) affected)

in sample output unless it is important to see the server response. By now, you have probably realized that these messages come back every time, and that they provide an accurate count of the number of rows returned by the query.

If you want to suppress the rowcount report at the end of every query, you have two methods available.

One method is to execute this statement in the SQL Query Analyzer window:

set nocount on

That will suppress the count messages for that window, but it won't affect any other window you might open. You will also need to remember to run it every time you start the SQL Query Analyzer.

You can also use the Connections tab in the Options dialog box of the SQL Query Analyzer (Tools, Options, Connections). Click Set Nocount to suppress the count display for all new connections. This option is retained after you exit and return to the SQL Query Analyzer.

There are times when you will find the rowcount display useful. To switch it back on, use

set nocount off


Numeric Data and Display Formats in the WHERE Clause

Each of the products in the previous result set has a price of $12.50. It's important to notice that the formats of the price in input and output are different. The server interprets numbers by converting the value into an internal arithmetic format. It doesn't matter if we enter $12.50, 12.50, or 12.5. If the number (which is converted internally by the server to an internal numeric value) matches the value in the column, the condition is true.

On the other hand, output format depends entirely on the software that you are using to display the data. UnitPrice is defined as money, which allows four decimal positions (and permits numeric values up to about 900 trillion). The SQL Query Analyzer displays all four digits of money type data.

The UnitsInStock column is defined as a small integer (smallint, permitting no decimal positions and values ranging up to about 32000). Because a smallint value never includes decimal data, the default display for these values in SQL Query Analyzer only shows the integer part of the number. This example shows how the SQL Query Analyzer displays UnitsInStock values for the rows we just looked at a moment ago:

select ProductName, UnitsInStock
from Products
where UnitPrice = 12.5

Results:

ProductName                              UnitsInStock
---------------------------------------- ------------
Gorgonzola Telino                                   0
Scottish Longbreads                                 6

In your own applications, you can modify how the any of the data returned is displayed. Most report builders and application development environments are able to determine whether to display the data in a result set. As the developer, you are free to override the way data is displayed so that it makes the most sense for your users. This is usually done within the software you are using to display the data.

Retrieving Empty Result Sets

Sometimes you will present a condition that matches no rows. For example, a query asking for products with a negative price will never return any rows:

select ProductName, UnitPrice
from products
where UnitPrice = -10

Results:

ProductName                              UnitPrice
---------------------------------------- ---------------------

(0 row(s) affected)

In spite of the fact that no rows were found, a result set will still be returned. The result set just won't have rows of data in it.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset