When you join tables, you get a result set where each row can contain values from several tables. In this last section, you will learn to merge two or more result sets into a single result set by using the UNION keyword.
Consider the problem of developing a mailing list of suppliers and customers. You would want to produce a list of names and addresses for each table. You don't need to join the tables; after all, there are no corresponding columns to join the tables. Figure 5.11 shows how rows from two separate queries are merged into a single result set.
Here is the Customers and Suppliers mailing list query. Notice how the UNION statement connects two complete, self-sufficient select statements. (I've limited it to Customers and Suppliers in the United States to shorten the output, and I've truncated every column to make the data fit on the printed page.)
1:select 2: left(CompanyName, 20) CompanyName, 3: left(Address, 20) Address, 4: City, 5: Region as St, 6: PostalCode as Zip 7:from 8: Customers 9:where 10: Country = 'USA' 11:union 12:select 13: left(CompanyName, 20) , 14: left(Address, 20) , 15: City, 16: Region, 17: PostalCode 18:from 19: Suppliers 20:where 21: Country = 'USA'
Results:
CompanyName Address City St Zip -------------------- -------------------- --------- -- ----- Bigfoot Breweries 3400 - 8th Avenue Su Bend OR 97101 Grandma Kelly's Home 707 Oxford Rd. Ann Arbor MI 48104 Great Lakes Food Mar 2732 Baker Blvd. Eugene OR 97403 Hungry Coyote Import City Center Plaza 51 Elgin OR 97827 Lazy K Kountry Store 12 Orchestra Terrace Walla Wal WA 99362 Let's Stop N Shop 87 Polk St. Suite 5 San Franc CA 94117 Lonesome Pine Restau 89 Chiaroscuro Rd. Portland OR 97219 New England Seafood Order Processing Dep Boston MA 02134 New Orleans Cajun De P.O. Box 78934 New Orlea LA 70117 Old World Delicatess 2743 Bering St. Anchorage AK 99508 Rattlesnake Canyon G 2817 Milton Dr. Albuquerq NM 87110 Save-a-lot Markets 187 Suffolk Ln. Boise ID 83720 Split Rail Beer & Al P.O. Box 555 Lander WY 82520 The Big Cheese 89 Jefferson Way Sui Portland OR 97201 The Cracker Box 55 Grizzly Peak Rd. Butte MT 59801 Trail's Head Gourmet 722 DaVinci Blvd. Kirkland WA 98034 White Clover Markets 305 - 14th Ave. S. S Seattle WA 98128
You can't tell from the final result whether any specific row started out as a customer or supplier.
Tip
If you needed to, you could include a column to indicate the source table. In each select list, add a column with a constant string, like this:
select left(CompanyName, 20) CompanyName, 'Customer'as Type, left(Address, 20) Address, … union select left(CompanyName, 20) , 'Supplier'as Type, left(Address, 20) , …
Now, each row will include a value for type. If the row emerges from the Customers query, the value for type will be 'Customer'; otherwise, it will be 'Supplier'.
The first query sets the stage for the number of columns, their data types (including length, precision, and scale), and column aliases. Each subsequent select statement must have the same number of columns as the first. When SQL Server matches up each successive query in a UNION, each column in a query must be a compatible data type to the corresponding column in the first query. Column names and column aliases in subsequent queries are ignored.
You have a lot of leeway when you use UNION. If you want to add the Employees table to the mailing list, you need to deal with the lack of a company name in that table. Also, in both Customers and Suppliers, the ContactName column is not separated into FirstName and LastName. You will need to address both of these issues. Listing 5.8 is the revised query with three result sets.
In Listing 5.8, you can see that the second Select statement on line 13 that retrieves data from the Employees table does not contain a company name. However, the UNION syntax requires that the position for company name contain data. To allow this to work, we have added a static string, 'Our Company Name', in that position.
To understand UNION clearly, you need to understand how it interacts with various keywords. I'll provide you with some guidance here, but you can work with these on your own.
You should notice that the previous result set is sorted alphabetically by the first column, CompanyName. The server sorts the data to remove any duplicate values. None of the data here is duplicated, and in most cases, the results of UNION queries won't be.
You can suppress the sort step and the related search for duplicates by adding the keyword ALL after UNION. That will improve query performance with no loss of data integrity.
These three keywords are associated only with the local query prior to the UNION operation. WHERE clauses provide the conditions governing the current select statement. Grouping is performed within the individual query, and HAVING clauses are evaluated within the query as well.
You can specify the order of a UNION query only after the final SELECT statement. That ordering will govern the entire result set.