An important element in PROC SQL is its Data Definition Language (DDL) capabilities. From creating and deleting tables (see Chapter 5, “Creating, Populating, and Deleting Tables”) and indexes to altering table structures and columns, the DDL provides PROC SQL programmers with a way to change (or redefine) the definition of one or more existing tables. The ALTER TABLE statement permits columns to be added, modified, or dropped in a table with the ADD, MODIFY, or DROP clauses. When a table’s columns or attributes are modified, the table’s structural dynamics also change. The following sections examine the various ways tables can be modified in the SQL procedure.
As requirements and needs change, a database’s initial design may require one or more new columns to be added. Before any new columns can be added, complete ownership of the table must be granted. When you have exclusive access, each new column that you add is automatically added at the end of the table’s descriptor record. This means that the ALTER TABLE statement’s ADD clause modifies the table without reading or writing data.
Suppose you were given a new requirement to improve your ability to track the status of inventory levels. It is determined that your organization can achieve this new capability by adding a new column to the INVENTORY table. The ADD clause is used in the ALTER TABLE statement to define the new column, INVENTORY_STATUS, and its attributes. The new column’s purpose is to identify the following inventory status values: “In-Stock”, “Out-of-Stock”, and “Back Ordered”.
SQL Code
PROC SQL; ALTER TABLE INVENTORY ADD inventory_status char(12); QUIT;
Once the new column is added, the SAS log indicates that 6 columns exist in the INVENTORY table.
SAS Log Results
PROC SQL; ALTER TABLE INVENTORY ADD inventory_status char(12); NOTE: Table WORK.INVENTORY has been modified, with 6 columns. QUIT; NOTE: PROCEDURE SQL used: real time 0.44 seconds |
The output shows the INVENTORY_STATUS column added at the end of the INVENTORY table.
Results
The CONTENTS Procedure -----Alphabetic List of Variables and Attributes----- # Variable Type Len Pos Format Informat Label __________________________________________________ __________________________________ 4 invencst Num 6 22 DOLLAR10.2 Inventory Cost 2 invenqty Num 3 19 Inventory Quantity 6 inventory_status Char 12 4 5 manunum Num 3 28 Manufacturer Number 3 orddate Num 4 0 MMDDYY10. MMDDYY10. Date Inventory Last Ordered 1 prodnum Num 3 16 Product Number -----Variables Ordered by Position----- # Variable Type Len Pos Format Informat Label __________________________________________________ __________________________________ 1 prodnum Num 3 16 Product Number 2 invenqty Num 3 19 Inventory Quantity 3 orddate Num 4 0 MMDDYY10 . MMDDYY10. Date Inventory Last Ordered 4 invencst Num 6 22 DOLLAR10 .2 Inventory Cost 5 manunum Num 3 28 Manufacturer Number 6 INVENTORY_STATUS Char 12 4 |
Column position is not normally important in relational database processing. But, there are times when a particular column order is desired, for example when SELECT * (select all) syntax is specified. To add one or more columns in a designated order, the SQL standard provides a couple of choices. You can:
Create a new table with the columns in the desired order and load the data into the new table,
Create a view that puts the columns in the desired order and then access the view in lieu of the table (see Chapter 8, “Working with Views,” for a detailed explanation).
Suppose you had to add the INVENTORY_STATUS column so it is inserted between the ORDDATE and INVENCST columns and not just added as the last column in the table. The following example shows how this can be done. As before, we begin by adding the INVENTORY_STATUS column to the INVENTORY table. Then, we create a new table called INVENTORY_COPY and load the data from INVENTORY in the following column order: PRODNUM, INVENQTY, ORDDATE, INVENTORY_STATUS, INVENCST, and MANUNUM.
PROC SQL;
ALTER TABLE INVENTORY
ADD INVENTORY_STATUS CHAR(12);
CREATE TABLE INVENTORY_COPY AS
SELECT PRODNUM, INVENQTY, ORDDATE, INVENTORY_STATUS,
INVENCST, MANUNUM
FROM INVENTORY;
QUIT;
PROC CONTENTS DATA=INVENTORY_COPY;
RUN;
The PROC CONTENTS output below shows the positioning of the columns in the new INVENTORY_COPY table including the new INVENTORY_STATUS column that was added.
Results
The CONTENTS Procedure -----Variables Ordered by Position----- # Variable Type Len Pos Format Informat Label ________________________________________________ ____________________________________ 1 prodnum Num 3 16 Product Number 2 invenqty Num 3 19 Inventory Quantity 3 orddate Num 4 0 MMDDYY10. MMDDYY10. Date Inventory Last Ordered 4 INVENTORY_STATUS Char 12 4 5 invencst Num 6 22 DOLLAR10.2 Inventory Cost 6 manunum Num 3 28 Manufacturer Number |
Another way of controlling a table’s column order is to create a view or virtual table (for more information on views, see Chapter 8, “Working with Views”), from an existing table by specifying the desired column order. Using a CREATE VIEW statement and a SELECT query you can construct a new view so that the columns appear in a desired order. Essentially the view contains no data, just the PROC SQL query’s instructions that were used to create it. The biggest advantage of creating a view to reorder the columns defined in a table is that a view not only avoids the creation of a physical table, but hides sensitive data from unauthorized viewing. In the next example, a new view called INVENTORY_VIEW is created from the INVENTORY table with selected columns appearing in a specific order.
SQL Code
PROC SQL; CREATE VIEW INVENTORY_VIEW AS SELECT PRODNUM, INVENQTY, INVENTORY_STATUS FROM INVENTORY; QUIT;
The PROC CONTENTS output below shows the positioning of the columns in the new view including the new INVENTORY_STATUS column that was added earlier.
Results
The CONTENTS Procedure -----Alphabetic List of Variables and Attributes----- # Variable Type Len Pos Label ________________________________________________ _________________ 2 invenqty Num 3 8 Inventory Quantity 3 inventory_status Char 12 11 1 prodnum Num 3 0 Product Number |
Column definitions (length, informat, format, and label) can be modified with the MODIFY clause in the ALTER TABLE statement. PROC SQL enables a character or numeric column to have its length changed. In the next example, suppose you had to reduce the length of the character column MANUCITY in the MANUFACTURERS table from 20 bytes to a length of 15 bytes to conserve space. The CHAR column-definition is used in the MODIFY clause in the ALTER TABLE statement to redefine the length of the column.
PROC SQL; ALTER TABLE MANUFACTURERS MODIFY MANUCITY CHAR(15); QUIT;
SAS Log Results
PROC SQL; ALTER TABLE MANUFACTURERS MODIFY MANUCITY CHAR(15); NOTE: Table WORK.MANUFACTURERS has been modified, with 4 columns. QUIT; NOTE: PROCEDURE SQL used: real time 0.50 seconds |
The PROC CONTENTS output below illustrates the changed column length made to the MANUCITY column in the MANUFACTURERS table.
Results
The CONTENTS Procedure
Data Set Name: WORK.MANUFACTURERS
Observations: 6
Member Type: DATA
Variables: 4
Engine: V8
Indexes: 0
Created: 14:21 Tuesday, November 9,
1999 Observation Length: 45
-----Alphabetic List of Variables and
Attributes-----
# Variable Type Len Pos
Label
____________________________________
______________________
3 manucity Char 15 25
Manufacturer City
2 manuname Char 25 0
Manufacturer Name
1 manunum Num 3 42
Manufacturer Number
4 manustat Char 2 40
Manufacturer State |
The column length can also be changed using the PROC SQL LENGTH= option in the SELECT clause of the CREATE TABLE statement. This construct avoids your having to use the ALTER TABLE statement, as illustrated in the previous example, as well as using a DATA step. The next example shows the LENGTH= option to reduce the length of the MANUCITY column from 20 bytes to 15 bytes.
SQL Code
PROC SQL; CREATE TABLE MANUFACTURERS_MODIFIED AS SELECT MANUNUM, MANUNAME, MANUCITY LENGTH=15, MANUSTAT FROM MANUFACTURERS; QUIT;
SAS Log Results
PROC SQL; CREATE TABLE MANUFACTURERS_MODIFIED AS SELECT MANUNUM, MANUNAME, MANUCITY LENGTH=15 , MANUSTAT FROM MANUFACTURERS; NOTE: Table WORK.MANUFACTURERS_MODIFIED created, with 6 rows and 4 columns. QUIT; NOTE: PROCEDURE SQL used (Total process time): real time 0.12 seconds cpu time 0.01 seconds |
A column that is initially defined as numeric can also have its length changed in PROC SQL. The SQL procedure ignores a field width in these situations and defines all numeric columns with a maximum width of 8 bytes. The reason is that numeric columns are always defined with the maximum precision allowed by the SAS System. To override this limitation, it is recommended that you use a LENGTH= option in the SELECT clause of the CREATE TABLE statement or the LENGTH statement in a DATA step to assign (or reassign) any numeric column lengths to the desired size. You can also improve query results by assigning indexes only to those columns that have many unique values or that you use regularly in joins.
In the next example, the numeric column MANUNUM has its length changed (or redefined) from 3 bytes to 4 bytes using the LENGTH= option in the SELECT clause of the CREATE TABLE statement.
SQL Code
PROC SQL; CREATE TABLE MANUFACTURERS_MODIFIED AS SELECT MANUNUM LENGTH=4, MANUNAME, MANUCITY, MANUSTAT FROM MANUFACTURERS; QUIT;
The PROC CONTENTS output illustrates the changed column length assigned to the numeric MANUNUM column in the MANUFACTURERS_MODIFIED table.
Results
The CONTENTS Procedure
Alphabetic List of Variables and
Attributes
# Variable Type Len Label
3 manucity Char 15
Manufacturer City
2 manuname Char 25
Manufacturer Name
1 manunum Num 4
Manufacturer Number
4 manustat Char 2
Manufacturer State |
In the next example, the numeric column MANUNUM has its length changed (or redefined) from 3 bytes to 4 bytes using the LENGTH statement in a DATA step. To avoid truncation or data problems, you should verify that a column having a shorter length can handle existing data. Because PROC SQL does not produce any notes or warnings if numeric values are truncated, you are required to know your data.
DATA Step Code
DATA MANUFACTURERS; LENGTH MANUNUM 4.; SET MANUFACTURERS; RUN;
SAS Log Results
DATA MANUFACTURERS; LENGTH MANUNUM 4.; SET MANUFACTURERS; RUN; NOTE: There were 6 observations read from the dataset WORK.MANUFACTURERS. NOTE: The data set WORK.MANUFACTURERS has 6 observations and 4 variables. NOTE: DATA statement used: real time 0.44 seconds |
The PROC CONTENTS output below illustrates the changed column length assigned to the numeric MANUNUM column in the MANUFACTURERS table.
Results
The CONTENTS Procedure -----Alphabetic List of Variables and Attributes----- # Variable Type Len Pos Label ___________________________________________ _______________ 3 manucity Char 15 29 Manufacturer City 2 manuname Char 25 4 Manufacturer Name 1 manunum Num 4 0 Manufacturer Number 4 manustat Char 2 44 Manufacturer State |
You can permanently change a column’s format with the MODIFY clause of the ALTER TABLE statement—and not just for the duration of the step. Suppose you had to increase the size of the DOLLARw.d format from DOLLAR9.2 to DOLLAR12.2 to allow larger product cost (PRODCOST) values in the PRODUCTS table to print properly.
SQL Code
PROC SQL;
ALTER TABLE PRODUCTS
MODIFY PRODCOST FORMAT=DOLLAR12.2;
QUIT;
SAS Log Results
PROC SQL; ALTER TABLE PRODUCTS MODIFY PRODCOST FORMAT=DOLLAR12.2; NOTE: Table WORK.PRODUCTS has been modified, with 5 columns. QUIT; NOTE: PROCEDURE SQL used: real time 0.33 seconds |
You can modify a column’s label information with the ALTER TABLE statement MODIFY clause. Because the label information is part of the descriptor record, changes to this value have no impact on the data itself. Suppose you had to change the label corresponding to the product cost (PRODCOST) column in the PRODUCTS table so when printed it displayed “Retail Product Cost”.
PROC SQL; ALTER TABLE PRODUCTS MODIFY PRODCOST LABEL="Retail Product Cost"; QUIT;
SAS Log Results
PROC SQL; ALTER TABLE PRODUCTS MODIFY PRODCOST LABEL="Retail Product Cost"; NOTE: Table WORK.PRODUCTS has been modified, with 5 columns. QUIT; NOTE: PROCEDURE SQL used: real time 0.00 seconds |
The SQL procedure does provide an ANSI approach to renaming columns in a table. By specifying the SELECT clause in the CREATE TABLE statement, you can rename columns, although it can be tedious if a large number of columns exist in the table. The next example illustrates a SELECT clause in a CREATE TABLE statement being used to rename the ITEM column to ITEM_PURCHASED in the PURCHASES table. As the example below illustrates you should refrain from specifying the same table name in the CREATE TABLE statement as specified in the FROM clause. Recursive references to the target table can cause data integrity problems.
SQL Code
PROC SQL;
CREATE TABLE PURCHASES AS
SELECT CUSTNUM, ITEM AS ITEM_PURCHASED, UNITS, UNITCOST
FROM PURCHASES;
QUIT;
PROC SQL;
CREATE TABLE PURCHASES AS
SELECT CUSTNUM, ITEM AS ITEM_PURCHASED,
UNITS, UNITCOST
FROM PURCHASES;
WARNING: This CREATE TABLE statement recursively
references the
target table. A consequence of this is a possible
data integrity
problem.
NOTE: Table WORK.PURCHASES created, with 7 rows
and 4 columns.
QUIT;
NOTE: PROCEDURE SQL used (Total process time):
real time 0.41 seconds
cpu time 0.02 seconds |
An alternative approach to renaming columns in a table consists of using the RENAME= SAS data set option in a SELECT statement’s FROM clause. Suppose you needed to rename ITEM in the PURCHASES table to ITEM_PURCHASED. In the next example, the RENAME= SAS data set option can be specified in one of two ways, as illustrated below. Either approach is syntactically correct.
SQL Code
PROC SQL; SELECT * FROM PURCHASES (RENAME=ITEM=ITEM_PURCHASED); QUIT; < or > PROC SQL; SELECT * FROM PURCHASES (RENAME=(ITEM=ITEM_PURCHASED)); QUIT;
PROC SQL; SELECT * FROM PURCHASES(RENAME=ITEM=ITEM_PURCHASED); QUIT; NOTE: PROCEDURE SQL used: real time 0.31 seconds cpu time 0.02 seconds |
The SQL procedure does not provide a standard ANSI approach to renaming a table in a SAS library. Consequently, the DATASETS procedure is the recommended method to accomplish this relatively simple task. Suppose you had to rename the PRODUCTS table in the WORK library to MANUFACTURED_PRODUCTS.
SAS Code
PROC DATASETS LIBRARY=WORK; CHANGE PRODUCTS = MANUFACTURED_PRODUCTS; RUN;
PROC DATASETS LIBRARY=work; Directory Libref WORK Engine V9 Physical Name D:SAS Version 9.1 SAS Temporary Files\_TD1704 File Name D:SAS Version 9.1 SAS Temporary Files\_TD1704 Member File # Name Type Size Last Modified 1 CUSTOMERS DATA 5120 16Aug04:23:37:30 2 CUSTOMERS2 DATA 5120 16Aug04:23:37:30 3 INVENTORY DATA 5120 16Aug04:23:39:22 4 INVOICE DATA 5120 16Aug04:23:37:32 5 MANUFACTURERS DATA 5120 17Aug04:00:07:40 6 PRODUCTS DATA 17408 17Aug04:00:10:38 7 PURCHASES DATA 5120 17Aug04:00:17:12 CHANGE PRODUCTS = MANUFACTURED_PRODUCTS; RUN; NOTE: Changing the name WORK.PRODUCTS to WORK .MANUFACTURED_PRODUCTS (memtype=DATA). |
An assortment of novel approaches has been used to rename tables. One approach, shown below, uses the CREATE TABLE statement with the SELECT query to create a new table with the desired table name followed by the DROP TABLE statement to delete the old table. You should be aware, however, that this is not an efficient method to rename a table.
PROC SQL; CREATE TABLE MANUFACTURED_PRODUCTS AS SELECT * FROM PRODUCTS; DROP TABLE PRODUCTS; QUIT;