Chapter 07 - Performing Operations on Tables

How to change data and characteristics of your database tables - SQL Server permits you to change the column values of tables and add new columns to a table.
Operations made on multiple tables - You'll lean to combine data from multiple tables using relational joins and the UNION statement.
Remove one or more unwanted rows form a table - SQL Server permits you to remove selected rows from a table with the DELETE statement.

Information that's stored in multiple tables is often combined in a single query. To combine rows logically across tables, the tables must be created with related columns of data. You must be able to issue queries that not only combine but also eliminate rows from multiple tables, so that only the requisite rows appear in the resultant display.

You'll need to use an UPDATE statement to change one (or more) row's column values for a table. You may need to change a row simply to correct information that was incorrectly entered. You'll want to be able to remove rows that are no longer needed in a table. You use a DELETE statement to remove one or more rows from a table.

In addition to changing table rows, you'll find that you may need to add additional columns of information to a database table. You use the ALTER TABLE statement primarily to add one more columns to an already existing table, as well as change other characteristics of the table.

In this chapter, you'll learn to write queries that retrieve rows of related information from multiple tables. You'll also learn to update and delete the rows of a database table. Also, you'll learn to change the characteristics of database tables including how to add columns to a table.

Updating Rows

Changing column values of rows is one of the operations that you must be able to perform to maintain a database. You use an UPDATE statement to modify the column values of table rows. The simplified syntax of an UPDATE statement is as follows:

UPDATE table_name
SET column_name_1 = value,......column_name_n = value
WHERE column_name comparison operator value

You use the WHERE clause to identify the rows to be changed. The WHERE clause, used as part of a SELECT statement, narrows the scope of your selection of rows that will be returned or affected by the query. In an UPDATE statement, the WHERE clause is used to identify the rows that are changed, rather than the rows to be displayed.

You can use the UPDATE statement to change erroneous entries or misspellings for the column values of existing rows of a table.

In the following example, the values for the department and badge columns of the Employees table are changed for the employee Bob Smith. If more than one row has an employee named Bob Smith, the department and badge number of each row is changed.

update employees
set department = 'SALES', badge = 1232
where name = 'Bob Smith'

You can also update views as well as tables with the UPDATE statement. You simply use the name of the View in place of the table name in the UPDATE clause of the UPDATE statement. In many different operations, views are treated the same as tables.

You can use UPDATE to change multiple rows that match the criteria specified by the WHERE clause. In the following example, all rows that contain the department SALES are changed to MARKETING.

update employees
set department = 'MARKETING'
where department = 'SALES'

You must be careful that you specify only the rows you want changed. If you omit a WHERE clause from an UPDATE statement, the change specified in the SET clause is made to every row of the table. The following example shows a change that's made to all rows of a table:

update employees
set wageclass = 'W0'

There are usually two reasons why an UPDATE statement wouldn't contain a WHERE clause: because the WHERE was inadvertently omitted, or because you purposely want to change a column for all rows of a table. For example, when you add a new column to a table with the ALTER TABLE command, you may have assigned a null values for the new column for all existing rows of the table. You can use an UPDATE statement without a WHERE clause to add a non-null value to the new column for all rows.

You can use a SELECT count(*) statement with the same criteria—specifically, your WHERE clause—that you plan to use in your UPDATE statement to learn the number of rows that will be subsequently changed by your UPDATE statement. By first determining the number of rows that will be affected by your UPDATE, you're more likely to notice any mistakes in your criteria.

See the section in Chapter 8 entitled Using COUNT

SET Clause Options

You can also use an expression or the keywords DEFAULT and NULL in the SET clause of an UPDATE statement. If you use an expression in a SET clause rather than a constant value, the expression is first evaluated, and its result is assigned to the rows that are specified in the UPDATE statement.

In the following example, a raise in the hourly rate is given to all employees by updating the rate column of the pays table.

update pays
set rate=rate+2

You can use the keyword NULL to change the column value of the specified rows of a table to nulls. The table column that's to be assigned a null value must have been created with the NULL characteristic originally. In the following example, an employee who has been moved out of his current department but not yet assigned to another department has his department changed to a null.

update employees
set department=null
where name='Bob Smith'

You can also use the UPDATE statement to assign a DEFAULT value if a default value has been associated with the table column. In the following example, the department for an employee is changed to the default value which was previously established and associated with the department column.

update employees
set department=default
where name ='Sally Springer'

See the Chapter 11 sections entitled Creating Defaults and Binding Defaults

If a default doesn't exist for the column and the column permits nulls, the column value is changed to a null.

Overview of the Update Process

An update can be performed in SQL Server in two ways. Under certain conditions, changes can be made directly to the rows of database tables. When a direct update of the table row can be made, the operation is done quickly with little overhead to perform the operation. An update directly to the rows of a table is referred to as an update in place.

A second way in which you can change the rows of a table is an indirect or deferred update. In such an update, the change is made by deleting the row to be modified and then inserting the row as a new row with the new values in place. A deferred update is slower because two operations are required to made a change the row of a table.

The conditions under which a direct update can be performed are primarily determined by restrictions set on the database table. The following conditions must be met for a direct update to be performed on a table:

The updated column can't be part of a clustered index.
A update trigger can't be defined on the table.
The table can't be set up for replication.

Also, a number of conditions must be met for an update in place to be performed on updates that change a single row:

For a table column that's defined using a datatype of variable length (such as VARCHAR), the updated row must fit in the same database page. In general, this means that the information you're inserting can't be larger than the information you're replacing.
If a non-clustered index that allows duplicates is defined for the column, the updated column must be a fixed-size datatype or composed of multiple fixed datatypes.
If a unique, non-clustered index is defined on the column and the WHERE clause of the UPDATE statement uses an exact match for a unique index, the updated column must be a fixed-size datatype or composed of multiple fixed datatypes. The column in the WHERE clause can be the same column as the updated column.
The byte size of the updated row value can't be more than 50 percent different from the original row, and the total number of new bytes must be equal to or less than 24.

The following set of conditions must be met for updates that change multiple rows to be performed in place:

The updated column must be defined as a fixed-length datatype.
The updated column can't be part of a unique, non-clustered index.
If a non-unique, non-clustered index is defined on the column and the WHERE clause of the UPDATE statement isn't the same column as the updated column, an updated column must be a fixed-size datatype or composed of multiple fixed datatypes.
The table can't include a timestamp column.

See the Chapter 12 section entitled Understanding Locks

See the Chapter 10 section entitled Defining Indexes

See the Chapter 14 section entitled Using INSERT and UPDATE Triggers

If needed—perhaps because you'll be making many subsequent updates on your database tables—you can plan the table design so that all updates are direct. You can consider all the restrictions for direct updates to ensure that your updates are performed as quickly as possible.

You can use the query option SHOWPLAN to determine whether an update was direct or deferred.

Deleting Rows

Removing rows from a database table is another operation that you must be able to perform to maintain a database. Use a DELETE FROM statement to remove table rows. The syntax of a DELETE [FROM] statement is as follows:

DELETE [FROM] table_name
WHERE column_name = 'value'

You don't need to use the keyword FROM; it's optional in the DELETE statement. You can delete rows from tables,as well as update tables, through views.

In the following example, the operation of the DELETE statement removes all rows that match the criteria specified in the WHERE clause of the DELETE statement. In this case, we're removing all rows that contain the department "SALES".

delete from employees
where department = 'SALES'

You can first use a COUNT function in a SELECT statement that has an identical WHERE clause to your DELETE statement to determine the number of rows that will be subsequently removed.

I executed a DELETE statement and, even though I uppercased all letters to reference the column values in the WHERE clause of my DELETE statement, it deleted rows that contain the same value in lowercase letters. Why did this happen?

You must be aware of the default sort order that was selected when SQL Server was installed. Microsoft SQL Server installs with a case-insensitive sort order and this is why the DELETE statement didn't distinguish between upper- and lowercase. If you want all your subsequent DELETE statements—as well as UPDATE statements—to be case-sensitive, you might want to update SQL Server using SQL Setup from the SQL Server program group to effectively reinstall SQL Server with a case-sensitive sort order specified.

See the Appendix A section entitled "Installing Server and Client Software"

You can use a DELETE FROM statement to remove multiple rows as well as individual rows. However, use the DELETE FROM statement carefully. If you don't use a WHERE clause in a DELETE FROM statement, all table rows are removed—leaving you with an empty table. You'll receive no warning before the DELETE FROM statement is executed. In the following example, all rows of the specified table are deleted:

delete from employees

You should define a transaction when you use statements such as DELETE FROM and UPDATE so that you can later change your mind and undo the removal or change of rows. See Chapter 12, "Understanding Transactions and Locking" to learn how to define and undo a transaction. It's often desirable to keep your transactions to a few statements as necessary, so that other users can access the latest information about a table.

I began entering a DELETE FROM statement and got as far as the name of the table when I accidentally executed the query. When I try to read the table, all the rows are gone. Other products ask me to confirm an unrecoverable action such as the deletion of all table rows. I'm surprised that SQL Server doesn't do this. How can I prevent this from happening again?

Because SQL Server won't ask you to confirm the deletion or update of one or all table rows, you must be sure that you want the action to be performed before you allow it to execute.

The execution of a DELETE statement without a WHERE clause that removes all rows of a table is most often an accident. If you want to delete all rows of a table, but still keep the table intact, you should use the TRUNCATE statement. The syntax of TRUNCATE table is:

Truncate table_name

The advantage of using a TRUNCATE TABLE statement is that the removal of rows is completed faster than an equivalent DELETE statement. The truncate statement is faster because it removes pages of information that contain multiple tables rows at a time rather than the delete statement, which removes individual rows at a time. However, you can't recover table rows with the TRUNCATE TABLE statement. Unlike the DELETE statement, the TRUNCATE statement does not maintain a copy of the deleted rows even if it's part of a defined transaction.

TRUNCATE TABLE and DELETE TABLE retain the database table. If you want to permanently remove a table, as well all rows that it contains, you can use the DROP TABLE statement (which uses the following syntax):

DROP TABLE table_name

After you drop a table, you can't recover the rows that it contained except from a previously made backup copy of the table.

Another advantage of the truncate statement is that it won't log the removal of the information in the transaction log. If you have a situation where your transaction log has become full, you can still use the TRUNCATE statement to remove rows and free up space in the database.

Adding Columns with ALTER TABLE

You primarily use the ALTER TABLE command to add more columns to an existing table. You're limited in the operations you can perform on existing columns. For example, you can't delete a column or change the size or datatype of an existing column.

Other implementations of SQL do allow changes even to the datatype of existing columns to be performed through the ALTER TABLE statement. The ALTER TABLE statement in SQL Server and Transact-SQL, however, doesn't permit datatype changes to existing rows. In Microsoft SQL Server, you must create a new table, read the rows out of the old table and into the new table to effect such a change.

For more information, see the sections entitled Changing the Width of a Table Column and Removing a Column from a Table later in this chapter.

The syntax of the ALTER TABLE statement is as follows:

ALTER TABLE [[<database.>]<owner.>]<table_name>
ADD <column_name> <datatype> NULL [constraint],
<column_name> <datatype> NULL...] [constraint]
[WITH NOCHECK]
[DROP [CONSTRAINT]
      constraint_name [..., constraint_name_n]]

When ALTER TABLE is executed, it doesn't expand existing rows. It changes only the internal description of the added columns in the system tables. Each time an existing row is read from the disk, SQL Server adds the additional null entry for the new column or columns before it's available to a user.

When a new row is written to the disk, SQL Server creates the new row with the additional column and its value. SQL Server writes the row with the additional column unless no value is specified for the new row and its value remains a null. In the following example, sp_help is used to display the existing characteristics of a table in which three columns are defined:

sp_help employees3
Name                           Owner                       Type
------------------------------ --------------------------------------
employees3                     dbo                         user table
Data_located_on_segment        When_created
------------------------------ ---------------------------
default                        Jul 5 1994 10:08PM
Column_name     Type            Length Nulls
Default_name    Rule_name
--------------- --------------- ------ ----- --------------- ------
name            char            30     0     (null)          (null)
department      char            30     0     (null)          (null)
badge           int             4      0     (null)          (null)
Object does not have any indexes.
No defined keys for this object.

ALTER TABLE is used to add a new column to the table. You use the sp_help procedure to verify that the new columns have been added to the table. In the following example, SELECT displays all rows of the new table, including nulls in the new column for all rows:

alter table employees3
add wageclass char(2) null
sp_help employees3
Name                           Owner                          Type
------------------------------ ------------------------------ ----------------
employees3                     dbo                            user table
Data_located_on_segment        When_created
------------------------------ ---------------------------
default                        Jul 5 1994 10:08PM
Column_name     Type            Length Nulls
Default_name    Rule_name
--------------- --------------- ------ ----- --------------- ---------------
name            char            30     0     (null)          (null)
department      char            30     0     (null)          (null)
badge           int             4      0     (null)          (null)
wageclass       char            2      1     (null)          (null)
Object does not have any indexes.
No defined keys for this object.
select * from employees3
name                           department                     badge   wageclass
------------------------------ ------------------------------ ------- ---------
Stan Humphries                 Field Service                  3211    (null)
Fred Stanhope                  Field Service                  6732    (null)
Sue Sommers                    Logistics                      4411    (null)
Lance Finepoint                Library                        5522    (null)
Mark McGuire                   Field Service        1997      (null)
Sally Springer                 Sales                9998      (null)
Ludmilla Valencia              Software             7773      (null)
Barbara Lint                   Field Service        8883      (null)
Jeffrey Vickers                Mailroom             8005      (null)
Jim Walker                     Unit Manager         7779      (null)
Bob Smith                      SALES                1234      (null)
(11 row(s) affected)

You can use an UPDATE statement to define values for new columns that are added to a table with ALTER TABLE.

The null values are inserted when a new column is added to the table with the ALTER TABLE statement. In the following example, all table rows have a new value added to the column that was added with an earlier UPDATE TABLE statement. A subsequent SELECT statement is used to display all rows of the table, which includes the new column values.

update employees3
set wageclass='w4'
(11 row(s) affected)
select * from employees3
name                      department            badge       wageclass
---------------------- ---------------------- -----------   ---------
Stan Humphries            Field Service        3211        w4
Fred Stanhope             Field Service        6732        w4
Sue Sommers               Logistics            4411        w4
Lance Finepoint           Library              5522        w4
Mark McGuire              Field Service        1997        w4
Sally Springer            Sales                9998        w4
Ludmilla Valencia         Software             7773        w4
Barbara Lint              Field Service        8883        w4
Jeffrey Vickers           Mailroom             8005        w4
Jim Walker                Unit Manager         7779        w4
Bob Smith                 SALES                1234        w4
(11 row(s) affected)

You can also define a new column that you've added to a table with the identity characteristic. In Chapter 5, "Creating Database Tables and Using Datatypes" you learned that the identify characteristic permits you to define an initial value for the first row of the table (the seed) and a value that's added to each successive column to automatically generate a new column value (the increment).

You can't assign the identity characteristic to an existing column. Only new columns that are added to a table with the ALTER TABLE command can be defined with the identity characteristic. Also, if the value automatically generated for a new column by the identity mechanism exceeds the allowable values for the column's datatype, the ALTER TABLE statement fails and an error is displayed.

See the Chapter 5 section entitled "identity Property."

In the following example, an additional column is added to table pays, which is defined with the identity characteristic and can be subsequently used as a row number.

Alter table pays
add row_number identity(1,1)

You can also add one or more columns to a table using the SQL Enterprise Manager. To add a column to an existing table through the SQL Enterprise Manager, follow these steps:

Right-click a selected table to which you want to add a column.
Choose Edit from the menu.
Enter one or more columns in the Manage Tables dialog box. You enter a column name, choose a datatype and a size for the datatype.
Click the Save Table tool on the toolbar to keep the additional columns that you've added to a table. In Figure 7.1, an additional column (wageclass) is added to the employees table.

Fig. 7.1 - You can't deselect the null property on a column added to an existing table.

You can also double-click a table to open it and edit its structure. This might be the case where you want to add a new column to the table.

Changing the Width of a Table Column

Recall that you can't use ALTER TABLE to change the size of an existing table column or its datatype. You can use ALTER TABLE only to add a new column. You also can't change the size of the column of a table through the SQL Enterprise Manager. You can, however, drop a column or narrow its datatype by creating a new table with smaller but compatible datatypes and fewer columns.

In the following example, a new table is created in which the name and datatype of the first column is identical to the first column in an existing table, but the first column in the new table is smaller in size. In the new table, the second column is defined as VARCHAR instead of CHAR, as it's defined in the second column of the existing table. The new table's third column is defined as SMALLINT instead of INT, as it's defined in the corresponding Badge column in the existing table. The SMALLINT datatype uses half the storage space of INT.

After the new table is created, all rows of the older table are loaded into the new table with an INSERT statement. A SELECT statement is then used to display the rows of the new table.

create table employees4
(name char(15), department varchar(20),badge smallint)
insert into employees4
select name,department,badge from employees
(11 row(s) affected)
select * from employees4
name            department           badge
--------------- -------------------- ------
Stan Humphries  Field Service        3211
Fred Stanhope   Field Service        6732
Sue Sommers     Logistics            4411
Lance Finepoint Library              5522
Mark McGuire    Field Service        1997
Sally Springer  Sales                9998
L. Valencia     Software             7773
Barbara Lint    Field Service        8883
Jeffrey Vickers Mailroom             8005
Jim Walker      Unit Manager         7779
Bob Smith       SALES                1234
(11 row(s) affected)

The sp_help procedure shows the difference between datatypes in the corresponding columns of the two tables used in the example. The example shows only the relevant parts of the display returned by sp_help.

sp_help employees
Name                           Owner                          Type
------------------------------ ------------------------------ ------
employees                      dbo                       user table
...
name            char            20     0     (null)          (null)
department      char            20     0     deptdefault     (null)
badge           int             4      0     (null)          (null)
...
sp_help employees4
Name                           Owner                          Type
------------------------------ ------------------------------ ------
employees4                     dbo                         user table
--------------- --------------- ------ ----- --------------- --------
...
name            char            15     0     (null)          (null)
department      varchar         20     0     (null)          (null)
badge           smallint        2      0     (null)          (null)
...

The INSERT table statement successfully completes because the data from the earlier table is compatible with the columns defined for the new table. If the data isn't compatible between the tables, you'll receive an error. In the following example, a new table is created that defines a column as a character datatype. The attempted insertion of the corresponding column from one table results in an error because the datatypes can't be implicitly converted.

create table onecolumn
(badge char(4))
insert into onecolumn
select badge from employees
Msg 257, Level 16, State 1
Implicit conversion from datatype 'int' to 'char' is not allowed.
Use the CONVERT function to run this query.

If you're transferring a large number of rows between tables, you can first set a database option called select into/bulkcopy. If the select into/bulkcopy option is set, your rows are copied into a new table faster because SQL Server keeps less information in its transaction logs about your operation. The lack of complete log information about your operation, which prevents an undo or rollback operation to be done later, is probably not important because you still have the rows in the original table should the need arise to undo any operations.

From an ISQL/W command line, the select into/bulk copy option can be set on or off by issuing the following command:

sp_dboption database_name, 'select into/bulkcopy', TRUE|FALSE

For example, the following command turns on select into/bulkcopy for the database employees:

sp_dboption database_employees, 'select into/bulkcopy', true

You can also change a database option using the graphical interface of the SQL Enterprise Manager rather than a command line.

Removing a Column from a Table

Although you can't remove a column from a table with the ALTER TABLE command, you can remove a column from a table through a series of operations. You also can't remove a column from a table with the SQL Enterprise Manager. First, create a new table that you define with all but one of the columns in an existing table. Then use an INSERT statement to copy rows from the original table to the new table, minus the column that you didn't define in the new table.

In the following example, a new table is defined that contains only two of the three columns defined in an existing table. INSERT is used with a SELECT statement that references only two of the three columns of the original table in the SELECT clause.

create table employees5
(name char(20), badge int))
insert into employees5
select name,badge from employees

Adding Constraints with ALTER TABLE

You can also use the ALTER TABLE command to add, drop, apply, or bypass constraints or checks on existing table. Constraints are defined to provide data integrity on added columns. The ALTER TABLE statement, like the CREATE TABLE statement, allows you to add a column to a table with primary and foreign key, unique, and check and default constraints. You can add or drop constraints to or from a table without adding a new column. The syntax for constrains is identical to the syntax used for defining constraints in the CREATE TABLE statement.

See the Chapter 5 section entitled Creating and Using Constraints

In the following example, a unique CONSTRAINT is added to the badge column for the table employees2.

ALTER TABLE employees2
ADD
CONSTRAINT badgeunc UNIQUE NONCLUSTERED (badge)

Microsoft added a number of additional options to the ALTER TABLE statement in version 6 of SQL Server. All the additions were characteristics that were made to a table in other ways before version 6. You can continue to use the older and more direct ways of changing table characteristics. For example, an index, default, or rule can be defined and subsequently associated with a table using CREATE INDEX, CREATE RULE, or CREATE DEFAULT commands.

The changes that were made to the ALTER TABLE statement, as well as many other statements, allow Transact-SQL to meet the specifications of a standardized specification of SQL, ANSI SQL. The additions for ANSI compatibility result in multiple ways of performing the same operations, sometimes using different keywords or syntax.

You can easily drop a constraint from a table using the drop constraint clause of the alter table statement. You simply specify the name of the constraint to be removed from a table after the keywords drop constraint. For example, to remove a default constraint on the department column for the employees table, enter the following statement:

alter table employees
drop constraint department_default

Using the WITH NOCHECK Clause

You can add a NOCHECK clause to an ALTER TABLE statement to specify that a CHECK or FOREIGN KEY constraint shouldn't be applied on the existing rows of a table. The constraints added with the ALTER TABLE statement that contains the WITH NOCHECK clause are in effect only for rows that are subsequently changed or inserted. You can use a NOCHECK clause in an ALTER TABLE statement when you're certain that the existing data doesn't violate the constraints to speed up the execution of the ALTER TABLE statement.

You can't use WITH NOCHECK to override the initial checking of PRIMARY KEY and UNIQUE constraints. By default, SQL Server applies the constraints to existing rows in the table as well as new rows that are added or changed later. You'll receive an error message and the ALTER TABLE statement will fail if existing data violates your constraint.

You can also specify that a CHECK constraint that's added to a table through the ALTER TABLE statement isn't applied to the existing rows of a table through the NOT FOR REPLICATION clause. NOT FOR REPLICATION operates as though the WITH NOCHECK clause was added to the ALTER TABLE statement. The NOT FOR REPLICATION clause is added to an ALTER TABLE statement for a different purpose than the WITH NOCHECK clause.

If you set up the automatic copying of a table and table rows from one server system to another, the actual work of ensuring the server system that receives a copy of the data is done by an intermediate server. The NOT FOR REPLICATION clause is added to an ALTER TABLE statement to prevent the table copy on the intermediate server from being checked, an unnecessary operation.

See the Chapter 20 section entitled "Setting Up and Managing Replication"

Adding Constraints Through the SQL Enterprise Manager

You can add table and column constraints through the SQL Enterprise Manager. To add a constraint to a table or column through the Enterprise Manager, follow these steps:

Right-click a selected tables that you want to which you want to add a constraint in the main window of the Server Manager.
Choose Edit from the menu. You can also double-click the table.
Click the Advanced Features tool on the toolbar. You can click the Primary Key/Identity, Foreign Keys, Unique Constraints, or Check Constraints tabs to create each type of constraint.
Enter the requisite information in the Constraint box that appears after you click a tab. For example, in figure 7.2, a Check Constraint is entered on the departments column for the employees table to prevent any department from being entered and stored that isn't one of three department values.
Fig. 7.2. - The Not for Replication check box can also be checked when the Check Constraint option is defined.
Click the Save Table toolbar button to apply the constraint to the table.

Performing Relational Joins

The rows of different tables can be combined to display and operate on the data using the same statements used in single tables. The rows of multiple tables can be combined in various ways. The first way is called an equi-join or natural join.

You perform a natural join or equi-join by matching equal values for rows in shared columns between multiple tables. You must define one of the two tables so that a column from one of the tables is duplicated in the second table. The column from the original table can be its primary key if it also is a column with values that make the rows of the table unique. The duplicate column that's added to a second table is referred to as a foreign key.

You define a foreign key to permit rows from different tables to be related. In a sense, the matching columns are used to form virtual rows that span a database table. Although each table is limited to 250 columns, matching columns to combine rows from multiple tables can result in almost an unlimited number of columns that can be combined across tables.

You use a standard SELECT statement with a WHERE clause to retrieve data from two or more tables. The syntax of a SELECT statement that's used to join two tables is as follows:

SELECT column_name_1,...column_name_n
FROM table_name_1, table_name_2
WHERE primary_key_column=
foreign_key_column

The following create table and insert statements are used to create a new table in the same database to be used for subsequent relational join examples. Each row that's added to the pays table matches one row of the employees table.

create table pays
(hours_worked int, rate int,badge int)
go
insert into pays
values (40,10,3211);
go
insert into pays
values (40,9,6732);
go
insert into pays
values (52,10,4411);
go
insert into pays
values (39,11,5522);
go
insert into pays
values (51,10,1997);
go
insert into pays
values (40,8,9998);
go
insert into pays
values (55,10,7773);
insert into pays
values (40,9,8883);
go
insert into pays
values (60,7,8005);
go
insert into pays
values (37,11,7779);
go

In the following example, three columns from two tables are displayed after the Badge column is used to combine the rows that have matching Badge numbers.

select name, department,hours_worked
from employees,pays
where employees.badge=pays.badge
name                 department
hours_worked
-------------------- -------------------- --
Stan Humphries       Field Service        40
Fred Stanhope        Field Service        40
Sue Sommers          Logistics            52
Lance Finepoint      Library              39
Mark McGuire         Field Service        51
Sally Springer       Sales                40
Ludmilla Valencia    Software             55
Barbara Lint         Field Service        40
Jeffrey Vickers      Mailroom             60
Jim Walker           Unit Manager         37
(10 row(s) affected)

An equi-join doesn't eliminate any of the table columns from the temporary tables that are formed by the join of tables. You must use a WHERE clause to match the corresponding rows of the tables. You also shouldn't use the asterisk wildcard character to reference all columns of the combined tables. If you use an asterisk, the columns with matching values are displayed twice.

In the following example, the rows of two tables are accessed without using a WHERE clause. SQL Server forms a cross-product of the rows in both tables. If you don't try to combine the rows using matching columns, each row of the second table in the FROM clause is added to every row of the first table. The Badge column is displayed from both tables because the asterisk wildcard is used in the SELECT clause.

select *
from employees,pays
name               department      badge    hours_worked rate   badge
------------------ -------------- -------- ------------ ------  -----
Stan Humphries     Field Service   3211     40           9       6732
Stan Humphries     Field Service   3211     40           10      3211
Stan Humphries     Field Service   3211     52           10      4411
Stan Humphries     Field Service   3211     39           11      5522
Stan Humphries     Field Service   3211     51           10      1997
Stan Humphries     Field Service   3211     40           8       9998
Stan Humphries     Field Service   3211     55           10      7773
Stan Humphries     Field Service   3211     40           9       8883
Stan Humphries     Field Service   3211     60           7       8005
Stan Humphries     Field Service   3211     37           11      7779
Fred Stanhope      Field Service   6732     40           9       6732
...
(100 row(s) affected)

The combination of all rows of the second table with the first table results in a cross-product, also called a Cartesian Product, of the two tables. In the example, the employees table and the pays table each contains 10 rows, so the resultant cross-product creates 100 rows in the temporary table. However, only 10 of the 100 rows belong together. The badge numbers match in one out of every 10 rows between the two tables.

I joined together two tables that didn't have matching columns with a SELECT statement that didn't contain a WHERE clause. I was surprised that SQL Server returned a huge table.

The SELECT statement operates on multiple tables whether or not they were designed to be combined with a relational join. It's important to always use a WHERE clause, which eliminates rows that don't have matching column values. If you don't use a WHERE clause, you'll receive a temporary table that contains the cross-product of the number of rows in the first table multiplied by the number of rows in the second table. For example, two tables that each contains only 100 rows joined without a WHERE clause will return 10,000 rows.

If you reference one of the columns used to match the rows across both tables, you must indicate the table in which the column is defined. Any time you reference a column that has the same name in multiple tables, you must somehow specify which column from which table to prevent ambiguity. The following example displays an error because SQL Server doesn't know from which table to display the Badge column.

select badge
from employees,pays
where employees.badge=pays.badge
Msg 209, Level 16, State 1
Ambiguous column name badge

To avoid ambiguity, the table columns used for matching rows are preceded by the table in which they're defined and separated by a period (.). In the following example, the Badge column is displayed from the first table by preceding the name of the Badge column with its table name.

select employees.badge
from employees,pays
where employees.badge=pays.badge
badge
-----------
3211
6732
4411
5522
1997
9998
7773
8883
8005
7779
(10 row(s) affected)

Using Range Variables

In the previous example, the name of the table is used to prevent ambiguity when referencing table columns in a SELECT statement when multiple tables are referenced. In fact, what appears in the examples to be the name of the table preceding the column name actually is a range variable.

Other dialects of SQL refer to a range variable as an alias.

Range variables are symbolic references for tables that are specified in the FROM clause of a SELECT statement. You can use a range variable in a preceding clause. such as the SELECT clause, or in a clause that comes after the FROM clause, such has a WHERE clause. Define a range variable by specifying a character constant following the name of a table in the FROM clause of a SELECT statement, as in the following syntax:

...
From table_name_1 range_name_1, ...,table_name_n range_name_n
...

You can define a range variable for each table that's specified in the FROM clause. You can use as many as 30 characters—the limit for any permanent or temporary object in Transact-SQL—to define the range variable. A range variable can be defined to provide a shorter reference to a table in a SELECT statement.In the following example, a range variable is defined for each table. The range variables are used in both the SELECT and WHERE clauses.

select e.badge,p.id
from employees e,pays p
where e.badge=p.id
badge       id
----------- ----
3211        3211
3211        3211
3211        3211
3211        3211
6732        6732
4411        4411
5522        5522
1997        1997
9998        9998
7773        7773
8883        8883
8005        8005
7779        7779
3211        3211
3211        3211
3211        3211
3211        3211
(17 row(s) affected)

Range variables are called range variables because after they're defined, the symbolic reference applies to, or ranges through, the table. As in the previous example, you can define a range variable to be a single character and use it as a short nickname for a table.

Range variables can be quite handy because several tables with long names can be specified in a SELECT statement. You can combine rows from as many as 16 tables in SQL Server using Transact-SQL. If you don't explicitly define range variables in the FROM clause, they're implicitly created using the complete name of each table.

In the following example, the rows from three tables are combined using the Badge columns that are defined in all the tables. The range variables are implicitly defined to the table names and are used in both the SELECT and WHERE clauses. The WHERE clause first combines the first and second tables; then the first and third tables use the AND Boolean operator.

select pays.badge, name,department, payrate
from employees,pays, salaries
where employees.badge = pays.badge
and employees.badge=salaries.badge

Transact-SQL automatically establishes range variables for use in queries. If you don't specify a range variable for the name of a table in the FROM clause, a range variable is created with the same name as the table, as in the following example:

select name
from employees

is internally rewritten as:

select employees.name
from employees employees

Although it seems unnecessary to create range variables when only a single table is named in the query, they are mandatory when you reference tables that contain columns with the same names in the same query.

Using Many-to-One and One-to-Many Joins

You may not have tables that have only one corresponding row in each table. In previous examples, only a single row in the employees table matches the value of a single row in the pays table. It's possible that you'll have to create or work with existing tables in which more than one entry is a match for the entries in another table.

In the ensuing examples, rows have been added with identical badge numbers in the employees table. Three employees have been added, each with a last name of Smith and each with the same badge number. In the tables referenced in the following examples, badge isn't defined as a primary key, so duplicate badge numbers can be present. Three employees with last names of Humphries have also been added, each with the same badge number as the original employee, Stan Humphries. The following example shows the rows of the employees table after the additional seven rows are added.

Although you usually define a primary and foreign key using the corresponding columns of tables that you subsequently want to combine rows form for display, you aren't required to define the columns as keys. SQL Server will permit you to perform joins on tables that don't have primary or foreign key definitions. You should realize that the assignment of primary and foreign keys to a table isn't required, though it's often desirable, if you combine data from different tables.

select * from employees
order by badge
name                 department           badge
-------------------- -------------------- ----
Bob Smith            SALES                1234
Henry Smith          Logistics            1234
Susan Smith          Executive            1234
Mark McGuire         Field Service        1997
Gertie Humphries     Sales                3211
Stan Humphries       Field Service        3211
Stan Humphries Jr    Sales                3211
Winkie Humphries     Mailroom             3211
Sue Sommers          Logistics            4411
Lance Finepoint      Library              5522
Fred Stanhope        Field Service        6732
Ludmilla Valencia    Software             7773
Jim Walker           Unit Manager         7779
Jeffrey Vickers      Mailroom             8005
Barbara Lint         Field Service        8883
Sally Springer       Sales                9998
(16 row(s) affected)

The pays table is unaltered and contains only the original 10 rows, as shown in the following example:

select * from pays
order by badge
hours_worked rate        badge
------------ ----------- -----
51           10          1997
40           10          3211
52           10          4411
39           11          5522
40           9           6732
55           10          7773
37           11          7779
60           7           8005
40           9           8883
40           8           9998
(10 row(s) affected)

You can combine tables that have an unequal number of matching rows. The following example joins the employees table, in which two sets of entries match a single entry for the badge column in the pays table. The join of the employees table with the pays table is called many-to-one.

select name,pays.badge,hours_worked,rate
from employees,pays
where employees.badge=pays.badge
name                 badge     hours_worked   rate
----                 -----     ------------   ----
Fred Stanhope        6732      40             9
Stan Humphries       3211      40             10
Gertie Humphries     3211      40             10
Stan Humphries Jr    3211      40             10
Winkie Humphries     3211      40             10
Sue Sommers          4411      52             10
Lance Finepoint      5522      39             11
Mark McGuire         1997      51             10
Sally Springer       9998      40             8
Ludmilla Valencia    7773      55             10
Barbara Lint         8883      40             9
Jeffrey Vickers      8005      60             7
Jim Walker           7779      37             11
(13 row(s) affected)

If you switch the order of the joined tables, it becomes a one-to-many join. The following example returns the same rows that were returned in the previous example:

select name,pays.badge,hours_worked,rate
from pays,employees
where pays.badge=employees.badge
name                 badge       hours_worked   rate
----                 -----       ------------   ----
Fred Stanhope        6732        40             9
Stan Humphries       3211        40             10
Gertie Humphries     3211        40             10
Stan Humphries Jr    3211        40             10
Winkie Humphries     3211        40             10
Sue Sommers          4411        52             10
Lance Finepoint      5522        39             11
Mark McGuire         1997        51             10
Sally Springer       9998        40             8
Ludmilla Valencia    7773        55             10
Barbara Lint         8883        40             9
Jeffrey Vickers      8005        60             7
Jim Walker           7779        37             11
(13 row(s) affected)

Using Many-to-Many Joins

You may also want to join tables where more than one row matches more than one row in a second table, which is referred to as a many-to-many join. In the following example, two tables are combined after one row is added to pays with a 3211 badge number, 73 hours_worked, and rate of 31.

select name,pays.badge,hours_worked,rate
from employees,pays
where employees.badge=pays.badge
name                 badge       hours_worked  rate
----                 -----       ------------  ----
Fred Stanhope        6732        40            9
Stan Humphries       3211        40            10
Gertie Humphries     3211        40            10
Stan Humphries Jr    3211        40            10
Winkie Humphries     3211        40            10
Sue Sommers          4411        52            10
Lance Finepoint      5522        39            11
Mark McGuire         1997        51            10
Sally Springer       9998        40            8
Ludmilla Valencia    7773        55            10
Barbara Lint         8883        40            9
Jeffrey Vickers      8005        60            7
Jim Walker           7779        37            11
Stan Humphries       3211        73            31
Gertie Humphries     3211        73            31
Stan Humphries Jr    3211        73            31
Winkie Humphries     3211        73            31
(17 row(s) affected)

The additional row is added to the temporary table that's displayed. If the row value is restricted to only the badge number 3211, eight rows are returned.

Many-to-many queries are often not desirable and can produce results that are difficult to follow. In most cases, it's best to implement either a one-to-many or a many-to-one relationship, even if it entails adding an intermediary table.

For more information on database design approaches, see Chapter 2, "Data Modeling and Database Design"

select name,pays.badge,hours_worked,rate
from employees,pays
where employees.badge=pays.badge
and pays.badge=3211
name                 badge       hours_worked  rate
----                 -----       ------------  ----
Stan Humphries       3211        40            10
Gertie Humphries     3211        40            10
Stan Humphries Jr    3211        40            10
Winkie Humphries     3211        40            10
Stan Humphries       3211        73            31
Gertie Humphries     3211        73            31
Stan Humphries Jr    3211        73            31
Winkie Humphries     3211        73            31
(8 row(s) affected)

Using Outer Joins

In the previous join examples, we excluded the rows in either of the two tables that didn't have corresponding or matching rows.

Previous examples, in which the rows of two tables were joined with a WHERE statement, included all rows of both tables. However, a query that includes all rows from both tables is probably never useful, except to understand the way in which SQL Server combines the rows. You can combine any two or more tables with a WHERE clause and receive a set of rows that were never meant to be combined and thus receive a meaningless result.

In practice, you'll combine only the rows from tables that have been created to be matched together. Tables that are designed to be combined have common columns of information so that a WHERE clause can be included in a query to eliminate the rows that don't belong together (those that have identical values).

You must ensure that the information used to combine tables—the corresponding values in common columns—remain valid. If the value in one table is changed, the corresponding identical value (or values, if a one-to-many relationship exists) must also be updated in other tables.

Referential integrity involves ensuring that you have valid information in common columns across tables used to join tables. You'll read more about referential integrity in subsequent chapters. Chapter 14, "Creating and Managing Triggers" discusses the mechanism for maintaining referential integrity and Chapter 10, "Managing and Using Indexes and Keys" discusses the common table columns on which joins are based.

Use outer joins to return table rows that have both matching and non-matching values. You may need to return the rows that don't contain matching values in the common table columns for either one table or the other tables specified in the SELECT statement.

If, for example, you join the employees table with the pays table used in the previous examples, you can specify the return of rows with matching values along with rows without matching values. The specification of the outer join is positional, which means that you use a special symbol the precedes or follows the comparison operator in the WHERE clause of a SELECT statement.

An outer join references one of the two tables joined using the table's position in the WHERE clause. A left-outer join specifies the table to the left of a comparison operator and a right-outer join specifies the table to the right of a comparison operator. The following table shows the symbol combination used for outer joins.

Symbol Combination Join

*= Left-outer join

=* Right-outer join

A left-outer join (*=) retains non-matching rows for the table on the left of the symbol combination in a WHERE statement. A right-outer join (=*) retains non-matching rows for the table on the right of the symbol combination.

In the following example, a SELECT statement specifies a join in the WHERE clause to return only rows that contain matching values in a common column for the two tables:

select *
from employees,pays
where employees.badge=pays.badge
name             department     badge     hours_worked rate     badge
---------------- -------------- --------- ------------ -------- ------
Stan Humphries     Field Service  3211      40           10       3211
Gertie Humphries   Sales          3211      40           10       3211
Stan Humphries Jr. Sales          3211      40           10       3211
Winkie Humphries   Mailroom       3211      40           10       3211
Fred Stanhope      Field Service  6732      40           9        6732
Sue Sommers        Logistics      4411      52           10       4411
Lance Finepoint    Library        5522      39           11       5522
Mark McGuire       Field Service  1997      51           10       1997
Sally Springer     Sales          9998      40           8        9998
Ludmilla Valencia  Software       7773      55           10       7773
Barbara Lint       Field Service  8883      40           9        8883
Jeffrey Vickers    Mailroom       8005      60           7        8005
Jim Walker         Unit Manager   7779      37           11       7779
Stan Humphries     Field Service  3211      73           31       3211
Gertie Humphries   Sales          3211      73           31       3211
Stan Humphries Jr. Sales          3211      73           31       3211
Winkie Humphries   Mailroom       3211      73           31       3211
(17 row(s) affected)

In the next example, a left-outer join is used in the WHERE clause of a SELECT statement to specify that both rows containing matching values for a common column and the rows from the left table (employees) are included in the rows returned. This might be the case where you needed to find out what employees don't have a pay rate associated with them.

Before the following query was executed, additional rows were added to the employees table that don't have corresponding values in a common column in the pays table:

select *
from employees,pays
where employees.badge*=pays.badge
name               department    badge     hours_worked rate     badge
------------------ ------------- --------- ------------ -------- -----
Stan Humphries     Field Service  3211      40           10       3211
Stan Humphries     Field Servic   3211      73           31       3211
Fred Stanhope      Field Service  6732      40           9        6732
Sue Sommers        Logistics      4411      52           10       4411
Lance Finepoint    Library        5522      39           11       5522
Mark McGuire       Field Service  1997      51           10       1997
Sally Springer     Sales          9998      40           8        9998
Ludmilla Valencia  Software       7773      55           10       7773
Barbara Lint       Field Service  8883      40           9        8883
Jeffrey Vickers    Mailroom       8005      60           7        8005
Jim Walker         Unit Manager   7779      37           11       7779
Bob Smith          SALES          1234      (null)     (null)   (null)
Bob Jones          Sales          2223      (null)     (null)   (null)
Gertie Humphries   Sales          3211      40           10       3211
Gertie Humphries   Sales          3211      73           31       3211
Stan Humphries Jr. Sales          3211      40           10       3211
Stan Humphries Jr. Sales          3211      73           31       3211
Winkie Humphries   Mailroom       3211      40           10       3211
Winkie Humphries   Mailroom       3211      73           31       3211
Susan Smith        Executive      1234      (null)     (null)   (null)
Henry Smith        Logistics      1234      (null)     (null)   (null)
(21 row(s) affected)

Recall that null values don't match, so rows that contain nulls in the primary and foreign key columns are displaying only with outer joins and won't be seen with equi-joins.

In the next example, a right-outer join is used in the WHERE clause of a SELECT statement to specify that both rows that contain matching values for a common column and the rows from the right table (pays) are included in the rows returned. Two additional rows are first added to the pays table that don't have corresponding values in a common column in the employees table.

insert into pays
values (40,10,5555)
insert into pays
values (40,10,5555)
select *
from employees,pays
where employees.badge=*pays.id
name               department     badge     hours_worked rate     id
------------------ ---------- --------- ------------ -------- -----
Stan Humphries      Field Service 3211      40           10       3211
Gertie Humphries    Sales         3211      40           10       3211
Stan Humphries Jr.  Sales         3211      40           10       3211
Winkie Humphries    Mailroom      3211      40           10       3211
Fred Stanhope       Field Service 6732      40           9        6732
Sue Sommers         Logistics     4411      52           10       4411
Lance Finepoint     Library       5522      39           11       5522
Mark McGuire        Field Service 1997      51           10       1997
Sally Springer      Sales         9998      40           8        9998
Ludmilla Valencia   Software      7773      55           10       7773
Barbara Lint        Field Service 8883      40           9        8883
Jeffrey Vickers     Mailroom      8005      60           7        8005
Jim Walker          Unit Manager  7779      37           11       7779
Stan Humphries      Field Service 3211      73           31       3211
Gertie Humphries    Sales         3211      73           31       3211
Stan Humphries Jr.  Sales         3211      73           31       3211
Winkie Humphries    Mailroom      3211      73           31       3211
(null)              (null)        (null)    40           10       5555
(null)              (null)        (null)    40           10       5555
(19 row(s) affected)

Left- and right-outer joins can be used to shows rows that contain nulls which wouldn't have corresponding entries across tables, and would only be displayed with other non-matching entries.

Combining Query Results with UNION

Use a UNION to combine the results of two or more queries. A UNION merges the results of the first query with the results of a second query. UNION implicitly removes duplicate rows between the queries. A UNION returns a single results set that consists of all the rows that belong to the first table, the second table, or both tables.

You should define the queries that contain a UNION clause so that they're compatible. The queries should have the same number of columns and a common column defined for each table. You also can't use a UNION within the definition of a view.

The syntax for queries that include a UNION clause is as follows:

SELECT column_name_1, ..., column_name_n
FROM table_name_1, ... , table_name_n
WHERE column_name comparison_operator value
[GROUP BY...]
[HAVING ...
UNION
SELECT column_name_1, ..., column_name_n
FROM table_name_1, ... , table_name_n
WHERE column_name comparison_operator value
[GROUP BY...]
[HAVING...]
[ORDER BY...]
[COMPUTE...

In the following example, the badge numbers that are common to both tables are displayed using two select statements that are bound with a UNION clause. The ORDER BY clause is used after the last query to order the final results. The ORDER BY clause appears only after the last SELECT statement. Recall that UNION implicitly removes duplicate rows, as defined by the query.

select badge from employees
union
select badge from pays
order by badge
badge
-----------
1234
1997
3211
4411
5522
6732
7773
7779
8005
8883
9998
(11 row(s) affected)

In the following example, the same set of queries is used except the ALL keyword is added to the UNION clause. This retains query-defined duplicates (only the Badge column) in the resultant rows. The duplicate rows from both tables are retained.

select badge from employees
union all
select badge from pays
order by badge
badge
------
1234
1234
1234
1997
1997
3211
3211
3211
3211
3211
3211
4411
4411
5522
5522
6732
6732
7773
7773
7779
7779
8005
8005
8883
8883
9998
9998
(27 row(s) affected)

In the following example, the datatypes referenced in the query for one of the two columns aren't compatible. The execution of the example returns an error because of this.

select name,badge from employees
union
select hours_worked,badge from pay
Msg 257, Level 16, State 1
Implicit conversion from datatype 'char' to 'int' is not allowed.

 Use the CONVERT function to run this query.

You can use a UNION clause with queries to combine the rows from two compatible tables and merge the rows into a new third table. To illustrate this merge, in which the results are kept in a permanent table, a new table is created that has the same datatypes as the existing employees table. Several rows are first inserted into the new table.

create table employees2
(name char(20),department char(20),badge int)
go
insert into employees2
values ('Rod Gilbert','Sales',3339)
go
insert into employees2
values ('Jean Ratele','Sales',5551)
go
insert into employees2
values ('Eddie Giacomin','Sales',8888)

Each table now has rows that contain employees records. If you use a UNION clause to combine the SELECT statements along with INSERT INTO, the resultant rows can be retained in a new table.

The SELECT statement that references the employees table uses WHERE to restrict the rows returned to only those with the Sales department. All three rows of the employees2 table are in the Sales department, so no WHERE clause is necessary.

create table employees3
(name char(20),department char(20),badge int))
go
insert into employees3
select * from employees
where department='Sales'
union
select * from employees2
(6 row(s) affected)
select * from employees3
name                 department           badge
----                 ----------           -----
Eddie Giacomin       Sales                8888
Gertie Humphries     Sales                3211
Jean Ratele          Sales                5551
Rod Gilbert          Sales                3339
Sally Springer       Sales                9998
Stan Humphries Jr    Sales                3211
(6 row(s) affected)

As in the previous example, you can use UNION to combine multiple tables, or combinations of selected columns and rows from tables, into an existing or new table. You can have tables with identical columns at different office locations in which rows are added throughout the day.

At the end of the work day, you can use a set of SELECT statements with a UNION clause to add the separate collection tables to a master table at a central location. After the rows are copied to the master table, the rows of the collection tables can be removed using a DELETE FROM statement without a WHERE clause, or a TRUNCATE statement as mentioned earlier in this chapter.

You can combine up to 16 SELECT statements by adding additional UNION clauses between each set of SELECT statements. You can use parentheses to control the order of the UNIONs. The SELECT statements within parentheses are performed before those that are outside parentheses. You don't need to use parentheses if all the UNIONs are UNION ALL. You also don't need parentheses if none of the UNIONs is UNION ALL.

In the following example, three tables are combined using a UNION clause. IN is used with the first table to specify multiple forms of one department. The second table doesn't use WHERE and all rows are selected. The third table specifies only a single case-sensitive department name, assuming the sort order is case-sensitive.

You don't need parentheses to control the order of the merges because no UNIONs use ALL. The resultant rows are ordered by Badge and Name using an ORDER BY clause that can appear only after the last SELECT statement.

select name,badge
from employees
where department in ('SALES','Sales','sales')
union
select name,badge
from employees2
union
select name,badge
from employees3
where department='Sales'
order by badge,name
name                 badge
----                 -----
Bob Smith            1234
Gertie Humphries     3211
Stan Humphries Jr    3211
Rod Gilbert          3339
Jean Ratele          5551
Eddie Giacomin       8888
Sally Springer       9998
(7 row(s) affected)

From Here...

In this chapter you learned to add rows and delete rows from a database tables as well as changed the characteristics of a table. You also learned how to perform operations on multiple tables using relational joins and UNION statements. Additional related information can be found in the following chapters:

Chapter 2, "Data Modeling and Database Design" teaches you how you can create tables that logically lend themselves to the types of joins we've covered here.
Chapter 10, "Managing and Using Indexes and Keys" teaches you how to define the common columns in tables that are used to perform joins.
Chapter 12, "Understanding Transactions and Locking" teaches you how to set up a query so that you can undo it.
Chapter 14, "Creating and Managing Triggers" teaches you how to maintain referential integrity, which ensures that correct rows are joined between tables.

QUE Home Page

For technical support For our books And software contact support@mcp.com

Table of Contents

06 - Retrieving Data with Transact-SQL

08 - Using Functions