Information that's stored in multiple tables is often combined in a single query. To combine rows logically across tables, the tables must be created with related columns of data. You must be able to issue queries that not only combine but also eliminate rows from multiple tables, so that only the requisite rows appear in the resultant display.
You'll need to use an UPDATE statement to change one (or more) row's column values for a table. You may need to change a row simply to correct information that was incorrectly entered. You'll want to be able to remove rows that are no longer needed in a table. You use a DELETE statement to remove one or more rows from a table.
In addition to changing table rows, you'll find that you may need to add additional columns of information to a database table. You use the ALTER TABLE statement primarily to add one more columns to an already existing table, as well as change other characteristics of the table.
In this chapter, you'll learn to write queries that retrieve rows of related information from multiple tables. You'll also learn to update and delete the rows of a database table. Also, you'll learn to change the characteristics of database tables including how to add columns to a table.
UPDATE table_name
SET column_name_1 = value,......column_name_n = value
WHERE column_name comparison operator value
You use the WHERE clause to identify the rows to be changed. The WHERE clause, used as part of a SELECT statement, narrows the scope of your selection of rows that will be returned or affected by the query. In an UPDATE statement, the WHERE clause is used to identify the rows that are changed, rather than the rows to be displayed.
You can use the UPDATE statement to change erroneous entries or misspellings for the column values of existing rows of a table.
In the following example, the values for the department and badge columns of the Employees table are changed for the employee Bob Smith. If more than one row has an employee named Bob Smith, the department and badge number of each row is changed.
update employees
set department = 'SALES', badge = 1232
where name = 'Bob Smith'
You can also update views as well as tables with the UPDATE statement. You simply use the name of the View in place of the table name in the UPDATE clause of the UPDATE statement. In many different operations, views are treated the same as tables.
You can use UPDATE to change multiple rows that match the criteria specified by the WHERE clause. In the following example, all rows that contain the department SALES are changed to MARKETING.
update employees
set department = 'MARKETING'
where department = 'SALES'
You must be careful that you specify only the rows you want changed. If you omit a WHERE clause from an UPDATE statement, the change specified in the SET clause is made to every row of the table. The following example shows a change that's made to all rows of a table:
update employees
set wageclass = 'W0'
There are usually two reasons why an UPDATE statement wouldn't contain a WHERE clause: because the WHERE was inadvertently omitted, or because you purposely want to change a column for all rows of a table. For example, when you add a new column to a table with the ALTER TABLE command, you may have assigned a null values for the new column for all existing rows of the table. You can use an UPDATE statement without a WHERE clause to add a non-null value to the new column for all rows.
You can use a SELECT count(*) statement with the same criteriaspecifically, your WHERE clausethat you plan to use in your UPDATE statement to learn the number of rows that will be subsequently changed by your UPDATE statement. By first determining the number of rows that will be affected by your UPDATE, you're more likely to notice any mistakes in your criteria.
See the section in Chapter 8 entitled Using COUNT
SET
Clause Options
You can also use an expression or the keywords DEFAULT and NULL in the SET clause of an UPDATE statement. If you use an expression in a SET clause rather than a constant value, the expression is first evaluated, and its result is assigned to the rows that are specified in the UPDATE statement.
In the following example, a raise in the hourly rate is given to all employees by updating the rate column of the pays table.
update pays
set rate=rate+2
You can use the keyword NULL to change the column value of the specified rows of a table to nulls. The table column that's to be assigned a null value must have been created with the NULL characteristic originally. In the following example, an employee who has been moved out of his current department but not yet assigned to another department has his department changed to a null.
update employees
set department=null
where name='Bob Smith'
You can also use the UPDATE statement to assign a DEFAULT value if a default value has been associated with the table column. In the following example, the department for an employee is changed to the default value which was previously established and associated with the department column.
update employees
set department=default
where name ='Sally Springer'
See the Chapter 11 sections entitled Creating Defaults and Binding Defaults
If a default doesn't exist for the column and the column permits nulls, the column value is changed to a null.
A second way in which you can change the rows of a table is an indirect or deferred update. In such an update, the change is made by deleting the row to be modified and then inserting the row as a new row with the new values in place. A deferred update is slower because two operations are required to made a change the row of a table.
The conditions under which a direct update can be performed are primarily determined by restrictions set on the database table. The following conditions must be met for a direct update to be performed on a table:
Also, a number of conditions must be met for an update in place to be performed on updates that change a single row:
The following set of conditions must be met for updates that change multiple rows to be performed in place:
See the Chapter 12 section entitled Understanding Locks
See the Chapter 10 section entitled Defining Indexes
See the Chapter 14 section entitled Using INSERT and UPDATE Triggers
If neededperhaps because you'll be making many subsequent updates on your database tablesyou can plan the table design so that all updates are direct. You can consider all the restrictions for direct updates to ensure that your updates are performed as quickly as possible.
You can use the query option SHOWPLAN to determine whether an update was direct or deferred.
DELETE [FROM] table_name
WHERE column_name = 'value'
You don't need to use the keyword FROM; it's optional in the DELETE statement. You can delete rows from tables,as well as update tables, through views.
In the following example, the operation of the DELETE statement removes all rows that match the criteria specified in the WHERE clause of the DELETE statement. In this case, we're removing all rows that contain the department "SALES".
delete from employees
where department = 'SALES'
You can first use a COUNT function in a SELECT statement that has an identical WHERE clause to your DELETE statement to determine the number of rows that will be subsequently removed.
I executed a DELETE statement and, even though I uppercased all letters to reference the column values in the WHERE clause of my DELETE statement, it deleted rows that contain the same value in lowercase letters. Why did this happen?
You must be aware of the default sort order that was selected when SQL Server was installed. Microsoft SQL Server installs with a case-insensitive sort order and this is why the DELETE statement didn't distinguish between upper- and lowercase. If you want all your subsequent DELETE statementsas well as UPDATE statementsto be case-sensitive, you might want to update SQL Server using SQL Setup from the SQL Server program group to effectively reinstall SQL Server with a case-sensitive sort order specified.
See the Appendix A section entitled "Installing Server and Client Software"
You can use a DELETE FROM statement to remove multiple rows as well as individual rows. However, use the DELETE FROM statement carefully. If you don't use a WHERE clause in a DELETE FROM statement, all table rows are removedleaving you with an empty table. You'll receive no warning before the DELETE FROM statement is executed. In the following example, all rows of the specified table are deleted:
delete from employees
You should define a transaction when you use statements such as DELETE FROM and UPDATE so that you can later change your mind and undo the removal or change of rows. See Chapter 12, "Understanding Transactions and Locking" to learn how to define and undo a transaction. It's often desirable to keep your transactions to a few statements as necessary, so that other users can access the latest information about a table.
I began entering a DELETE FROM statement and got as far as the name of the table when I accidentally executed the query. When I try to read the table, all the rows are gone. Other products ask me to confirm an unrecoverable action such as the deletion of all table rows. I'm surprised that SQL Server doesn't do this. How can I prevent this from happening again?
Because SQL Server won't ask you to confirm the deletion or update of one or all table rows, you must be sure that you want the action to be performed before you allow it to execute.
The execution of a DELETE statement without a WHERE clause that removes all rows of a table is most often an accident. If you want to delete all rows of a table, but still keep the table intact, you should use the TRUNCATE statement. The syntax of TRUNCATE table is:
Truncate table_name
The advantage of using a TRUNCATE TABLE statement is that the removal of rows is completed faster than an equivalent DELETE statement. The truncate statement is faster because it removes pages of information that contain multiple tables rows at a time rather than the delete statement, which removes individual rows at a time. However, you can't recover table rows with the TRUNCATE TABLE statement. Unlike the DELETE statement, the TRUNCATE statement does not maintain a copy of the deleted rows even if it's part of a defined transaction.
TRUNCATE TABLE and DELETE TABLE retain the database table. If you want to permanently remove a table, as well all rows that it contains, you can use the DROP TABLE statement (which uses the following syntax):
DROP TABLE table_name
After you drop a table, you can't recover the rows that it contained except from a previously made backup copy of the table.
Another advantage of the truncate statement is that it won't log the removal of the information in the transaction log. If you have a situation where your transaction log has become full, you can still use the TRUNCATE statement to remove rows and free up space in the database.
Other implementations of SQL do allow changes even to the datatype of existing columns to be performed through the ALTER TABLE statement. The ALTER TABLE statement in SQL Server and Transact-SQL, however, doesn't permit datatype changes to existing rows. In Microsoft SQL Server, you must create a new table, read the rows out of the old table and into the new table to effect such a change.
For more information, see the sections entitled Changing the Width of a Table Column and Removing a Column from a Table later in this chapter.
The syntax of the ALTER TABLE statement is as follows:
ALTER TABLE [[<database.>]<owner.>]<table_name>
ADD <column_name> <datatype> NULL [constraint],
<column_name> <datatype> NULL...] [constraint]
[WITH NOCHECK]
[DROP [CONSTRAINT]
constraint_name [..., constraint_name_n]]
When ALTER TABLE is executed, it doesn't expand existing rows. It changes only the internal description of the added columns in the system tables. Each time an existing row is read from the disk, SQL Server adds the additional null entry for the new column or columns before it's available to a user.
When a new row is written to the disk, SQL Server creates the new row with the additional column and its value. SQL Server writes the row with the additional column unless no value is specified for the new row and its value remains a null. In the following example, sp_help is used to display the existing characteristics of a table in which three columns are defined:
sp_help employees3
Name Owner Type
------------------------------ --------------------------------------
employees3 dbo user table
Data_located_on_segment When_created
------------------------------ ---------------------------
default Jul 5 1994 10:08PM
Column_name Type Length Nulls
Default_name Rule_name
--------------- --------------- ------ ----- --------------- ------
name char 30 0 (null) (null)
department char 30 0 (null) (null)
badge int 4 0 (null) (null)
Object does not have any indexes.
No defined keys for this object.
ALTER TABLE is used to add a new column to the table. You use the sp_help procedure to verify that the new columns have been added to the table. In the following example, SELECT displays all rows of the new table, including nulls in the new column for all rows:
alter table employees3
add wageclass char(2) null
sp_help employees3
Name Owner Type
------------------------------ ------------------------------ ----------------
employees3 dbo user table
Data_located_on_segment When_created
------------------------------ ---------------------------
default Jul 5 1994 10:08PM
Column_name Type Length Nulls
Default_name Rule_name
--------------- --------------- ------ ----- --------------- ---------------
name char 30 0 (null) (null)
department char 30 0 (null) (null)
badge int 4 0 (null) (null)
wageclass char 2 1 (null) (null)
Object does not have any indexes.
No defined keys for this object.
select * from employees3
name department badge wageclass
------------------------------ ------------------------------ ------- ---------
Stan Humphries Field Service 3211 (null)
Fred Stanhope Field Service 6732 (null)
Sue Sommers Logistics 4411 (null)
Lance Finepoint Library 5522 (null)
Mark McGuire Field Service 1997 (null)
Sally Springer Sales 9998 (null)
Ludmilla Valencia Software 7773 (null)
Barbara Lint Field Service 8883 (null)
Jeffrey Vickers Mailroom 8005 (null)
Jim Walker Unit Manager 7779 (null)
Bob Smith SALES 1234 (null)
(11 row(s) affected)
You can use an UPDATE statement to define values for new columns that are added to a table with ALTER TABLE.
The null values are inserted when a new column is added to the table with the ALTER TABLE statement. In the following example, all table rows have a new value added to the column that was added with an earlier UPDATE TABLE statement. A subsequent SELECT statement is used to display all rows of the table, which includes the new column values.
update employees3
set wageclass='w4'
(11 row(s) affected)
select * from employees3
name department badge wageclass
---------------------- ---------------------- ----------- ---------
Stan Humphries Field Service 3211 w4
Fred Stanhope Field Service 6732 w4
Sue Sommers Logistics 4411 w4
Lance Finepoint Library 5522 w4
Mark McGuire Field Service 1997 w4
Sally Springer Sales 9998 w4
Ludmilla Valencia Software 7773 w4
Barbara Lint Field Service 8883 w4
Jeffrey Vickers Mailroom 8005 w4
Jim Walker Unit Manager 7779 w4
Bob Smith SALES 1234 w4
(11 row(s) affected)
You can also define a new column that you've added to a table with the identity characteristic. In Chapter 5, "Creating Database Tables and Using Datatypes" you learned that the identify characteristic permits you to define an initial value for the first row of the table (the seed) and a value that's added to each successive column to automatically generate a new column value (the increment).
You can't assign the identity characteristic to an existing column. Only new columns that are added to a table with the ALTER TABLE command can be defined with the identity characteristic. Also, if the value automatically generated for a new column by the identity mechanism exceeds the allowable values for the column's datatype, the ALTER TABLE statement fails and an error is displayed.
See the Chapter 5 section entitled "identity Property."
In the following example, an additional column is added to table pays, which is defined with the identity characteristic and can be subsequently used as a row number.
Alter table pays
add row_number identity(1,1)
You can also add one or more columns to a table using the SQL Enterprise Manager. To add a column to an existing table through the SQL Enterprise Manager, follow these steps:
Fig. 7.1 - You can't deselect the null property on a column added to an existing table.
You can also double-click a table to open it and edit its structure. This might be the case where you want to add a new column to the table.
In the following example, a new table is created in which the name and datatype of the first column is identical to the first column in an existing table, but the first column in the new table is smaller in size. In the new table, the second column is defined as VARCHAR instead of CHAR, as it's defined in the second column of the existing table. The new table's third column is defined as SMALLINT instead of INT, as it's defined in the corresponding Badge column in the existing table. The SMALLINT datatype uses half the storage space of INT.
After the new table is created, all rows of the older table are loaded into the new table with an INSERT statement. A SELECT statement is then used to display the rows of the new table.
create table employees4
(name char(15), department varchar(20),badge smallint)
insert into employees4
select name,department,badge from employees
(11 row(s) affected)
select * from employees4
name department badge
--------------- -------------------- ------
Stan Humphries Field Service 3211
Fred Stanhope Field Service 6732
Sue Sommers Logistics 4411
Lance Finepoint Library 5522
Mark McGuire Field Service 1997
Sally Springer Sales 9998
L. Valencia Software 7773
Barbara Lint Field Service 8883
Jeffrey Vickers Mailroom 8005
Jim Walker Unit Manager 7779
Bob Smith SALES 1234
(11 row(s) affected)
The sp_help procedure shows the difference between datatypes in the corresponding columns of the two tables used in the example. The example shows only the relevant parts of the display returned by sp_help.
sp_help employees
Name Owner Type
------------------------------ ------------------------------ ------
employees dbo user table
...
name char 20 0 (null) (null)
department char 20 0 deptdefault (null)
badge int 4 0 (null) (null)
...
sp_help employees4
Name Owner Type
------------------------------ ------------------------------ ------
employees4 dbo user table
--------------- --------------- ------ ----- --------------- --------
...
name char 15 0 (null) (null)
department varchar 20 0 (null) (null)
badge smallint 2 0 (null) (null)
...
The INSERT table statement successfully completes because the data from the earlier table is compatible with the columns defined for the new table. If the data isn't compatible between the tables, you'll receive an error. In the following example, a new table is created that defines a column as a character datatype. The attempted insertion of the corresponding column from one table results in an error because the datatypes can't be implicitly converted.
create table onecolumn
(badge char(4))
insert into onecolumn
select badge from employees
Msg 257, Level 16, State 1
Implicit conversion from datatype 'int' to 'char' is not allowed.
Use the CONVERT function to run this query.
If you're transferring a large number of rows between tables, you can first set a database option called select into/bulkcopy. If the select into/bulkcopy option is set, your rows are copied into a new table faster because SQL Server keeps less information in its transaction logs about your operation. The lack of complete log information about your operation, which prevents an undo or rollback operation to be done later, is probably not important because you still have the rows in the original table should the need arise to undo any operations.
From an ISQL/W command line, the select into/bulk copy option can be set on or off by issuing the following command:
sp_dboption database_name, 'select into/bulkcopy', TRUE|FALSE
For example, the following command turns on select into/bulkcopy for the database employees:
sp_dboption database_employees, 'select into/bulkcopy', true
You can also change a database option using the graphical interface of the SQL Enterprise Manager rather than a command line.
In the following example, a new table is defined that contains only two of the three columns defined in an existing table. INSERT is used with a SELECT statement that references only two of the three columns of the original table in the SELECT clause.
create table employees5
(name char(20), badge int))
insert into employees5
select name,badge from employees
See the Chapter 5 section entitled Creating and Using Constraints
In the following example, a unique CONSTRAINT is added to the badge column for the table employees2.
ALTER TABLE employees2
ADD
CONSTRAINT badgeunc UNIQUE NONCLUSTERED (badge)
Microsoft added a number of additional options to the ALTER TABLE statement in version 6 of SQL Server. All the additions were characteristics that were made to a table in other ways before version 6. You can continue to use the older and more direct ways of changing table characteristics. For example, an index, default, or rule can be defined and subsequently associated with a table using CREATE INDEX, CREATE RULE, or CREATE DEFAULT commands.
The changes that were made to the ALTER TABLE statement, as well as many other statements, allow Transact-SQL to meet the specifications of a standardized specification of SQL, ANSI SQL. The additions for ANSI compatibility result in multiple ways of performing the same operations, sometimes using different keywords or syntax.
You can easily drop a constraint from a table using the drop constraint clause of the alter table statement. You simply specify the name of the constraint to be removed from a table after the keywords drop constraint. For example, to remove a default constraint on the department column for the employees table, enter the following statement:
alter table employees
drop constraint department_default
You can't use WITH NOCHECK to override the initial checking of PRIMARY KEY and UNIQUE constraints. By default, SQL Server applies the constraints to existing rows in the table as well as new rows that are added or changed later. You'll receive an error message and the ALTER TABLE statement will fail if existing data violates your constraint.
You can also specify that a CHECK constraint that's added to a table through the ALTER TABLE statement isn't applied to the existing rows of a table through the NOT FOR REPLICATION clause. NOT FOR REPLICATION operates as though the WITH NOCHECK clause was added to the ALTER TABLE statement. The NOT FOR REPLICATION clause is added to an ALTER TABLE statement for a different purpose than the WITH NOCHECK clause.
If you set up the automatic copying of a table and table rows from one server system to another, the actual work of ensuring the server system that receives a copy of the data is done by an intermediate server. The NOT FOR REPLICATION clause is added to an ALTER TABLE statement to prevent the table copy on the intermediate server from being checked, an unnecessary operation.
See the Chapter 20 section entitled "Setting Up and Managing Replication"
Click the Advanced Features tool on the toolbar. You can click the Primary Key/Identity, Foreign Keys, Unique Constraints, or Check Constraints tabs to create each type of constraint.
Fig. 7.2. - The Not for Replication check box can also be checked when the Check Constraint option is defined.
You perform a natural join or equi-join by matching equal values for rows in shared columns between multiple tables. You must define one of the two tables so that a column from one of the tables is duplicated in the second table. The column from the original table can be its primary key if it also is a column with values that make the rows of the table unique. The duplicate column that's added to a second table is referred to as a foreign key.
You define a foreign key to permit rows from different tables to be related. In a sense, the matching columns are used to form virtual rows that span a database table. Although each table is limited to 250 columns, matching columns to combine rows from multiple tables can result in almost an unlimited number of columns that can be combined across tables.
You use a standard SELECT statement with a WHERE clause to retrieve data from two or more tables. The syntax of a SELECT statement that's used to join two tables is as follows:
SELECT column_name_1,...column_name_n
FROM table_name_1, table_name_2
WHERE primary_key_column=
foreign_key_column
The following create table and insert statements are used to create a new table in the same database to be used for subsequent relational join examples. Each row that's added to the pays table matches one row of the employees table.
create table pays
(hours_worked int, rate int,badge int)
go
insert into pays
values (40,10,3211);
go
insert into pays
values (40,9,6732);
go
insert into pays
values (52,10,4411);
go
insert into pays
values (39,11,5522);
go
insert into pays
values (51,10,1997);
go
insert into pays
values (40,8,9998);
go
insert into pays
values (55,10,7773);
insert into pays
values (40,9,8883);
go
insert into pays
values (60,7,8005);
go
insert into pays
values (37,11,7779);
go
In the following example, three columns from two tables are displayed after the Badge column is used to combine the rows that have matching Badge numbers.
select name, department,hours_worked
from employees,pays
where employees.badge=pays.badge
name department
hours_worked
-------------------- -------------------- --
Stan Humphries Field Service 40
Fred Stanhope Field Service 40
Sue Sommers Logistics 52
Lance Finepoint Library 39
Mark McGuire Field Service 51
Sally Springer Sales 40
Ludmilla Valencia Software 55
Barbara Lint Field Service 40
Jeffrey Vickers Mailroom 60
Jim Walker Unit Manager 37
(10 row(s) affected)
An equi-join doesn't eliminate any of the table columns from the temporary tables that are formed by the join of tables. You must use a WHERE clause to match the corresponding rows of the tables. You also shouldn't use the asterisk wildcard character to reference all columns of the combined tables. If you use an asterisk, the columns with matching values are displayed twice.
In the following example, the rows of two tables are accessed without using a WHERE clause. SQL Server forms a cross-product of the rows in both tables. If you don't try to combine the rows using matching columns, each row of the second table in the FROM clause is added to every row of the first table. The Badge column is displayed from both tables because the asterisk wildcard is used in the SELECT clause.
select *
from employees,pays
name department badge hours_worked rate badge
------------------ -------------- -------- ------------ ------ -----
Stan Humphries Field Service 3211 40 9 6732
Stan Humphries Field Service 3211 40 10 3211
Stan Humphries Field Service 3211 52 10 4411
Stan Humphries Field Service 3211 39 11 5522
Stan Humphries Field Service 3211 51 10 1997
Stan Humphries Field Service 3211 40 8 9998
Stan Humphries Field Service 3211 55 10 7773
Stan Humphries Field Service 3211 40 9 8883
Stan Humphries Field Service 3211 60 7 8005
Stan Humphries Field Service 3211 37 11 7779
Fred Stanhope Field Service 6732 40 9 6732
...
(100 row(s) affected)
The combination of all rows of the second table with the first table results in a cross-product, also called a Cartesian Product, of the two tables. In the example, the employees table and the pays table each contains 10 rows, so the resultant cross-product creates 100 rows in the temporary table. However, only 10 of the 100 rows belong together. The badge numbers match in one out of every 10 rows between the two tables.
I joined together two tables that didn't have matching columns with a SELECT statement that didn't contain a WHERE clause. I was surprised that SQL Server returned a huge table.
The SELECT statement operates on multiple tables whether or not they were designed to be combined with a relational join. It's important to always use a WHERE clause, which eliminates rows that don't have matching column values. If you don't use a WHERE clause, you'll receive a temporary table that contains the cross-product of the number of rows in the first table multiplied by the number of rows in the second table. For example, two tables that each contains only 100 rows joined without a WHERE clause will return 10,000 rows.
If you reference one of the columns used to match the rows across both tables, you must indicate the table in which the column is defined. Any time you reference a column that has the same name in multiple tables, you must somehow specify which column from which table to prevent ambiguity. The following example displays an error because SQL Server doesn't know from which table to display the Badge column.
select badge
from employees,pays
where employees.badge=pays.badge
Msg 209, Level 16, State 1
Ambiguous column name badge
To avoid ambiguity, the table columns used for matching rows are preceded by the table in which they're defined and separated by a period (.). In the following example, the Badge column is displayed from the first table by preceding the name of the Badge column with its table name.
select employees.badge
from employees,pays
where employees.badge=pays.badge
badge
-----------
3211
6732
4411
5522
1997
9998
7773
8883
8005
7779
(10 row(s) affected)
Other dialects of SQL refer to a range variable as an alias.
Range variables are symbolic references for tables that are specified in the FROM clause of a SELECT statement. You can use a range variable in a preceding clause. such as the SELECT clause, or in a clause that comes after the FROM clause, such has a WHERE clause. Define a range variable by specifying a character constant following the name of a table in the FROM clause of a SELECT statement, as in the following syntax:
...
From table_name_1 range_name_1, ...,table_name_n range_name_n
...
You can define a range variable for each table that's specified in the FROM clause. You can use as many as 30 charactersthe limit for any permanent or temporary object in Transact-SQLto define the range variable. A range variable can be defined to provide a shorter reference to a table in a SELECT statement.In the following example, a range variable is defined for each table. The range variables are used in both the SELECT and WHERE clauses.
select e.badge,p.id
from employees e,pays p
where e.badge=p.id
badge id
----------- ----
3211 3211
3211 3211
3211 3211
3211 3211
6732 6732
4411 4411
5522 5522
1997 1997
9998 9998
7773 7773
8883 8883
8005 8005
7779 7779
3211 3211
3211 3211
3211 3211
3211 3211
(17 row(s) affected)
Range variables are called range variables because after they're defined, the symbolic reference applies to, or ranges through, the table. As in the previous example, you can define a range variable to be a single character and use it as a short nickname for a table.
Range variables can be quite handy because several tables with long names can be specified in a SELECT statement. You can combine rows from as many as 16 tables in SQL Server using Transact-SQL. If you don't explicitly define range variables in the FROM clause, they're implicitly created using the complete name of each table.
In the following example, the rows from three tables are combined using the Badge columns that are defined in all the tables. The range variables are implicitly defined to the table names and are used in both the SELECT and WHERE clauses. The WHERE clause first combines the first and second tables; then the first and third tables use the AND Boolean operator.
select pays.badge, name,department, payrate
from employees,pays, salaries
where employees.badge = pays.badge
and employees.badge=salaries.badge
Transact-SQL automatically establishes range variables for use in queries. If you don't specify a range variable for the name of a table in the FROM clause, a range variable is created with the same name as the table, as in the following example:
select name
from employees
is internally rewritten as:
select employees.name
from employees employees
Although it seems unnecessary to create range variables when only a single table is named in the query, they are mandatory when you reference tables that contain columns with the same names in the same query.
In the ensuing examples, rows have been added with identical badge numbers in the employees table. Three employees have been added, each with a last name of Smith and each with the same badge number. In the tables referenced in the following examples, badge isn't defined as a primary key, so duplicate badge numbers can be present. Three employees with last names of Humphries have also been added, each with the same badge number as the original employee, Stan Humphries. The following example shows the rows of the employees table after the additional seven rows are added.
Although you usually define a primary and foreign key using the corresponding columns of tables that you subsequently want to combine rows form for display, you aren't required to define the columns as keys. SQL Server will permit you to perform joins on tables that don't have primary or foreign key definitions. You should realize that the assignment of primary and foreign keys to a table isn't required, though it's often desirable, if you combine data from different tables.
select * from employees
order by badge
name department badge
-------------------- -------------------- ----
Bob Smith SALES 1234
Henry Smith Logistics 1234
Susan Smith Executive 1234
Mark McGuire Field Service 1997
Gertie Humphries Sales 3211
Stan Humphries Field Service 3211
Stan Humphries Jr Sales 3211
Winkie Humphries Mailroom 3211
Sue Sommers Logistics 4411
Lance Finepoint Library 5522
Fred Stanhope Field Service 6732
Ludmilla Valencia Software 7773
Jim Walker Unit Manager 7779
Jeffrey Vickers Mailroom 8005
Barbara Lint Field Service 8883
Sally Springer Sales 9998
(16 row(s) affected)
The pays table is unaltered and contains only the original 10 rows, as shown in the following example:
select * from pays
order by badge
hours_worked rate badge
------------ ----------- -----
51 10 1997
40 10 3211
52 10 4411
39 11 5522
40 9 6732
55 10 7773
37 11 7779
60 7 8005
40 9 8883
40 8 9998
(10 row(s) affected)
You can combine tables that have an unequal number of matching rows. The following example joins the employees table, in which two sets of entries match a single entry for the badge column in the pays table. The join of the employees table with the pays table is called many-to-one.
select name,pays.badge,hours_worked,rate
from employees,pays
where employees.badge=pays.badge
name badge hours_worked rate
---- ----- ------------ ----
Fred Stanhope 6732 40 9
Stan Humphries 3211 40 10
Gertie Humphries 3211 40 10
Stan Humphries Jr 3211 40 10
Winkie Humphries 3211 40 10
Sue Sommers 4411 52 10
Lance Finepoint 5522 39 11
Mark McGuire 1997 51 10
Sally Springer 9998 40 8
Ludmilla Valencia 7773 55 10
Barbara Lint 8883 40 9
Jeffrey Vickers 8005 60 7
Jim Walker 7779 37 11
(13 row(s) affected)
If you switch the order of the joined tables, it becomes a one-to-many join. The following example returns the same rows that were returned in the previous example:
select name,pays.badge,hours_worked,rate
from pays,employees
where pays.badge=employees.badge
name badge hours_worked rate
---- ----- ------------ ----
Fred Stanhope 6732 40 9
Stan Humphries 3211 40 10
Gertie Humphries 3211 40 10
Stan Humphries Jr 3211 40 10
Winkie Humphries 3211 40 10
Sue Sommers 4411 52 10
Lance Finepoint 5522 39 11
Mark McGuire 1997 51 10
Sally Springer 9998 40 8
Ludmilla Valencia 7773 55 10
Barbara Lint 8883 40 9
Jeffrey Vickers 8005 60 7
Jim Walker 7779 37 11
(13 row(s) affected)
select name,pays.badge,hours_worked,rate
from employees,pays
where employees.badge=pays.badge
name badge hours_worked rate
---- ----- ------------ ----
Fred Stanhope 6732 40 9
Stan Humphries 3211 40 10
Gertie Humphries 3211 40 10
Stan Humphries Jr 3211 40 10
Winkie Humphries 3211 40 10
Sue Sommers 4411 52 10
Lance Finepoint 5522 39 11
Mark McGuire 1997 51 10
Sally Springer 9998 40 8
Ludmilla Valencia 7773 55 10
Barbara Lint 8883 40 9
Jeffrey Vickers 8005 60 7
Jim Walker 7779 37 11
Stan Humphries 3211 73 31
Gertie Humphries 3211 73 31
Stan Humphries Jr 3211 73 31
Winkie Humphries 3211 73 31
(17 row(s) affected)
The additional row is added to the temporary table that's displayed. If the row value is restricted to only the badge number 3211, eight rows are returned.
Many-to-many queries are often not desirable and can produce results that are difficult to follow. In most cases, it's best to implement either a one-to-many or a many-to-one relationship, even if it entails adding an intermediary table.
For more information on database design approaches, see Chapter 2, "Data Modeling and Database Design"
select name,pays.badge,hours_worked,rate
from employees,pays
where employees.badge=pays.badge
and pays.badge=3211
name badge hours_worked rate
---- ----- ------------ ----
Stan Humphries 3211 40 10
Gertie Humphries 3211 40 10
Stan Humphries Jr 3211 40 10
Winkie Humphries 3211 40 10
Stan Humphries 3211 73 31
Gertie Humphries 3211 73 31
Stan Humphries Jr 3211 73 31
Winkie Humphries 3211 73 31
(8 row(s) affected)
Previous examples, in which the rows of two tables were joined with a WHERE statement, included all rows of both tables. However, a query that includes all rows from both tables is probably never useful, except to understand the way in which SQL Server combines the rows. You can combine any two or more tables with a WHERE clause and receive a set of rows that were never meant to be combined and thus receive a meaningless result.
In practice, you'll combine only the rows from tables that have been created to be matched together. Tables that are designed to be combined have common columns of information so that a WHERE clause can be included in a query to eliminate the rows that don't belong together (those that have identical values).
You must ensure that the information used to combine tablesthe corresponding values in common columnsremain valid. If the value in one table is changed, the corresponding identical value (or values, if a one-to-many relationship exists) must also be updated in other tables.
Referential integrity involves ensuring that you have valid information in common columns across tables used to join tables. You'll read more about referential integrity in subsequent chapters. Chapter 14, "Creating and Managing Triggers" discusses the mechanism for maintaining referential integrity and Chapter 10, "Managing and Using Indexes and Keys" discusses the common table columns on which joins are based.
Use outer joins to return table rows that have both matching and non-matching values. You may need to return the rows that don't contain matching values in the common table columns for either one table or the other tables specified in the SELECT statement.
If, for example, you join the employees table with the pays table used in the previous examples, you can specify the return of rows with matching values along with rows without matching values. The specification of the outer join is positional, which means that you use a special symbol the precedes or follows the comparison operator in the WHERE clause of a SELECT statement.
An outer join references one of the two tables joined using the table's position in the WHERE clause. A left-outer join specifies the table to the left of a comparison operator and a right-outer join specifies the table to the right of a comparison operator. The following table shows the symbol combination used for outer joins.
Symbol Combination Join
*= Left-outer join
=* Right-outer join
A left-outer join (*=) retains non-matching rows for the table on the left of the symbol combination in a WHERE statement. A right-outer join (=*) retains non-matching rows for the table on the right of the symbol combination.
In the following example, a SELECT statement specifies a join in the WHERE clause to return only rows that contain matching values in a common column for the two tables:
select *
from employees,pays
where employees.badge=pays.badge
name department badge hours_worked rate badge
---------------- -------------- --------- ------------ -------- ------
Stan Humphries Field Service 3211 40 10 3211
Gertie Humphries Sales 3211 40 10 3211
Stan Humphries Jr. Sales 3211 40 10 3211
Winkie Humphries Mailroom 3211 40 10 3211
Fred Stanhope Field Service 6732 40 9 6732
Sue Sommers Logistics 4411 52 10 4411
Lance Finepoint Library 5522 39 11 5522
Mark McGuire Field Service 1997 51 10 1997
Sally Springer Sales 9998 40 8 9998
Ludmilla Valencia Software 7773 55 10 7773
Barbara Lint Field Service 8883 40 9 8883
Jeffrey Vickers Mailroom 8005 60 7 8005
Jim Walker Unit Manager 7779 37 11 7779
Stan Humphries Field Service 3211 73 31 3211
Gertie Humphries Sales 3211 73 31 3211
Stan Humphries Jr. Sales 3211 73 31 3211
Winkie Humphries Mailroom 3211 73 31 3211
(17 row(s) affected)
In the next example, a left-outer join is used in the WHERE clause of a SELECT statement to specify that both rows containing matching values for a common column and the rows from the left table (employees) are included in the rows returned. This might be the case where you needed to find out what employees don't have a pay rate associated with them.
Before the following query was executed, additional rows were added to the employees table that don't have corresponding values in a common column in the pays table:
select *
from employees,pays
where employees.badge*=pays.badge
name department badge hours_worked rate badge
------------------ ------------- --------- ------------ -------- -----
Stan Humphries Field Service 3211 40 10 3211
Stan Humphries Field Servic 3211 73 31 3211
Fred Stanhope Field Service 6732 40 9 6732
Sue Sommers Logistics 4411 52 10 4411
Lance Finepoint Library 5522 39 11 5522
Mark McGuire Field Service 1997 51 10 1997
Sally Springer Sales 9998 40 8 9998
Ludmilla Valencia Software 7773 55 10 7773
Barbara Lint Field Service 8883 40 9 8883
Jeffrey Vickers Mailroom 8005 60 7 8005
Jim Walker Unit Manager 7779 37 11 7779
Bob Smith SALES 1234 (null) (null) (null)
Bob Jones Sales 2223 (null) (null) (null)
Gertie Humphries Sales 3211 40 10 3211
Gertie Humphries Sales 3211 73 31 3211
Stan Humphries Jr. Sales 3211 40 10 3211
Stan Humphries Jr. Sales 3211 73 31 3211
Winkie Humphries Mailroom 3211 40 10 3211
Winkie Humphries Mailroom 3211 73 31 3211
Susan Smith Executive 1234 (null) (null) (null)
Henry Smith Logistics 1234 (null) (null) (null)
(21 row(s) affected)
Recall that null values don't match, so rows that contain nulls in the primary and foreign key columns are displaying only with outer joins and won't be seen with equi-joins.
In the next example, a right-outer join is used in the WHERE clause of a SELECT statement to specify that both rows that contain matching values for a common column and the rows from the right table (pays) are included in the rows returned. Two additional rows are first added to the pays table that don't have corresponding values in a common column in the employees table.
insert into pays
values (40,10,5555)
insert into pays
values (40,10,5555)
select *
from employees,pays
where employees.badge=*pays.id
name department badge hours_worked rate id
------------------ ---------- --------- ------------ -------- -----
Stan Humphries Field Service 3211 40 10 3211
Gertie Humphries Sales 3211 40 10 3211
Stan Humphries Jr. Sales 3211 40 10 3211
Winkie Humphries Mailroom 3211 40 10 3211
Fred Stanhope Field Service 6732 40 9 6732
Sue Sommers Logistics 4411 52 10 4411
Lance Finepoint Library 5522 39 11 5522
Mark McGuire Field Service 1997 51 10 1997
Sally Springer Sales 9998 40 8 9998
Ludmilla Valencia Software 7773 55 10 7773
Barbara Lint Field Service 8883 40 9 8883
Jeffrey Vickers Mailroom 8005 60 7 8005
Jim Walker Unit Manager 7779 37 11 7779
Stan Humphries Field Service 3211 73 31 3211
Gertie Humphries Sales 3211 73 31 3211
Stan Humphries Jr. Sales 3211 73 31 3211
Winkie Humphries Mailroom 3211 73 31 3211
(null) (null) (null) 40 10 5555
(null) (null) (null) 40 10 5555
(19 row(s) affected)
Left- and right-outer joins can be used to shows rows that contain nulls which wouldn't have corresponding entries across tables, and would only be displayed with other non-matching entries.
You should define the queries that contain a UNION clause so that they're compatible. The queries should have the same number of columns and a common column defined for each table. You also can't use a UNION within the definition of a view.
The syntax for queries that include a UNION clause is as follows:
SELECT column_name_1, ..., column_name_n
FROM table_name_1, ... , table_name_n
WHERE column_name comparison_operator value
[GROUP BY...]
[HAVING ...
UNION
SELECT column_name_1, ..., column_name_n
FROM table_name_1, ... , table_name_n
WHERE column_name comparison_operator value
[GROUP BY...]
[HAVING...]
[ORDER BY...]
[COMPUTE...
In the following example, the badge numbers that are common to both tables are displayed using two select statements that are bound with a UNION clause. The ORDER BY clause is used after the last query to order the final results. The ORDER BY clause appears only after the last SELECT statement. Recall that UNION implicitly removes duplicate rows, as defined by the query.
select badge from employees
union
select badge from pays
order by badge
badge
-----------
1234
1997
3211
4411
5522
6732
7773
7779
8005
8883
9998
(11 row(s) affected)
In the following example, the same set of queries is used except the ALL keyword is added to the UNION clause. This retains query-defined duplicates (only the Badge column) in the resultant rows. The duplicate rows from both tables are retained.
select badge from employees
union all
select badge from pays
order by badge
badge
------
1234
1234
1234
1997
1997
3211
3211
3211
3211
3211
3211
4411
4411
5522
5522
6732
6732
7773
7773
7779
7779
8005
8005
8883
8883
9998
9998
(27 row(s) affected)
In the following example, the datatypes referenced in the query for one of the two columns aren't compatible. The execution of the example returns an error because of this.
select name,badge from employees
union
select hours_worked,badge from pay
Msg 257, Level 16, State 1
Implicit conversion from datatype 'char' to 'int' is not allowed.
Use the CONVERT function to run this query.
You can use a UNION clause with queries to combine the rows from two compatible tables and merge the rows into a new third table. To illustrate this merge, in which the results are kept in a permanent table, a new table is created that has the same datatypes as the existing employees table. Several rows are first inserted into the new table.
create table employees2
(name char(20),department char(20),badge int)
go
insert into employees2
values ('Rod Gilbert','Sales',3339)
go
insert into employees2
values ('Jean Ratele','Sales',5551)
go
insert into employees2
values ('Eddie Giacomin','Sales',8888)
Each table now has rows that contain employees records. If you use a UNION clause to combine the SELECT statements along with INSERT INTO, the resultant rows can be retained in a new table.
The SELECT statement that references the employees table uses WHERE to restrict the rows returned to only those with the Sales department. All three rows of the employees2 table are in the Sales department, so no WHERE clause is necessary.
create table employees3
(name char(20),department char(20),badge int))
go
insert into employees3
select * from employees
where department='Sales'
union
select * from employees2
(6 row(s) affected)
select * from employees3
name department badge
---- ---------- -----
Eddie Giacomin Sales 8888
Gertie Humphries Sales 3211
Jean Ratele Sales 5551
Rod Gilbert Sales 3339
Sally Springer Sales 9998
Stan Humphries Jr Sales 3211
(6 row(s) affected)
As in the previous example, you can use UNION to combine multiple tables, or combinations of selected columns and rows from tables, into an existing or new table. You can have tables with identical columns at different office locations in which rows are added throughout the day.
At the end of the work day, you can use a set of SELECT statements with a UNION clause to add the separate collection tables to a master table at a central location. After the rows are copied to the master table, the rows of the collection tables can be removed using a DELETE FROM statement without a WHERE clause, or a TRUNCATE statement as mentioned earlier in this chapter.
You can combine up to 16 SELECT statements by adding additional UNION clauses between each set of SELECT statements. You can use parentheses to control the order of the UNIONs. The SELECT statements within parentheses are performed before those that are outside parentheses. You don't need to use parentheses if all the UNIONs are UNION ALL. You also don't need parentheses if none of the UNIONs is UNION ALL.
In the following example, three tables are combined using a UNION clause. IN is used with the first table to specify multiple forms of one department. The second table doesn't use WHERE and all rows are selected. The third table specifies only a single case-sensitive department name, assuming the sort order is case-sensitive.
You don't need parentheses to control the order of the merges because no UNIONs use ALL. The resultant rows are ordered by Badge and Name using an ORDER BY clause that can appear only after the last SELECT statement.
select name,badge
from employees
where department in ('SALES','Sales','sales')
union
select name,badge
from employees2
union
select name,badge
from employees3
where department='Sales'
order by badge,name
name badge
---- -----
Bob Smith 1234
Gertie Humphries 3211
Stan Humphries Jr 3211
Rod Gilbert 3339
Jean Ratele 5551
Eddie Giacomin 8888
Sally Springer 9998
(7 row(s) affected)