Information that's stored in multiple tables is often combined in a single query. To combine rows logically across tables, the tables must be created with related columns of data. You must be able to issue queries that not only combine but also eliminate rows from multiple tables, so that only the requisite rows appear in the resultant display.
You'll need to use an UPDATE statement to change one (or more) row's column values for a table. You may need to change a row simply to correct information that was incorrectly entered. You'll want to be able to remove rows that are no longer needed in a table. You use a DELETE statement to remove one or more rows from a table.
In addition to changing table rows, you'll find that you may need to add additional columns of information to a database table. You use the ALTER TABLE statement primarily to add one more columns to an already existing table, as well as change other characteristics of the table.
In this chapter, you'll learn to write queries that retrieve rows of related information from multiple tables. You'll also learn to update and delete the rows of a database table. Also, you'll learn to change the characteristics of database tables including how to add columns to a table.
UPDATE table_name SET column_name_1 = value,......column_name_n = value WHERE column_name comparison operator valueYou use the WHERE clause to identify the rows to be changed. The WHERE clause, used as part of a SELECT statement, narrows the scope of your selection of rows that will be returned or affected by the query. In an UPDATE statement, the WHERE clause is used to identify the rows that are changed, rather than the rows to be displayed.
You can use the UPDATE statement to change erroneous entries or misspellings for the column values of existing rows of a table.
In the following example, the values for the department and badge columns of the Employees table are changed for the employee Bob Smith. If more than one row has an employee named Bob Smith, the department and badge number of each row is changed.
update employees set department = 'SALES', badge = 1232 where name = 'Bob Smith'
You can use UPDATE to change multiple rows that match the criteria specified by the WHERE clause. In the following example, all rows that contain the department SALES are changed to MARKETING.
update employees set department = 'MARKETING' where department = 'SALES'
update employees set wageclass = 'W0'There are usually two reasons why an UPDATE statement wouldn't contain a WHERE clause: because the WHERE was inadvertently omitted, or because you purposely want to change a column for all rows of a table. For example, when you add a new column to a table with the ALTER TABLE command, you may have assigned a null values for the new column for all existing rows of the table. You can use an UPDATE statement without a WHERE clause to add a non-null value to the new column for all rows.
You can use a SELECT count(*) statement with the same criteria—specifically, your WHERE clause—that you plan to use in your UPDATE statement to learn the number of rows that will be subsequently changed by your UPDATE statement. By first determining the number of rows that will be affected by your UPDATE, you're more likely to notice any mistakes in your criteria.
See the section in Chapter 8 entitled Using COUNT
In the following example, a raise in the hourly rate is given to all employees by updating the rate column of the pays table.
update pays set rate=rate+2You can use the keyword NULL to change the column value of the specified rows of a table to nulls. The table column that's to be assigned a null value must have been created with the NULL characteristic originally. In the following example, an employee who has been moved out of his current department but not yet assigned to another department has his department changed to a null.
update employees set department=null where name='Bob Smith'You can also use the UPDATE statement to assign a DEFAULT value if a default value has been associated with the table column. In the following example, the department for an employee is changed to the default value which was previously established and associated with the department column.
update employees set department=default where name ='Sally Springer'
If a default doesn't exist for the column and the column permits nulls, the column value is changed to a null.
A second way in which you can change the rows of a table is an indirect or deferred update. In such an update, the change is made by deleting the row to be modified and then inserting the row as a new row with the new values in place. A deferred update is slower because two operations are required to made a change the row of a table.
The conditions under which a direct update can be performed are primarily determined by restrictions set on the database table. The following conditions must be met for a direct update to be performed on a table:
Also, a number of conditions must be met for an update in place to be performed on updates that change a single row:
The following set of conditions must be met for updates that change multiple rows to be performed in place:
See the Chapter 12 section entitled Understanding Locks
See the Chapter 10 section entitled Defining Indexes
See the Chapter 14 section entitled Using INSERT and UPDATE Triggers
If needed—perhaps because you'll be making many subsequent updates on your database tables—you can plan the table design so that all updates are direct. You can consider all the restrictions for direct updates to ensure that your updates are performed as quickly as possible.
You can use the query option SHOWPLAN to determine whether an update was direct or deferred.
DELETE [FROM] table_name WHERE column_name = 'value'You don't need to use the keyword FROM; it's optional in the DELETE statement. You can delete rows from tables,as well as update tables, through views.
In the following example, the operation of the DELETE statement removes all rows that match the criteria specified in the WHERE clause of the DELETE statement. In this case, we're removing all rows that contain the department "SALES".
delete from employees where department = 'SALES'
I executed a DELETE statement and, even though I uppercased all letters to reference the column values in the WHERE clause of my DELETE statement, it deleted rows that contain the same value in lowercase letters. Why did this happen?
You must be aware of the default sort order that was selected when SQL Server was installed. Microsoft SQL Server installs with a case-insensitive sort order and this is why the DELETE statement didn't distinguish between upper- and lowercase. If you want all your subsequent DELETE statements—as well as UPDATE statements—to be case-sensitive, you might want to update SQL Server using SQL Setup from the SQL Server program group to effectively reinstall SQL Server with a case-sensitive sort order specified.
See the Appendix A section entitled "Installing Server and Client Software"
You can use a DELETE FROM statement to remove multiple rows as well as individual rows. However, use the DELETE FROM statement carefully. If you don't use a WHERE clause in a DELETE FROM statement, all table rows are removed—leaving you with an empty table. You'll receive no warning before the DELETE FROM statement is executed. In the following example, all rows of the specified table are deleted:
delete from employees
I began entering a DELETE FROM statement and got as far as the name of the table when I accidentally executed the query. When I try to read the table, all the rows are gone. Other products ask me to confirm an unrecoverable action such as the deletion of all table rows. I'm surprised that SQL Server doesn't do this. How can I prevent this from happening again?
Because SQL Server won't ask you to confirm the deletion or update of one or all table rows, you must be sure that you want the action to be performed before you allow it to execute.
The execution of a DELETE statement without a WHERE clause that removes all rows of a table is most often an accident. If you want to delete all rows of a table, but still keep the table intact, you should use the TRUNCATE statement. The syntax of TRUNCATE table is:
Truncate table_nameThe advantage of using a TRUNCATE TABLE statement is that the removal of rows is completed faster than an equivalent DELETE statement. The truncate statement is faster because it removes pages of information that contain multiple tables rows at a time rather than the delete statement, which removes individual rows at a time. However, you can't recover table rows with the TRUNCATE TABLE statement. Unlike the DELETE statement, the TRUNCATE statement does not maintain a copy of the deleted rows even if it's part of a defined transaction.
TRUNCATE TABLE and DELETE TABLE retain the database table. If you want to permanently remove a table, as well all rows that it contains, you can use the DROP TABLE statement (which uses the following syntax):
DROP TABLE table_nameAfter you drop a table, you can't recover the rows that it contained except from a previously made backup copy of the table.
Another advantage of the truncate statement is that it won't log the removal of the information in the transaction log. If you have a situation where your transaction log has become full, you can still use the TRUNCATE statement to remove rows and free up space in the database.
Other implementations of SQL do allow changes even to the datatype of existing columns to be performed through the ALTER TABLE statement. The ALTER TABLE statement in SQL Server and Transact-SQL, however, doesn't permit datatype changes to existing rows. In Microsoft SQL Server, you must create a new table, read the rows out of the old table and into the new table to effect such a change.
For more information, see the sections entitled Changing the Width of a Table Column and Removing a Column from a Table later in this chapter.
The syntax of the ALTER TABLE statement is as follows:
ALTER TABLE [[<database.>]<owner.>]<table_name> ADD <column_name> <datatype> NULL [constraint], <column_name> <datatype> NULL...] [constraint] [WITH NOCHECK] [DROP [CONSTRAINT] constraint_name [..., constraint_name_n]]When ALTER TABLE is executed, it doesn't expand existing rows. It changes only the internal description of the added columns in the system tables. Each time an existing row is read from the disk, SQL Server adds the additional null entry for the new column or columns before it's available to a user.
When a new row is written to the disk, SQL Server creates the new row with the additional column and its value. SQL Server writes the row with the additional column unless no value is specified for the new row and its value remains a null. In the following example, sp_help is used to display the existing characteristics of a table in which three columns are defined:
sp_help employees3 Name Owner Type ------------------------------ -------------------------------------- employees3 dbo user table Data_located_on_segment When_created ------------------------------ --------------------------- default Jul 5 1994 10:08PM Column_name Type Length Nulls Default_name Rule_name --------------- --------------- ------ ----- --------------- ------ name char 30 0 (null) (null) department char 30 0 (null) (null) badge int 4 0 (null) (null) Object does not have any indexes. No defined keys for this object.ALTER TABLE is used to add a new column to the table. You use the sp_help procedure to verify that the new columns have been added to the table. In the following example, SELECT displays all rows of the new table, including nulls in the new column for all rows:
alter table employees3 add wageclass char(2) null sp_help employees3 Name Owner Type ------------------------------ ------------------------------ ---------------- employees3 dbo user table Data_located_on_segment When_created ------------------------------ --------------------------- default Jul 5 1994 10:08PM Column_name Type Length Nulls Default_name Rule_name --------------- --------------- ------ ----- --------------- --------------- name char 30 0 (null) (null) department char 30 0 (null) (null) badge int 4 0 (null) (null) wageclass char 2 1 (null) (null) Object does not have any indexes. No defined keys for this object. select * from employees3 name department badge wageclass ------------------------------ ------------------------------ ------- --------- Stan Humphries Field Service 3211 (null) Fred Stanhope Field Service 6732 (null) Sue Sommers Logistics 4411 (null) Lance Finepoint Library 5522 (null) Mark McGuire Field Service 1997 (null) Sally Springer Sales 9998 (null) Ludmilla Valencia Software 7773 (null) Barbara Lint Field Service 8883 (null) Jeffrey Vickers Mailroom 8005 (null) Jim Walker Unit Manager 7779 (null) Bob Smith SALES 1234 (null) (11 row(s) affected)You can use an UPDATE statement to define values for new columns that are added to a table with ALTER TABLE.
The null values are inserted when a new column is added to the table with the ALTER TABLE statement. In the following example, all table rows have a new value added to the column that was added with an earlier UPDATE TABLE statement. A subsequent SELECT statement is used to display all rows of the table, which includes the new column values.
update employees3 set wageclass='w4' (11 row(s) affected) select * from employees3 name department badge wageclass ---------------------- ---------------------- ----------- --------- Stan Humphries Field Service 3211 w4 Fred Stanhope Field Service 6732 w4 Sue Sommers Logistics 4411 w4 Lance Finepoint Library 5522 w4 Mark McGuire Field Service 1997 w4 Sally Springer Sales 9998 w4 Ludmilla Valencia Software 7773 w4 Barbara Lint Field Service 8883 w4 Jeffrey Vickers Mailroom 8005 w4 Jim Walker Unit Manager 7779 w4 Bob Smith SALES 1234 w4 (11 row(s) affected)You can also define a new column that you've added to a table with the identity characteristic. In Chapter 5, "Creating Database Tables and Using Datatypes" you learned that the identify characteristic permits you to define an initial value for the first row of the table (the seed) and a value that's added to each successive column to automatically generate a new column value (the increment).
You can't assign the identity characteristic to an existing column. Only new columns that are added to a table with the ALTER TABLE command can be defined with the identity characteristic. Also, if the value automatically generated for a new column by the identity mechanism exceeds the allowable values for the column's datatype, the ALTER TABLE statement fails and an error is displayed.
See the Chapter 5 section entitled "identity Property."
In the following example, an additional column is added to table pays, which is defined with the identity characteristic and can be subsequently used as a row number.
Alter table pays add row_number identity(1,1)You can also add one or more columns to a table using the SQL Enterprise Manager. To add a column to an existing table through the SQL Enterprise Manager, follow these steps:
Fig. 7.1 - You can't deselect the null property on a column added to an existing table.
You can also double-click a table to open it and edit its structure. This might be the case where you want to add a new column to the table.
In the following example, a new table is created in which the name and datatype of the first column is identical to the first column in an existing table, but the first column in the new table is smaller in size. In the new table, the second column is defined as VARCHAR instead of CHAR, as it's defined in the second column of the existing table. The new table's third column is defined as SMALLINT instead of INT, as it's defined in the corresponding Badge column in the existing table. The SMALLINT datatype uses half the storage space of INT.
After the new table is created, all rows of the older table are loaded into the new table with an INSERT statement. A SELECT statement is then used to display the rows of the new table.
create table employees4 (name char(15), department varchar(20),badge smallint) insert into employees4 select name,department,badge from employees (11 row(s) affected) select * from employees4 name department badge --------------- -------------------- ------ Stan Humphries Field Service 3211 Fred Stanhope Field Service 6732 Sue Sommers Logistics 4411 Lance Finepoint Library 5522 Mark McGuire Field Service 1997 Sally Springer Sales 9998 L. Valencia Software 7773 Barbara Lint Field Service 8883 Jeffrey Vickers Mailroom 8005 Jim Walker Unit Manager 7779 Bob Smith SALES 1234 (11 row(s) affected)The sp_help procedure shows the difference between datatypes in the corresponding columns of the two tables used in the example. The example shows only the relevant parts of the display returned by sp_help.
sp_help employees Name Owner Type ------------------------------ ------------------------------ ------ employees dbo user table ... name char 20 0 (null) (null) department char 20 0 deptdefault (null) badge int 4 0 (null) (null) ... sp_help employees4 Name Owner Type ------------------------------ ------------------------------ ------ employees4 dbo user table --------------- --------------- ------ ----- --------------- -------- ... name char 15 0 (null) (null) department varchar 20 0 (null) (null) badge smallint 2 0 (null) (null) ...The INSERT table statement successfully completes because the data from the earlier table is compatible with the columns defined for the new table. If the data isn't compatible between the tables, you'll receive an error. In the following example, a new table is created that defines a column as a character datatype. The attempted insertion of the corresponding column from one table results in an error because the datatypes can't be implicitly converted.
create table onecolumn (badge char(4)) insert into onecolumn select badge from employees Msg 257, Level 16, State 1 Implicit conversion from datatype 'int' to 'char' is not allowed. Use the CONVERT function to run this query.
From an ISQL/W command line, the select into/bulk copy option can be set on or off by issuing the following command:
sp_dboption database_name, 'select into/bulkcopy', TRUE|FALSEFor example, the following command turns on select into/bulkcopy for the database employees:
sp_dboption database_employees, 'select into/bulkcopy', true
In the following example, a new table is defined that contains only two of the three columns defined in an existing table. INSERT is used with a SELECT statement that references only two of the three columns of the original table in the SELECT clause.
create table employees5 (name char(20), badge int)) insert into employees5 select name,badge from employees
See the Chapter 5 section entitled Creating and Using Constraints
In the following example, a unique CONSTRAINT is added to the badge column for the table employees2.
ALTER TABLE employees2 ADD CONSTRAINT badgeunc UNIQUE NONCLUSTERED (badge)
The changes that were made to the ALTER TABLE statement, as well as many other statements, allow Transact-SQL to meet the specifications of a standardized specification of SQL, ANSI SQL. The additions for ANSI compatibility result in multiple ways of performing the same operations, sometimes using different keywords or syntax.
You can easily drop a constraint from a table using the drop constraint clause of the alter table statement. You simply specify the name of the constraint to be removed from a table after the keywords drop constraint. For example, to remove a default constraint on the department column for the employees table, enter the following statement:
alter table employees drop constraint department_default
You can't use WITH NOCHECK to override the initial checking of PRIMARY KEY and UNIQUE constraints. By default, SQL Server applies the constraints to existing rows in the table as well as new rows that are added or changed later. You'll receive an error message and the ALTER TABLE statement will fail if existing data violates your constraint.
You can also specify that a CHECK constraint that's added to a table through the ALTER TABLE statement isn't applied to the existing rows of a table through the NOT FOR REPLICATION clause. NOT FOR REPLICATION operates as though the WITH NOCHECK clause was added to the ALTER TABLE statement. The NOT FOR REPLICATION clause is added to an ALTER TABLE statement for a different purpose than the WITH NOCHECK clause.
If you set up the automatic copying of a table and table rows from one server system to another, the actual work of ensuring the server system that receives a copy of the data is done by an intermediate server. The NOT FOR REPLICATION clause is added to an ALTER TABLE statement to prevent the table copy on the intermediate server from being checked, an unnecessary operation.
See the Chapter 20 section entitled "Setting Up and Managing Replication"
Click the Advanced Features tool on the toolbar. You can click the Primary Key/Identity, Foreign Keys, Unique Constraints, or Check Constraints tabs to create each type of constraint.
Fig. 7.2. - The Not for Replication check box can also be checked when the Check Constraint option is defined.
You perform a natural join or equi-join by matching equal values for rows in shared columns between multiple tables. You must define one of the two tables so that a column from one of the tables is duplicated in the second table. The column from the original table can be its primary key if it also is a column with values that make the rows of the table unique. The duplicate column that's added to a second table is referred to as a foreign key.
You define a foreign key to permit rows from different tables to be related. In a sense, the matching columns are used to form virtual rows that span a database table. Although each table is limited to 250 columns, matching columns to combine rows from multiple tables can result in almost an unlimited number of columns that can be combined across tables.
You use a standard SELECT statement with a WHERE clause to retrieve data from two or more tables. The syntax of a SELECT statement that's used to join two tables is as follows:
SELECT column_name_1,...column_name_n FROM table_name_1, table_name_2 WHERE primary_key_column= foreign_key_columnThe following create table and insert statements are used to create a new table in the same database to be used for subsequent relational join examples. Each row that's added to the pays table matches one row of the employees table.
create table pays (hours_worked int, rate int,badge int) go insert into pays values (40,10,3211); go insert into pays values (40,9,6732); go insert into pays values (52,10,4411); go insert into pays values (39,11,5522); go insert into pays values (51,10,1997); go insert into pays values (40,8,9998); go insert into pays values (55,10,7773); insert into pays values (40,9,8883); go insert into pays values (60,7,8005); go insert into pays values (37,11,7779); goIn the following example, three columns from two tables are displayed after the Badge column is used to combine the rows that have matching Badge numbers.
select name, department,hours_worked from employees,pays where employees.badge=pays.badge name department hours_worked -------------------- -------------------- -- Stan Humphries Field Service 40 Fred Stanhope Field Service 40 Sue Sommers Logistics 52 Lance Finepoint Library 39 Mark McGuire Field Service 51 Sally Springer Sales 40 Ludmilla Valencia Software 55 Barbara Lint Field Service 40 Jeffrey Vickers Mailroom 60 Jim Walker Unit Manager 37 (10 row(s) affected)An equi-join doesn't eliminate any of the table columns from the temporary tables that are formed by the join of tables. You must use a WHERE clause to match the corresponding rows of the tables. You also shouldn't use the asterisk wildcard character to reference all columns of the combined tables. If you use an asterisk, the columns with matching values are displayed twice.
In the following example, the rows of two tables are accessed without using a WHERE clause. SQL Server forms a cross-product of the rows in both tables. If you don't try to combine the rows using matching columns, each row of the second table in the FROM clause is added to every row of the first table. The Badge column is displayed from both tables because the asterisk wildcard is used in the SELECT clause.
select * from employees,pays name department badge hours_worked rate badge ------------------ -------------- -------- ------------ ------ ----- Stan Humphries Field Service 3211 40 9 6732 Stan Humphries Field Service 3211 40 10 3211 Stan Humphries Field Service 3211 52 10 4411 Stan Humphries Field Service 3211 39 11 5522 Stan Humphries Field Service 3211 51 10 1997 Stan Humphries Field Service 3211 40 8 9998 Stan Humphries Field Service 3211 55 10 7773 Stan Humphries Field Service 3211 40 9 8883 Stan Humphries Field Service 3211 60 7 8005 Stan Humphries Field Service 3211 37 11 7779 Fred Stanhope Field Service 6732 40 9 6732 ... (100 row(s) affected)The combination of all rows of the second table with the first table results in a cross-product, also called a Cartesian Product, of the two tables. In the example, the employees table and the pays table each contains 10 rows, so the resultant cross-product creates 100 rows in the temporary table. However, only 10 of the 100 rows belong together. The badge numbers match in one out of every 10 rows between the two tables.
I joined together two tables that didn't have matching columns with a SELECT statement that didn't contain a WHERE clause. I was surprised that SQL Server returned a huge table.
The SELECT statement operates on multiple tables whether or not they were designed to be combined with a relational join. It's important to always use a WHERE clause, which eliminates rows that don't have matching column values. If you don't use a WHERE clause, you'll receive a temporary table that contains the cross-product of the number of rows in the first table multiplied by the number of rows in the second table. For example, two tables that each contains only 100 rows joined without a WHERE clause will return 10,000 rows.
If you reference one of the columns used to match the rows across both tables, you must indicate the table in which the column is defined. Any time you reference a column that has the same name in multiple tables, you must somehow specify which column from which table to prevent ambiguity. The following example displays an error because SQL Server doesn't know from which table to display the Badge column.
select badge from employees,pays where employees.badge=pays.badge Msg 209, Level 16, State 1 Ambiguous column name badgeTo avoid ambiguity, the table columns used for matching rows are preceded by the table in which they're defined and separated by a period (.). In the following example, the Badge column is displayed from the first table by preceding the name of the Badge column with its table name.
select employees.badge from employees,pays where employees.badge=pays.badge badge ----------- 3211 6732 4411 5522 1997 9998 7773 8883 8005 7779 (10 row(s) affected)
Other dialects of SQL refer to a range variable as an alias.
Range variables are symbolic references for tables that are specified in the FROM clause of a SELECT statement. You can use a range variable in a preceding clause. such as the SELECT clause, or in a clause that comes after the FROM clause, such has a WHERE clause. Define a range variable by specifying a character constant following the name of a table in the FROM clause of a SELECT statement, as in the following syntax:
... From table_name_1 range_name_1, ...,table_name_n range_name_n ...You can define a range variable for each table that's specified in the FROM clause. You can use as many as 30 characters—the limit for any permanent or temporary object in Transact-SQL—to define the range variable. A range variable can be defined to provide a shorter reference to a table in a SELECT statement.In the following example, a range variable is defined for each table. The range variables are used in both the SELECT and WHERE clauses.
select e.badge,p.id from employees e,pays p where e.badge=p.id badge id ----------- ---- 3211 3211 3211 3211 3211 3211 3211 3211 6732 6732 4411 4411 5522 5522 1997 1997 9998 9998 7773 7773 8883 8883 8005 8005 7779 7779 3211 3211 3211 3211 3211 3211 3211 3211 (17 row(s) affected)Range variables are called range variables because after they're defined, the symbolic reference applies to, or ranges through, the table. As in the previous example, you can define a range variable to be a single character and use it as a short nickname for a table.
Range variables can be quite handy because several tables with long names can be specified in a SELECT statement. You can combine rows from as many as 16 tables in SQL Server using Transact-SQL. If you don't explicitly define range variables in the FROM clause, they're implicitly created using the complete name of each table.
In the following example, the rows from three tables are combined using the Badge columns that are defined in all the tables. The range variables are implicitly defined to the table names and are used in both the SELECT and WHERE clauses. The WHERE clause first combines the first and second tables; then the first and third tables use the AND Boolean operator.
select pays.badge, name,department, payrate from employees,pays, salaries where employees.badge = pays.badge and employees.badge=salaries.badge
select name from employeesis internally rewritten as:
select employees.name from employees employeesAlthough it seems unnecessary to create range variables when only a single table is named in the query, they are mandatory when you reference tables that contain columns with the same names in the same query.
In the ensuing examples, rows have been added with identical badge numbers in the employees table. Three employees have been added, each with a last name of Smith and each with the same badge number. In the tables referenced in the following examples, badge isn't defined as a primary key, so duplicate badge numbers can be present. Three employees with last names of Humphries have also been added, each with the same badge number as the original employee, Stan Humphries. The following example shows the rows of the employees table after the additional seven rows are added.
Although you usually define a primary and foreign key using the corresponding columns of tables that you subsequently want to combine rows form for display, you aren't required to define the columns as keys. SQL Server will permit you to perform joins on tables that don't have primary or foreign key definitions. You should realize that the assignment of primary and foreign keys to a table isn't required, though it's often desirable, if you combine data from different tables.
select * from employees order by badge name department badge -------------------- -------------------- ---- Bob Smith SALES 1234 Henry Smith Logistics 1234 Susan Smith Executive 1234 Mark McGuire Field Service 1997 Gertie Humphries Sales 3211 Stan Humphries Field Service 3211 Stan Humphries Jr Sales 3211 Winkie Humphries Mailroom 3211 Sue Sommers Logistics 4411 Lance Finepoint Library 5522 Fred Stanhope Field Service 6732 Ludmilla Valencia Software 7773 Jim Walker Unit Manager 7779 Jeffrey Vickers Mailroom 8005 Barbara Lint Field Service 8883 Sally Springer Sales 9998 (16 row(s) affected)The pays table is unaltered and contains only the original 10 rows, as shown in the following example:
select * from pays order by badge hours_worked rate badge ------------ ----------- ----- 51 10 1997 40 10 3211 52 10 4411 39 11 5522 40 9 6732 55 10 7773 37 11 7779 60 7 8005 40 9 8883 40 8 9998 (10 row(s) affected)You can combine tables that have an unequal number of matching rows. The following example joins the employees table, in which two sets of entries match a single entry for the badge column in the pays table. The join of the employees table with the pays table is called many-to-one.
select name,pays.badge,hours_worked,rate from employees,pays where employees.badge=pays.badge name badge hours_worked rate ---- ----- ------------ ---- Fred Stanhope 6732 40 9 Stan Humphries 3211 40 10 Gertie Humphries 3211 40 10 Stan Humphries Jr 3211 40 10 Winkie Humphries 3211 40 10 Sue Sommers 4411 52 10 Lance Finepoint 5522 39 11 Mark McGuire 1997 51 10 Sally Springer 9998 40 8 Ludmilla Valencia 7773 55 10 Barbara Lint 8883 40 9 Jeffrey Vickers 8005 60 7 Jim Walker 7779 37 11 (13 row(s) affected)If you switch the order of the joined tables, it becomes a one-to-many join. The following example returns the same rows that were returned in the previous example:
select name,pays.badge,hours_worked,rate from pays,employees where pays.badge=employees.badge name badge hours_worked rate ---- ----- ------------ ---- Fred Stanhope 6732 40 9 Stan Humphries 3211 40 10 Gertie Humphries 3211 40 10 Stan Humphries Jr 3211 40 10 Winkie Humphries 3211 40 10 Sue Sommers 4411 52 10 Lance Finepoint 5522 39 11 Mark McGuire 1997 51 10 Sally Springer 9998 40 8 Ludmilla Valencia 7773 55 10 Barbara Lint 8883 40 9 Jeffrey Vickers 8005 60 7 Jim Walker 7779 37 11 (13 row(s) affected)
select name,pays.badge,hours_worked,rate from employees,pays where employees.badge=pays.badge name badge hours_worked rate ---- ----- ------------ ---- Fred Stanhope 6732 40 9 Stan Humphries 3211 40 10 Gertie Humphries 3211 40 10 Stan Humphries Jr 3211 40 10 Winkie Humphries 3211 40 10 Sue Sommers 4411 52 10 Lance Finepoint 5522 39 11 Mark McGuire 1997 51 10 Sally Springer 9998 40 8 Ludmilla Valencia 7773 55 10 Barbara Lint 8883 40 9 Jeffrey Vickers 8005 60 7 Jim Walker 7779 37 11 Stan Humphries 3211 73 31 Gertie Humphries 3211 73 31 Stan Humphries Jr 3211 73 31 Winkie Humphries 3211 73 31 (17 row(s) affected)The additional row is added to the temporary table that's displayed. If the row value is restricted to only the badge number 3211, eight rows are returned.
Many-to-many queries are often not desirable and can produce results that are difficult to follow. In most cases, it's best to implement either a one-to-many or a many-to-one relationship, even if it entails adding an intermediary table.
For more information on database design approaches, see Chapter 2, "Data Modeling and Database Design"
select name,pays.badge,hours_worked,rate from employees,pays where employees.badge=pays.badge and pays.badge=3211 name badge hours_worked rate ---- ----- ------------ ---- Stan Humphries 3211 40 10 Gertie Humphries 3211 40 10 Stan Humphries Jr 3211 40 10 Winkie Humphries 3211 40 10 Stan Humphries 3211 73 31 Gertie Humphries 3211 73 31 Stan Humphries Jr 3211 73 31 Winkie Humphries 3211 73 31 (8 row(s) affected)
Previous examples, in which the rows of two tables were joined with a WHERE statement, included all rows of both tables. However, a query that includes all rows from both tables is probably never useful, except to understand the way in which SQL Server combines the rows. You can combine any two or more tables with a WHERE clause and receive a set of rows that were never meant to be combined and thus receive a meaningless result.
In practice, you'll combine only the rows from tables that have been created to be matched together. Tables that are designed to be combined have common columns of information so that a WHERE clause can be included in a query to eliminate the rows that don't belong together (those that have identical values).
You must ensure that the information used to combine tables—the corresponding values in common columns—remain valid. If the value in one table is changed, the corresponding identical value (or values, if a one-to-many relationship exists) must also be updated in other tables.
Referential integrity involves ensuring that you have valid information in common columns across tables used to join tables. You'll read more about referential integrity in subsequent chapters. Chapter 14, "Creating and Managing Triggers" discusses the mechanism for maintaining referential integrity and Chapter 10, "Managing and Using Indexes and Keys" discusses the common table columns on which joins are based.
Use outer joins to return table rows that have both matching and non-matching values. You may need to return the rows that don't contain matching values in the common table columns for either one table or the other tables specified in the SELECT statement.
If, for example, you join the employees table with the pays table used in the previous examples, you can specify the return of rows with matching values along with rows without matching values. The specification of the outer join is positional, which means that you use a special symbol the precedes or follows the comparison operator in the WHERE clause of a SELECT statement.
An outer join references one of the two tables joined using the table's position in the WHERE clause. A left-outer join specifies the table to the left of a comparison operator and a right-outer join specifies the table to the right of a comparison operator. The following table shows the symbol combination used for outer joins.
Symbol Combination Join
*= Left-outer join
=* Right-outer join
A left-outer join (*=) retains non-matching rows for the table on the left of the symbol combination in a WHERE statement. A right-outer join (=*) retains non-matching rows for the table on the right of the symbol combination.
In the following example, a SELECT statement specifies a join in the WHERE clause to return only rows that contain matching values in a common column for the two tables:
select * from employees,pays where employees.badge=pays.badge name department badge hours_worked rate badge ---------------- -------------- --------- ------------ -------- ------ Stan Humphries Field Service 3211 40 10 3211 Gertie Humphries Sales 3211 40 10 3211 Stan Humphries Jr. Sales 3211 40 10 3211 Winkie Humphries Mailroom 3211 40 10 3211 Fred Stanhope Field Service 6732 40 9 6732 Sue Sommers Logistics 4411 52 10 4411 Lance Finepoint Library 5522 39 11 5522 Mark McGuire Field Service 1997 51 10 1997 Sally Springer Sales 9998 40 8 9998 Ludmilla Valencia Software 7773 55 10 7773 Barbara Lint Field Service 8883 40 9 8883 Jeffrey Vickers Mailroom 8005 60 7 8005 Jim Walker Unit Manager 7779 37 11 7779 Stan Humphries Field Service 3211 73 31 3211 Gertie Humphries Sales 3211 73 31 3211 Stan Humphries Jr. Sales 3211 73 31 3211 Winkie Humphries Mailroom 3211 73 31 3211 (17 row(s) affected)In the next example, a left-outer join is used in the WHERE clause of a SELECT statement to specify that both rows containing matching values for a common column and the rows from the left table (employees) are included in the rows returned. This might be the case where you needed to find out what employees don't have a pay rate associated with them.
Before the following query was executed, additional rows were added to the employees table that don't have corresponding values in a common column in the pays table:
select * from employees,pays where employees.badge*=pays.badge name department badge hours_worked rate badge ------------------ ------------- --------- ------------ -------- ----- Stan Humphries Field Service 3211 40 10 3211 Stan Humphries Field Servic 3211 73 31 3211 Fred Stanhope Field Service 6732 40 9 6732 Sue Sommers Logistics 4411 52 10 4411 Lance Finepoint Library 5522 39 11 5522 Mark McGuire Field Service 1997 51 10 1997 Sally Springer Sales 9998 40 8 9998 Ludmilla Valencia Software 7773 55 10 7773 Barbara Lint Field Service 8883 40 9 8883 Jeffrey Vickers Mailroom 8005 60 7 8005 Jim Walker Unit Manager 7779 37 11 7779 Bob Smith SALES 1234 (null) (null) (null) Bob Jones Sales 2223 (null) (null) (null) Gertie Humphries Sales 3211 40 10 3211 Gertie Humphries Sales 3211 73 31 3211 Stan Humphries Jr. Sales 3211 40 10 3211 Stan Humphries Jr. Sales 3211 73 31 3211 Winkie Humphries Mailroom 3211 40 10 3211 Winkie Humphries Mailroom 3211 73 31 3211 Susan Smith Executive 1234 (null) (null) (null) Henry Smith Logistics 1234 (null) (null) (null) (21 row(s) affected)
In the next example, a right-outer join is used in the WHERE clause of a SELECT statement to specify that both rows that contain matching values for a common column and the rows from the right table (pays) are included in the rows returned. Two additional rows are first added to the pays table that don't have corresponding values in a common column in the employees table.
insert into pays values (40,10,5555) insert into pays values (40,10,5555) select * from employees,pays where employees.badge=*pays.id name department badge hours_worked rate id ------------------ ---------- --------- ------------ -------- ----- Stan Humphries Field Service 3211 40 10 3211 Gertie Humphries Sales 3211 40 10 3211 Stan Humphries Jr. Sales 3211 40 10 3211 Winkie Humphries Mailroom 3211 40 10 3211 Fred Stanhope Field Service 6732 40 9 6732 Sue Sommers Logistics 4411 52 10 4411 Lance Finepoint Library 5522 39 11 5522 Mark McGuire Field Service 1997 51 10 1997 Sally Springer Sales 9998 40 8 9998 Ludmilla Valencia Software 7773 55 10 7773 Barbara Lint Field Service 8883 40 9 8883 Jeffrey Vickers Mailroom 8005 60 7 8005 Jim Walker Unit Manager 7779 37 11 7779 Stan Humphries Field Service 3211 73 31 3211 Gertie Humphries Sales 3211 73 31 3211 Stan Humphries Jr. Sales 3211 73 31 3211 Winkie Humphries Mailroom 3211 73 31 3211 (null) (null) (null) 40 10 5555 (null) (null) (null) 40 10 5555 (19 row(s) affected)
You should define the queries that contain a UNION clause so that they're compatible. The queries should have the same number of columns and a common column defined for each table. You also can't use a UNION within the definition of a view.
The syntax for queries that include a UNION clause is as follows:
SELECT column_name_1, ..., column_name_n FROM table_name_1, ... , table_name_n WHERE column_name comparison_operator value [GROUP BY...] [HAVING ... UNION SELECT column_name_1, ..., column_name_n FROM table_name_1, ... , table_name_n WHERE column_name comparison_operator value [GROUP BY...] [HAVING...] [ORDER BY...] [COMPUTE...In the following example, the badge numbers that are common to both tables are displayed using two select statements that are bound with a UNION clause. The ORDER BY clause is used after the last query to order the final results. The ORDER BY clause appears only after the last SELECT statement. Recall that UNION implicitly removes duplicate rows, as defined by the query.
select badge from employees union select badge from pays order by badge badge ----------- 1234 1997 3211 4411 5522 6732 7773 7779 8005 8883 9998 (11 row(s) affected)In the following example, the same set of queries is used except the ALL keyword is added to the UNION clause. This retains query-defined duplicates (only the Badge column) in the resultant rows. The duplicate rows from both tables are retained.
select badge from employees union all select badge from pays order by badge badge ------ 1234 1234 1234 1997 1997 3211 3211 3211 3211 3211 3211 4411 4411 5522 5522 6732 6732 7773 7773 7779 7779 8005 8005 8883 8883 9998 9998 (27 row(s) affected)In the following example, the datatypes referenced in the query for one of the two columns aren't compatible. The execution of the example returns an error because of this.
select name,badge from employees union select hours_worked,badge from pay Msg 257, Level 16, State 1 Implicit conversion from datatype 'char' to 'int' is not allowed.
Use the CONVERT function to run this query.You can use a UNION clause with queries to combine the rows from two compatible tables and merge the rows into a new third table. To illustrate this merge, in which the results are kept in a permanent table, a new table is created that has the same datatypes as the existing employees table. Several rows are first inserted into the new table.
create table employees2 (name char(20),department char(20),badge int) go insert into employees2 values ('Rod Gilbert','Sales',3339) go insert into employees2 values ('Jean Ratele','Sales',5551) go insert into employees2 values ('Eddie Giacomin','Sales',8888)Each table now has rows that contain employees records. If you use a UNION clause to combine the SELECT statements along with INSERT INTO, the resultant rows can be retained in a new table.
The SELECT statement that references the employees table uses WHERE to restrict the rows returned to only those with the Sales department. All three rows of the employees2 table are in the Sales department, so no WHERE clause is necessary.
create table employees3 (name char(20),department char(20),badge int)) go insert into employees3 select * from employees where department='Sales' union select * from employees2 (6 row(s) affected) select * from employees3 name department badge ---- ---------- ----- Eddie Giacomin Sales 8888 Gertie Humphries Sales 3211 Jean Ratele Sales 5551 Rod Gilbert Sales 3339 Sally Springer Sales 9998 Stan Humphries Jr Sales 3211 (6 row(s) affected)As in the previous example, you can use UNION to combine multiple tables, or combinations of selected columns and rows from tables, into an existing or new table. You can have tables with identical columns at different office locations in which rows are added throughout the day.
At the end of the work day, you can use a set of SELECT statements with a UNION clause to add the separate collection tables to a master table at a central location. After the rows are copied to the master table, the rows of the collection tables can be removed using a DELETE FROM statement without a WHERE clause, or a TRUNCATE statement as mentioned earlier in this chapter.
You can combine up to 16 SELECT statements by adding additional UNION clauses between each set of SELECT statements. You can use parentheses to control the order of the UNIONs. The SELECT statements within parentheses are performed before those that are outside parentheses. You don't need to use parentheses if all the UNIONs are UNION ALL. You also don't need parentheses if none of the UNIONs is UNION ALL.
In the following example, three tables are combined using a UNION clause. IN is used with the first table to specify multiple forms of one department. The second table doesn't use WHERE and all rows are selected. The third table specifies only a single case-sensitive department name, assuming the sort order is case-sensitive.
You don't need parentheses to control the order of the merges because no UNIONs use ALL. The resultant rows are ordered by Badge and Name using an ORDER BY clause that can appear only after the last SELECT statement.
select name,badge from employees where department in ('SALES','Sales','sales') union select name,badge from employees2 union select name,badge from employees3 where department='Sales' order by badge,name name badge ---- ----- Bob Smith 1234 Gertie Humphries 3211 Stan Humphries Jr 3211 Rod Gilbert 3339 Jean Ratele 5551 Eddie Giacomin 8888 Sally Springer 9998 (7 row(s) affected)