Chapter 06 - Retrieving Data with Transact-SQL![]()
![]() You usually don't want to access and display all the data stored in a database in each query or report. You may want some—but not all—of the rows and columns of data. Although you can access all the information, you probably don't need to display all rows and columns simply because it's too much information to examine at one time. In Chapter 1, "Introducing Microsoft SQL Server" and Chapter 2, "Understanding Data Modeling and Database Design," you learned that the information stored in a relational database is always accessed in the form of a table. If you reference a printed table of information, you usually don't read all the rows and columns. You probably look at only part of the table to obtain the information you need. The table exists in a printed form only because it's a primitive way of storing information. If you can reconsider your requests for information from the database, you can start to eliminate the queries that produce unwanted or unneeded results. In these cases, you can produce output that presents exactly what's needed and nothing more.
Understanding Relevant Storage Characteristics for RetrievalThe data stored on your database—the disk of your computer system—is analogous to a set of printed tables. You don't need to retrieve an entire table when you issue queries to display information from the database. You construct a query using a Transact-SQL statement that returns only the relevant portion of the column or rows of your database tables.Table 6.1 shows an example table structure and its data, which will be used for several examples in this chapter. For information on creating tables, see Chapter 4, "Creating Devices, Databases, and Transaction Logs" and Chapter 5, "Creating Database Tables and Using Datatypes"
The table is limited to 12 rows to make it easier to work with the examples. The typical size of a table for a production database might have more columns of information and will nearly always have more rows of information. The size of the table won't make any difference in showing the operation of Transact-SQL statements. The statements work identically regardless of the size of the tables operated on. The examples in this chapter are easier to understand if a small number of rows and columns are present in the table used to show SQL operations.
Retrieving Data from a Table with SELECTUnless you retrieve all the columns and rows from all tables, your queries of a database are a selection process that narrows the information retrieved from the database. Your goal as you work with tables should always be to return only the information needed to fulfill the user's request. Any more information, and the user will be required to wait for a longer period of time than needed. Any less information than is needed will result in additional queries against the database. This modeling of sets of data is always a balancing act that requires continued refinement.The Transact-SQL SELECT statement is used for the selection process. Various parts of a SELECT statement target some—but not all—of the data in the database tables. The complete syntax of the SELECT statement is as follows:
SELECT [ALL | DISTINCT] select_list [INTO [new_table_name]] [FROM {table_name | view_name}[(optimizer_hints)] [[, {table_name2 | view_name2}[(optimizer_hints)] [..., {table_name16 | view_name16}[(optimizer_hints)]]] [WHERE clause] [GROUP BY clause] [HAVING clause] [ORDER BY clause] [COMPUTE clause] [FOR BROWSE]A SELECT statement is like a triangle superimposed on a database table. Using SQL keywords, the database is narrowed to target the columns and rows that are to be retrieved in a query. In the triangle comparison shown in Figure 6.1, the widest part of the triangle selects all the rows and columns of a database for retrieval. The point of the triangle opposite its wide base selects the least amount of data that can be retrieved from a table (a single row consisting of one column). Fig. 6.1 - SELECT queries are used to target specific columns and rows of a database. You can also use the SELECT statement to extract a single column or row, or all columns and all rows from the database. Most SQL queries retrieve rows and columns that are narrower than the entire table, represented by the base of the triangle in the Figure, but wider than the single row and column as shown in the point of the triangle opposite the triangle's base. You'll typically need to retrieve more than a single row and column but less than all the rows and columns of the database.
Specifying the Table with FROMDifferent parts of the SELECT statement are used to specify the data to be returned from the database. The first part of the selection process occurs when fewer than all the database tables are referenced. You can retrieve data stored in the database separately by referencing some tables, but not others, in a SELECT statement.A SELECT statement uses the FROM clause to target the tables from which rows and columns are included in a query. The syntax of the FROM clause is
[FROM {table_name | view_name}[(optimizer_hints)] [[, {table_name2 | view_name2}[(optimizer_hints)] [..., {table_name16 | view_name16}[(optimizer_hints)]]]In the following complete SELECT statement, the FROM clause specifies that the returned data should include only data from the employee table:
SELECT * FROM employee ![]() In the examples shown in this chapter, the Transact-SQL keywords used to form clauses are written in uppercase. You can, however, use lowercase keywords. However, if you installed Microsoft SQL Server with the default binary sort order, the names of your database objects—including the names of tables and columns—must match in case. ![]() You can also specify multiple tables in the FROM clause, as in the following example:
FROM table_name_1,...,table_name_nEach table is separated from the names of other tables with a comma, a separator used with lists of information in FROM and other Transact-SQL clauses. The list in a FROM clause often specifies multiple tables rather than a single table. In the following example of a SELECT statement, the FROM clause references the data from two tables:
SELECT * FROM employee,payThe employee and pay tables are targeted, from which all rows and columns are retrieved.
In a relational database, you must provide instructions within the SELECT statement to match the rows from two or more tables together. To learn how to match, or join rows from multiple tables, see Chapter 7, "Performing Operations on Tables" Transact-SQL allows you to choose tables from different databases. You can specify the name of the database in which the table is located by inserting the database name to the left of the table name. Next, place a period, the database owner name, and another period, followed by the table you need to work with:
database_name.owner.table_nameIn the following example, the employee table in the database company and the owner dbo is specified:
SELECT * FROM company.dbo.employee ![]() The DBO keyword specifies the database owner. You can refer to the dbo at any time. SQL will know that you're referring to the owner of the specific database. ![]() In the previous example, the table was created by using the system administrator's account, so the owner is dbo. If you omit the name of the database and owner when you reference a table, SQL Server looks for the table or tables that you specified in the FROM clause in the current database. You must enter the name of the database in which a table was created, along with its owner, to include the rows and columns from tables in different databases.
Specifying Columns with SELECTThe columns of values returned from database tables are specified as part of the SELECT clause immediately following the SELECT keyword. One or more of the columns are entered as a list. Each column, like the tables in a FROM clause, is separated by a comma:
SELECT column_name_1,...column_name_n FROM table_name_1,...table_name_nIn the following code example, the name and badge columns are selected for retrieval. The results are shown in Figure 6.2. This example uses the SQL Windows application utility ISQL/w to perform the retrieval of rows.
Select name, badge from employeeFig. 6.2 - Select statements are used to retrieve data from the database..
Follow these steps to begin a ISQL/w session:
The example in Figure 6.4 shows the result of the execution of the USE command to set the database to employees. Use the Erase button on the toolbar to delete the previous command before you enter and execute a new command. Fig. 6.4 - You can also define the database to which a user is automatically positioned to when he starts a SQL Server session during the creation of the user logon. In Chapter 1, "Introducing Microsoft SQL Server" and Chapter 2, "Data Modeling and Database Design" you learned that one of the basic tenets of a relational database is that operations on database tables always return another table. The rows and columns of database tables that are targeted for retrieval are always assembled into a temporary table. In most cases, this table is maintained only until the data is provided to the requesting client. The new temporary table shown in Figure 6.2 was constructed from the three-column employee table. According to the SELECT statement, "the temporary table targets all rows of the permanent table's three columns and eliminates the second column, Department. The temporary table is deleted after the rows are provided to the requesting client. Figure 6.5 shows the query with the employee table performed with the command-line form of ISQL. As you can see, you must enter the default command-line terminator go at the end of each statement. Fig. 6.5 - Transact SQL will produce the same results from a command-line prompt that it will in the GUI environment.
You can display table columns in a different order than you originally defined. To change the default order for the display of table columns, simply list the names of the columns in the order in which you want the columns displayed. In the following example—the results are shown in Figure 6.6—the order of display for the columns of the employee table are reversed from the order in which they were defined.
select badge, department, name from employeeFig. 6.6 - The SELECT clause determines the order columns are listed. Changing the order of the displayed columns of a database table is consistent with the characteristics of a relational database. You learned in Chapter 2, "Understanding Data Modeling and Database Design," that the access of data from a relational database doesn't depend on the manner in which the data is physically stored. You simply specify the names of the columns in the order in which you want them returned in the SELECT clause of the SELECT statement.
Using a Wild Card in the SELECT ClauseYou can use an asterisk (*) in the SELECT clause to specify all columns for inclusion in the retrieval. The following code shows a query that uses an asterisk to reference all columns of the employee table with the results shown in Figure 6.7. The name, department, and badge columns are displayed in the query results.
Select * from employeeFig. 6.7 - You can use the asterisk wild-card character in the SELECT clause of a SELECT statement.
You shouldn't use an asterisk with a production database because you probably need to access only some of the table columns in a query rather than all of them. Eliminating some table columns can dramatically reduce the time it takes to retrieve the rows when several rows are retrieved.
You can specify the column names even if all the columns should be retrieved so that the query is more descriptive. If the query is saved and later in need of revision, the columns and rows that the query retrieves will be easy to determine by reviewing the query.
Specifying the Rows of a TableIn the previous examples, all the table rows of a database are retrieved. Your goal often will be to retrieve only certain rows rather than all rows. For example, if you have tables that contain millions of rows, you'll probably never execute a query to retrieve all rows from the tables. Every query that you execute specifies a specific results set because it's impractical to retrieve or manipulate all rows in a single query.The WHERE keyword is used to form a clause that you add to a SELECT statement to specify the rows of a table to be retrieved. A WHERE clause uses the following syntax:
SELECT column_name_n,...column_name_n FROM table_name_1,...table_name_n WHERE column_name comparison_operator valueA WHERE clause forms a row-selection expression that specifies, as narrowly as possible, the rows that should be included in the query. A SELECT statement that includes a WHERE clause may return a single row or even no rows if none of the rows matches the criteria specified in the SELECT statement. In the example from the following code—the result is shown in Figure 6.8—a WHERE clause specifies that only the rows with the sales department are retrieved. All rows that contain the sales department, without regard for the case, are displayed.
select * from employee where department = "sales"Fig. 6.8 - The sort order that you defined during the installation of Microsoft SQL Server determines case-sensitivity.
The following code—the result is shown in Figure 6.9—shows the SELECT statement returns a single row because only one row contains the mailroom department.
select * from employee where department = "mailroom"Fig. 6.9 - The count message shows "row(s)" but only a single row is selected. The query result for a row that contains the personnel department retrieves no rows, as shown in the following code and in Figure 6.10. The count line, which displays the number of rows retrieved, is at the bottom of the output window. A retrieval in which no rows match the criteria of the SELECT statement doesn't return an error; instead, the message (0 row(s) affected) appears.
select * from employee where department = "personnel"Fig. 6.10 - You can enclose the value in the WHERE clause in single or double quotation marks. (Double quotes are used here.) You can refer to the @@ERROR system symbol, which is called a global variable, to learn whether the previous operation was successful. Microsoft SQL Server returns a zero (0) to @@ERROR if the previous operation was successful. After you execute a SELECT query that retrieves zero rows, @@ERROR contains a zero (0), indicating that no error occurred (see fig. 6.11).
select @@errorFig. 6.11 - Conditional statements can be used to check @@ERROR and perform additional statements, if needed. A SELECT statement is used in Figure 6.11 to display the contents of @@ERROR, so the results are returned in the form of rows retrieved from a table. The dashes (–) are displayed underneath the location of where a column header would appear if a column—rather than @@ERROR—was specified. A count message is also displayed for the one value (0) retrieved from the global variable. Error conditions and global variables are discussed in Chapter 13, "Managing Stored Procedures and Using Flow-Control Statements."
Using Comparison Operators in a WHERE ClauseThe syntax for the WHERE clause allows the use of a comparison operator following the name of a table column and before a column value. In the earlier examples, only the comparison operator = (equal) was used. Additional comparison operators may be used to retrieve different rows. Table 6.2 lists the comparison operators that you can use in the WHERE clause.
The syntax for a WHERE clause that uses a comparison operator is as follows:
SELECT column_name_1,...column_name_n FROM table_name_1,...table_name_n WHERE column_name comparison_operator value ![]() You can optionally use spaces around the comparison operations if you wish. You query will execute correctly with or without spaces around the comparison operators. ![]() In addition to the = (equal) comparison operator that was used in the preceding section (refer to figs. 6.8 through 6.11), you can use the <> (not equal) operator. You can use the not-equal operator to retrieve all rows except those that contain the value to the right of the <> operator. The following code—the result is shown in Figure 6.12—shows a SELECT statement that contains a WHERE clause for all rows from the employee table except those that contain the sales department.
select * from employee where department <> "sales"Fig. 6.12 - You can the comparison operators <> or != for 'not equal'. The < (less than) comparison operator can be used to retrieve rows that are less than the value specified for the column in the WHERE clause. In the following code—the result is shown in Figure 6.13—the rows that contain a badge number less than 5000 are retrieved from the employee table.
select * from employee where badge < 5000Fig. 6.13 - The results include rows with badge numbers less than 5000. The > (greater than) comparison operator retrieves rows that contain a value greater than the value used in the WHERE clause. In the following code, rows with badge numbers greater than 8000 are retrieved from the employee table (see fig. 6.14).
select * from employee where badge > 8000Fig. 6.14 - The results include rows with badge numbers greater than 8000. The <= (less than or equal to) comparison operator returns rows that have a value equal to or greater than the value in the WHERE statement. The following code returns rows that contain the value less than or equal to badge number 3211 (see fig. 6.15).
select * from employee where badge <= 3211Fig. 6.15 - The results include rows with badge numbers less than or equal to 3211. The >= (greater than or equal to) comparison operator returns rows that are greater than or equal to the value in the WHERE clause. Comparison operators can be used with columns that contain alphabetic values as well as numeric values. The following code uses >= in the WHERE clause to retrieve rows that are alphabetically greater than or equal to software for the department column (see fig. 6.16).
select * from employee where department >= "software"Fig. 6.16 - The results include rows that contain the software department.
For values that are more than a single character, each character is successively compared using the binary representation.
Using the Comparison Operator LIKEThe last of the comparison operators is a keyword—rather than one or two special symbols. The LIKE keyword is followed by a value fragment rather than a complete column value. The example query in the following code retrieves all rows that contain a department name beginning with the alphabetic character S (see fig. 6.17). A wild-card character such as the percent sign (%) can follow the letter S. This wild card is used to match any number of characters up to the size of the column, minus the number of characters that precede the percent sign.
select * from employee where department like "s%"Fig. 6.17 - The query retrieves only the rows that contain a department that starts with the letter s. An underscore (_) is another wild card that you can use to specify a value fragment. Each underscore used in the specification of a value fragment can match any one character. The example shown in the following code uses four underscores following the S to match any rows that contain a column value that begins with an s followed by any four characters (see fig. 6.18). Unlike the example shown in Figure 6.17, the query retrieves only the rows that contain Sales or SALES; it doesn't retrieve the rows that include Software.
select * from employee where department like "s____"Fig. 6.18 - You can use the underscore (_) wild card along with the percent sign (%) in a WHERE clause. You can use square brackets ([]) as wild cards in a WHERE clause that uses the LIKE comparison operator. The square brackets specify a range of values. In the following code, the brackets are used to specify a range of any upper- or lowercase characters as the first character of the department column (see fig. 6.19).
select * from employee where department like "[a-zA-Z]%"Fig. 6.19 - You can use any wild-card combination in the value fragment. In Figure 6.19, % and [] are combined to specify that the rows for retrieval have any upper- or lowercase letter as their first character, as well as any additional characters up to the width of the column. The department column was created wide enough to store 20 characters. The Figure shows that you can combine wild cards to specify a value fragment. You can also use a caret (^) after the left bracket to specify a range of values to be excluded from the rows retrieved in a SELECT statement. For example, the SELECT statement shown in the following code retrieves all rows from the employee table except those with first characters that fall within the range F through M (see fig. 6.20).
select * from employee where department like "[^F-M]%"Fig. 6.20 - A SELECT statement that excludes a range of values from F through M. You can use wild cards only in a WHERE clause that uses the LIKE keyword. If you use the asterisk (*), underscore (_), brackets ([]), or caret (^) with any of the other comparison-operator symbols, they're treated as literal column values. For example, the following code contains the same query issued in Figure 6.20, but an equal sign (=) has been substituted for the LIKE query (see fig. 6.21). The identical query with an equal comparison operator rather than LIKE doesn't retrieve any rows.
select * from employee where department = "[^F-M]%"Fig. 6.21 - When you use wild cards with comparison operators other than LIKE, they're treated as literal column values.
Selecting Columns and Rows with the WHERE ClauseYou can retrieve a subset of a table's columns and rows in a SELECT statement by combining the use of specific, called-out columns and the use of a constricting WHERE clause. In the following code, only two columns, name and department, are selected, and only for the rows that contain the 'Field Service' value in the department column (see fig. 6.22).
select name, department from employee where department = "field service"Fig. 6.22 - A SELECT statement can limit both the rows and columns retrieved. In Chapter 1, "Introducing Microsoft SQL Server" you learned that you can create more than 250 columns in a table and an unlimited number of rows. It's almost always impractical—except for small, simple tables—to try to retrieve all columns and rows of a table. If you correctly construct your query, you can also reference columns from multiple tables in a single query. The SELECT clause is descriptive because it invokes a selection operation on a table's rows and columns. Keep in mind the analogy of the data-retrieval triangle introduced earlier in this chapter (refer to fig. 6.1). The SELECT statement effectively superimposes a triangle over a table to retrieve some (but not all) columns and some (but not all) rows.
SQL Server doesn't return an error for a query that results in no returned rows. Although this is not true of other software products—for example, some programming languages expect an error when no records are retrieved from a file—Microsoft SQL Server and Transact-SQL consider a query that returns no rows a valid query. I executed a SELECT statement that resulted in no rows returned, but I'm certain information is in the database. How can I debug what went wrong?
Try modifying your WHERE clause so only one condition is applied. Run the query with only the first condition. If no rows are returned, you can examine the WHERE clause to determine the problem. If information is returned, add the next portion or portions of the original WHERE clause, running the query after each. You should be able to quickly narrow down which portion of the constricting clause is going awry.
Using Boolean Operators and Other Keywords in a WHERE ClauseYou can use Boolean operators to retrieve table rows that are based on multiple conditions specified in the WHERE clause. Booleans are used the way conjunctions are used in the English language. Boolean operators are used to form multiple row-retrieval criteria. Use Boolean operators to closely control the rows that are retrieved.The syntax for the use of a Boolean is as follows:
SELECT column_name_1,...column_name_n FROM table_name_1,...table_name_n WHERE column_name comparison_operator value Boolean_operator column_name comparison operator Using the OR OperatorThe first of the Boolean operators is OR, which you can use to select multiple values for the same column. In the following code, OR is used to form a WHERE clause to retrieve rows containing two column values (see fig. 6.23). Continue to add ORs to the WHERE clause to select additional values for the same column.
select * from employee where department = "field service" or department = "logistics"Fig. 6.23 - You can use any number of ORs with different comparison operators in each comparison. The following query retrieves the rows of the employee table that contain three column values (see fig. 6.24).
select * from employee where department = "field service" or department = "logistics" or department = "software"Fig. 6.24 - You can use additional ORs to select more than two values of the same column. You can specify different columns in the WHERE clause of a SELECT statement that uses an OR. The query in following code retrieves rows that either are members of the Field Service department (with any badge number) or have a badge number that's less than 6000 (but are members of any department). See Figure 6.25.
select * from employee where department = "field service" or badge < 6000Fig. 6.25 - You can use multiple Boolean operators to specify a criteria for the rows to be returned by a SELECT statement.
Using the AND OperatorUse the AND Boolean operator if you want the rows returned by the query to meet both comparisons specified in the WHERE clause.In the following code, a query is used to retrieve a row that contains a specific Name and badge combination (see fig. 6.26). If two rows in the table contained Bob Smith, the boolean AND is used to specify a criteria that requires the row to also contain the badge value 1834. Multiple rows would be returned only if more than one row contained the values Bob Smith and 1834.
select * from employee where name = "bob smith" and badge = 1834Fig. 6.26 - You can also use AND (as shown here) and OR together in a WHERE clause. Populating one column of the table with unique values allows individual rows to be retrieved.A unique row is returned if one of the specified columns is the column that contains unique values.
Using the NOT OperatorNOT is an additional Boolean operator that you can use as part of a WHERE clause. Use NOT to specify negation. Use NOT before the column name and a comparison operator, such as = (equal), after the column name and before the value.For example, the following query retrieves all rows of the employee table that contain any department except Field Service (see fig. 6.27).
select * from employee where not department = "field service"Fig. 6.27 - You can use NOT for negation the same way the not-equal comparison operators (!= and <>) are used. You can also use NOT in a WHERE clause in combination with AND and OR. In the following code, NOT is used to retrieve all rows of the employee table that are members of the Field Service department, except for Mark McGuire (see fig. 6.28).
select * from employee where department = "field service" and not name = "mark mcguire"Fig. 6.28 - You can use OR as well as AND with the NOT Boolean operator.
Using BETWEEN to Select a Range of ValuesAlthough you can use a number of ORs in a WHERE clause to specify the selection of multiple rows, another construction is available in Transact-SQL. You can use the BETWEEN keyword with AND to specify a range of column values to be retrieved.In the following code, a WHERE clause that includes BETWEEN is used to specify a range of badge values to be retrieved from the employee table (see fig. 6.29). Use BETWEEN after the name of the column, followed by one end of the range of values, the AND keyword, and the other end of the range of values.
select * from employee where badge between 2000 and 7000Fig. 6.29 - The table doesn't have to contain rows that are identical to the column values used to specify the range of values referenced by BETWEEN. The two numbers that form the range of values for retrieval don't actually need to be stored in the table. For example, in Figure 6.29, badge numbers 2000 and 7000 don't have to be stored in the table. Those numbers simply specify a range. Also, a successful query (one that returns an error code of 0) can return zero rows within the range specified by the WHERE clause that contains a BETWEEN. No rows need to be stored in the table for the query to execute correctly.
Using IN to Specify a List of ValuesYou can't always use a WHERE clause with BETWEEN to specify the rows that you want to retrieve from a table in place of a WHERE clause that contains multiple ORs. The rows that contain the column values specified within the range of values will include rows that you don't want. However, you can use the IN keyword in a WHERE clause to specify multiple rows more easily than if you use multiple ORs with a WHERE clause.A statement that uses IN uses the following syntax:
SELECT column_name_1,...column_name_n FROM table_name_1,...table_name_n WHERE column_name IN (value_1, ...value_n)In the following code, IN is followed by a list of values to specify the rows to be retrieved (see fig. 6.30).
select * from employee where badge in (3211,6732,4411,5522)Fig. 6.30 - A WHERE clause that contains IN is simpler to write than a WHERE clause that contains multiple ORs and is more specific than using the BETWEEN operator.
Using an ORDER BY ClauseIn Chapter 1, "Introducing Microsoft SQL Server" you learned that the rows of a relational database are unordered. As part of your SELECT statement, you can specify the order in which you want the rows retrieved and displayed: Add an ORDER BY clause to sort the table rows that are retrieved by a SELECT statement.
The syntax of a SELECT statement that contains an ORDER BY clause is as follows:
SELECT column_name_1,...column_name_n FROM table_name_1,...table_name_n ORDER BY column_name_1,...column_name_nThe following code shows a SELECT statement in which the rows are ordered by department (see fig. 6.31). The rows of the employee table are retrieved in ascending order by default.
select * from employee order by departmentFig. 6.31 - You can use multiple columns to determine the order of rows retrieved. badgeIf you include the name of a second column after the name of the first column, the second column orders the rows that are duplicates of the first column. In the following code, ORDER BY is used to order the rows of the employee table first by the department and then by the badge column (see fig. 6.32).
select * from employee order by department, badge descFig. 6.32 - You can reference a third column if you have the need. Notice that the DESC keyword is added after the second named column to order the badge numbers in descending orderbadge. You can also use the keyword ASC to specify explicitly that a column's order in the ORDER BY clause be ascending. However, it's unnecessary because the order of a column is ascending by default. When you use ORDER BY in a SELECT statement, you can specify the columns in the ORDER BY clause by their order number in the SELECT clause. In the following code, the department and badge columns from the employee table are referenced in the ORDER BY clause by their order of occurrence from left to right in the SELECT clause (see fig. 6.33).
select department, badge from employee order by 1, 2Fig. 6.33 - A SELECT statement that references the columns using column numbers rather than names.
You must understand why SQL Server doesn't use the order of the column defined in a table such as employee in the ORDER BY clause. In a relational database, the syntax of a query language that references database data should be independent of the manner in which the data is stored. If Transact-SQL used the order of columns as they are defined in the table, the column number would be based as a physical characteristic of the stored data. It's more appropriate to reference the columns using the relative order of the columns in the SELECT clause. Another problem exists in trying to reference table columns by the order in which the columns are defined in a table. You can specify columns from different tables in the SELECT clause. You can't, for example, reference two columns that are both defined as the third column in two tables, because the column numbers are identical. You have complete control over specifying columns in the SELECT clause of a SELECT statement. You can reference the same column more than once in a SELECT clause, and the column values will be displayed multiple times. In the following code, the badge column is referenced twice in the SELECT clause (see fig. 6.34).
select badge, name, department, badge from employeeFig. 6.34 - Multiple copies of the same column can be used to improve readability. You may not immediately see a reason for displaying the values of a column more than once. If a listing is wide enough, it can be convenient to display a column—often the unique identifier for each row—as the first and last columns of a display, as Figure 6.34 shows. A train schedule is an example of a wide output in which the stations often are displayed in the first, last, and center columns to make the output display easier to read.
Using DISTINCT to Retrieve Unique Column ValuesYou can construct your database table so that you never allow duplicate rows to be stored or to allow duplicate rows. Unless you define a constraint on your table, such as a unique key, you can store duplicate rows in the table. Although you can disallow duplicate rows in the table, you may allow duplicates for some columns. You may want to find the unique entries that exist in a table column. The DISTINCT keyword is used to return the unique values of a column.The following code shows the different departments of the employee table (see fig. 6.35).
select distinct department from employeeFig. 6.35 - You can use DISTINCT with multiple columns to return the unique values of a column. If you use DISTINCT with multiple columns, the rows retrieved are unique by the combination of the columns specified after the DISTINCT keyword. The combination of the values from the Department and badge columns must return all rows. Whenever you combine a column that contains non-duplicate values (such as badge) with a column that contains duplicate values (such as Department), the combination of the two is non-duplicate.
Using Arithmetic OperatorsYou can use arithmetic operators to form expressions in Transact-SQL. Expressions are evaluated within the statements in which they appear. You need arithmetic operators to manipulate the data retrieved from tables. You can use the arithmetic operators in the SELECT clause to add, subtract, multiply, and divide data from columns that store numeric data.Table 6.3 shows the arithmetic operators you can use in Transact-SQL.
You can form expressions by using arithmetic operators on columns defined with the datatypes TINYINT, SMALLINT, INT, FLOAT, REAL, SMALLMONEY, and MONEY. You can't use the modulo operator (%) on columns defined with the MONEY, SMALLMONEY, FLOAT, or REAL datatypes.
You can use arithmetic operators to form expressions in the SELECT clause of a SELECT statement with both numeric constants and columns. In the following code, an expression is used to increment the badge numbers by 5 (see fig. 6.36).
select badge, badge + 5 from employeeFig. 6.36 - You can use multiple arithmetic operators to operate on column names, constants, or a combination of column names and constants. When you use an expression in a SELECT clause, the display for the evaluation of the expression doesn't have a column header. You can specify a column header for the expression by preceding the expression with a text string followed by an equal sign (=). For example, the following code shows the expression preceded by the specified column header in the SELECT clause of the SELECT statement (see fig. 6.37). You can enclose the text string in single or double quotation marks to retain embedded spaces.
select badge, "badge + 5" = badge + 5 from employeeFig. 6.37 - You can also specify an alternate column header for any column without using the column name in an expression. Computed columns are not stored in the database. They may exist in a temporary table that is created during the execution of the SELECT statement. Since the data is not in the database, there is no way to directly verify the results of a computed column. One easy way to compute a value incorrectly is to ignore operator precedence. Arithmetic operators are performed in predetermined order unless parenthesis are used to force otherwise. Table 6.4 shows the order in which arithmetic operators are executed.
If you use multiple arithmetic operators with the same order of precedence, the expressions are evaluated from left to right. You can use parentheses to control the order of execution. Expressions in parentheses are evaluated before any other expressions. Use parentheses to evaluate expressions that contain addition and subtraction before the expressions that contain multiplication, division, and modulo operators. The following code shows the use of parentheses in the SELECT clause. The constant 5 is added to each value of the badge column. After the constant is added to badge, the sum is multiplied by 2. See Figure 6.38.
select "badge + 5 * 2" = (badge + 5) * 2 from employeeFig. 6.38 - You can use parentheses to make the evaluation order more descriptive, even if the parentheses are unnecessary. You can perform arithmetic operations on different numeric datatypes in the same expression—a procedure called mixed mode arithmetic. The datatype of the result is determined by the rank of the datatype code stored in a column of a system table. You can use a SELECT statement to retrieve the datatype names and their code numbers from the systypes system table. Table 6.5 shows the codes for the numeric datatypes.
When you write expressions using different datatypes, the results are returned in the datatype of the highest ranked datatype. For example, the values of a column that's defined as either a TINYINT or a SMALLINT datatype is converted to INT if they're evaluated in a expression that contains an INT datatype. One exception to the datatype code rule applies to expressions that include columns with the FLOAT and MONEY datatypes. The evaluation of an expression that contains FLOAT and Money is returned as the MONEY datatype, even though the code number for MONEY (60) is lower than FLOAT (62). You can retrieve the code numbers for all the Transact-SQL datatypes by using the query in following code.
select name, type from systypes order by type descThis statement retrieves the names and codes for all Transact-SQL datatypes in order by the highest code numbers.
Using a GROUP BY ClauseThe GROUP BY clause divides a table into groups of rows. The rows in each group have the same value for a specified column. Duplicate values for each different value are placed in the same group. Grouping allows you to perform the same functions on groups of rows.You can group by any number of columns in a statement. Columns in the select list must be in the GROUP BY clause or have a function used on it. The syntax of a SELECT statement that contains a GROUP BY clause is as follows:
SELECT column 1,...column n FROM tablename GROUP BY columnname 1, columnname nGROUP BY targets only unique column values after sorting by ascending column value (default). GROUP BY is unlike the ORDER BY clause, which though it also sorts records in ascending order, it doesn't remove duplicate column values. The example query shown in the following code groups the rows by the department column (see fig. 6.39). The departments are first sorted to group them together. The duplicate departments aren't displayed because the purpose of the GROUP BY clause in a SELECT statement is to form groups of rows for subsequent action by other clauses.
select department, "headcount" = count(*) from employee group by departmentFig. 6.39 - A SELECT statement containing a GROUP BY clause that sorts rows by the column referenced in the GROUP BY clause.
For example, you can select specific groups with a HAVING clause, which compares some property of the group with a constant value. If a group satisfies the logical expression in the HAVING clause, it's included in the query result. The syntax of a SELECT statement with a HAVING clause is
SELECT column 1,...column n FROM tablename GROUP BY columnname HAVING expressionThe HAVING clause is used to determine the groups to be displayed in the output of the SELECT statement. The following code shows the use of a HAVING clause (see fig. 6.40).
select department, "headcount" = count(*) from employee group by department having count(*) = 1Fig. 6.40 - A SELECT statement containing a HAVING clause that limits the returned rows.
Using a COMPUTE Clause in a SELECt StatementYou can use a COMPUTE clause in a SELECT statement with functions such as SUM(), AVG(), MIN(), MAX(), and COUNT(). The COMPUTE clause generates summary values that are displayed as additional rows. The COMPUTE clause works like a so-called control break, a mechanism used in applications called report writers. You can use the COMPUTE clause to produce summary values for groups as well as calculate values using more than one function for the same group.
The general syntax of the COMPUTE clause is as follows:
COMPUTE row_aggregate(column name) [,row_aggregate(column name,...] [BY column name [,column name...] ![]() Several restrictions apply to the use of a COMPUTE clause in a SELECT statement. The following list summarizes the COMPUTE clause restrictions:
You can use a clause that contains the keyword COMPUTE without BY to display grand totals or counts. You can also use both a COMPUTE and a COMPUTE BY clause in a SELECT statement. ![]()
Using SubqueriesYou can nest a complete SELECT statement within another SELECT statement. A SELECT statement that's nested within another SELECT statement is called a subquery. The nested or inner SELECT statement is evaluated and the result is available to the outer SELECT statement. To use a subquery, enclose a SELECT statement within parentheses to specify that it should be evaluated before the outer query.The row or rows returned by the SELECT statement in parentheses are used by the outer SELECT statement. The rows returned by the inner SELECT statement are used in the position of the value in the WHERE clause of the outer SELECT statement. For example, in the following code, all rows are retrieved for the employee table, where the department is equal to the same department to which Bob Smith is a member (see fig. 6.41).
select * from employee where department = ( select department from employee where name = "bob smith")Fig. 6.41 - You can nest a subquery within the subquery by using an additional set of parentheses around an enclosed SELECT statement. Some restrictions apply to the use of subqueries. The SELECT list of a subquery must return either a single value or one or more rows (if an EXISTS is used) unless IN is used in the outer query. This would result in the following error if a comparison operator such as = (equal to) was used in the WHERE clause of the outer query:
Msg 512, Level 16, State 1 Subquery returned more than 1 value. This is illegal when the subquery follows =, !=, <, <= , >, >=, or when the subquery is used as an expression. Command has been aborted. ![]() Us a NOT IN to eliminate rows that match the results of a subquery. ![]() You're restricted in the choice of datatypes within subqueries. You can't use either the IMAGE or TEXT datatypes in the SELECT clause of the subquery. Also, the datatype of the value(s) returned by the subquery must match the datatype used in the WHERE clause of the outer query.
Your nested SELECT statement probably returned more than one row value to the WHERE clause of your outer SELECT statement. Unless you're sure that the inner SELECT statement will return only a single row value, you should use an IN in the WHERE clause of the outer SELECT statement.
Using ANY and ALLYou can use ANY and ALL to modify the comparison operators that precede a subquery. In Transact-SQL, ANY and ALL don't have the same meaning that they do in the English language. For example, when > (greater than) is followed by ALL, it's interpreted as greater than all values—including the maximum value returned by a subquery. When > (greater than) is followed by ANY, it's interpreted as greater than at least one—the minimum.
Specifying Optimizer HintsThe optimizer hints clause is somewhat misleading. You use the optimizer clause of the SELECT statement to override the data-retrieval methods that are automatically chosen by the query optimizer. When a query is executed, a portion of SQL Server, called the query optimizer, determines how the retrieval of the data from the database is performed.For example, although an index may exist on a table that's referenced in a query, the query optimizer may determine that it would be faster to retrieve the rows from the table without using the index. The query optimizer may not use an index because the number of rows requested by the query are few in number, and it would be faster to directly access the data rows rather than both the index rows and then data rows. If multiple indexes exist on a table, the query optimizer will chose to return the rows of the table using the index that would result in the fastest retrieval of information. You may, however, want to override the way in which the retrieval of rows will be done. For example, if two indexes—one clustered and one non-clustered—exist on a table, the query optimizer may choose to use the non-clustered index for the retrieval of the table rows. You can use the optimizer hints clause in a SELECT statement to force a retrieval using the clustered index. You want the rows to be retrieval by the clustered index because they will automatically be returned in ascending sorted order by the column or columns on which the clustered index was created. For example, you can use the following optimizer hints clause in a SELECT clause to specify that the rows of a table will be retrieved by the clustered index:
select * from employee (index=0)This statement specifies the index name or ID to use for that table. 0 forces the use of a clustered index (if one exists). If you use a optimizer hint of 1, a non-clustered index is used for retrieval of rows targeted in the SELECT statement.
You can use a second set of optimizer hints to control the synchronization, or locking of the tables in your SELECT statement. You can use the locking optimizer hints to override the way in which SQL Server normally controls access to data from multiple clients.
Using NOLOCKYou use the NOLOCK optimizer hint to permit you to read rows that SQL Server would normally not permit you to access. For example, if you use NOLOCK in a SELECT statement, you can read uncommitted rows.
Using HOLDLOCKYou use HOLDLOCK to prevent other clients from changing rows that are part of your SELECT clause until your transaction is complete. Normally, other clients can modify the rows as soon as they're displayed. One restriction of HOLDLOCK is that you can't use it in a SELECT clause that also contains a FOR BROWSE clause.
Using UPDLOCKYou use an UPDLOCK like HOLDLOCK to prevent other clients from changing rows that are part of your SELECT clause. UPDLOCK releases the rows of your table and the end of the command or next transaction, rather than at the end of the transaction only.
Using TABLOCKYou use TABLOCK like HOLDLOCK to prevent other clients from changing rows. TABLOCK, unlike HOLDLOCK, acts on the entire table, rather than just on the rows of your table. You can use TABLOCK along with HOLDLOCK to prevent other clients from changing rows of you entire table until your transaction completes.
Using PAGLOCKYou use PAGLOCK like HOLDLOCK to prevent other clients from changing rows. PAGLOCK prevents other clients from changing rows a table page at a time, rather than the entire table.
Using TABLOCKxYou use TABLOCKx to prevent other clients from displaying as well as changing an entire table referenced in your SELECT clause until your command or transaction is complete.
Using FASTFIRSTROWYou use FASTFIRSTROW to retrieve the rows of a table using a non-clustered index. Unlike the index=1 optimizer hint, the first row of the query is returned more quickly through use of optimized read techniques. The total time that it takes to perform the query may be longer than if the non-clustered index were used with the FASTFIRSTROW option. You use FASTFIRSTROW to get better response time by returning the initial results of your query faster.
Using the FOR BROWSE optionYou can use the FOR BROWSE clause in a SELECT statement to read a table that another client is now adding, deleting, or updating rows within. Normally, SQL Server won't permit you to read a table while pending updates, deletes, or inserts are uncommitted. There are restrictions on what other clauses your SELECT statement can contain to use the FOR BROWSE clause.To use the FOR BROWSE clause in a SELECT statement, the SELECT statement must contain a table with a timestamp column and a unique index. To use the FOR BROWSE clause in a SELECT statement, the SELECT statement can't contain a UNION clause. FOR BROWSE should be the last clause of a SELECT statement. Unique indexes and timestamp columns are required attributes of tables to be used with the FOR BROWSE clause. If a table doesn't meet these requirements, a retrieval statement executes as if the FOR BROWSE clause weren't present in the SELECT statement. A SELECT statement that tries to read the rows from a table that's being modified waits up the default query timeout interval of 5 minutes. If the modification is completed within that time, the rows are displaying by the waiting query. If the pending modification doesn't complete within the timeout interval, the query fails. The FOR BROWSE clause in a SELECT statement permits you to read rows of a table while they're being changed.
From Here...In this chapter you've learned to write queries for the retrieval of data from a database contains only the required rows or columns. In addition, you've learned to manipulate the returned data, performing arithmetic operations and sorting the rows. Finally, you've learned to use optimization techniques to override the default actions of SQL Server to retrieve data faster or in different ways.For more information about the topics mentioned in this chapter, see the following chapters:
|