Chapter 10 - Managing and Using Indexes and Keys

What indexes are for and how to create them - Indexes are used to make it easier and faster to retrieve data. They can be created at the same time a table is created or they can be added to the table at a later time.
The different kinds of indexes that can be created - There are several types of indexes. Each index tracks data and stores that tracking information in different ways. Some indexes also affect the way the data is physically stored in the database.
What keys are for and how to create them - Rows of data contain unique values that can be used to identify each individual row. The columns that contain these unique values are called key columns. For key columns to be guaranteed unique, they must be declared when creating a table. Key column definitions can also be added or modified after a table is created.

One of the most important responsibilities of a database designer is to correctly define a database table for optimal performance. SQL Server's basic design of a table doesn't in any way define how data is to be accessed or stored physically, beyond the data-type constraints and any referential constraint placed on a column or columns designated as PRIMARY KEY. Instead, SQL Server provides a mechanism of indexes or keys to a table that help SQL Server optimize responses to queries.

Without an index, SQL Server must table scan, or read every row in a table, before it can know the answer to any given query. In large tables this is obviously an expensive option for the server to take. Indexes provide a way for SQL Server to organize pointers to the data required. An index in a database works the same way as an index in a reference book. Like an index in a book, an index in a database is a list of "important" values that contain references to pages in the database table that contain the information that matches the index value. This allows the database to read from a (usually) smaller list of index pages that will in turn point to the data that will answer any given request.

Defining Indexes

Indexes are SQL Server's internal method of organizing the data in a table in such a way that it can be retrieved in an optimal fashion. Optimal, in this case, refers to the quickest way. Indexes are collections of unique values in a given table and their corresponding list of pointers to the pages of data where those values are physically represented in a table.

At a high level, indexes are a shorthand way of the database recording information that it's storing in tables. Indexes are just another kind of object in the database and have storage needs like tables. Just as tables require pages of data to store their rows in, indexes require pages to store their summary data in. The advantage of an index is that, generally, it reduces the number of I/Os required to reach any given piece of data in a table.

When you create an index in SQL Server, you tell the database to scan the table, gather the discrete values in the particular column or column(s) being indexed, and then write a list of data pages (and row identifiers) to the index page that match the value being indexed. This allows the server to scan a list of index pages, before choosing to scan the whole table, looking for matching data.

Creating Indexes

SQL Server has two methods of creating indexes: a graphical method provided in SQL Enterprise Manager, and a Transact-SQL interface using the CREATE INDEX statement. Only the table's owner can create an index on a table.

SQL Enterprise Manager has a limitation when you create indexes through it in which you can't specify a data segment for the index to be created on. Moving indexes on to different data segments can significantly improve performance on non-clustered indexes because multiple I/O threads can be used to read the data from the index and data pages concurrently. Use the CREATE INDEX statement in ISQL/w to create an index if you need to specify a segment for the index data.

Creating an Index with SQL Enterprise Manager

To create an index using SQL Enterprise Manager, follow these steps:

Run SQL Enterprise Manager from the Microsoft SQL Server 6.5 group. Figure 10.1 shows the main screen shortly after the startup of SQL Enterprise Manager.
Fig. 10.1 - SQL Enterprise Manager's Explorer view after having just being started. Note that no server is selected.
Select the server, database, and table that you want to work on (see fig. 10.2).
Fig. 10.2 - The Explorer view of SQL Enterprise Manager, with the authors table selected from the pubs database.
From the Manage menu choose Indexes. The Manage Indexes dialog box appears (see fig. 10.3).
Fig. 10.3 - The authors table is selected in the top-left combo box, and the aunmind index is selected in the top-right combo box.
Select the value (New Index) from the Index combo box. This clears the Index combo box, enabling you to enter the new index name; the Columns In Index list also is cleared.
Enter fk_au_id as the index name, highlight the au_id column, and click the Add button. Select the check boxes Unique Keys and Ignore Duplicate Keys (see fig. 10.4).
Fig. 10.4 - The Manage Indexes dialog box is ready to build a new index.

Prefixing index names with pk for primary key or fk for foreign key makes it easier to identify the index type without having to inspect its properties.
Click the Build button to build the index. A message box appears, asking whether the index should be built now or scheduled as a task to run later (see fig. 10.5). Choose the option that's appropriate for your environment.
Fig. 10.5 - The Index Build message box seeks confirmation of whether the index should be built immediately or scheduled as a task.

Creating an Index with CREATE INDEX

The Transact-SQL command CREATE INDEX is used by SQL Enterprise Manager to perform the index creation when the Build button is clicked in the Manage Indexes dialog box. The syntax for CREATE INDEX is as follows:

CREATE [UNIQUE] [CLUSTERED | NONCLUSTERED] INDEX index_name
      ON [[database.]owner.]table_name (column_name [, column_name]...)
[WITH
      [FILLFACTOR = x]
      [[,] IGNORE_DUP_KEY]
      [[,] {SORTED_DATA | SORTED_DATA_REORG}]
      [[,] {IGNORE_DUP_ROW | ALLOW_DUP_ROW}]]
[ON segment_name]

The options for the Transact-SQL command CREATE INDEX follow.

UNIQUE

If an index is created as UNIQUE, SQL Server disallows duplicate values in the index and therefore stops a client from inserting a record into the base table. This is the most common use of an index to enforce integrity on a table. Unique indexes can't be created on tables that have duplicate values in the columns being indexed--the duplicate data must be removed first. If enabled, the IGNORE_DUP_KEY option (described later) allows UPDATE or INSERT statements, affecting several rows, that modify index keys to complete, even if the new index key values become duplicates. The duplicate values will be rolled back and the transaction will continue—no error will be generated.

CLUSTERED

A clustered index is a special index that forces SQL Server to store the table data in the exact order of the index. Using a clustered index to physically store the data in the table in a particular way can greatly improve access performance to the table. Data requested from tables that are scanned repeatedly by the index key value for an individual record or set of records in a range can be found very quickly, because SQL Server knows that the data for the index page is right next to it. Any further values are guaranteed to be in the following data pages.

If CLUSTERED isn't specified in the CREATE INDEX statement, the index is assumed to be NONCLUSTERED.

There can be only one clustered index per table because the data can be in only one physical order.

Specifying a segment for a clustered index to be placed on will actually move the table data too. Be careful using the ON segment_name keyword when creating a clustered index. You must have approximately 1.2 times the space required for the entire table available on the target segment for the clustered index and data; otherwise, the segment free space will be filled and a new segment will need to be created. SQL Server won't warn you before index creation that the segment space is inadequate because it has no way of knowing or accurately estimating the size required for the index.

NONCLUSTERED

This is the default index type and means that SQL Server will create an index whose pages of index data contain pointers to the actual pages of table data in the database. You can create up to 249 non-clustered indexes in a table.

index_name

An index name must be unique by table—that is, the same index name can be given to two indexes, provided that they're indexing different base tables in the database. Index names must follow standard SQL Server naming conventions for objects.

table_name

table_name is the table that's going to be indexed.

column_name

The column that's being indexed. If more than one column is placed here, a composite or compound index is created. Multiple columns should be separated by spaces. You can specify up to 16 columns to create a composite key; however, the maximum width of the data types being combined can't exceed 256 bytes.

FILLFACTOR = x

Specifying a FILLFACTOR on a index tells SQL Server how to "pack" the index data into the index data pages. FILLFACTOR tells SQL Server to preserve space on the index data page for other similar rows that are expected for the same index keys or similar index key values.

The FILLFACTOR should rarely be used. It is included solely for fine-tuning purposes. Even for fine-tuning, it should only be used if future changes in the data can be made with accuracy.

Specifying a FILLFACTOR for frequently inserted database tables can improve performance because SQL Server won't have to split data onto separate data pages when the index information is too different to fit on a single page. Page splitting is a costly operation (in terms of I/O) and should be avoided, if possible.

The number of the FILLFACTOR refers to the percentage of free space that should be preserved on each index page.

A small FILLFACTOR is useful for creating indexes for tables that haven't got their complete dataset yet. For example, if you know that a table is going to have many more values than it does now and you want SQL Server to preallocate space in the index pages for those values so that it won't need to page split, specify a low FILLFACTOR of about 10. A page split occurs when the index fills up such that no further values will fit in the current 2K data page. Consequently SQL Server splits the page in two and puts references to the newly created page in the original page.

A high FILLFACTOR will force more frequent page splits because SQL Server will have no room on the index page to add any additional values that may be necessary if a record is inserted into the table. A FILLFACTOR of 100 will force SQL Server to completely fill the index pages. This option is good for highly concurrent, read-only tables. It's extremely bad, however, for tables that are inserted or updated frequently; every insert will cause a page split, and many of the updates (if key values are updated) will also cause page splits.

If no FILLFACTOR is specified, the server default (usually 0) is used. To change the server default, use the system-stored procedure sp_configure.

See the section entitled Displaying and Setting Server Options in Chapter 16

Be careful when specifying a FILLFACTOR for a clustered index—it will directly affect the amount of space required for the storage of the table data. Because a clustered index is bound to a table, (the physical order of the table data is mapped to the order of the clustered index), a FILLFACTOR on the index will space each data page of the table apart from the others according to the value requested. This can consume substantial amounts of disk space if the FILLFACTOR is sparse.

Specifying a FILLFACTOR when creating an index on a table without data has no effect because SQL Server has no way of placing the data in the index correctly. For tables that have dynamic data sets that need to be indexed with an index specifying a FILLFACTOR, you should rebuild indexes periodically to make sure that SQL Server is actually populating the index pages correctly.

IGNORE_DUP_KEY

When SQL is executed, this option controls SQL Server's behavior that causes duplicate records to exist in a table with a unique index defined on it. By default, SQL Server will always reject a duplicate record and return an error. This option allows you to get SQL Server to continue processing as though this isn't an error condition.

This configuration option can be useful in highly accessed tables where the general trend of the data is more important than the actual specifics. It shouldn't be used for tables where each individual record is important, however, unless application code is providing appropriate referential constraints to the data. If multiple records are affected by an update or insert statement, and the statement causes some records to create duplicates, the statement will be allowed to continue. Those records that created duplicates will be rolled back with no error returned.

When enabling IGNORE_DUP_KEY, be careful that you don't lose required data due to unwanted updates occurring. If IGNORE_DUP_KEY is enabled for a unique index and an update is done to data that causes duplicate records to exist, not only will the duplicates be rejected by the update, but the original records will also be removed. This is because of the way SQL Server performs updates—by deleting the record and then reinserting it. The reinsertion will fail (due to the duplicity of the record), so neither the original record nor the updated record will exist.

SORTED_DATA

SQL Server uses the SORTED_DATA keyword to speed up index creation for clustered indexes. By specifying SORTED_DATA, you're saying that the data to be indexed is already physically sorted in the order of the index. SQL Server will verify that the order is indeed correct during index creation by checking that each indexed item is greater than the previous item. If any item isn't found to be sorted, SQL Server will report an error and abort index creation. If this option isn't specified, SQL Server will sort the data for you as it would do normally.

Using the SORTED_DATA keyword greatly reduces the amount of time and space required to create a clustered index. The time is reduced because SQL Server doesn't spend any time ordering the data; the required space is reduced because SQL Server no longer needs to create a temporary workspace to place the sorted values before creating the index.

SORTED_DATA_REORG

SORTED_DATA_REORG is similar to SORTED_DATA in that it helps SQL Server's overall performance by making the data physically reside in the database table in the order of the index. The SORTED_DATA_REORG keyword tells SQL Server to physically reorder the data in the order of the index. This can be especially useful on non-clustered indexed tables that you want to reduce the amount of page splits on due to data no longer being in adjacent data pages. This will help the data be physically adjacent in the database and will reduce the number of non-sequential physical I/Os required to fetch data, which in turn improves performance.

IGNORE_DUP_ROW

This option is for creating a non-unique clustered index. If enabled at index creation time on a table with duplicate data in it, SQL Server will

Create the index
Delete the duplicate values
Return an error message to the calling process indicating the failure. At this point the calling process should initiate a ROLLBACK to restore the data.

If data is inserted into or updated in the table after the index is created, SQL Server will

Accept any non-duplicate values
Delete the duplicate values (and possibly the original record, if a duplicate occurs during an update)
Return an error message to the calling process indicating the failure. At this point the calling process should initiate a ROLLBACK to restore the data.

See the section entitled "ROLLBACK TRAN," in Chapter 12

This option has no effect on non-clustered indexes. SQL Server internally assigns identifiers to the records being indexed and doesn't have to manage the physical order of the data according to the clustering.

ALLOW_DUP_ROW

This option can't be set on an index that's allowed to IGNORE_DUP_ROW. It controls behavior for inserting or updating records in a non-unique clustered index. If ALLOW_DUP_ROW is enabled, no errors are returned and no data is affected if multiple duplicate records are created in a clustered index.

ON segment_name

Specifying a segment for the index to reside on allows the placement of an index on a different segment of the data. This will improve performance of non-clustered indexes because multiple I/O handlers can be used to read and write from the index and data segments concurrently.

Clustered indexes that have a segment name specified will move the data that's being indexed, as well as the index to the indicated segment.

Understanding Statistics

The value of an index for helping SQL Server resolve a query largely depends on how accurately the index's data reflects the actual data in the database. SQL Server maintains heuristical or trend statistics on the data that the index contains to help it choose the appropriate index that will yield the least number of I/Os to get to the actual table data. Clustered indexes (when available) will almost always be chosen as a valid key over non-clustered indexes. Clustered indexes are favored because there is no physical I/O after the data is found in the index because it's on the same page.

SQL Server's statistics-gathering engine is based on an as-needed basis, in that the statistics are maintained only when an index is built or the statistics on those indexes are forced to be updated by the UPDATE STATISTICS statement. SQL Server doesn't maintain statistics on the fly purely for performance reasons. The additional overhead of maintaining the statistics dynamically generally isn't considered advantageous to fetching the data because it will consume (on average) more resources than would have been saved by the additional data. SQL Server's statistics indicate trends in the data and don't necessarily represent every key data element. These trends are what SQL Server uses to determine the best index to use.

The exception of where dynamic index statistic management would be beneficial is when an index is created on a table with no data in it. SQL Server has no way of knowing what the trends in the data are and, as a result, makes very basic assumptions of normal distribution. These assumptions are often very wrong after a number of records are added to the table; thus, SQL Server will table scan even though there's an appropriate index. In this situation, perform an UPDATE STATISTICS on the table, and the index's distribution information will be updated.

To determine the last time statistics information was updated on a given index, the DBCC command SHOW_STATISTICS can be used. An example of its use follows:

/*-----------------------------
dbcc show_statistics( authors, fk_au_id )
-----------------------------*/
Updated              Rows        Steps       Density
-------------------- ----------- ----------- ---------
Dec 10 1995  2:20PM  23          22          0.0434783

(1 row(s) affected)

All density              Columns
------------------------ -------
0.0434783                au_id

(1 row(s) affected)

Steps
-----------
172-32-1176
213-46-8915
238-95-7766
267-41-2394
274-80-9391
341-22-1782
409-56-7008
427-17-2319
472-27-2349
486-29-1786
527-72-3246
648-92-1872
672-71-3249
712-45-1867
722-51-5454
724-08-9931
724-80-9391
756-30-7391
807-91-6654
846-92-7186
893-72-1158
899-46-2035

(22 row(s) affected)

DBCC execution completed. If DBCC printed error messages, see your 
System Administrator.

In the output from the DBCC SHOW STATISTICS command, you can see that the statistics have been kept up-to-date as of December 10th. In the second example, shown in Listing 10.1, the authors2 table is re-created with the same structures as authors and the data are copied to it.

Listing 10.1 10_01.SQL—Creating the Table authors2 and its Indexes

CREATE TABLE authors2
(
      au_id id NOT NULL ,
      au_lname varchar (40) NOT NULL ,
      au_fname varchar (20) NOT NULL ,
      phone char (12) NOT NULL ,
      address varchar (40) NULL ,
      city varchar (20) NULL ,
      state char (2) NULL ,
      zip char (5) NULL ,
      contract bit NOT NULL
)
GO

 CREATE  INDEX aunmind ON dbo.authors2(au_lname, au_fname) 
 WITH  FILLFACTOR = 5
GO

 CREATE  UNIQUE  INDEX barny ON dbo.authors2(au_id) WITH  IGNORE_DUP_KEY
GO

 CREATE  UNIQUE  INDEX fk_au_id ON dbo.authors2(au_id) WITH  IGNORE_DUP_KEY
GO

insert into authors2 select * from authors
GO

The previous listing was generated by doing a SQL script generation using SQL Enterprise Manager and then changing the table name from authors to authors2. Here's the same DBCC command reporting very different results, because the STATISTICS are out of date:

/*-----------------------------
dbcc show_statistics( authors2, fk_au_id )
-----------------------------*/
Updated              Rows        Steps       Density
-------------------- ----------- ----------- -------
                NULL 23          0           0.0

(1 row(s) affected)

All density              Columns
------------------------ -------
0.0                      au_id

(1 row(s) affected)

Steps
-----------

(0 row(s) affected)

DBCC execution completed. If DBCC printed error messages, see your 
System Administrator.

Performing an UPDATE STATISTICS on the authors2 table and then doing the DBCC command yields the identical results to the initial DBCC on the authors table:

/*-----------------------------
update statistics authors2
go
dbcc show_statistics( authors2, fk_au_id )
-----------------------------*/
Updated              Rows        Steps       Density
-------------------- ----------- ----------- ---------
Dec 11 1995 12:08PM  23          22          0.0434783

(1 row(s) affected)

All density              Columns
------------------------ -------
0.0434783                au_id

(1 row(s) affected)

Steps
-----------
172-32-1176
213-46-8915
238-95-7766
267-41-2394
274-80-9391
341-22-1782
409-56-7008
427-17-2319
472-27-2349
486-29-1786
527-72-3246
648-92-1872
672-71-3249
712-45-1867
722-51-5454
724-08-9931
724-80-9391
756-30-7391
807-91-6654
846-92-7186
893-72-1158
899-46-2035

(22 row(s) affected)

DBCC execution completed. If DBCC printed error messages, see your 
System Administrator.

Updating Statistics

SQL Server has two ways of updating the statistics that relate to a table. The graphical way can be performed using SQL Enterprise Manager. The Transact-SQL command is UPDATE STATISTICS.

Using SQL Enterprise Manager to Update Statistics

To use SQL Enterprise Manager to update statistics on a table, follow these steps:

Launch SQL Enterprise Manager from the Microsoft SQL Server 6.5 group.
Select the server, database, and table that you want to work on.
From the Manage menu choose Indexes. The Manage Indexes dialog box appears, (see fig. 10.6).
Fig. 10.6 - The authors table is selected in the Table combo box, and the aunmind index is selected in the Index combo box.
Click the Distribution button to view the Index Distribution Statistics dialog box (see fig. 10.7).
Fig. 10.7 - This dialog box uses the output of DBCC SHOW_STATISTICS to display the information.
Click the Update button to update the statistics. A message box appears, asking whether the statistics should be updated now or scheduled as a task to run later (see fig. 10.8). Choose the option that's appropriate for your environment.
Fig. 10.8 - The Update Distributing Statistics message box confirms whether the index statistics should be updated immediately or scheduled as a task.

By default, only the index highlighted in the Manage Indexes dialog box will have its statistics updated. If you want to update all of the indexes on the table selecte, check the option Apply to ALL Indexes Of check box before either scheduling the task or executing it.

Using UPDATE STATISTICS

The UPDATE STATISTICS statement is used to update the index statistics on a table or index. The syntax for UPDATE STATISTICS is as follows:

UPDATE STATISTICS [[database.]owner.]table_name [index_name]

The options for the Transact-SQL command UPDATE STATISTICS are as follows:

table_name—The name of the table that the index resides on. If no index is specified, all the indexes on the table are updated at the same time.
index_name—The name of the index that's to have its statistics updated.

Performing an UPDATE STATISTICS on a database table can affect the plan that a stored procedure has generated for accessing data. Because stored procedures are compiled and stored in the procedure cache when they're first executed, they can store invalid access paths to data based on the index statistics at the time the procedure was first run. To force a stored procedure to refresh its access path, use the system-stored procedure sp_recompile and pass the table that was updated as a parameter. For example, sp_recompile authors will force all the procedures that use the authors table to be recompiled the next time they're executed.

Forcing the Use of a Particular Index

If SQL Server fails to pick an index that you know should provide better performance than the index it chose, you can force the use of an index by specifying it in the FROM clause. To force an index, use the optimizer hints or (INDEX = ...) section of the SELECT statement's syntax. In simplified syntax, here's a SELECT statement:

SELECT ...
FROM table_name (INDEX = n) /* optimizer hints are placed after the table */
...

For more information on the syntax of SELECT statements, see Chapter 6, "Retrieving Data with Transact-SQL"

The INDEX keyword tells SQL Server to use the index specified by the numeric n. If n equals 0, SQL Server will table scan. If n equals 1, SQL Server will use the clustered index if one is in the table. The other values of n are determined by the number of indexes on the table.

An index name can also be used in the optimizer hint instead of an identifying id number.

Listing 10.2 shows SQL Server using the optimizer hints when selecting from the authors table.

Listing 10.2 10_02.SQL—The Same SELECT Statement Performed Four Times to Demonstrate the Use of a Forced Index.

/* Turn on statistics IO, so that the results can be seen */
set statistics io on
go

/* Basic Select with no hints to show the optimizer
   choosing the clustered index */
Select      AU_ID, AU_FNAME
From  AUTHORS
Where AU_ID between '172-32-1176' and '238-95-7766'
Order By AU_ID
go

/* Force a table scan */
Select     AU_ID, AU_FNAME
From  AUTHORS (INDEX = 0)
Where AU_ID between '172-32-1176' and '238-95-7766'
Order By AU_ID
go

/* Force the clustered index */
Select      AU_ID, AU_FNAME
From  AUTHORS (INDEX = 1)
Where AU_ID between '172-32-1176' and '238-95-7766'
Order By AU_ID
go

/* Force the first alternate index */
Select      AU_ID, AU_FNAME
From  AUTHORS (INDEX = 2)
Where AU_ID between '172-32-1176' and '238-95-7766'
Order By AU_ID
go

The output is as follows:

AU_ID       AU_FNAME
----------- --------------------
172-32-1176 Johnson
213-46-8915 Marjorie
238-95-7766 Cheryl

(3 row(s) affected)

Table: authors  scan count 1,  logical reads: 1,  physical reads: 0
AU_ID       AU_FNAME
----------- --------------------
172-32-1176 Johnson
213-46-8915 Marjorie
238-95-7766 Cheryl

(3 row(s) affected)

Table: authors  scan count 1,  logical reads: 1,  physical reads: 0
Table: Worktable  scan count 0,  logical reads: 4,  physical reads: 0
AU_ID       AU_FNAME
----------- --------------------
172-32-1176 Johnson
213-46-8915 Marjorie
238-95-7766 Cheryl

(3 row(s) affected)

Table: authors  scan count 1,  logical reads: 2,  physical reads: 0
AU_ID       AU_FNAME
----------- --------------------
172-32-1176 Johnson
213-46-8915 Marjorie
238-95-7766 Cheryl

(3 row(s) affected)

Table: authors  scan count 1,  logical reads: 29,  physical reads: 0
Table: Worktable  scan count 0,  logical reads: 4,  physical reads: 0

The effects of forcing an index are clearly shown in these examples. The last example shows an extremely expensive option being forced on the server. You can cause major performance problems by forcing index use, and so it's generally not recommended that you update the indexes.

Forcing index selection in a query is also dangerous if the application code is left unchanged and the indexes are changed or rebuilt. Changing the indexes may cause severe performance degradation due to the forcing of indexes that no longer provide optimal performance.

If you must resort to index forcing and believe that the optimizer should have chosen a different index, it's recommended that you call Microsoft or your local support provider and log a bug with the query optimizer.

Displaying Index Information

SQL Server has two ways to show information about indexes. The graphical method is via SQL Enterprise Manager's Index Manager; the command-line method is via the system-stored procedure sp_helpindex and the ODBC stored procedure sp_statistics.

SQL Enterprise Manager's Index Manager has been discussed in detail in previous sections in this chapter. Please refer to the section Using SQL Enterprise Manager to Update Statistics for instructions on how to view the statistics associated with an index.

sp_helpindex

The system-stored procedure sp_helpindex has been provided to get information about indexes. The syntax for the procedure's use is

sp_helpindex table_name

table_name should be replaced with an unqualified table name. If the table you want to inquire on isn't in the current database, you must change to the required database before executing this procedure.

sp_helpindex will return the first 8 indexes that are found on a database table. In the following example, sp_helpindex shows all the indexes on the authors table:

/*-----------------------------
sp_helpindex authors
-----------------------------*/
index_name           index_description
index_keys
--------------------------------------
UPKCL_auidind        clustered, unique, primary key located on default
au_id
aunmind              nonclustered located on default                      
au_lname, au_fname
barny                nonclustered, ignore duplicate key, unique located on
default au_id
fk_au_id             nonclustered, ignore duplicate key, unique located on
default au_id

(1 row(s) affected)

sp_statistics
sp_statistics is a special stored procedure that has been created to help Microsoft "publish" information for the ODBC interface to the database. Microsoft created this stored procedure so that an ODBC driver could retrieve all the relevant information about an index from a single call to the database. The information returned can be gathered in a number of other ways; however, it's often convenient to use sp_statistics to summarize all the relevant information on a table. The syntax for sp_statistics is as follows:

sp_statistics table_name [, table_owner] [, table_qualifier]
      [, index_name] [, is_unique]

The options for the system-stored procedure sp_statistics are as follows:

table_name—This is the name of the table that you require the index information on.
table_owner—This is the owner of the table.
table_qualifier—This is the name of the database in which the table resides.
index_name—This is the specific index that's being requested.
is_unique—If this parameter is set to 'Y', SQL Server will return only unique indexes on the table.

Many stored procedures have many of parameters. To save time, rather than specify all the parameters, you can indicate a particular one by placing an @ sign in front of the parameter name--for example,

sp_statistics authors, @is_unique = 'Y'

Dropping Indexes

SQL Server has two ways of dropping indexes on a table. The graphical way can be performed by using SQL Enterprise Manager. The command-line way is by using the SQL statement DROP INDEX.

Using SQL Enterprise Manager to Drop an Index

To use SQL Enterprise Manager to drop an index, follow these steps:

Launch SQL Enterprise Manager from the Microsoft SQL Server 6.5 group.
Select the server, database, and table that you want to work on.
From the Manage menu choose Indexes. The Manage Indexes dialog box appears.
Click the Remove button to drop the index required. A message box appears, asking whether the index should be removed now or scheduled as a task to run later (see fig. 10.9). Choose the option that's appropriate for your environment.
Fig. 10.9 - The Index Removal message box confirms whether you want the index removed immediately or scheduled as a task.

Using the DROP INDEX Command

To remove an index using Transact-SQL, use the DROP INDEX statement. The syntax for DROP INDEX is as follows:

DROP INDEX [owner.]table_name.index_name
[, [owner.]table_name.index_name...]

The options for the Transact-SQL command DROP INDEX are as follows:

table_name—The name of the table that the index resides on. If the user running DROP INDEX is the DBO or SA and the table isn't owned by that user, table_name can be prefixed by the owner of the table.
index_name—The name of the index to be removed. You can remove multiple indexes by indicating them in the same statement, separated by commas.

The following example drops the barny index on the authors table:

Drop Index authors.barny

No output is generated after executing this command.

Defining Keys

Keys and indexes are often synonymous in databases; however, in SQL Server a slight difference exists between them. In SQL Server, keys can be defined on tables and then can be used as referential integrity constraints in the same fashion as the ANSI standard for SQL.

A primary key is a unique column or set of columns that defines the rows in the database table. In this sense, a primary key performs the same integrity role as a unique index on a table, except that notionally SQL Server allows only one primary key to be defined for a table; on the other hand, there can be many unique indexes. Primary keys enforce uniqueness by creating a unique index on the table on which they're placed.

Foreign keys are columns in a table that correspond to primary keys in other tables. The relationship of a primary key to a foreign key defines the domain of values permissible in the foreign key. The domain of values is equivalent to a distinct list of values in the corresponding primary key. This foreign key domain integrity is a useful way of enforcing referential integrity between associated sets of columns. Foreign keys don't create indexes on the table when the key is created.

Starting in version 6.0, primary and foreign keys in SQL Server offer much of the functionality that previously had to be coded with triggers in prior versions of SQL Server. In prior versions of SQL Server, primary and foreign keys weren't much more than documentation and were useful to third-party programs that needed to know key information about a table. Keys provide needed functionality and should be used as a referential integrity enforcer.

Adding Primary and Foreign Keys

In SQL Server you can add primary and foreign keys in two ways. The graphical method is performed by using SQL Enterprise Manager. The command-line method is done by using Transact-SQL commands ALTER TABLE...ADD CONSTRAINT, or by specifying PRIMARY/FOREIGN KEY in the CREATE TABLE statement.

Using SQL Enterprise Manager to Add Primary and Foreign Keys

To use SQL Enterprise Manager to add a primary key, follow these steps:

Launch SQL Enterprise Manager from the Microsoft SQL Server 6.5 group.
Select the server, database, and table that you want to work on.
From the Manage menu choose Tables. The Manage Tables screen appears (see fig. 10.10).
Fig. 10.10 - The authors table is selected in the top combo box, and a key icon is in the Key column of the au_id row.
Click the Advanced Features toolbar button (with a green plus sign on it) to show the advanced options at the bottom of the window (see fig. 10.11).
Fig. 10.11 - This figure shows SQL Enterprise Manager's Manage Tables window with the Advanced Features visible and the Primary Key/Identity page active.
Remove the existing primary key by clicking the Remove button and then re-enter the information (see fig. 10.12).
Fig. 10.12 - The Primary Key has been reentered and the Add button is now enabled to allow you to create the Primary Key on the table.
Click the Add button to add the primary key. No message will be shown, but the key icon will return to the table.
Click the Save button to save the changes to the table. For the purposes of this exercise, you'll get an error because a foreign key was defined referencing this table. Ignore the error and close the window without saving changes.

Using CREATE TABLE...PRIMARY KEY

The CREATE TABLE syntax has a place for adding a PRIMARY KEY or a FOREIGN KEY in the CONSTRAINT section. A simplified syntax of the CREATE TABLE is shown as follows:

CREATE TABLE table_name
( column_name data_type CONSTRAINT ...,...)

In Listing 10.3, you'll see tables created in different styles with different types of CONSTRAINTs.

Listing 10.3 10_03.SQL—Creating Tables with Different CONSTRAINTs.

/* create a table where the primary key name is not
   specified, and the database will assign it */

Create TABLE TABLE_A
( COLUMN_A smallint PRIMARY KEY)
go

/* Now create a primary key specifying the name */
Create TABLE TABLE_B
( COLUMN_B smallint CONSTRAINT PK_COLUMN_B PRIMARY KEY)
go

/* Now create a foreign key referencing TABLE_A */
Create TABLE TABLE_C
( COLUMN_C smallint FOREIGN KEY (COLUMN_C) REFERENCES TABLE_A(COLUMN_A))
go

/* Now Create a multi-column primary key */
Create TABLE TABLE_D
( COLUMN_D1 smallint CONSTRAINT PK_D_COLUMNS PRIMARY KEY 
(COLUMN_D1, COLUMN_D2), COLUMN_D2 smallint)

go

/* now create a foreign key referencing the multi-column
   primary key */
Create TABLE TABLE_E
( COLUMN_E1 smallint FOREIGN KEY (COLUMN_E1, COLUMN_E2)
                  REFERENCES TABLE_D( COLUMN_D1, COLUMN_D2),
  COLUMN_E2 smallint)
go

When you add a PRIMARY KEY to a table with the ALTER TABLE...ADD CONSTRAINT syntax or in the CREATE TABLE statement, if you don't indicate any parameters for the key, a clustered, unique index is created on the table. To specify a non-clustered index, add NONCLUSTERED immediately after PRIMARY KEY to the statement.

Using ALTER TABLE...ADD CONSTRAINT

The ALTER TABLE...ADD CONSTRAINT syntax is very similar to the CREATE TABLE logic. In Listing 10.4, the same tables are created, but the ALTER TABLE syntax is used to add the keys.

Listing 10.4 10_04.SQL—Altering Tables to Add Primary and Foreign Keys

/* create the table */
Create TABLE TABLE_A
( COLUMN_A smallint)
go

/* add the basic primary key without specifying the name */
Alter Table TABLE_A ADD PRIMARY KEY (COLUMN_A)
go

/* create the table */
Create TABLE TABLE_B
( COLUMN_B smallint)
go

/* add the primary key specifying the name */
Alter Table TABLE_B ADD CONSTRAINT PK_COLUMN_B PRIMARY KEY (COLUMN_B)
go

/* create the table */
Create TABLE TABLE_C
( COLUMN_C smallint)
go

/* Now create a foreign key referencing TABLE_A */
Alter Table TABLE_C ADD FOREIGN KEY (COLUMN_C) REFERENCES TABLE_A(COLUMN_A)
go

/* create the table */
Create TABLE TABLE_D
( COLUMN_D1 smallint,
  COLUMN_D2 smallint)
go

/* Now add the multi-column primary key */
Alter Table TABLE_D ADD CONSTRAINT PK_D_COLUMNS PRIMARY KEY 
(COLUMN_D1, COLUMN_D2)
go

/* create the table */
Create TABLE TABLE_E
( COLUMN_E1 smallint,
  COLUMN_E2 smallint)
go

/* now add the foreign key referencing the multi-column
   primary key */
Alter Table TABLE_E ADD CONSTRAINT FK_E_COLUMNS FOREIGN KEY 
(COLUMN_E1, COLUMN_E2)
REFERENCES TABLE_D( COLUMN_D1, COLUMN_D2)
go

SQL Server 6.5 adds two new options to the ALTER TABLE...ADD CONSTRAINT syntax: WITH CHECK | NOCHECK, and NOT FOR REPLICATION. The WITH NOCHECK option is provided so that a constraint can be added without checking the existing data for referential integrity constraints. Microsoft added the NOT FOR REPLICATION option to allow replication to occur without requiring constraints to be dropped and re-added after the replication took place.

See the section entitled Constraints and Checks in Chapter 5

Displaying Key Information

SQL Server has two ways to show information about keys. The graphical method is via SQL Enterprise Manager's Table Manager. The command-line method is via the system-stored procedures sp_help and sp_helpconstraints, and the ODBC stored procedures sp_pkeys and sp_fkeys.

SQL Enterprise Manager's Table Manager has been discussed in detail in previous sections in this chapter. Please refer to the section entitled Using SQL Enterprise Manager To Add Primary And Foreign Keys for information on how to view the constraints on a table.

sp_helpconstraint

SQL Server's primary way of displaying information about keys is through the system-stored procedure sp_helpconstraint. Its syntax is as follows:

sp_helpconstraint table_name

sp_help
sp_help is a generic system-stored procedure that returns information about database tables. Part of the output from sp_help is information on keys on a table. The syntax for sp_help is

sp_help table_name

sp_pkeys and sp_fkeys
SQL Server provides two system-stored procedures, sp_pkeys and sp_fkeys, that can be used to view key information stored in the database. sp_pkeys and sp_fkeys are procedures that have been created to help ODBC implementers access SQL Server's system catalog tables easily.

The syntax for the two procedures is identical and is as follows:

sp_pkeys | sp_fkeys table_name

table_name is the table for which the keys need to be found.

Examples of Using System-Stored Procedures to View Primary and Foreign Keys

Here are some examples of the output from sp_pkeys, sp_fkeys, and sp_help:

/*-----------------------------
sp_helpconstraint TABLE_D
-----------------------------*/
Object Name
------------------------
TABLE_D

constraint_type          constraint_name      constraint_keys
------------------------ -----------------    --------------------
PRIMARY KEY (clustered)  PK_D_COLUMNS         COLUMN_D1, COLUMN_D2

Table is referenced by
------------------------------------------------------------------
pubs.dbo.TABLE_E: FK_COLUMNS
/*-----------------------------
sp_help table_d
-----------------------------*/
Name        Owner          Type       When_created
--------------------------------------------------
TABLE_D     dbo            user table Dec 11 1995  7:42PM

Data_located_on_segment
------------------------------
default

Column_name   Type                           Length Prec  Scale Nullable
------------------------------------------------------------------------
COLUMN_D1     smallint                       2      5     0     no
COLUMN_D2     smallint                       2      5     0     no

Identity    Seed    Increment
-----------------------------
No identity column defined.    (null)  (null)

index_name   index_description    index_keys
--------------------------------------------
PK_D_COLUMNS clustered, unique, primary key located on default
       COLUMN_D1, COLUMN_D2

constraint_type     constraint_name     constraint_keys
-------------------------------------------------------
PRIMARY KEY (clustered)   PK_D_COLUMNS  COLUMN_D1, COLUMN_D2

Table is referenced by
----------------------
pubs.dbo.TABLE_E: FK_COLUMNS
/*-----------------------------
sp_pkeys table_d
-----------------------------*/
table_qualifier   table_owner   table_name   column_name      key_seq
pk_name
---------------------------------------------------------------------
pubs              dbo           TABLE_D      COLUMN_D1        1      
PK_D_COLUMNS
pubs              dbo           TABLE_D      COLUMN_D2        2       
PK_D_COLUMNS

(2 row(s) affected)
/*-----------------------------
sp_fkeys table_d
-----------------------------*/
pktable_qualifier      pktable_owner       pktable_name            pkcolumn_name
fktable_qualifier      fktable_owner       fktable_name            fkcolumn_name
              
key_seq update_rule delete_rule fk_name                          pk_name
------------------------------ -------------------------- --------------
pubs                   dbo                 TABLE_D                 COLUMN_D1 
pubs                   dbo                 TABLE_E                 COLUMN_E1 
1       1           1           FK_COLUMNS                       PK_D_COLUMNS
pubs                   dbo                 TABLE_D                 COLUMN_D2 
pubs                   dbo                 TABLE_E                 COLUMN_E2 
2       1           1           FK_COLUMNS                       PK_D_COLUMNS

Dropping Keys

SQL Server has two methods for dropping primary and foreign keys. The graphical method is performed by using SQL Enterprise Manager. The command-line method is done by using the Transact-SQL command ALTER TABLE...DROP CONSTRAINT.

Using SQL Enterprise Manager

To use SQL Enterprise Manager to drop a key, follow these steps:

Launch SQL Enterprise Manager from the Microsoft SQL Server 6.5 group.
Select the server, database, and table that you want to work on.
From the Manage menu choose Tables. The Manage Tables dialog box appears.
Click the Advanced Features toolbar button (with a green plus sign on it) to show the advanced options at the bottom of the window. Refer to figure 10.11 for a picture of the Manage Tables dialog box.
Remove the existing primary key by clicking the Remove button; or click the Foreign Keys tab and remove the required foreign key.
Click the Save button (looks like a diskette) to save the changes to the table. If you get any referential constraint errors, you'll need to go to those tables and remove the primary/foreign keys that are causing the problem.

Using ALTER TABLE...DROP CONSTRAINT

To drop a foreign key using SQL, use the ALTER TABLE...DROP CONSTRAINT statement. The syntax for this SQL statement is as follows:

ALTER TABLE table_name DROP CONSTRAINT constraint_name

The table_name is the name of the table that the constraint applies to. The constraint_name is the name of the constraint.

You can't drop a primary key if other tables reference it as a foreign key. You must drop those foreign keys first.

From Here...

In this chapter you learned how to create, view, and manage indexes on your data tables. This information is very important to help you create an optimized database that won't be bogged down by user queries that force table scans.

From here you should look at the following chapters for more information:

Chapter 6, "Retrieving Data with Transact-SQL" shows how you can examine your queries to make sure that they're hitting the indexes defined on the tables.
Chapter 16, "Understanding Server, Database, and Query Options" explains how to make changes to your global server, database, and query configurations to further optimize your queries.

QUE Home Page

For technical support For our books And software contact support@mcp.com

Table of Contents

09 - Managing and Using Views

11 - Managing and Using Rules and Defaults