Chapter 17 - Optimizing Performance
- How to approximately size a database and how to estimate the amount of disk space required - Sizing a SQL Server database can make a difference to how you choose to buy hardware for your server
- How to size the procedure cache for optimal performance - An optimally sized procedure cache will substantial improve performance because frequently accessed procedures will not need to be recompiled.
- How to use Windows NT's Performance Monitor with SQL Server - Windows NT's Performance Monitor provides complete and up to date statistics to help you manage and monitor the performance of your SQL Server.
Performance tuning in the client-server world is somewhat of a magical art. A combination of so many factors can make an application perform well, and knowing where to focus your time is what's most important.
The most critical part of optimizing performance is to have good documentation. Document statistically how the system works or performs before even starting any performance tuning. As the performance tuning cycle begins, monitor and document the effects of all the changes so that it's easy to determine the changes that were positive and those that were negative. Never assume that all the changes made for one application automatically apply to another application. Remember you're ultimately tuning a product that a user is using, not just a database that's being accessed by some unknown client.
Estimating the size of a SQL Server database is relatively straightforward and can be done with a good level of accuracy. The principle of space calculation is that all the bytes of data per table should be added together along with the associated overhead per row and page of data and that this should be then used as a divisor to the page size (2K) to determine how many rows of data will fit in a page.
The actual available space of a page is 2016 bytes because 32 bytes are reserved for fixed overhead to manage the rows on the page. In general terms these calculations are affected by the placement of and use of FILL FACTORS on indexes and if a clustered index is on the table.
Each SQL Server datatype consumes a certain amount of bytes based on the storage of the data. The following list defines the amount of storage that each datatype uses:
|Char/Binary||The size indicated in the definition|
|VarChar/VarBinary||The actual data size (use an average estimate)|
|Float(b)||4 bytes (numbers with precision of 1-7 digits)|
|Float(b)||8 bytes (numbers with precision of 8-15 digits)|
|Double Precision||8 bytes|
|Decimal/Numeric||2-17 bytes depending on the precision|
|Text/Image||16 bytes per table row plus at least one 2K page for per NOT NULL column|
SQL Server internally defines any NULLABLE column as a var datatype. So a Char(12) NULL column is actually a Varchar(12) column. Therefore, for any columns that permit NULL values the average expected column size should be used.
Decimal and numeric precision affects the amount of storage required for these datatypes. The following table indicates the amount of bytes required for each range of precision:
The method of calculating a table's space requirements differs based on whether the table has a clustered index or not. Both calculation methods are shown here, and examples will be drawn from the Pubs database to illustrate their use.
Some things to be aware of when calculating table and index sizes are that:
- Performing UPDATE STATISTICS on an index adds an extra page for that index to store the distribution statistics of the data that it contains. Performing UPDATE STATISTICS on the table will add one data distribution page per index on the table.
- For tables with variable-length columns, you should try to average the length of the row by estimating the anticipated average size of the columns on the table.
- SQL Server won't store more than 256 rows per page, even if the row is very short. So if your row is 7 bytes or less in size, the number of data pages required for N rows of data is calculated by:N / 256 = number of data pages required.
- Text and Image data will take up a minimum of 2K (one page) unless when a row is inserted the value for the column is specified as NULL.
The Publishers table has a clustered index. This example will estimate the space required for 5,000,000 rows, and will assume that the average length of the varchar columns is 60 percent of the defined length:
- Calculate the row length. If the row contains only fixed-length, NOT NULL columns, the formula is
2 + (Sum of column sizes in bytes) = Row Size
If the row contains mixed variable-length fields and/or NULL columns, the formula is
2 + (Sum of fixed-length column sizes in bytes) + (Sum of average
of variable-length columns) = Subtotal
Subtotal * (( Subtotal / 256) _+ 1) + (Number of variable-length
columns +_ 1) + 2 = Row Size
For the Publishers table, the second formula is required:
2 + 4 + (60% of 92) = 55.2
55.2 * ((55.2/256) + 1) + 5 + 2 = 75
- Calculate the number of rows that will fit on a page. The formula is
2016 / (Row Size) = Number of rows per page
In this case,
2016 / 75 = 27
For more accurate calculations, round down any calculations for number of rows per page.
- Number Of Rows Required/Number of rows per page = number of 2K data pages:
In this case,
5,000,000 / 27 = 18519
For more accurate calculations, round up any calculations for number of pages required.
- Next calculate the space required for the clustered index. The size of the clustered index depends on whether the key columns are variable or fixed-length. For fixed-length keys, use this formula:
5 + (Sum of column sizes in bytes) = Clustered index size
For variable-length keys, use this formula:
5 + (Sum of fixed-length column sizes in bytes) + (Sum of average
of variable-length columns) = Subtotal
Subtotal * (( Subtotal / 256) _+ 1) + (Number of variable-length
columns +_ 1) + 2 = Clustered index size
For publishers, the key is a single fixed-length column, therefore
5 + 4 = 9
- Now calculate the number of clustered index rows that will fit on a page. The formula is:
(2016 / (Clustered index size)) - 2 = Number of rows per page
In this case:
(2016 / 9) - 2 = 222
- Next calculate the number of index pages by using the following formula:
(Number of data pages) /
(Number of clustered index rows per page) =
Number of index pages at index level N
For this example,
18519 / 222 = 84
Index pages are at multiple levels, to compute all the levels of the index, continue to divide the resulting number of index pages by the number of clustered rows per page until the result is 1 or less. In this case:
84 / 222 = 1, which means that one index page is at the top of the index and all the other pages are actual pointers to data pages.
- Compute the total number of 2K pages required for the database table:
Data Pages: 18519
Index Pages (level 1): 1
Index Pages (level 0): 83
Total number of 2K pages: 19403 (or about 38MB)
Tables with Nonclustered indexes are calculated in size the same way as a clustered index table except for the sizing of the index itself. In this example, assume that a non-clustered index has been added to the Roysched table on the title_id column, and that 7,000,000 rows are in the table. The following steps will help you size a nonclustered index:
- The first step is to calculate the length of the leaf row in the index. A leaf row is the bottom row of an index tree and points to the data page. The leaf row's size is the size of the index's columns summed together and is affected by variable or fixed-length columns. Use this formula if you have only fixed-length columns in the index:
7 + (Sum of fixed-length keys) = Size of index row
Use this formula if you fixed and variable-length columns in the index:
9 + (Sum of length of fixed-length keys) + (Sum of length of
variable-length keys) + (Number of variable-length keys)
+ 1 = Subtotal
(Subtotal) + ((Subtotal / 256) + 1) = (Size of leaf index row)
In the roysched table, the primary key is fixed-length and isn't null, therefore:
7 + 6 = 13
- Next calculate the number of leaf pages that will be required by using the following formulae:
2016 / (Size of leaf index row) = Number of leaf rows per page
e.g.: 2016 / 13 = 155
(Number of rows in table) / (Number of leaf rows per page) =
Number of leaf pages
e.g.: 7,000,000 / 155 = 45,162
- Next calculate the size of the non-leaf row and calculate the number of non-leaf pages. The size of non-leaf row is calculated according to this formula:
(Size of leaf index row) + 4 = Size of nonleaf row
that is, 13+4=17
(2016 / Size of nonleaf row) - 2 = Number of nonleaf index rows
In this example, (2016/17)2=116
(Number of leaf pages / Number of nonleaf index rows per page)
= Number of index pages at Level N
or 45,162/117=386 pages at level 1
Like the clustered index the levels of the index are determined by result division until the result is 1 or less:
386 / 117 = 4 pages at level 2.
4 / 117 = 1 page at level 3.
- Finally, compute the size of the Index by summing the number of pages at the various levels of the index:
Leaf Pages: 45,162
Level 1 Pages: 386
Level 2 Pages: 4
Level 3 Pages: 1
Total number of 2K pages: 45553 (or about 89MB)
FILL FACTOR alters the number of rows that SQL Server will place on a page. The most likely configuration of FILL FACTOR is to assume that the table will never change its data-set and therefore you set FILL FACTOR to 100% to maximize the use of data pages. This affects the calculations by increasing the number of rows that can fit on a page by 2.
If you're sizing an index with a FILL FACTOR of 100 percent, don't subtract 2 from the result of the number of rows per page because SQL Server won't preallocate these rows for page growth but will instead put user data there.
Any other value of FILL FACTOR alters the size of the page itself. For example a FILL FACTOR of 70 percent reduces the amount of available space on the page to 1412 bytes.
Sizing the Procedure Cache in SQL Server is basically a case of trial and error. Microsoft documents an approximation based on the following formula:
Procedure Cache =
(Maximum Concurrent Users) *
(Size of Largest Plan) * 1.25
To determine the size of a plan in memory, the DBCC MEMUSAGE command should be issued in ISQL. The following SQL illustrates the output from DBCC MEMUSAGE:
Buffer Cache, Top 20:
Procedure Cache, Top 12:
Procedure Name: sp_help
Database Id: 1
Object Id: 1888009757
Type: stored procedure
Number of trees: 0
Size of trees: 0.000000 Mb, 0.000000 bytes, 0 pages
Number of plans: 1
Size of plans: 0.051249 Mb, 53738.000000 bytes, 27 pages
Assuming that sp_help was the largest procedure to be run on a server and that there were to be 150 concurrent users, then
150 * 27 * 2 * 1.25 = 10125K
Memory in the procedure cache is managed as a set of 2K pages; therefore, the number of pages reported by DBCC MEMUSAGE is multiplied by 2K to derive the amount of memory that the plan actually consumes.
An alternative sizing can be estimated based on the need to stop SQL Server from recompiling procedures that fall out of the cache frequently. The procedure cache like the data cache works on a Least Recently Used (LRU) algorithm and procedures that are used infrequently are pushed out of the cache if there's no more room to compile a procedure that's requested by a user process.
Therefore, work out a list of the number of critical procedures or procedures that are most frequently accessed and execute each one, analyzing the memory used as reported by DBCC MEMUSAGE. Based on the total memory calculated the size of the procedure cache can be determined.
Ultimately, the only true judge of an accurate size of procedure cache is to test, test, test an application and monitor the effects of altering the amount of cache available.
It's possible to run out of procedure cache if the number of active procedures in use (and their combined plan sizes) is greater than the cache available. In this case, you'll receive error 701, and the calling process that was trying to execute a procedure will be rolled back. If you receive this message you should resize the procedure cache to a higher percentage of available memory.
Windows NT's Performance Monitor is an advanced tool that provides statistics about the operation of the NT environment. One of the unique properties of the performance monitor is its capability to install performance heuristics and callbacks from other executables in the Windows NT system and report their statistics.
SQL Server's performance monitor is just a set of hooks for the core Windows NT Performance Monitor to call. SQL Server groups the statistics that can be displayed into what it calls Objects. These objects group the logical similar statistics.
SQL Server gathers statistics in one of two different modes:
- In "Direct Response" mode, the Performance Monitor will wait for SQL Server to gather statistics and place them in statistics tables. These tables are available at any time for browsing but refer to the last period that the server gathered information and may not necessarily accurately reflect the current server operating level.
- In "On Demand" mode, the Performance Monitor will force SQL Server to gather statistics and return them whenever a period elapses in the Performance Monitor. You should take care not to saturate SQL Server with requests for statistics because this is a relatively costly operation and may in turn skew the results of the statistics themselves.
To configure either "Direct Response" or "On Demand" mode, follow these steps:
- First run SQL Enterprise Manager from the Microsoft SQL Server 6.5 group and select the server to be configured (see fig. 17.1).
Fig. 17.1 - SQL Enterprise Manager after just being started. Note that the (LOCAL) server is selected.
- From the menu choose Server, SQL Server, Configure, (see fig. 17.2)
Fig. 17.2 - SQL Enterprise Manager's Server Configuration/Options dialog box.
- Select Direct Response Mode or On Demand Mode and click OK.
The following sections list the objects that SQL Server Performance Monitor can report on. Some of the counters within the objects can be applied to particular instances of activity on the server. For example, the User Object's "Physical I/O" counter can be applied to each user connected to the system. This process of configuring instances to the particular counter allows you to customize the statistics that you want to monitor.
The SQL Server Object is the master object and provides a wide variety of statistics. The items that are monitored here are the high-level statistics that are very important to everyday management of your server. Normal monitoring of SQL Server typically watches at least the I/O and cache statistics to make sure that the physical I/O and memory subsystems aren't being flooded with requests.
They're grouped as follows:
The cache options allow you to monitor the performance of both the lazywriter and the amount of free data pages available in the data cache. Also, the cache hit rate can be monitored. The cache hit rate is a useful statistic because it tells you how frequently physical I/O has to be performed to meet a user process request.
I/O Statistics: These statistics report the amount of work being performed by the physical I/O subsystems of SQL Server. They can provide useful information and will allow you to monitor saturation on the I/O devices if the Outstanding Reads/Writes gets high. Careful analysis of the I/O statistics can yield an excellent set of data that will help diagnose disk problems in your environment.
NET Statistics: These statistics report the number of reads and writes that SQL Server is placing out on Windows NT's network processing queues. These NET statistics are relatively unimportant and NT's statistics should be monitored instead.
RA Manager Statistics: The Read Ahead Manager provides a number of system statistics that allow the monitor of the RA's performance. Generally this information will help you tune the server using sp_configure.
User Connections: The number of currently connected users to the server.
The Replication-Published DB Object is provided to allow monitoring of the publication of transaction log information from a source or publishing database. It is highly recommended that you monitor the performance of the this object if you are using replication in your environment. These statistics will help you decide how efficiently data is being published by your server.
The statistics that are available are as follows:
The number of transactions that are in the transaction log of the primary database that have yet to be placed in the distribution database but have been marked for replication.
Replicated Transactions/sec: The relative performance, presented in transactions per second, of the rate of reading out of the published database's transaction log and placing the items in the distribution database for replication.
Replication Latency (sec): The average number of milliseconds that elapse between the time that a transaction is placed in the transaction log and the time it's placed in the distribution database.
The Replication-Subscriber Object is provided to allow monitoring of the replication that's updating tables on the subscribing server/database. Like the Publishing Object previously described, careful monitoring of this object is essential to the management of effective replication.
The following information is provided:
The number of transactions that have been executed on the destination database.
Delivered Transactions/sec: The relative performance indicated in transactions per second of delivery of data.
Delivered Latency (Sec): The time, in seconds, that it takes for a transaction to be executed after it's placed in the distribution database.
Undelivered Transactions: The number of transactions sitting in the distribution database that have yet to be executed on the subscribing databases.
The Locks Object is provided to allow central management of locks in the database. The counters provided in this object detail the totals of the different types of locks: Extent, Intent, Page and Table. Additional counters are provided to indicate the number of blocking locks.
See Chapter 12, "Understanding Transactions and Locking" for more information about locks.
The Log Object is provided so that alerts can be placed on the size of and space free on the transaction log associated with a database. This will allow you to dump the logs at required intervals:
The size in megabytes of the transaction log.
Space Free: The percentage of the log that's free and available for transactions to be placed in.
The Users Object has the following counters available that track the statistics about user activity on the server:
The amount of CPU consumed by a particular user.
Locks Held: The current number of open locks held by a particular user.
Memory (in 2K Pages): The amount of memory that a particular user is consuming on the server. This memory will include any memory allocated to resolve queries.
Physical I/O: The amount of physical reads and writes of data pages that have occurred as a result of the user's most recent query.
Chart views are often the most easy statistical viewing mechanism for a database administrator. With a chart you can track periodic performance in a number of criteria and see prior history in the same view.
To use the SQL Performance Monitor to chart statistics for view, follow these steps:
- First run SQL Performance Monitor from the Microsoft SQL Server 6.5 group (see fig. 17.3).
Fig. 17.3 - Windows NT's Performance Monitor running the SQL Server control file (SQLCTRS.PMC) with the default objects being monitored.
- From the View menu, select Chart.
- From the Edit menu, select Add To Chart. This opens the Add To Chart dialog box shown in fig. 17.4.
Fig. 17.4 - The Add To Chart dialog box with the SQL Server object selected.
Click the Explain button to get a short explanation of each counter as you select it.
- Select the Object from which you want to monitor a counter.
- Select one or many counters from the Counter list box and, if necessary, specify the instances that you want to apply the counters on in the Instance list box.
- Specify the required line attributes for the item(s) being added (color, line style, and so on).
- Click Add to add the items to the current Performance Monitor Chart (see fig. 17.5).
Fig. 17.5 - The Add To Chart dialog box is ready to add SQL Server-Log counters for the master and pubs databases.
- Add any other statistics that you want to chart and then click Done to close the dialog box. (The Cancel button changes to a Done button after you add the first item to the chart.)
Performance Monitor can create basic reports of data that's being gathered. These reports show the current (or most recent) values fetched from the statistics of the selected counters.
To use the SQL Performance Monitor to create a report of system statistics, follow these steps:
- First run SQL Performance Monitor from the Microsoft SQL Server 6.5 group.
- From the View menu, choose Report.
- From the Edit menu, select Add To Report (see fig. 17.6).
Fig. 17.6 - The Add To Report dialog box has the SQL Server Replication-Published DB object selected and as well as all the counters to be added to the report.
- Select the Object that you want to report a counter from.
- Select one or many counters from the Counter list box and, if necessary, go to Instance and specify the number of instances on which you want to apply the counters.
- Click Add to add the items to the current Performance Monitor Report.
- Add any other statistics that you want to report on and click Done to close the dialog box. (The Cancel button changes to a Done button after you add the first item to the chart.)
Alerts are one of the most useful features of Performance Monitor. Performance Monitor not only can gather statistics from various system objects, but it can also monitor the values of those statistics. If they reach predetermined levels, Performance Monitor can execute a program that can, for example, dial a pager or alert the DBA in some other fashion.
To use the SQL Performance Monitor to create a statistical alert, follow these steps:
- Run SQL Performance Monitor from the Microsoft SQL Server 6.5 group.
- From the View menu, select Alert.
- From the Edit menu, select Add To Alert (see fig. 17.7).
Fig. 17.7 - The Add To Alert dialog box is ready to add an alert that will send a mail message to the DBA if the used log space on the pubs database exceeds 75 percent.
- Select the Object that you want to add an alert counter from.
- Select one or many counters from the Counter list box. If necessary, go to Instance and specify the instances on which you want to apply the counters.
- Click Add to add the items to the current Performance Monitor Alert list.
- Add any other statistics for which you want to create alerts and click Done to close the dialog box. (The Cancel button changes to a Done button after you add the first item to the chart.)
In this chapter you learned how to identify and manage statistics that will help you determine the performance characteristics of your server. You also learned how to size a database and how to size the procedure cache.
Consider looking at the following chapters to further develop your SQL Server knowledge:
QUE Home PageFor technical support For our books And software contact firstname.lastname@example.orgCopyright © 1996, Que Corporation
Table of Contents
16 - Understanding Server, Database, and Query Options
18 - SQL Server Administration