Special Edition, Using Microsoft BackOffice, Ch. 17

17 - Understanding SQL Server

by Greg Sullivan and Greg Blankenship

  • Overview of SQL Server - Gain an understanding of the relational database management system offered as a part of Microsoft BackOffice - SQL Server.

  • Features and tools of SQL Server - Explore the significant features of SQL Server and become acquainted with the tools included with this BackOffice component.

  • Definition of a relational database management system - Learn the fundamental characteristics of a relational database management system (RDBMS) and how SQL Server satisfies these characteristics.

  • The role of the database administrator - See what the roles and responsibilities of a database administrator are with respect to relational database management.

  • How database design impacts SQL Server - Discover how the effectiveness of your database design impacts your success with SQL Server.


Microsoft SQL Server is the database component of BackOffice. It is a relational database management system (RDBMS) designed to allow organizations to effectively manage information and build powerful database applications for deployment in multiuser networked environments.

Characteristics of Microsoft SQL Server

Microsoft SQL Server is based on client-server architecture, which divides processing into two components - a front-end, or client component, and a back-end, or server component. SQL Server itself constitutes the back-end database server, with which many different front-end client applications communicate, typically over a local area network (LAN). Its built-in data replication, powerful management tools, and open system architecture provide a platform for delivering cost-effective information solutions for organizations of all sizes. SQL Server provides a complete and integrated system of database management software to meet the challenges facing today's organizations when deploying large-scale distributed information systems.

Microsoft SQL Server has the following characteristics:

  • Relational database management system (RDBMS). SQL Server structures data according to the relational data model, and operations proceed according to the rules of relational algebra, originally put forth by E.F. Codd in 1970.

  • SQL-based. Administrators, users, and application programs use the Structured Query Language (SQL) to interact with SQL Server.

  • Scalable. You can add additional processors to a SQL Server computer (provided the computer you selected supports multiple processors) to achieve performance gains and greater throughput without any significant reconfiguration of SQL Server itself.

  • High performance. Microsoft SQL Server has consistently performed well in benchmark testing conducted by the computer industry trade press and various testing laboratories. It is among the top performers available.

In 1988, Microsoft, working with industry partners Ashton-Tate and Sybase, introduced its first version of SQL Server, designed to run on the OS/2 operating system. Subsequent to its initial release, Microsoft decided to change the SQL Server platform from OS/2 to Windows NT. This change caused a major rewrite to the SQL Server kernel, but provided a powerful, multiprocessing RDBMS for the Windows NT environment. By 1992, Microsoft had begun to recognize that different priorities existed for Sybase and began to exert more control over the Microsoft version of SQL Server. Microsoft and Sybase eventually ended their cooperative development agreement, and Microsoft gained complete control of SQL Server development. They have subsequently added the following features:

  • Support for RISC platforms

  • A MAPI interface for developing message-enabled database applications

  • Data migration tools

  • Integration with OLE object technology and the Microsoft Visual Basic programming system

  • Language enhancements that include declarative referential integrity (DRI) and powerful server-based cursor support

Surveying Features in Microsoft SQL Server

This section introduces you to some of the features available in Microsoft SQL Server 6.0, included in BackOffice 2.0. It also acquaints you with features found in the latest release of SQL Server, version 6.5. SQL Server 6.5 has not yet been added to the BackOffice product, but is planned for inclusion in the next release. You may be eligible for an upgrade to the SQL Server included in BackOffice.

Enterprise Wide Administration

SQL Server 6.0 provides enterprise-wide system administration through an integrated management framework of objects, services, and components. To manage the enterprise, SQL Server utilizes the SQL Enterprise Manager, a graphical administration tool that provides easy management of a multiple-server environment including:

  • Scheduling

  • Administrator alerts

  • Built-in replication management interface

SQL Enterprise Manager also makes it easier for the database administrator (DBA) to manage:

  • Logins

  • Access privileges

  • User groups

  • Devices and databases

  • Script development

  • Database and transaction log backups

  • Database components (tables, views, stored procedures, triggers, indexes, rules, defaults, and user-defined data types)

SQL Enterprise Manager also contains a new component called SQL Executive, which replaces the SQL Monitor from version 4.2. SQL Executive enables replication and provides scheduling of SQL Server events, DBCC operations, and other administrative maintenance tasks.

The SQL Enterprise Manager has been enhanced in version 6.5 to include a Database Maintenance Plan wizard. This wizard guides you through the process of creating a plan for routine maintenance and scheduling automated tasks controlled by the SQL Executive.

In addition to the Enterprise Manager, it is possible to automate repetitive tasks. This can be accomplished using two convenient methods. First you can create SQL scripts specifying maintenance operations. You could, for example, run the Database Consistency Checker (DBCC) on all the tables in a database to ensure their basic integrity. Second, you can create Microsoft Visual Basic for Applications (VBA) macros (using Microsoft Excel, for example) that interact with SQL Server through the Distributed Management Object (DMO), SQL Server's OLE interface. This is a feature first implemented in version 6.0 and enhanced in version 6.5. An example of an appropriate use of this capability would be to create a graph showing the size of each table in a database.

Some new tools have been added to SQL Server 6.5, including the following:

  • SQL Web Page Wizard. This wizard guides you through the process of publishing HTML pages on the World Wide Web that use a SQL Server database.

  • SQLTrace. This graphical utility allows you to record and monitor database activity for benchmarking and troubleshooting.

  • Microsoft Query. This easy to use query tool has been available in the past as part of the Microsoft Excel product, but can also be used as a stand-alone query tool.

Data Integrity

In a client-server database environment, the server enforces integrity rules automatically and provides a reliable foundation for client-server computing. SQL Server utilizes several mechanisms to enforce data integrity. Starting with version 6.0, SQL Server provides declarative referential integrity (DRI), allowing the user to define data integrity restrictions for a table and relationships between tables, and to provide assurances that related key values are consistent. This is done to enforce referential and business rule integrity, ensuring that cross-references among tables are valid and that changes to the database are consistent with business rules. To provide entity integrity, SQL Server includes unique indexes, ensuring that the values in key columns are unique because each record is a unique row, and no two rows have the same values in key columns. SQL Server also utilizes defaults and rules to enforce domain integrity, ensuring that the values in certain columns are legal.

Distributed Transaction Coordinator

An important new feature added to SQL Server 6.5 is the Microsoft Distributed Transaction Coordinator (MS DTC). The MS DTC combines the component object model (COM) with transaction technology and the two-phase commit protocol. Using this new functionality, software developers can create powerful new applications that create transaction objects and enlist the help of resource managers (a SQL Server for example) to complete the work of the transaction. The status of transactions can be monitored using new features that have been added to the SQL Enterprise Manager (see fig. 17.1).

Fig. 17.1 - Use the SQL Enterprise Manager to monitor the status of transactions using the MS DTC.

Replication

Microsoft SQL Server 6.0 includes data replication as a standard feature of the RDBMS. Through data replication, the user is able to automatically disseminate copies of transactional data across the enterprise from a single source server to one or more destination servers at remote locations. To allow for high-availability of synchronized, distributed data, continuous store-and-forward updates are provided. Direct configuration and administration is accomplished through the SQL Enterprise Manager application. The replication stored procedures provide the capability of manual configuration.

In version 6.5, the replication features have gained an important new capability. It is now possible to replicate to ODBC subscribers including Microsoft Access and ORACLE. In addition, you can now replicate text and image data types, something that was not possible with SQL Server 6.0.

Performance Features

Microsoft SQL Server 6.0 provides powerful new features, as well as expansion of existing system tools and procedures. SQL Server 6.0 extends the standard conventions set forth by the American National Standards Institute (ANSI) by including scrollable, server-based cursors to support forward, backward, absolute, and relative-position cursors. SQL Server 6.0 also includes:

  • Enhanced database consistency checker (DBCC)

  • Advanced configuration options

  • Parallel data scans (asynchronous read ahead) on sequential page operations

  • Added keywords and reserved words

  • Query optimizations

  • Enhanced RAISERROR statement

  • Session settings

  • New system procedures

Language Features

SQL Server 6.0 includes enhanced language features to allow ANSI-standard compliant CASE expressions, as well as expanding the capabilities of Transact-SQL. SQL Server includes a double quote delimited identifier feature, which allows the server to bypass language within double quotes. Also included is a new Data Definition Language (DDL), which utilizes Transact-SQL statements to define databases and database objects and to manage their properties. An enhanced Data Manipulation Language (DML) utilizes Transact-SQL statements to manipulate database objects, and a Data Control Language (DCL) allows Transact-SQL statements to permit access to database objects.

In the next few sections, you learn about the general principles of RDBMSs, the role of DBA, and the basic principles of database design. Then you begin to explore the tools used to manage SQL Server.

With version 6.5, SQL Server has been enhanced to meet the American National Standards Institute (ANSI) SQL-92 standards. SQL Server 6.5 has been certified to be compliant with the Federal Information Processing Standard (FIPS) 127-2 created by the National Institute of Standards and Technology (NIST). Specific enhancements to Transact-SQL include the following:

  • The CREATE SCHEMA statement

  • ANSI-standard joins in the SELECT statement

  • The GRANT WITH GRANT OPTION and REVOKE GRANT OPTION FOR statements

  • ANSI-standard NULL and DEFAULT support

  • The DUMP TABLE and LOAD TABLE statements, which allow a single table to be backed up and restored

What Is a Relational Database Management System?

A relational database management system (RDBMS) is a software product that structures data in accordance with the relational data model and permits data manipulation based on relational algebra. Users are protected from the low-level technical details associated with manipulating the data because the physical aspects of data management are handled by the RDBMS. See "The Relational Data Model," later in this chapter for more information on data modeling.

The framework of the RDBMS is a collection of tables, consisting of rows and columns, views, indexes, and other objects with related data. SQL provides all the data definition and update capabilities needed in any database language. With SQL statements, you can create new tables and views, add new data, modify existing data, and perform other functions. Figure 17.2 shows examples of views on relational database tables.

Fig. 17.2 - A relational database is a collection of tables.

The first, and most important, function of an RDBMS is to provide a means for storing, retrieving, and updating data. Database management systems, in general, were originally conceived to provide an organized way to structure data. By structuring data, the job of managing interaction with the data was simplified.

The basic premise of RDBMS products is to separate data management functions from application functions. This concept works well when combined with the client-server process model. The job of data management is isolated to the RDBMS; which in the client-server sense is a service provider, or server. It is implemented on a Windows NT platform as a server-based service. Applications request data management services from the RDBMS and process the response. Data management services include, at a minimum, the capability to define and manipulate data.

Data Definition and Manipulation

With any database management system you desire the capability to perform the following tasks:

  • Define the structure of a database

  • Manipulate the database

To define and manipulate a database, it is necessary to have a means for communicating instructions to the database management system. This facility is typically provided via a programming language. Two types of language are necessary to support these activities:

  • Data Definition Language (DDL)

  • Data Manipulation Language (DML)

In the case of RDBMS products, the DDL and the DML are both provided by the same language - Structured Query Language (SQL). SQL uses relations, and the rules of relational algebra, to transform input into the desired output.

What Is SQL?

SQL is the industry-standard language by which applications communicate with RDBMS and is the de facto standard on all computing platforms, from mainframes to PCs.

SQL has a long history dating back to the late 1970s. Although many commercial versions of SQL were available during the 1980s, the first SQL standard was not developed until 1986. ANSI and the International Standards Organization (ISO) combined to create this first standard. It was enhanced in 1989 and took on the name SQL-89. Since that time, ISO has released a new standard in 1992 known as SQL-92, or SQL2. Currently under development is SQL3, which is significantly more comprehensive than SQL2. SQL has also been formally adopted as an international standard by the International Electrotechnical Commission (IEC) and as a FIPS for the United States federal government.

SQL is a language of relationships, and these relationships transform user inputs into desired outputs. As an RDBMS communication language, SQL is viewed as a high-level language because queries are English-like in form. SQL queries are also described as non-procedural in form because the user determines what information is necessary to be retrieved, not how to retrieve it. The RDBMS controls manipulation of the database and determines the best data retrieval method. SQL queries gather sets of data, unlike languages with the capability to return only one record at a time.

A database language standard provides portability of database definitions and database application programs among conforming environments. Where there is interchange of database information between systems, utilizing a database language standard is appropriate. To support the relational data model, the structures and operations implemented by the RDBMS are defined. Additional components that support data definition, data access, security, programming language interface, and data administration are also defined. SQL is the standard for database applications requiring flexibility in data structures and access paths to the database.

SQL supports data definition by allowing you to communicate the structure of a relational database to the RDBMS. Data definition provides the capability to define the following:

  • Table names

  • Column names

  • Data types

  • User privileges

  • Valid or allowable data values

SQL also provides facilities for data manipulation, including:

  • Querying the database for data retrieval

  • Adding data to the database

  • Updating data in the database

  • Removing data from the database

Tables can be managed to produce new tables, unions, intersections, or joins. A union of two tables creates a third table containing all rows that are either in the first table, the second, or both. An intersection produces a table with rows from the first table that are identical to rows in a second table. To produce a join table, rows in the two tables that have identical values in corresponding columns are selected (see fig. 17.3).

Fig. 17.3 - Tables can be joined by linking identical values from columns with corresponding information. This figure depicts a diagram showing connections between columns that contain the linking values.

SQL Server provides these basic functions through a language called Transact-SQL. This language is an extension of SQL. As such, it contains all the features of SQL along with several enhancements in support of specific SQL Server features.

Extensions to SQL

Transact-SQL is the language at the heart of Microsoft SQL Server. Transact-SQL provides a control-of-flow language enhancement, designed to extend the power of ANSI-standard SQL. Transact-SQL supports the requirements of mission-critical applications and reduces the need for a more complex programming language to accomplish a desired task. The enhancements provide a powerful programming language to boost system performance, provide stringent data integrity, and efficiently perform complex data manipulation.

ACID

Users interact with the RDBMS by making requests for database management services. Each request equates to a unit of work for the RDBMS. These units of works are referred to as transactions. Each transaction typically requires multiple operations on the database to complete it in its entirety.

A single unit of work, or transaction, can be handled easily by any commercial RDBMS. Complexities arise in situations where there are multiple, concurrent users of the same database. In this situation, it is possible that users will simultaneously request operations on the same elements of data. An important property of the RDBMS is to ensure the successful completion of each transaction while providing protection of the data.

This property is often referred to as ACID. In 1983, Andreas Reuter used this term to describe:

  • Atomicity

  • Consistency

  • Isolation

  • Durability

RDBMS products must possess all these characteristics.

Atomicity

When users submit transactions to the RDBMS, they are interested in knowing if all the operations associated with the transaction completed successfully. If any one operation failed, then the user would prefer to be notified of the failure and have the database returned to its state prior to any of the transaction operations being performed. This all-or-nothing property is referred to as atomicity.

To provide this capability, RDBMS products record each operation associated with a transaction in a log file. At the beginning of a transaction, a marker is placed in the log file indicating that a transaction is beginning. Thereafter, each operation is recorded in the log file until all operations are complete. As soon as the last operation is complete, the RDBMS marks the transaction as complete, or committed. If a problem occurs, the transaction log is used to roll back the transaction.

Consistency

If at any time during the execution of transaction operations a failure occurs, the transaction will not be able to complete successfully. In this situation, the RDBMS must undo the operations it has completed up to the point of failure in the transaction. Ensuring that the database is never left in a corrupt state, even in the event of a transaction failure, is known as consistency.

Transaction logging and recovery provides the means for fulfilling this requirement of an RDBMS. If all operations of a transaction are not completed, the RDBMS will go back through the log file, up to the beginning-of-transaction marker, and undo each operation. This action is known as rollback and recovery.

Isolation

The most typical use of an RDBMS today involves multiple, simultaneous users. These users are concurrently attached to the RDBMS requesting services at the same time. Occasionally, an operation is requested on an element of data by more than one user at the same time. The characteristic of RDBMS products that protects users from operating on the same data at the same time is referred to as isolation.

The way an RDBMS addresses the problem is by locking the element of data for which the operation is being requested. The data element is locked prior to the operation being performed and subsequently unlocked after the operation is completed.

The lock is typically a logical construct that prevents other users from performing some operation on the element of data during the duration of the lock. In this manner, locked users are protected from operating on data in an undefined state because it may currently be in the middle of being operated upon.

Deep inside every RDBMS is a set of processes that manages concurrency through the implementation of a locking scheme. The job of the RDBMS is to avoid situations that may arise due to concurrent use of the same data.

One such example is the case of deadlocks. This situation arises when one user is awaiting a lock to clear on an element of data while another user is awaiting a lock to clear on an element of data that is locked by the first user. These two users will wait forever for each other's lock to clear. The RDBMS concurrency management function will detect this situation, allow one user to proceed, and return an error to the other user. Other such situations are also handled by the RDBMS to allow users to successfully complete each of their desired operations.

Although this may seem like a simple idea, it is fraught with complexities. Many of the complexities lie in the physical aspects of protecting data stored in the database. Other complications arise out of the fact that transactions typically involve operations on multiple elements of data at the same time.

Isolation is the property of RDBMS products that ensures that each transaction is performed as if the requesting user is the only user of the RDBMS. Furthermore, each transaction must be performed without interfering with any other transactions being completed at the same time.

Durability

This property of an RDBMS states that the effects of a transaction are permanent. If the RDBMS fails for any reason, including a system crash, the database should include the impact of all completed transactions after restart. Incomplete transactions should not be reflected in any way.

Security

Every organization has an interest in protecting the data stored in its databases. Some data contains confidential information and should not be viewable by certain individuals or groups of individuals. Other data may not be as confidential, but may be critical information that must be safe from change or damage.

SQL Server provides a means to protect data in these situations. Protection is offered at various levels of the relational data model. In SQL Server, it is possible to secure each identifiable element of data, including tables, rows, columns, and individual fields.

Four types of privileges can be granted:

  • Read

  • Add

  • Update

  • Delete

To each element of data you may assign any, or all, of these privileges to any given user. In doing so, you establish a security matrix. The rows represent users; the columns represent data elements; and the cells indicate the permitted operations.

The security matrix should be created and managed by the administrator responsible for security. This most likely will be the database administrator. See "What Does a DBA Do?" later in the chapter for a thorough discussion of the DBA's responsibilities.

See "Using SQL Server Security," (Ch. 19)

Distribution Services

Many organizations must physically distribute data across several locations or interact with data from external sources. Usually, you might think of an RDBMS as managing the data within the databases it controls. Given the need to distribute data, however, this notion is obsolete. See "Data Distribution" later in the chapter for more information.

One of the more important features expected from modern day RDBMS products is the capability to manage distributed data. It may be necessary to spread the data across physically separate servers for capacity or performance reasons. It may also be necessary to distribute copies of data to get it closer to the users who need it. Many complications arise when databases are distributed to different physical servers, in potentially very different locations.

Consequently, it is desirable for an RDBMS to be able to synchronize with one or more instances of a similar RDBMS. In this sense, synchronization refers to copying an element of data to the required additional server(s) whenever it changes. If an RDBMS provides this service, it is said to be able to replicate data.

Replication is an important feature of an RDBMS. It is also desirable for a given RDBMS product to be able to replicate to and from competing RDBMS products. In this way, your RDBMS can exist in a heterogeneous computing environment and share data with other database management systems.


The capability to distribute data through replication in a manner transparent to users is an important characteristic of modern RDBMS products. Although it is nice to be able to distribute data across multiple servers, it may not be useful if users are required to know the location of every element of data they desire to view or modify. Transparency implies that the data is available to users as if it all existed on the same RDBMS. In other words, it should be "transparent" to the user that data is located in multiple physical locations.

Administration Utilities

Many activities are associated with the operation and management of an RDBMS. Database administrators perform these activities, as well as determine and set appropriate configuration parameters for the RDBMS.

To facilitate the convenient administration of an RDBMS, it is customary for RDBMS products to be delivered with a set of administration tools. These tools exist in the form of applications that interact with the RDBMS. The activities supported by these administration tools include the following:

  • Configuring the RDBMS

  • Setting up users

  • Granting user privileges

  • Monitoring and performance tuning

  • Troubleshooting problems

Much of the administration can be performed from client PCs. This enables administrators to manage the operation of their database management system using a desktop computer. In addition to the administration tools provided by the RDBMS vendor, the open architecture of RDBMS products allows the development of third-party administration tools.

What Does a DBA Do?

The database administrator (DBA) is responsible for managing the organization's data, including the hardware, software, tools, and human resources associated with managing data. The DBA typically reports to top-level information systems management and is given the authority to make decisions about the application of resources, human and capital, to the business of managing the organization's data. The DBA's domain of responsibility for data management normally encompasses the entire organization.

In small organizations, it is possible that the DBA role is allocated only a portion of an individual's time. In such cases, the DBA may also be the network administrator, among other duties. It is most important that someone be assigned responsibility for carrying out DBA functions. In larger organizations, the DBA role is performed by a team of people. For purposes of this discussion, the term DBA applies equally to both situations. Regardless of the size of the organization, all the responsibilities of the DBA must be carried out by someone or some group of people.


Someone in the organization must perform the role of the DBA. Many organizations overlook the significance of database administration and, therefore, assume that it can be performed by off-site staff or via external support. This attitude will eventually lead to serious problems as the organization relies more heavily on the RDBMS to support production systems.

DBA roles and responsibilities can be divided into two major categories:

  • Administrative

  • Technical

Administrative functions focus on those aspects of database administration external to the physical act of managing the database; whereas technical functions are centered on the hardware, software, and data.

Administrative Functions

The DBA is responsible for preparing the database environment, as well as managing its operation. Preparing for a database environment includes dealing with the organization's personnel and technical vendors.

Standards and Procedures

Database management and usage should be performed in accordance with a set of standards and procedures. It is the DBA's responsibility to develop these policies. In this manner, all database administrators and users will follow a consistent set of guidelines and operate under a similar set of expectations. Some examples of standards and procedures that the DBA should develop are as follows:

  • Security policies

  • Backup and archival procedures

  • Disaster recovery procedures

RDBMS, Tool, and Vendor Selection

The DBA should be responsible for selecting which RDBMS products will be used in the organization. In addition to selecting the RDBMS, the DBA will select associated tools, such as administration and design packages. It may also be the DBA's responsibility to select product vendors and carry out negotiations for the acquisition of these products.

Security

The DBA is responsible for the security of the database. This is one of the most important DBA responsibilities. Management typically views the DBA as the "protector of the data." As such, the DBA assigns user privileges with respect to the database.

Capacity Management

As more applications utilize the RDBMS and more users manipulate the database, some aspect of RDBMS operation will be taxed. It may be that the database has grown beyond the size of available storage capacity. It may be that the number of users has increased to the point that not all users can be serviced in a reasonable amount of time.

In either case, it becomes apparent that some aspect of the RDBMS has exceeded its capacity. In the first example, the storage device restricts the growth of the database. In the second example, the RDBMS server memory or processor utilization has exceeded reasonable limits of operation.

The DBA has the responsibility of repairing these situations when they occur. More important, the DBA must make sure that these situations never arise by anticipating future needs and configuring servers accordingly.

Backup and Recovery

After an organization becomes dependent on the RDBMS, users expect it to work all the time. Unfortunately, the RDBMS, or the hardware upon which it operates, will eventually fail. When this occurs, it is necessary to return to normal operation as quickly as possible.

To accommodate this situation, the DBA must make a copy of the database at regular intervals. This activity is known as backing up the database. The frequency of backups should be dictated by the need for recovery in case of failure. At the time of failure, the DBA can simply restore the database to its state at the time of the last backup.

Copies of the database should be stored in off-site locations, as well as at the same physical location as the RDBMS. This allows the DBA to restore normal operations even in the event of a disaster in which the entire computing environment is damaged. This responsibility of the DBA is referred to as disaster recovery.

Technical Functions

In addition to preparing the database environment, the DBA must manage its operation. These types of activities tend to be hands-on.

Development, Test, and Production Environments

Applications usually exist in various states throughout their life-cycle. After an application is designed, it is placed into development. During its creation, the application is in the hands of a developer, or a team of developers, who builds the application. Similar to how contractors construct a building, application developers first assemble the foundation, next create the structure, and finally, complete each module. While an application is under construction, developers will need access to the RDBMS for testing purposes. The RDBMS designated for application development is known as the development server.

After the application has been assembled and all its individual components have been tested, it is advisable to test the entire application. This testing should be performed in a computing environment similar to the environment in which the application will eventually be placed into operation. The RDBMS designated for application testing is known as the test server.

Finally, if the application has passed all tests, it can be placed into production usage. It then will be installed onto the production computing environment for operation. The RDBMS designated for live application operation is known as the production server.


It is necessary to establish separate computing environments for application development, application testing, and application operation.

Many organizations overlook the significance of setting up separate RDBMS servers for each of these situations. Application users will be annoyed if application testers take down the RDBMS to test the outcome of a different server configuration. Application testers may become annoyed with developer experimentation on the RDBMS. These three different groups of people will frequently interfere with one another if they use the same RDBMS; therefore, it is advisable to establish three physically distinct computers for these tasks.

Support

Application developers and users require assistance from time to time when using SQL Server. An important role of the DBA is to provide support and guidance to developers and users as their needs dictate. At a minimum, the DBA should be responsible for suggesting resources for developers and users to consult for an answer to their question.

Performance Measurement and Tuning

After an application is placed into production, users frequently encounter unacceptable levels of performance. This usually occurs the first time the application is tested and eventually occurs as characteristics of the database change, such as its size. When the performance of the database diminishes, the DBA is called upon to adjust the operational characteristics of the RDBMS to improve performance. This activity is known as performance tuning.

As applications are being tested for the first time, developers require the support of the DBA to "get the most" out of the RDBMS. This is a normal part of the process for developing client-server applications using RDBMS products. The DBA is normally involved in this process.

After applications are placed into production, it is common for developers to move on to other projects or, in some cases, to other jobs. In such cases, the original developers are no longer available to adjust performance characteristics of an application. Because most applications encounter a performance problem sooner or later, the DBA must get involved. In these situations, the DBA inspects the behavior of the application and adjusts the RDBMS to, hopefully, improve performance. At a minimum, the DBA should make recommendations for application improvements that developers can later address.

See "Tuning SQL Server," (Ch. 19)

Database Design

The topic of whether a DBA should perform database design is frequently debated. The ideal DBA should be capable of doing database design. It is essential that the DBA be qualified in this area to effectively administer the RDBMS.

Nevertheless, many organizations assign DBA responsibilities to individuals with no database design experience. In these situations, conflicts arise between application developers and the DBA as to how the RDBMS should be configured. Moreover, the DBA can potentially lose control of his primary responsibility - the data.

Ideally, your DBA will not only be qualified in database design, but also will be the chief database designer in your organization.

Process Model Architecture

Just as the DBA should be qualified for database design, the DBA must also be qualified in the area of process model architecture. To do so, the DBA must be an expert on the client-server process model and fully understand relational database concepts. See "Technical Underpinnings" later in the chapter for a complete discussion of these topics.

When client-server applications are developed some of the processes are executed on the RDBMS. In some situations, the DBA is responsible for the development of these server-based processes. At a minimum, the DBA should be responsible for the design decisions that dictate which processes will execute on the RDBMS server and which should execute on the client workstation.

Ultimately, the DBA is responsible for performance of the RDBMS server. If the server is overburdened with application processes, then it will have insufficient processing power to manage the database. Again, it is the DBA's responsibility to understand, manage, and possibly develop the server processes.

Designing Databases

Through the continued advancement of database technology and the competition of the open market, more sophisticated uses of data and information become possible and desirable. Without modeling the organization's use of its data and keeping your model current, significant losses are possible. This is due to the continued constraints of currently installed database technologies, when new technology may allow a wider and more profitable use of the data. In addition, if you wait until after system upgrades are implemented to construct and update data models, it becomes extremely costly and difficult to implement database structure changes. For these reasons, database design, and data modeling, should be considered a key focus when planning strategies to meet organizational objectives.

Your success with SQL Server will be driven by the effectiveness of your database designs. Now that you have learned what to expect from an RDBMS and the specific features and tools of SQL Server, it is time to explore database design. Database design, and in particular relational database design, is a widely published topic covered in many excellent books and periodicals.

It is beyond the scope of this book to provide a comprehensive coverage of relational database design. However, it is useful to understand relational database design issues as they apply to SQL Server before jumping into the product itself. If you are a relational database expert, this will serve as a review. If you are new to relational databases, this will serve as an introduction. You are encouraged to pursue these important topics apart from this book.

Technical Underpinnings

The most successful database designers understand the principles of underlying technologies. Database designers must fully comprehend the principles of:

  • The client-server process model

  • Relational database management systems

  • Relational database design

  • Distributed systems architecture

Current and aspiring database designers should also be well-versed in object-oriented technology. The popular RDBMS vendors, including Microsoft, tend to approach object-oriented database management with each new release. Although you will rarely hear a major RDBMS vendor refer to their product as an Object-Oriented Database Management System (OODBMS), you will notice that the RDBMS products include more and more object-oriented features with each new release.

Relational Data Model

Relational database concepts are based on a branch of mathematics known as relational algebra. It is not necessary to be an expert in relational algebra to be successful with relational database design and implementation. However, those adept at mathematics tend to traverse the learning curves associated with applying relational database concepts more quickly.

The idea of a relational database first originated in IBM research laboratories in the late 1960s. A researcher, E.F. Codd, set about to address database issues of the day by applying mathematics to the problem. The result of his efforts spawned the popular idea of a relational data model for solving data management problems.


Your knowledge of the principles of the relational data model will influence your success with SQL Server.

In its purest form, the relational data model is simply a foundation for logical modeling. It is commonly misunderstood that at the root of relational database issues is a logical model of the data. Confusion about the relational data model centers on the physical aspects of relational database implementation. These physical implementation issues, such as performance, have little to do with the underlying foundation defined by the relational data model.

Codd's work manifests itself in the form of a relational data model. Relational databases consist of tables with rows and columns. These tables represent sets of information. By applying set theory Codd devised a means for defining which operations can be performed on these sets of data. In turn, this manifestation is what we know as SQL today.


Do not confuse Codd's relational data model with your logical data models. Your logical data models are simply an implementation of the theoretical foundation put in place by his work.

SQL Server is an implementation of the relational data model. Microsoft continues to apply the formal aspects of relational theory to the physical implementation of their relational database product. For this reason, it is useful to understand the underlying principles of the relational data model when applying SQL Server to solve your data management problems.

Client-Server Process Model

Contemporary RDBMS products rely heavily on the client-server process model. It is common for the two to be confused. Many people fail to understand that RDBMS and client-server are two distinct aspects of technology today. RDBMS just happens to be one of the most successful examples of products based on the client-server process model.

In Chapter 1, "Understanding Microsoft BackOffice," you learned that the fundamental concept of client-server is to distribute process execution across the computing enterprise. The processes associated with data management are handled by the RDBMS. Popular RDBMS products, such as SQL Server, are delivered as server-based applications.

In turn, the applications you build or purchase that rely on RDBMS are also based on the client-server process model. This concept manifests itself as process architecture. SQL Server, as well as the other popular RDBMS products, provides a means for creating processes that will execute on the RDBMS server. The most common type of process that executes on the RDBMS is known as a stored procedure.


A stored procedure is simply a process that executes on the RDBMS. In this manner, it is possible to take advantage of the client-server process model by distributing some of the processing to the server application, which in this case is the RDBMS.

Triggers, or triggered procedures, are another example of a type of process supported by SQL Server and other RDBMS products. Triggers, similar to stored procedures, are processes that execute on the RDBMS server application. The difference between triggers and stored procedures is that the execution of a triggered procedure is tied to an event. A triggered procedure will commence execution when a defined event occurs, such as a data value changes or a table update takes place.

Stored procedures and triggers provide a means to place processes on the RDBMS server application. Your applications will also contain processes that execute on client PCs. These processes interact with the RDBMS for data management services. As such, your applications rely on the client-server process model and RDBMS services.

Architectural Issues

Successful database designers give ample consideration not only to how the database is structured, but also to how the data is processed and moved. These issues external to the database itself are often referred to as architectural issues.

When architects design buildings, they are concerned with, among other things, the functions of the building and the amount of traffic it will bear. They contemplate how many rooms to build, how the rooms should be arranged with respect to one another, how traffic will flow from room to room, and the size of the rooms in comparison to the overall size of the structure.

Database designers have to make similar architectural decisions. In the case of database design, this amounts to determining:

  • Where, when, and how the data should be processed

  • Where, when, and how the data should be distributed

  • For what purpose the data exists

Process Model

During database design it is important to understand the client-server process model. You have already seen how SQL Server is based on the client-server process model and how this process model manifests itself in SQL Server.

From the perspective of the database designer, process modeling amounts to determining where the data will be processed. On an information network, processes may exist on server computers, such as the computer running SQL Server, and on the client PCs. Some of the important issues that deserve consideration when determining the process model for your application include the following:

  • Performance. Typically, server computers possess much more processing capacity than desktop PCs. Because SQL Server is scalable across multiple processors, it is common for the server computers for SQL Server to contain multiple processors. This additional processing power provides some incentive for placing processes on the server computer. On the other side of the coin, you must take into account the number of client PCs that may request the execution of a stored procedure or trigger at the same time. Too many simultaneous requests to execute a server process may overburden the server.

    See "Tuning SQL Server," (Ch. 19)

  • Network Bandwidth. Application processes communicate with each other across the network by passing messages back and forth. Client processes generate requests by sending a message to a server process. The server process receives the message, processes the request, and responds with a result. These messages back and forth generate traffic on the network. Consideration must be given to the amount of network traffic created by the interaction defined within your process model. In particular, it is important to understand the amount of network traffic you will generate during peak operating periods. This task is a bit difficult because most networks perform functions other than sending your application's messages back and forth. Your network administrator can assist you in understanding the network implications of your process model.

  • Data Integrity. Isolating data management rules to processes on the RDBMS is an effective way to preserve the integrity of your data. Those processes that contain the rules by which your organization manages or interprets data are best located in the RDBMS as stored procedures or triggers. This ensures that the rules will be applied consistently because they exist in one place only. Of course, consideration should also be given to the performance implications of placing too many rules on the server.

  • Maintainability. A good reason to isolate some processes to the server is that it makes it easier to maintain the application in the future. By placing processes on the RDBMS, as opposed to designing them into each of the client processes, there is only one place to go to make changes. This makes sense so long as each client expects the same behavior of the process. Care must be taken when isolating processes to the server not to penalize client processes by reducing flexibility. Nevertheless, isolating processes to the server can reduce the cost of maintenance.

Even after the most careful consideration of these issues, it is difficult to completely understand exactly how the application will behave once it is constructed and placed into production. Some techniques are available for predicting performance of a given process model while it still exists on paper. Most of these techniques are somewhat theoretical and difficult to apply.

It is common for application development teams to make their best guess on the process model during design phases. These guesses are based on their experience in similar situations and their understanding of the fundamentals of the underlying technologies. No matter how good a guess is made there is always a need to adjust the process model during testing - and sometimes even after the application is put into production. These adjustments to the process model should be allowed for in your work plan during the performance tuning phase.


The process model almost always changes during the performance tuning phase of application development. This should be expected and, therefore, it should be accommodated in the development schedule. It is frequently overlooked when creating work plans for building client-server applications even though it can sometimes take as long, or longer, than the software development phase.

Data Distribution

Where should the data be located? Unfortunately, the answer to this question is not as simple as it may seem. In today's world of departmentalized organizations, data exists in many locations and in many forms. Moreover, the same data may exist in multiple locations at the same time. Processed data is frequently sent on to other locations for management reporting purposes.

Few applications today have the luxury of existing in their own world with no need to distribute data. Therefore, it is imperative that you take into account where, when, and how the data will be distributed when designing a database. Some of the issues to consider when contemplating data distribution are as follows:

  • Capacity. Some databases grow so large that they outstrip the storage capacity of the server computer. In other situations, the server computer can accommodate the size of the database but does not perform adequately with so much data to manage. In either case, it is useful to split up the data and physically distribute it across multiple server computers. Although this is possible to implement transparently to the users of the data, it is more difficult to administer. Takes this into account when contemplating separating data for capacity reasons.

  • Delivery. Clearly, data should be stored in locations that facilitate its convenient delivery at the time of request. The definition of convenient delivery may be different depending on your perspective. A user will prefer to have the data located nearby so that it can be retrieved as quickly as possible; whereas an administrator will prefer to have the data located in a secure location within the physical domain of the administrator's control. Due to delivery performance requirements, it is possible that users in different physical locations will need the same data to be located at their sites.

    Storing the data in a central location is more convenient for the administrator, but it creates a situation in which some users have to wait longer for the data to arrive than others. This is especially true in wide area networks with users spread across multiple geographic locations. In these cases, it makes sense to place a copy of the data at enough locations to accommodate convenient delivery. Therefore, when designing your database, you must decide how many places it must exist to get it where it needs to go as fast as it needs to get there.

  • Redundancy. Delivery requirements frequently lead to the need to create copies of the data in other physical locations. How is it possible to have redundant data while preserving the integrity and consistency of the data? The answer to this question lies in a concept known as replication. SQL Server, as well as other popular RDBMS products, possesses the capability to "replicate" data in other locations. SQL Server has a built-in replication service that can be applied to any, or all, of your data. When you design your database, you must determine how the replication service will manage the distribution of your data in such a manner so as to preserve the integrity and consistency of the data.

Just as with process modeling, it is difficult to accurately predict how the data distribution plan will affect the behavior of an application. Time should be allowed during the testing phases to make adjustments to when, where, and how the data is distributed.


As your application undergoes changes over time, you must contemplate architectural changes, as well. It is a mistake to make significant adjustments to an application without also considering how the data distribution plan will affect its behavior. Sometimes it is appropriate to change the manner in which data is distributed in response to the changing needs of the users.

See "Data Replication," (Ch. 20)

On-line Transaction Processing and On-line Analytical Processing

Presumably, you are storing data in a database for a reason. It makes sense to store and manage the data differently depending on the purpose of the data and the reason for its existence. Today, databases are commonly broken down into two types:

  • On-line Transaction Processing (OLTP) databases

  • On-line Analytical Processing (OLAP) databases

These two types of databases support distinctly different purposes and, therefore, are designed differently.

OLTP databases are designed and built to support the operations of an organization. Sometimes these types of systems are referred to as production systems. The distinguishing feature of a production system is its transactional nature. These databases receive a constant flow of new or updated data with occasional, large bursts of new data or changes to existing data.

OLAP databases, on the other hand, are designed to support decision-making processes. The common term for an OLAP database today is data warehouse. A data warehouse is built to accommodate user reporting and ad-hoc queries. The applications that use data warehouses to support decision making and data analysis are frequently referred to as decision support systems (DSS).


One of the most important principles of database design is the separation of OLAP databases from OLTP databases.

Many designers overlook the significance of these two types of databases when designing their applications. OLTP databases should be designed to accommodate the quickest possible input of data into the database. OLAP databases should be designed to accommodate the quickest possible output of data from the database. Remember, OLTP is designed to support production systems (data input); whereas OLAP is designed to support reporting and ad-hoc query activities (data output).

These two types of databases have many differences, which should be accounted for in design. These topics are thoroughly covered in numerous books and a variety of periodicals. For the sake of clarity, it may help you to understand one of the most significant differences.

Decision makers frequently analyze historical data as a part of their thought processes. Retrieving data about the past is necessary to support this task. Therefore, it is often prudent for an organization to retain a fair amount of historical data. It may even make sense to store this data in summarized form, as well as in its original detail, to facilitate faster reporting of summary information.

Historical data and summary data tend to clutter production system databases. The larger a database grows, the less quickly it can accept new data. Those involved with the organization's operations expect their transactions to be accepted quickly and painlessly. As the production database grows, these times tend to increase.

Consequently, it is not wise to burden a production database (OLTP) with all this historical and summary data. It is also not advisable to burden the production database with all the requests for information from the decision makers. The queries against the database can also affect performance for the operations staff.

A better location for historical and summary data is in a separate database (OLAP). This database can be designed and built for the sole purpose of facilitating the convenient retrieval of data. Decision makers can query this data warehouse without concern for the impact of their query on the production systems. Moreover, the data can be stored in a form more conducive to human analysis.

This example demonstrates one of the important differences between OLTP and OLAP databases. Again, there are many other important reasons for separating production data from reporting data. You are encouraged to pursue this topic to ensure the best chance for success with your SQL Server database.

Detailed Design Issues

After architects lay out the design for a building, designers arrive to address issues such as appearance and functionality. Such is the case for database designers, as well. After the architectural decisions have been put into place it is necessary to move into the details:

  • Which design tools will be used?

  • To what extent will normalization be enforced?

  • How will data integrity be preserved?

  • How will the data be modeled?

  • Which indexing techniques will be applied?

Design Tools

Much like a writer uses a word processor to record ideas, a database designer uses a database design tool. Database design tools are sometimes referred to as computer-aided software engineering (CASE) tools. CASE tools exist for activities other than database design, as well.

Database designers use CASE tools to record their ideas and produce design documentation. In relational database design, this implies that the design tool is capable of generating a diagram of the data entities and their corresponding relationships. This is known as an entity-relationship (E-R) diagram. Most popular database design tools support the creation of E-R diagrams, among other design documents.

In addition to design documentation, some CASE tools will prepare the source code to create or update the physical database. This source code is created in the DDL for the database of your choice. This function is also known as schema generation.

Other CASE tools go far beyond design documentation and schema generation by producing application source code. Given an appropriate amount of information, the tool may be able to create header file definitions for database classes to be used in the application source code. They may also go so far as to generate source code for stored procedures and triggers based on the rules it interprets from the relationships you have defined in your data model.

After the data model is designed in a CASE tool and the schema is generated, you can create your database in SQL Server. As time goes on, you may choose to alter the database layout in SQL Server. If you only change it in SQL Server, the database will become out of sync with the data model in your CASE tool.


Always keep your data model in sync with your SQL Server database.

Most CASE tools do not support incremental changes to the data model. If your CASE tool does not automatically re-synchronize the data model with the physical database, you must do so yourself. Regrettably, the task of updating the data model in the CASE tool while keeping it in sync with the database is "manual." To do this job yourself, you must change the data model in the CASE tool, generate the new database definition, export the data, create a new database based on the new definition, and reload the data from the previous database. Some sophisticated CASE tools will automate portions of this task. Other CASE tools promise to provide this type of function as they become more tightly coupled with RDBMS products in the future.

Database design tools come in many sizes and flavors. Regardless of your level of expertise, if you are designing a database for SQL Server, you should be using a database design tool for the job. At the very least, it will serve as a documentation aid in support of your design efforts.

Normalization

Relational database designers must be knowledgeable about normalization. It is beyond the scope of this book to define normalization. However, it is important to explain the significance of normalization as it relates to your SQL Server implementation.

One measure of the effectiveness of a logical data model is the level to which it is normalized. At the base level of the relational data model is first normal form (1NF). Second normal form (2NF) and third normal form (3NF) database designs are possible to deploy in SQL Server.

For the most part, the minimum level of acceptance for an OLTP database is 3NF. Denormalizing a 3NF data model for the sake of performance usually leads to difficulty somewhere down the road. You are encouraged to maintain, at least, 3NF with your physical data models. Having said that, you should recognize that this is not always possible in real-world projects. In some cases, logical data models are taken to 3NF, but a denormalized form is implemented.

OLAP databases, however, tend to be somewhat denormalized. In support of user reporting and ad-hoc queries, it is acceptable to intentionally model the database to include repeating groups, for example. Less than 3NF is acceptable in this case because the database is almost never updated by users (it is populated by production databases), and the data is frequently retrieved in groups.

Integrity

Normalized relational databases, by definition, model organizational processes by breaking them down into logical components suitable for processing by SQL Server. A concept at the application level may be modeled in the database design as several components, or entities. It is important that the relationships these entities have with one another are preserved as the data is processed.

For a moment, imagine that you have in your hand an order for several items from one of your customers. In front of you exists a single piece of paper, which you refer to as an order. SQL Server, on the other hand, will likely (assuming that 3NF is achieved) capture the order as two or more entities. At a minimum, there will exist an order header file that contains attributes such as customer name, ship to address, order date, and order number; and an order detail file that contains information such as item ordered, quantity ordered, item price, and extended price.

While you mentally process this as one object (the order), SQL Server processes this as two objects (the order header file and the order detail file). One aspect of integrity is the preservation of this important concept. If you "tear up" the order, it is entirely eliminated. If you ask SQL Server to delete the order, you must be careful to instruct it to remove both the order header information and the order detail information. Furthermore, you must ensure that either both sets of information are completely removed or that nothing is removed.

Accounting for this type of situation is critical to the success of your SQL Server databases. You must familiarize yourself with the various types of integrity and at what level you will enforce their compliance.


You must enforce integrity compliance to ensure accurate and consistent data.

Logical Data Model

The first step in database design is to map the application requirements to the database. The database designer, or data modeler, analyzes the requirements and constructs a conceptual model for fulfilling those requirements. In this case, the model is comprised of relational database components. These components go by many names depending on which school of thought you follow. Regardless of the naming scheme used, the same basic concepts apply.


Application requirements are not the only possible source of originating information for the data modeler. It is common in today's world for a data modeler to begin from a conceptual object model. Several methodologies exist for mapping object models to relational data models. In fact, there are methodologies that support mapping from relational data models to object models. Design tools are available today that aid in the development of an object model and a data model at the same time, regardless of the direction your methodology takes you.

You must follow the principles of relational database technology in designing databases for SQL Server. The process of "mapping" requirements to the database results in the identification of the logical database entities. For each entity, you must define its attributes and how it relates to the other entities. As the entity relationships are defined, the keys are identified. See "The Relational Data Model" earlier in the chapter for more information.


You are encouraged to study the principles of relational database design if you are not intimately familiar with this topic.

The pictorial representation of how entities relate to each other is known as an Entity-Relationship (E-R) diagram. The end result of any logical data modeling exercise is the creation of the E-R diagram. All popular relational data modeling tools support the creation and maintenance of E-R diagrams. Other diagrams that support the logical data model are also available from these tools.

Creating a logical data model is a necessary step in every database design project. It is with the logical data model that the database designers, application architects, and application designers perform mental exercises to validate system behavior. Even after the logical data model is transformed into a physical database, the logical model must remain in sync with the physical aspects of the database to support future decision making.

Physical Data Model

After the logical data model is prepared, analyzed, and approved, it is time to create the physical database. Often the physical database will be a different representation of the logical model? Why might the physical database be different from the logical model? The answer lies in the fact that humans analyze the model differently from how the RDBMS processes the data.

Logical data models are designed to accommodate the thought processes of a database designer. The representation of the physical database, also known as the physical data model, is designed to accommodate the best possible scenario for processing the data by the RDBMS.

Physical data models are concerned with exactly how the data is stored on the storage media. Database designers and database administrators may implement a physical database differently than depicted by the logical data model. This is necessary in most instances due to performance implications. It may also be necessary to accommodate environmental aspects such as the configuration of your network, servers, or client PCs.

After applications are tested and implemented, it is normal for the physical database to undergo changes to improve performance or repair a deficiency in the operational characteristics of the application. This process is known as performance tuning.


Performance tuning exercises frequently result in changes to the physical database. This should be expected as a part of optimizing application performance after testing begins.

A common mistake made during this work step is to let the logical data model get out of sync with the physical database. If you allow this to happen, you will lose the ability to analyze the database on its logical characteristics. Because this is the best basis from which humans can contemplate such issues, you run the risk of ending up with an inadequate data model. In the real world, damage at this level is irreparable because time is rarely allowed to start over again.


You must keep your logical data model in sync with your physical database.

Indexing

One of the most important decisions regarding the physical implementation of the database is how the indexes will be built. Selecting which way to build an index is a function of how you believe it will be used. SQL Server offers two ways to build indexes:

  • Clustered

  • Non-clustered

As the name implies, clustered indexing creates indexes such that the indexed order of the rows is the same as the physical order of the rows. Additionally, the lowest level of a clustered index contains the actual data. Because clustered indexes have such a direct relationship to the physical data, they provide the fastest form of data retrieval.

Also, because clustered indexing infers something about the physical arrangement of the data, there can only be one clustered index per table. If a clustered index is desired, the question becomes for which columns should the clustered index be applied. In most cases, the clustered index is generated for the columns that uniquely identify each row; that is, the primary key. This certainly makes sense in situations where most searches are performed on the primary key or when the primary key is used exclusively in WHERE clauses.

However, in some situations it makes sense to apply the only clustered index for a table to columns other than those representing the primary key. These situations include the following:

  • Searching columns sequentially, or by a range of values, is best supported by a clustered index.

  • Applying a clustered index to a column that is frequently sorted improves ORDER BY or GROUP BY performance.

  • Static columns and join columns benefit from clustered indexes.


When selecting the indexing method, it is important to know how the table will be queried in most situations. It is also important to know how SQL Server will optimize index usage. For the former, you need only know the application and how the users will use it. For the latter, the best way to gain this knowledge is through training and experience.

Non-clustered indexes typically outperform clustered indexes in situations where key values are being added or updated. On the other hand, non-clustered indexes usually cannot match the retrieval speed of a clustered index. This is because non-clustered indexes are physically one level removed from the data. In particular, the following issues are relevant to considering non-clustered indexes:

  • It is possible to have multiple non-clustered indexes per table. However, you should contemplate update performance if too many indexes are added. This is not a problem in situations where the database will be queried far more than updated, such as in OLAP databases.

  • Because non-clustered indexes tend to be larger than clustered indexes, the additional storage space required should be weighed against the extra cost of that capacity.

  • Non-clustered indexes are of less value in cases where many duplicate index values exist.

These indexing issues, among others, are important aspects of the physical design of your database. Database designers and database administrators will find their time is well spent exploring these issues. Tools like DBCC and the SHOWPLAN option for query analysis are instrumental in helping you develop and refine your database designs. The benefit of experience is valuable in these situations.

Enterprise Data Model

Another important topic that deserves discussion before leaving the topic of database design is enterprise data modeling. An enterprise data model is a database design that supports most, if not all, of an organization's production systems. It may also include one or more data warehouse components in support of the user reporting and query requirements.

The concept of building a single, large database for the entire organization is not new. However, only recently could RDBMS products such as SQL Server deal with the complexities and size of enterprise databases. Because there are many advantages to modeling the enterprise on the whole and because SQL Server can accommodate this approach, many organizations are now at various stages of enterprise data model development or implementation.

Enterprise data models are typically built by a single individual or team of database designers. It should not be a surprise that better data models are developed when the job stays in one group, as opposed to being spread across the organization and into the application development teams.


Enterprise data models do not eliminate the need, nor the benefit of, departmental databases. In fact, a well-designed enterprise data model includes rules and processes for synchronizing with departmental databases.

Most enterprise data models are built in increments. This design approach is used primarily because it is difficult to design the entire data model all at once. Rarely is there enough time to finish the entire design before the first application goes into production. Consequently, the enterprise data model tends to grow as each new application is brought into existence. The bigger the enterprise data model gets, the easier it is to build new applications. Each new application has the benefit of the existing data model and associated processes, including stored procedures and other aspects of the database.

Overall, the stability of applications is improved due to the consistency within their foundation - the enterprise data model. For this reason, the up-front investment in enterprise data modeling pays off over time.

Surveying the Available Tools

Now that you have been briefly introduced to SQL Server and learned a little about the technology upon which it is based, it is time to learn about some of the actual tools and utilities used to configure and manage SQL Server. This section introduces you to the SQL Enterprise Manager, the main tool used by DBAs to work with SQL Server. It also reviews the other special-purpose tools that complement the SQL Enterprise Manager.

The SQL Enterprise Manager

This tool is an updated combination of the older SQL Administrator and SQL Object Manager tools offered in version 4.2. The SQL Enterprise Manager, as shown in figure 17.4 is a feature-rich management tool that allows DBAs to manage SQL Servers anywhere on the enterprise network from a single computer running Windows NT or Windows 95.

Fig. 17.4 - The SQL Enterprise Manager connected to the SQL Server named DATASRV.

The SQL Enterprise Manager allows each DBA to create server groups to organize the SQL Servers for which he or she has administrative responsibility. Another administrator running the SQL Enterprise Manager on a different computer has the freedom to create a different set of server groups reflecting a different view of the network appropriate to a different set of responsibilities.

Many different tasks now can be performed using the SQL Enterprise Manager. Most of the things you learn how to do in this chapter and the two that follow are done using the SQL Enterprise Manger. The following is a short list of the operations you can perform with this tool:

  • List the devices defined on a SQL Server

  • List the databases defined on a SQL Server

  • List the tables, indexes, stored procedures, and other objects defined for a database

  • View the structure of a table

  • List the definition of a stored procedure or trigger

  • Display the replication topology showing the relationship between publishing and subscribing servers

  • Display the tasks, schedule, and status of the SQL Executive

  • Initiate a transfer of data and associated objects from one SQL Server to another using a capability very similar to the stand-alone SQL Transfer Manager tool

This list is by no means comprehensive, but offers a glimpse at the depth and breadth of the SQL Enterprise Manager's capabilities. You can see that it has been designed to configure and control nearly every aspect of SQL Server operation.

SQL Security Manager

SQL Server supports three different security models, which are described in more detail later. Two of the models offer the capability to use Windows NT user accounts for access to SQL Server. If you have selected one of these models, the SQL Security Manager is used to help manage accounts. Figure 17.5 depicts a typical Security Manager display showing the Windows NT accounts that have been granted User privilege on this SQL Server.

Fig. 17.5 - The SQL Security Manager displaying Windows NT accounts that have been granted User privilege. The Account Detail dialog box is displayed by double-clicking the user's account name.

See "Choosing a Security Model," (Ch. 18)

See "Using SQL Server Security," (Ch. 19)

SQL Transfer Manager

SQL Transfer Manager is an administrative tool from SQL Server 4.2 and 6.0 that has been replaced by a transfer management interface in the SQL Enterprise Manager. SQL Transfer Manager can still be used, however, and provides a graphical interface to control the transfer of data and associated objects into and out of SQL Server. This utility is an easy to understand and convenient alternative to the older bulk copy program known as bcp. You can use the SQL Transfer Manager to move information between SQL Servers running on dissimilar platforms. For example, you can transfer information from a SQL Server running on a computer with a MIPS processor to a computer using an Alpha AXP processor or an Intel processor.

The transfer management interface in the SQL Enterprise Manager provides much of the same functionality, but can only be used when the destination server is running SQL Server version 6.5. It is opened by choosing Object, Transfer from the menu after selecting the source database. If you have a mixed server environment, the SQL Transfer Manager may still be useful. If the source server for your transfer is running SQL Server version 6.0, the OBJECT60.SQL script must be run on the source server for the SQL Transfer Manager to function properly.

SQL Administrator

The SQL Administrator is an administrative tool from SQL Server 4.2 that has been replaced by the SQL Enterprise Manager. The SQL Administrator can still be used, however, if you run a SQL script (ADMIN60.SQL) that is provided with SQL Server to set up support for this older utility. Most administrators prefer the more powerful and comprehensive SQL Enterprise Manager. Figure 17.6 depicts a typical SQL Administrator session showing the device and database management windows.

Fig. 17.6 - The SQL Administrator is an older tool from SQL Server version 4.2 that can still be useful in mixed server environments.

Object Manager

The Object Manager is another SQL Server 4.2 tool that has been replaced by the SQL Enterprise Manager. A corresponding script (OBJECT60.SQL) can be run to allow this tool to be used on SQL Server 6.0. This tool is sometimes preferred by software developers who need a tool that allows them to view the structure and definition of a database, its tables, and associated objects. This is provided by the Object Manager without including the administrative functions for which a developer has no use. A typical Object Manager display is shown in figure 17.7. The Manage Tables window has been opened to show the definition for the authors table.

Fig. 17.7 - The Object Manager is a SQL Server 4.2 tool that is still useful, especially for software developers.

isql and ISQL/w

Traditionally, all management of a database management system was done from a command-line prompt by entering individual commands or executing a prepared script of commands created to perform a specific task. The first versions of SQL Server included a command-line utility called isql (for interactive SQL). That utility is still available. An updated version of this old standby features multiple windows in a graphical, Windows-based utility called ISQL/w (see fig. 17.8). This tool still allows you to enter individual commands or run scripts, but it also offers some useful capabilities for tracking the behavior of queries in addition to the advantages of a multiple windowed environment.

Fig. 17.8 - ISQL/w is a Windows-based utility for entering command lines or executing SQL scripts.

Some DBAs prefer to use a command-line interface and scripts of SQL commands to perform maintenance tasks on their servers. Others prefer the SQL Enterprise Manager with its graphical user interface (GUI), dialog boxes, toolbar buttons, and visual, graphical feedback. It is fair to say that if a repetitive maintenance task needs to be performed on many servers, it can be faster and more reliable to create a script of commands, test them, and then execute them on the target servers. For an operation on a small number of servers, the speed and ease of directly specifying an operation using SQL Enterprise Manager is often a better choice. Of course a balanced approach using all the available tools is the ideal approach.

See "Using SQL Scripts and Stored Procedures for Maintenance Tasks," (Ch. 19)

bcp

The command-line utility traditionally used to get large amounts of data into and out of SQL Server is the bulk copy program (bcp), as shown in figure 17.9. It is still available in SQL Server 6.0, although many people will find the SQL Transfer Manager or the data transfer capabilities of the SQL Enterprise Manager to be a preferable replacement.

Fig. 17.9 - A bcp command being entered from a Windows NT command prompt. The command shown is using the /? flag to display the full syntax of the command.

See "Importing and Exporting Data," (Ch. 19)

DBCC

The Database Consistency Checker is a command that plays several special roles in the administration of SQL Server databases. Its primary function is to perform a check on the internal consistency of databases or individual tables. It compares the information in SQL Server's internal system tables with the actual physical database to see if the information contained in the two locations matches. If the physical database does not match the internal description of its size and location, DBCC can be used to attempt repairs. The DBCC command is also used to set trace flags, an advanced feature of SQL Server that is useful in some troubleshooting scenarios. A typical use of the DBCC command is shown in figure 17.10.

Fig. 17.10 - A DBCC command being entered in an ISQL/w window. A DBCC SHOW_STATISTICS command has just been executed. The results are displayed on the Results tab of the Query window.

System Procedures

In addition to the commands available in the SQL language, there is a collection of predefined system procedures. System procedures are also called system stored procedures because they are, in fact, a collection of useful stored procedures created by Microsoft that are provided with the product. These procedures can be used to perform many administrative tasks for SQL Server such as defining users, creating login IDs, setting configuration parameters, and creating devices. Almost everything that can be done with system procedures can also be done using the SQL Enterprise Manager. These commands have names beginning with sp_. For example, the system procedure to add a login ID is sp_addlogin. For more information see "Stored Procedures-System Stored Procedures" in the Transact-SQL Reference Manual.

See "Stored Procedures," (Ch. 18)

From Here...

In this chapter, you were introduced to SQL Server and its underlying technologies. It is important to understand the technical foundation upon which SQL Server is built to be successful in its use. Specifically, you learned the characteristics of SQL Server, the characteristics of an RDBMS, the roles and responsibilities of a DBA, and how database design influences your success with SQL Server.


Table of Contents

16 - Exchange Server Advanced Topics

18 - Building Your SQL Server