Appendix E - Case Study: New York Metropolitan Museum of Art

By Jay Hoffman, Gallery Systems, New York, New York

I began developing software for the art world in 1981 with my first application, The Museum System, which is used to catalog art objects. The Museum System stores information about art, including a picture of the object that can be displayed on a PC monitor.

The Museum System is uses at more than 75 art galleries in the United States and Europe. Institutions in the United States that use the Museum System include The Detroit Institute of Arts, The United States Holocaust Memorial Museum, The United States Supreme Court, The American Craft Museum (New York), and New York University's Grey Art Gallery.

I had decided to develop a similar, but more robust, system that would store information in a Microsoft SQL Server database when the Antonio Ratti Textile Center of The Metropolitan Museum of Art contacted him. The Met wanted me to develop an application that would combine the tracking of movement and preparation of textiles for storage in their new textile facility.


Objects in museums—even objects that are publicly displayed—are often temporarily removed by curators or art historians for research purposes. In addition, a number of objects can be removed from display to be lent to other museums. As well as the obvious information that must be kept about art objects, such as who the artist is and when the object was created, the current location of objects must also be kept.

The new application that I was to develop for the Met included a read-only version running on PC workstations in an area of the museum that is accessible to visitors. Visitors would be able to explore the textile database and view images and related data about the textiles.

Selecting Microsoft SQL Server for Windows NT

Many departments (including mine) within the Metropolitan Museum of Art that were already using other Microsoft products were pleased with the features and performance of the Microsoft products. The Museum staff's experience with Microsoft's word processor, spreadsheet, and PC database is the reason that I initially considered Microsoft SQL Server.

I also needed a database that could eventually store information about a nearly unlimited number of art objects, though the immediate needs only required the cataloging of less than 100,000 art objects for the textile project. Microsoft SQL Server looked as if it could meet the current needs of the textile project, as well be used to store much more data about additional art objects in the future.

Anticipating Time Required for the Project

There are several phases in the textile project, with the total time for completion specified to be one year and nine months. The first phase was completed on time, and the second phase is moving along nicely. The textile project will probably evolve into a second project with a much longer time frame.

The museum recognizes that it must eventually catalog all objects within the museum's collection so that the information about art objects can be easily and quickly retrieved. Such a project would also have to be completed in phases and would take considerably longer. It's easy to project that the museum should eventually catalog on a computer system information about it's entire collection, which is far more vast than the textile information.

Choosing a Computer System

I selected a Sequent 3000 with two Pentium processors for the database server platform. The Sequent system permits him to add additional processors. He anticipates that there would only be a need to add two additional processors in the future when the museum requires additional processing capabilities.


A product such as Microsoft SQL Server is designed to leverage the power of the Windows NT operating system (see Chapter 3, "Understanding the Underlying Operating System, Windows NT"). SQL Server is written in sections that execute as separate threads under Windows NT. The greater the number of processors within the computer system, the greater the number of threads, application code, that can execute simultaneously. Purchasing a computer system to which you can add processors as the processing demands of the system increase over time makes it less likely that you'll have to uproot your applications and move them to a larger computer.

Number of PC Client Workstations Supported

At this point in the project, the exact number isn't determined, but the number of PC workstations should be somewhere in the range of 12 to 60. The client workstations will be primarily used by the curitorial and conservation staff for scholarly purposes.

The PC workstations will be located on desktops in offices of the museum as well as a number that will be located in a public area. The size of the Sequent server and it's expansion capability should permit the server to easily satisfy the number of clients within the range specified.

Part of the plan is to provide up to six workstations that are connected to the server in a public access area that can be used by visiting scholars or the general public. A visiting scholar or a member of the public will be able to use the PC workstation to retrieve information on textile objects in the database, using one of the public workstations. A museum visitor can make the same query that a member of the museum research staff could make to obtain information about the textiles.

For example, a query could be made based on the culture or period of an object, and all matching objects which would be displayed in a thumb-nail sketch of the object found. The access to objects for the PC workstations in the public area will be limited to read-only. Information stored about the textiles cannot be changed from the PC systems that are located in the public area.

Structuring the Database

I allocated a half a gigabyte of storage for the database. I calculated the initial allocation of space for the database knowing that I had to catalog data for 42,000 art objects, the tapestries. I didn't require half a gigabyte of space for the storage of information about the existing objects. I wanted to ensure that I had plenty of room for the addition of data for more objects that the museum would subsequently acquire. In addition, the museum hadn't completely defined the amount of information that would be cataloged for each tapestry.

The amount of space that's used for the storage of actual data is only 200 megabytes, and I expect that the final amount of space required for data about the 42,000 textiles will be 300 to 400 megabytes. I defined the storage for the database and the transaction log on different logical and physical devices, which is the usual recommendation for the best performance.

Actually, 300 to 400 megabytes is a relatively small amount of space to use for the storage of data about 42,000 art objects. The reason that a relatively small amount of space is required for the storage of the information about the tapestries is that the images of the tapestries are stored externally to the SQL Server database, as separate files.

Rather than defining the column of a database table as an image datatype (an appropriate datatype for the storage of pictures), I chose a format that permitted the greatest flexibility for subsequent access by other applications. By storing the pictures of the tapestries external to the SQL Server database, I achieved maximum flexibility in their use: their access from other applications and retrieval systems that may require use of the pictures.

The format I chose for the storage of the pictures is the Kodak Photo CD format, which actually stores multiple copies of each graphic of a tapestry, each in a different resolution. This is one of the advantages of the Photo CD format. Subsequent applications, which may be written, can choose one of the formats to display the graphic on the screen.

The current retrieval and display application normally displays the graphics with 4 million colors, which is required for scholarly research and proper representation and identification of the art objects. The objects also look more pleasing at a resolution that provides several million colors rather than the 256 colors that are more commonly used for graphics in computer applications.

I built a table that contains a column to link the object number to the slide or image number of the picture of the tapestry. The Photo CD format actually provides five virtual resolutions, though the application primarily uses two: one of the Photo CD formats and an additional black and white format. The application uses a black and white TIFF format that permits 256 shades of gray and the 24-bit Photo CD format that provides millions of colors.

An outside vendor scans the pictures of the tapestries and delivers the scanned images to me on compact disks (CDs). The multiple resolutions of the Photo CD format are also useful because they permit the use of client systems that support different hardware, which may vary in the resolutions that they can display because of different video cards. I'm trying to define a standard client system that permits a resolution of 1024 by 768 and displays images in at least 32,000 colors.

One of the capabilities of the retrieval system permits the display of up to twelve small so-called thumbnail representations of the tapestry pictures at a time on the screen. The thumbnail display supports only 256 colors, though the full-screen display of the tapestry permits a display of several million.

All the client systems are Intel-based PCs that are currently running Windows for Workgroups. Both the PCs and the operating system were already in place at the museum when the tapestry project was conceived, and it was decided to use them as the client system rather than replace them.

The network connectivity that is a part of the Windows for Workgroups configuration permitted a simple conversion to a server-based NT domain system. The existing Windows for Workgroups configuration was one of the reasons for implementing the storage of information for the tapestry project using Microsoft SQL Server on Windows NT.

I'm considering upgrading the client system to run Windows NT Workstation instead of Windows for Workgroups. I'll probably do this using Visual Basic 4 in order to take advantage of the more powerful client system that NT Workstation provides. Alternately, it's possible that some of the clients will be upgraded to Windows 95, which, like Windows for Workgroups or Windows NT Workstation, permits a simple connection to the database server.

Using Application Packages to Define the Database

The application that I'm creating for the tapestry collection at the Met is a direct evolution of my existing DOS-based application, The Museum System. As a result, I was also aware of the design issues involved in the type of database required. The Museum System effectively served as a prototype for the tapestry application.

Number of Tables Defined in the Database

I should need 50-75 tables in the database, though some of the tables are quite small in keeping with the type of information in the tables. Many of the small tables are authority tables in which information stored about the tapestry varies depending on the department of the museum.

Islamic art is different from Asian, twentieth century art is different from European sculpture and decorative art, and the information that is kept about each tapestry is different depending upon it's origin (and hence the department that has ownership of it).

Even the same type of information that is sorted about an art object can vary. For example, the dynasties for China are different than the dynasties for Egypt, and though they are similar information, different information must be stored for each.

The differences in the information recorded for art objects also affect the entries that appear in the user interface for the tapestry application. For the example, the names of dynasties on drop-down menus must be different for Chinese and Egyptian dynasties.

Formally Defining Primary and Foreign Keys for the Database Tables

I feel strongly that the database is defined correctly, and it does make sense to define primary and foreign keys. Though SQL Server will permit relational joins and triggers to be performed in the absence of key definitions, such definitions provide a descriptive structure to the database. Moreover, the formal definition of keys anticipates what may become a requirement of Microsoft SQL Server in subsequent versions for the definition of other objects dependent upon keys, such as triggers.

Defining Indexes on Tables

Both clustered and non-clustered indexes are defined for the databases tables. Generally, the smaller tables that are referenced less frequently have as few as one or two indexes defined for them. The main tables, which are referenced frequently, have several indexes defined for faster joins and retrieval.

Using Storage Optimization Techniques

The Sequent server that we're using has a disk array, and we're currently making use of volume sets to optimize data access. I'm also considering using other optimization techniques, such as striping without parity to improve performance, which I may or may not implement later in the project.

Using Fault-Tolerant Techniques

The database is backed up daily. The images are stored on CDs, so there's no need to backup this non-volatile media. The database information isn't considered critical enough to make use of additional fault-tolerant mechanisms, such as mirroring, either at the database, operating system, or hardware level. Although not a fault-tolerant mechanism, replication may be later used to make the information about the art objects available at other institutions around the world.

Allocating the Temporary Database in RAM to Optimize Performance

I initially tried doing allocating the temporary database tempdb on a logical device that was defined in memory. For the type of queries and access that we're doing, it didn't seem to provide much of an improvement in performance. I've gone back to using the hard disk for the temporary database.

Using Triggers the Database

I use triggers in the usual way: to maintain the referential integrity of the database tables. In addition, I use triggers to maintain an audit trail about changes that are made to data about the art objects. Some columns, of course, can't be changed by anyone, and appropriate security is provided to prevent unauthorized updates.

I also use a trigger to generate a row ID for entries in some tables to uniquely identify the rows, retrieving a value from another table. I also considered generating a unique row number using the timestamp datatype or the max function to return the largest row number in the table and then incrementing the value to use as a new row number.

The Front-End Query Application

I'm writing the application in Visual Basic using ODBC to access the database. I'm using ODBC because I want the platform independence that it provides. Visual Basic makes it easy for me to pull up the 50 to 500 objects that a user may want to browse through once they've chosen an initial object.

I'll also permit the SQL Server database to be accessible from front-end products, such as Microsoft Access. I've already permitted the access of the database using the Attach Table feature of Microsoft Access. Users have local Access tables on their clients in which they can initially store information about objects.

Through Microsoft Access, they can also attach to the SQL Server database and update the data rows. They may also need to combine the data that they've stored within the local Access tables with information from the SQL Server database.

We're also testing the use of the report writer Crystal Reports to be used to access the SQL Server database as well as Access databases. To simplify the users access, I've set up views in SQL Server. I use the views to permit users to access multiple tables—as many as twelve tables—without having to be aware of the structure of the SQL Server database and the SQL syntax required to reference so many data sources.

Administrating the Database

The server will be located in the Textile Center, which is under construction; so for convenience, the administration of the database will be performed through the provided client administrative applications that come with the Microsoft SQL Server product.

Archiving Data

Currently, there's no plan to perform on-site or off-site storage of archives of the data. The concerns of the museum are different than those of a commercial environment. In a sense, the graphics images of the art objects are already a backup of the original art object itself, the tapestries.

If the CDs that contain the representations of the objects were lost, they could always be exactly recreated from the art objects. The focus to this point in time, as it should be, has been on the security and preservation of the objects themselves.

As time goes on, the museum will probably implement an archive strategy for safeguarding the information that is kept about the objects. We will be replicating the data to other institutions, which can also be used as backup copies of the data.

Existing Repositories of Information

The storage of data about art objects was started in 1988, and data was stored in dBASE III. Within the museum as a whole, each department has started committing the information that they have cataloged about art objects onto a computer systems. Other PC databases were and are in use for the storage of data throughout the museum.

I've sometimes had to write custom programs for the conversion of the data from the PC databases to SQL Server. Fortunately, I often have been able to use Microsoft Access to read the data from another vendor's PC database and write it out to the SQL Server database.

Dial-Up Access to the Database

Initially, there won't be any dial-up access provided to the database, although it will be provided at a later time.


The main window of the retrieval portion of the textile project application will display a miniature view of the tapestry in a rectangle that can be seen in the upper-right portion of the window. Information that is often retrieved about a tapestry is automatically displayed in several fields of the window.

For example, the name of the tapestry is shown in a Title field, which is defined as a list box. Multiple lines of information can be stored and subsequently displayed within the Title list box field. The classification of the art object will subsequently permit other art objects to be recorded and displayed when the application is used for cataloging other objects within the museum.

A Medium Label Copy field displays information about the characteristics of the tapestry (such as its composition). Two remaining fields that display data are the Date/Period and the CreditLine (which shows the manner in which the museum acquired the tapestry).

The image of the tapestry itself can be displayed in an enlarged view in a high resolution (several million colors), permitting as detailed a view of the tapestry as permitted by current technology. The image also can be displayed in a small representation using the Make Thumbnail. In addition, the contrast and brightness of the tapestry can be adjusted to permit alternate representations of the image for research purposes.


QUE Home Page

For technical support For our books And software contact support@mcp.com

Copyright © 1996, Que Corporation

Table of Contents

Appendix D - Redundant Arrays of Inexpensive Drives (RAID)

Appendix F - What's on the CD?