append
field
read()
record
seekg()
write()
random file access
sequential file access
Most "real-world" computer applications need to hold data long after the computer is turned off. So far, all the programs in this book have used data that was either placed in literals or entered through the keyboard. You've used expressions
to move the data around inside the program and the standard input and output classes, cin and cout, to converse with the user.
Nearly all useful programs remember data from previous times they have been run, or they can read data from other programs. In the PC world, nearly all the data is held in the form of disk storage. This unit looks at how you can use files.
Having used the standard input and output routines, you will be surprised how easy it is to read and write data to and from disk in a very similar way.
Random file access enables you to read or write any data in your disk file without having to read or write every piece of data that precedes it. You can quickly search for, add, retrieve, change, and delete information in a random-access file. Although
you need to learn a few new functions to access files randomly, the extra effort pays off in flexibility, power, and the speed of disk access.
This unit ends the basics of programming in Visual C++. When you began reading this book, you might have been a beginning programmer. After this unit, you will be past the rank of beginning programmer and can begin writing extremely powerful programs.
Disks hold lots of data for a long time.
Nearly all computers have much less memory (RAM) than hard disk storage. Your disk drives hold much more data than your computer can hold in RAM. Also, if you turn off your PC, the disk memory is remembered, whereas RAM is lost and the data is
forgotten.
Disk is the perfect place to store programs. That's where all of your code is stored. It's also a good place to hold data. You can have several disks if you want, so you can increase your storage to massive amounts. Hard disk memory is relatively cheap:
A gigabyte (1,000,000,000 characters) of disk costs roughly the same as 8 megabytes (1,000,000) of memoryand one hundred times cheaper is worth worrying about!
When you run lots of programs, they all must share RAM, which is limited. The more you can keep on disk, the better the PC can run your programs and efficiently share memory between them. If you can keep most of your data on disk, there is less for the
PC to hold in RAM. If you need lots of data, you can always throw some data away out of RAM and get it back from disk a bit later when you need it again.
Disks are not as limited as RAM for memory. You can store vast amounts of data and it will still be there after you have switched off the PC.
There is no Stop and Type section due to the textual nature of this section.
Sequential data is read from start to finish in order, but you can read random files in any order you like.
You can access files in two ways: using sequential access or random access. What your application needs to do determines the method you will use. The access mode of a file determines how you are allowed to read, change, and delete data from a file. Most
files can be accessed in both ways as long as the data lends itself to both kinds of access.
Sequential files have to be accessed in the same order as they were written. This is like cassette tapes: You play music in the same order as it was recorded. You can skip forward or backward, ignoring the music, but the order of the songs on the tape
dictates how you have to wind through the tape. You can't insert new songs in between songs already on the tape. The only way to add or remove data is to copy the data to a new file.
It might seem that sequential files are limiting, but their ease of processing means that they can be used effectively in many applications. In fact, in C++ the screen and the keyboard are treated as sequential files because that is an easy model to
process.
You can access random-access files in any order you like. In some ways, they are much more like a compact disc or an old-fashioned record because you can hop around playing parts of a track, switching to another part nearly instantaneously. The order of
play is not limited to the order of recording. Random-access files take more effort to program, but the result can be a more flexible system.
The type of file you usesequential or randomdepends on your application's requirements. Not all programs need random-access files.
There are two basic operations you can do to a sequential file: read it and write it. A special form of writing is adding to an existing file.
Only three operations are possible with sequential files.
If you are making a file for the first time, you first create the file, and then you add the data to it. Suppose you want to create a customer data file. You create a new file and write your current customers to that file. The data might be in arrays or
structures, or in lots of variables. Over time, as your customer list grows, you could add them to the file. When you add to the end of a file, it is called appending to the file. As the customers contact you, you could read the file to look for
information about the customer.
A sequential file is like that music cassette. You can't easily change the music in the middle of the tape. Likewise, not all applications are good for sequential processing. If you need to change the customer's details, you need to copy the whole file.
Similarly, if you want to delete a customer, you need to copy the whole file but skip over that customer. Unlike a music cassette, because the data is digital, you can copy a file over and over and it is still as good as the original.
The primary approach to sequential file updating is to create a new file from the old one. Don't worry about updating a file directly until you review random-access files later in this section.
Programs must open files before they can access data. After your program has accessed the data, your program should close them.
Before you can use a file, you must open it. This is like opening a book before reading it. When you are finished with the book, you close it again. You must also close a file when you have finished with it.
When you open a file, you must tell Visual C++ what the file is called, regardless of whether you are reading it or creating it. Visual C++ uses the operating system to help it prepare the file for use, making sure that there is an entry on the disk's
index of files and a space for the file to occupy. When a file is closed, any remaining data is written to disk and the file information is updated.
File accessing in C++ is done with streams. To make a stream, you make an object. There are two ways to make a stream. You can make a stream without a name and then open it, or you can open a stream with a filename.
There are two kinds of sequential file streams: input file streams (or ifstream), and output file streams (or ofstream). The file streams live in the fstream.h header. You don't need to worry too much about the inner workings of streams. You use various
member functions to find out about them, such as eof() to find whether you are at the end of an input file and is_open() to find whether you successfully opened the file. Here is a statement to open file vc.txt for input:
ifstream input("vc.txt");
Here is a snippet of code to open a file for output and check that it was opened correctly:
ofstream output("vcout.txt"); if (output.is_open())
Files can be opened to work in different ways. The default file type is a text file. You can make the file a binary file. If you write a number to a text file, it appears as it would if you wrote it to cout. In fact, cout is a special case of a text
file, so anything you write to a screen will appear in the same way in a text file. If you open a binary file such as
ofstream output("binary.out",ios::binary);
Visual C++ writes the characters in a different way (especially numbers). The binary format is more compact, but it is specific to the program. You can't guarantee that other programs can read your data. It mainly differs in the handling of numbers. In
the next program, you'll see that if you write a number to a text file, it gets converted. To a binary file, Visual C++ writes the bytes of data that represent the numbersuch as two bytes for an integerinstead of five characters for 12345.
The second parameter of the stream constructor or open member function sets the access mode of the file. Table 24.1 lists them.
Mode | Description |
ios::app | Opens a file for appending (adding to) |
ios::ate | Seeks to end of file on opening it |
ios::in | Opens a file for reading |
ios::out | Opens a file for writing |
ios::binary | Opens a file in binary mode |
ios::trunc | Discards the contents if the file exists |
ios::nocreate | If file does not exist, open fails |
ios::noreplace | If file exists, open fails unless appending or seeking to the end of file on opening. |
To combine these flags (in order to have a binary file with no create), you need to use the bitwise OR operator (|). Here's an example:
ifstream input("infile.bin",ios::nocreate | ios::binary);
You can combine as many as flags as you like in this way. If you want to know more about bitwise operators, refer to the bonus chapter, "Advancing with Bitwise Operators," on the disk.
When you try to open a file, you should always check whether you have been successful by using is_open() or testing that the stream variable is nonzero. (A clever thing with C++ is that the class can be made to return different types of values.) When
the file is open, you never need to refer to the filename again.
Because the input and output is a stream, the same rules apply as for screen input. You use the >> and << operators for input and output respectively. If you use these, each variable is delimited by whitespace. (The following program might
help you clarify what goes on with screen input.) Whitespace is not automatically added to output. (You can change the delimiter to a special character if you like, using the fill() function.) You can also use get (get a single character), put (put a
single character), getline, and putline to more carefully control the characters written.
Often, you develop a class that owns a file, normally as a data member. Rather than performing low-level reading and writing, you will develop functions to write out structures of data. To successfully use stream input, you should know what has been
written. If you can predict the data types on the input stream (they must have been written in a particular order), you can read into the correct data type as you go. If you don't know what is coming, normally you will read the input into a character
string using getline (or read for binary data) and then examine it, perhaps using the library functions of the previous unit.
There is not a function that checks for a file's existence. To do this, open a file for input with the ios::nocreate flag set. This will fail if the file does not exist. If the file does exist, you can immediately close the file again.
Files should be closed as soon as they are finished being used. A file will automatically be closed by the stream destructor (by using delete or by the object going out of local scope) or by explicitly calling close():
output.close();
Closing does more than just close the file. It first ensures that any data loitering about in the program is written out too.
You can use as many files as you like in your program. (That is not strictly true. Different operating systems have limits on the number of files they can handle once.)
Listing 24.1 shows a program that reads itself onto the screen.
1:// File name: PROGREAD.CPP 2:// A simple text reading program 3:// 4:#include <fstream.h> 5:void main() 6: { 7: char buffer[255]; // A place to read into 8: char inputFile[]="PROGREAD.CPP"; // The file name 9: 10: ifstream input(inputFile); // Declare an input stream 11: 12: if (!input) // Has it opened OK? 13: cout << "Error in file name"; // Oops! 14: else 15: { // Ok 16: cout << "File: " << inputFile << endl; 17: while (!input.eof()) // While more data 18: { 19: input.getline(buffer,255); // Get line of input 20: cout << buffer << endl; // Output input(!) 21: } 22: } 23: } // input closes here due to scope & destructor
Output
File: PROGREAD.CPP // File name: PROGREAD.CPP // A simple text reading program // #include <fstream.h> void main() { char buffer[255]; // A place to read into char inputFile[]="PROGREAD.CPP"; // The file name ifstream input(inputFile); // Declare an input stream if (!input) // Has it opened OK? cout << "Error in file name"; // Oops! else { // Ok cout << "File: " << inputFile << endl; while (!input.eof()) // While more data { input.getline(buffer,255); // Get line of input cout << buffer << endl; // Output input(!) } } } // input closes here due to scope & destructor
Analysis
Aside from being a quick way to fill a few pages of this book (just joking, honest!), this shows how easy file handling is now that you're used to screen output.
In line 10, the program finds the file and opens it in one simple statement. In line 12, the file variable is tested to see whether the file opened correctly. If the file did open, you want to output the contents to the screen line by line. Text files
have the same concept of lines that the screen has. Therefore, you can simply read a whole linenot to exceed 255 charactersinto a character string, and then immediately output it again. This demonstrates an important principle. This program
could handle a file of thousands of lines, but it only needs the RAM to hold a few work variables and 255 characters of the file at a time.
To specify a DOS path name, you need to use the double backslash, because \ is a special character to C++. So, to print AUTOEXEC.BAT from any directory or drive would require:
ifstream auto("C:\\AUTOEXEC.BAT");
Line 17 contains a very significant statement. The while test uses the function eof() to find out whether there is any more file to be read. Inside the stream processing, when C++ reads the file and finds that there is no more data to be read, it sets
the eof status, which is one of several status flags maintained for a stream. By using the eof() function, the program can retrieve the setting of this status to decide when to stop. After a status is set, it is not automatically unset. Therefore, if
further processing is required on the file, clear() needs to be called in order to reset the status. Another test you can make is to check the value of bad(), which returns a nonzero value if there has been a severe I/O error (such as removing a floppy
disk from the drive halfway through an executing program).
Definition
It's a snap to write to an output file. You've been doing it for the whole book!
As was mentioned earlier, the commands you use with cout work just the same with a file stream, except the result ends up on the disk. You can overwrite an existing file or append to a file. You control this by setting the ios::app
flag for appending. The program in Listing 24.2 adds some data to a file every time it is run.
1:// Filename: WR1.CPP 2:// Writes names to a file 3:#include <fstream.h> 4: 5:void main() 6:{ //Create or append to a file 7: ofstream fp("C:\\NAMES.DAT", ios::app); 8: 9: if (!fp) // Exit on error 10: return; 11: fp << "Kim Spencer" << endl; 12: fp << "Michael Spencer" << endl; 13: fp << "Derek Spencer" << endl; 14: fp << "Jane Spencer" << endl; 15: fp << "---------------" << endl; 16:} // Close file automatically
Output
After three runs, the file looks like this:
Kim Spencer Michael Spencer Derek Spencer Jane Spencer --------------- Kim Spencer Michael Spencer Derek Spencer Jane Spencer --------------- Kim Spencer Michael Spencer Derek Spencer Jane Spencer ---------------
Analysis
The file is opened in line 7. The flag tells it to append to an existing file; hence the output is tripled by running three times. (You can check the output by opening the file with the Visual C++ Workbench editor.)
Note that the only reason that the output has one name per line is that you wrote the newline (endl) character sequence into the file. When writing to files, keep in mind that you have to read the data later. You have to use "mirror image"
input functions to read data that you output to files.
You can read and write random-access files that give you the ability to move around in the file (appropriately) randomly.
Sequential file processing can be slow unless you read the entire file into arrays and process them in memory. As explained in the previous section, you have much more disk space than RAM, and some disk files won't fit into RAM because they are simply
bigger than the available memory. Therefore, you need a way to quickly read individual pieces of data from a file in any order and process them one at a time.
definition
A record is a structure stored in a disk file.
definition
A field is one or more values in a record that are analogous to a structure member.
You usually read and write files one record at a time. Records contain fields that you read and write to disk. Generally, you store data in structures and write the structures to disk, where they are then called records. When you read a record from
disk, you generally read that record into a structure variable and process it with your program.
Unlike with most programming languages, not all disk data for Visual C++ programs must be stored in record format. Typically, you write a stream of characters to a disk file and access that data either sequentially or randomly by reading it into
variables and structures.
Using random-access data in a file is simple. Think about the data files of a large credit card organization. When you make a purchase, the store calls the credit card company to receive authorization. Millions of names are in the credit card company's
files. There is no quick way that the company can read every record sequentially from the disk that comes before yours. Sequential files do not lend themselves to quick access. It is not feasible, in many situations, to look up individual records in a data
file with sequential access.
Use random file access when you must instruct the program to go directly to your record, just as you go directly to a song on a compact disc or a record album. The functions that you use are different from the sequential functions, but the power that
results from learning the added functions is worth the effort.
Your random-access file is like a big array on the disk. You know that with arrays you can add, print, or remove values in any order. You do not have to start with the first array element, sequentially looking at the next one until you get the element
you need. You can view your random-access file in the same way, by accessing the data in any order.
definition
A fixed-length record file contains records that are all the same length.
Often, random file records contain fixed-length records. Each record (usually a row in the file) takes the same amount of disk space. Most of the sequential files you read and wrote in the previous unit were variable-length records.
When you are reading or writing sequentially, there is no need for fixed-length records because you input each value one character, word, string, or number at a time, and look for the data you want. With fixed-length records, your computer can better
calculate where on the disk the desired record is located.
Although you waste some disk space with fixed-length records (because of the spaces that pad some of the fields), the advantages of random file access compensate for the "wasted" disk space (when the data does not actually fill the structure
size).
Random-access files enable you to read or write records in any order. Even if you want to perform sequential reading or writing of the file, you can use random-access processing and randomly read or write the file sequentially from the first record to the last.
Because all variables of the same structure are the same size, when you read and write structure variables, you are reading and writing fixed-length data values. Working with fixed-length (all the same size) data values gives you the capability to move
around in the file, reading and writing any structure in the file. If structure variables (derived from the same structure definition) were variable-length, you would never know how far to move in the file when you wanted to skip ahead five records.
Files contain records, which are usually structure variables stored back to back in the file.
There is no Stop and Type section here due to this section's textual nature.
The open function's access mode tells Visual C++ that you want to access a file randomly.
As with sequential files, you must open random-access files before reading or writing to them. You can use any of the read access modes if you are only going to read a file randomly. To update a file, you should use the fstream stream. This insists on a
mode (from Table 24.1). To update, you would open the file like this:
fstream update("update.txt",ios::in | ios::out);
The difference between random-access and sequential files is not physical, but lies in the method that you use to access them and update them. Suppose that you want to write a program to create a file of the names of your company's top five executives.
The following open() function call suffices, assuming that fp is declared as a file pointer:
fp.open("EXECS.DAT", ios::out); if (!fp) { cout << "*** Cannot open file ***"; return;} // Exits the program if an error occurs
No update open() access mode is needed if you are only creating the file. The !fp check ensures that the file opened properly. If an error occurred (such as an open disk drive door for a floppy disk file) and the file did not open properly, the program
could issue an error message.
However, what if you wanted to create the file, write names to it, and give the user a chance to change any of the names before closing the file? You then would have to open the file like this:
fp.open("EXECS.DAT", ios::in | ios::out); if (!fp) { cout << "*** Cannot open file ***"; return; } // Exits the program if an error occurs
This code enables you to create the file, and then change data you wrote to the file. The | symbol is the bitwise OR operator that you saw earlier. Use the bitwise OR operator between two or more of the modes you want Visual C++ to use for open().
As with sequential files, the only difference between using a binary open() access mode and a text mode is that the file you create is more compact and saves disk space.
Remember that you can't open a read-only file for output. You get an error if you try to open a CD-ROM file with ios::out. Files on hard disks or diskettes can be set to read only too.
You can move the random file pointer with seekg().
When you read and write disk files, Visual C++ keeps track of a file pointer. When you first open the file for reading or writing, the file pointer points to the very first byte in the file. When you open the file for appending, the file pointer points
to the end of the file (so that you can write to the end of it). You do not always want to begin reading or writing at the location of the file pointer. (If you did, you probably would be using a sequential mode.)
Visual C++ provides the seekg() function, which enables you to move the file pointer to a specific point in a random-access data file. The format of seekg() is
filestream.seekg(longNum, origin);
filestream is the stream object that you want to access, initialized with an open() statement or the constructor. longNum is the number of bytes you want to skip in the file. Visual C++ does not read this many bytes, but it literally skips the data by
the number of bytes specified in longNum. Skipping the bytes on the disk is much faster than reading them. If longNum is negative, Visual C++ skips backward in the file (which allows for rereading of data several times). Because data files can be large,
you must declare longNum as a long integer to hold a large number of bytes.
origin is a value that tells Visual C++ where to begin the skipping of bytes specified by longNum. origin can be any of the three values shown in Table 24.2.
Visual C++ Name | Description |
ios::beg | Beginning of file |
ios::cur | Current file position |
ios::end | End of file |
The names ios::beg, ios::cur, and ios::end are defined in the FSTREAM.H header file.
No matter how far into a file you have read, the following seekg() function positions the file pointer at the beginning of a file:
fp.seekg(0L, ios::beg); // Positions file pointer at beginning
The constant 0L passes a long integer 0 to the seekg() function. Without the L, C++ passes a regular integer. This does not match the prototype for seekg() that is located in FSTREAM.H. This seekg() function literally reads "Move the file pointer 0
bytes from the beginning of the file."
The following seekg() function positions the file pointer at the 30th byte from the end the file:
filePtr.seekg(-30L, ios::end); // Positions file pointer // at the 30th byte
This seekg() function literally reads "Move the file pointer 30 bytes from the end of the file."
If you write structures to a file, you can quickly seek any structure in the file using the sizeof() function. Suppose that you want the 123rd occurrence of the structure named Inventory. You would search using the following seekg() function:
filestream.seekg((123L * sizeof(Inventory)), ios::beg);
You'll see a demonstration of a structure read in the Project 12 program.
I use the term structure carefully. The structure should not have any pointers to other data types because these would not be saved. Also you can't simply create a class by copying data directly into it. The hidden internals will not be set.
To point to the end of a data file, you can use the seekg() function to position the file pointer at the last byte. Subsequent seekg()s should then use a negative longNum value to skip backwards in the file. The following seekg() function makes the file
pointer point to the end of the file:
filePtr.seekg(0L, ios::end); // Positions file // pointer at the end
This seekg() function literally reads "Move the file pointer 0 bytes from the end of the file." The file pointer now points to the end-of-file marker, but you can seekg() backwards to find other data in the file.
Listing 24.3 contains a program that writes data to a new file and then reads randomly from that file. The program writes the letters of the alphabet to a file called ALPH.TXT. The seekg() function is then used to read and display
the ninth and seventeenth letters (I and Q).
Use seekg() to position the file pointer at whatever location you want to read or write next.
1:// Filename: ALPH.CPP 2:// Stores the alphabet in a file, then reads 3:// two letters from it 4: 5:#include <fstream.h> 6:#include <stdlib.h> 7: 8: 9:void main() 10:{ 11: char ch; // Holds A through Z 12: 13: // Opens in update mode so that you can 14: // read file after writing to it 15: fstream fp("alph.txt", ios::in | ios::out); 16: if (!fp) 17: { 18: cout << endl << "*** Error opening file ***"; 19: return; 20: } 21: 22: for (ch = 'A'; ch <= 'Z'; ch++) 23: fp << ch; // Writes letters 24: 25: fp.seekg(8L, ios::beg); // Skips eight letters, points to I 26: fp >> ch; 27: cout << "The first character is " << ch << endl; 28: 29: fp.seekg(16L, ios::beg); // Skips 16 letters, points to Q 30: fp >> ch; 31: cout << "The second character is " << ch << endl; 32:}
Output
The first character is I The second character is Q
Analysis
Line 15's fstream constructor opens the file named ALPH.TXT for both input and output. The if in line 16 ensures that the open worked properly.
The for loop in lines 22 and 23 writes the letters A through Z to the open file. Immediately after writing the data, line 25 repositions the file pointer so that the pointer points to the ninth letter, I. Line 26 then reads a character from the file, at
the file pointer's location, so that the I is read into ch.
Line 29 positions the file pointer to the Q, and line 30 reads the Q into ch. The file closes at the program's termination with the destruction of the fstream object.
The preceding program forms the basis of a more complete data file management program. After you master the seekg() functions and become more familiar with disk data files, you'll begin to write programs that store more advanced data structures and
access them.
There are lots of helpful file I/O functions that you can put in your programming bag of tricks.
Several more disk I/O functions that you might find useful are available. They are mentioned here for completeness. As you perform more powerful disk I/O, you might find a use for many of these functions.
The following function is part of a larger program that receives inventory data in an array of structures from the user. This function is passed the structure. The write() function then writes the structure to the file stream fs.
void WriteStructure(Inventory& item) // reference for efficiency { fs.write((char *)&item, sizeof(inventory)); }
The (char *) typecast operator is required by write(). You must typecast the data being written to a string pointed to by a character pointer as shown here.
write() and its mirror image, read(), are extremely powerful. If the Inventory parameter is an array that has 1,000 elements, this one-line function can still write the entire array to the disk file! You could use the read() function to read an entire array of structures from the disk in a single function call.
Listing 24.4 contains a program that requests a filename from the user and erases the file from the disk using the remove() function.
The I/O functions help you work with data files from your program.
1:// Filename: ERAS.CPP 2:// Erases the file specified by the user 3: 4:#include <stdio.h> 5:#include <iostream.h> 6: 7:void main() 8:{ 9: char filename[255]; 10: 11: cout << "What is the filename you want me to erase? "; 12: cin >> filename; 13: if (remove(filename)) 14: cout << "*** I could not remove the file ***"; 15: else 16: cout << "The file " << filename 17: << " is now removed"; 18:}
Output
What is the filename you want me to erase? c:\names.dat The file c:\names.dat is now removed
Analysis
Line 9 defines a 255-element character array to hold the filename that the user wants to delete. The 255 elements give enough room for a pathname if needed. After asking for the name of the file, line 13 attempts to remove the file with the remove()
function and also checks the return value to see that the deletion worked.
There is no What's the Output? section for this unit.
ifstream input(fileName); if (!fp) { "Disaster! File not found!"; return; } while (input.get(inChar)) cout >> inChar;
remove(fileIn);
// Filename: CHANGBUG.CPP // Stores the alphabet in a file, reads two letters from it, // and changes those letters to x #include <fstream.h> void main() { char ch; // Holds A through Z // Opens in update mode so that you can // read file after writing to it fstream fp("alph.txt", ios::in | ios::out); if (!fp) { cout << "*** Error opening file ***"; return; } for (ch = 'A'; ch <= 'Z'; ch++) { fp << ch; } // Writes letters fp.seekg(8L, ios::beg); // Skips eight letters, points to I fp >> ch; // Changes the I to an x fp.seekg(-2L, ios::cur); fp << 'x'; cout << "The first character is " << ch << endl; fp.seekg(16L, ios::beg); // Skips 16 letters, points to Q fp >> ch; cout << "The second character is " << ch << endl; // Changes the Q to an x fp.seekg(-2L, ios::cur); fp << 'x'; }