Objectives
In this chapter you’ll learn:
• To create, read, write and update files.
• To use class File
to retrieve information about files and directories.
• The Java input/output stream class hierarchy.
• The differences between text files and binary files.
• Sequential-access file processing.
• To use classes Scanner
and Formatter
to process text files.
• To use the FileInputStream
and FileOutputStream
classes.
• To use a JFileChooser
dialog.
• To use the ObjectInputStream
and ObjectOutputStream
classes.
I can only assume that a “Do Not File” document is filed in a “Do Not File” file.
—Senator Frank Church Senate Intelligence Subcommittee Hearing, 1975
Consciousness ... does not appear to itself chopped up in bits.... A “river” or a “stream” are the metaphors by which it is most naturally described.
—William James
I read part of it all the way through.
—Samuel Goldwyn
A great memory does not make a philosopher, any more than a dictionary can be called grammar.
—John Henry, Cardinal Newman
Outline
14.1 Introduction
14.2 Data Hierarchy
14.3 Files and Streams
14.4 Class File
14.5 Sequential-Access Text Files
14.5.1 Creating a Sequential-Access Text File
14.5.2 Reading Data from a Sequential-Access Text File
14.5.3 Case Study: A Credit-Inquiry Program
14.5.4 Updating Sequential-Access Files
14.6 Object Serialization
14.6.1 Creating a Sequential-Access File Using Object Serialization
14.6.2 Reading and Deserializing Data from a Sequential-Access File
14.7 Additional java.io
Classes
14.8 Opening Files with JFileChooser
14.9 Wrap-Up
Storage of data in variables and arrays is temporary—the data is lost when a local variable goes out of scope or when the program terminates. Computers use files for long-term retention of large amounts of data, even after the programs that created the data terminate. You use files every day for tasks such as writing an essay or creating a spreadsheet. We refer to data maintained in files as persistent data because it exists beyond the duration of program execution. Computers store files on secondary storage devices such as hard disks, optical disks and magnetic tapes. In this chapter, we explain how Java programs create, update and process files.
File processing is one of the most important capabilities a language must have to support commercial applications, which typically store and process massive amounts of persistent data. In this chapter, we discuss Java’s powerful file-processing and stream input/output features. The term “stream” refers to ordered data that is read from or written to a file. We discuss streams in more detail in Section 14.3. File processing is a subset of Java’s stream-processing capabilities, which enable a program to read and write data in memory, in files and over network connections. We have two goals in this chapter—to introduce file-processing concepts (making the reader more comfortable with using files programmatically) and to provide the reader with sufficient stream-processing capabilities to support the networking features introduced in Chapter 19, Networking. Java provides substantial stream-processing capabilities—far more than we can cover in one chapter. We discuss two forms of file processing here—text-file processing and object serialization.
We begin by discussing the hierarchy of data contained in files. We then cover Java’s architecture for handling files programmatically by discussing several classes in package java.io
. Next we explain that data can be stored in two different types of files—text files and binary files—and cover the differences between them. We demonstrate retrieving information about a file or directory using class File
and then devote several sections to the different mechanisms for writing data to and reading data from files. First we demonstrate creating and manipulating sequential-access text files. Working with text files allows the reader to quickly and easily start manipulating files. As you’ll learn, however, it is difficult to read data from text files back into object form. Fortunately, many object-oriented languages (including Java) provide ways to write objects to and read objects from files (known as object serialization and deserialization). To demonstrate this, we recreate some of the sequential-access programs that used text files, this time by storing objects in binary files.
Ultimately, a computer processes all data items as combinations of zeros and ones, because it is simple and economical for engineers to build electronic devices that can assume two stable states—one representing 0
and the other representing 1
. It is remarkable that the impressive functions performed by computers involve only the most fundamental manipulations of 0
s and 1
s.
The smallest data item in a computer can assume the values 0
or 1
. Such a data item is called a bit (short for “binary digit”—a digit that can assume one of two values). Computer circuitry performs various simple bit manipulations, such as examining the value of a bit, setting the value of a bit and reversing the value of a bit (from 1
to 0
or from 0
to 1
).
It is cumbersome for programmers to work with data in the low-level form of bits. Instead, they prefer to work with data in such forms as decimal digits (0–9), letters (A–Z and a–z), and special symbols (e.g., $, @, %, &, *, (, ), –, +, ″, :, ? and / ). Digits, letters and special symbols are known as characters. The computer’s character set is the set of all the characters used to write programs and represent data items. Computers process only 1
s and 0
s, so a computer’s character set represents every character as a pattern of 1
s and 0
s. Characters in Java are Unicode characters composed of two bytes, each composed of eight bits. Java contains a data type, byte
, that can be used to represent byte data. The Unicode character set contains characters for many of the world’s languages. See Appendix B, ASCII Character Set, for more information on the ASCII (American Standard Code for Information Interchange) character set, a subset of the Unicode character set that represents uppercase and lowercase letters, digits and various common special characters.
Just as characters are composed of bits, fields are composed of characters or bytes. A field is a group of characters or bytes that conveys meaning. For example, a field consisting of uppercase and lowercase letters can be used to represent a person’s name.
Data items processed by computers form a data hierarchy that becomes larger and more complex in structure as we progress from bits to characters to fields, and so on.
Typically, several fields compose a record (implemented as a class
in Java). In a payroll system, for example, the record for an employee might consist of the following fields (possible types for these fields are shown in parentheses):
• Employee identification number (int
)
• Name (String
)
• Address (String
)
• Hourly pay rate (double
)
• Number of exemptions claimed (int
)
• Year-to-date earnings (int
or double
)
• Amount of taxes withheld (int
or double
)
Thus, a record is a group of related fields. In the preceding example, all the fields belong to the same employee. Of course, a company might have many employees and thus have a payroll record for each employee. A file is a group of related records. [Note: More generally, a file contains arbitrary data in arbitrary formats. In some operating systems, a file is viewed as nothing more than a collection of bytes—any organization of the bytes in a file (e.g., organizing the data into records) is a view created by the applications programmer.] A company’s payroll file normally contains one record for each employee. Thus, a payroll file for a small company might contain only 22 records, whereas one for a large company might contain 100,000 records. It is not unusual for a company to have many files, some containing billions, or even trillions, of characters of information. Figure 14.1 illustrates a portion of the data hierarchy.
Fig. 14.1. Data hierarchy.
To facilitate the retrieval of specific records from a file, at least one field in each record is chosen as a record key. A record key identifies a record as belonging to a particular person or entity and is unique to each record. This field typically is used to search and sort records. In the payroll record described previously, the employee identification number normally would be chosen as the record key.
There are many ways to organize records in a file. The most common is called a sequential file, in which records are stored in order by the record-key field. In a payroll file, records are placed in ascending order by employee identification number.
Most businesses store data in many different files. For example, companies might have payroll files, accounts receivable files (listing money due from clients), accounts payable files (listing money due to suppliers), inventory files (listing facts about all the items handled by the business) and many others. Often, a group of related files is called a database. A collection of programs designed to create and manage databases is called a database management system (DBMS). We discuss this topic in Chapter 20, Accessing Databases with JDBC.
Java views each file as a sequential stream of bytes (Fig. 14.2). Every operating system provides a mechanism to determine the end of a file, such as an end-of-file marker or a count of the total bytes in the file that is recorded in a system-maintained administrative data structure. A Java program processing a stream of bytes simply receives an indication from the operating system when it reaches the end of the stream—the program does not need to know how the underlying platform represents files or streams. In some cases, the end-of-file indication occurs as an exception. In other cases, the indication is a return value from a method invoked on a stream-processing object.
Fig. 14.2. Java’s view of a file of n bytes.
File streams can be used to input and output data as either characters or bytes. Streams that input and output bytes to files are known as byte-based streams, storing data in its binary format. Streams that input and output characters to files are known as character-based streams, storing data as a sequence of characters. For instance, if the value 5
were being stored using a byte-based stream, it would be stored in the binary format of the numeric value 5
, or 101
. If the value 5
were being stored using a character-based stream, it would be stored in the binary format of the character 5
, or 00000000 00110101
(this is the binary for the numeric value 53
, which indicates the character 5
in the Unicode character set). The difference between the numeric value 5
and the character 5
is that the numeric value can be used as an integer in calculations, whereas the character 5
is simply a character that can be used in a string of text, as in "Sarah Miller is 15 years old"
. Files that are created using byte-based streams are referred to as binary files, while files created using character-based streams are referred to as text files. Text files can be read by text editors, while binary files are read by a program that converts the data to a human-readable format.
A Java program opens a file by creating an object and associating a stream of bytes or characters with it. The classes used to create these objects are discussed shortly. Java can also associate streams with different devices. In fact, Java creates three stream objects that are associated with devices when a Java program begins executing—System.in
, System.out
and System.err
. Object System.in
(the standard input stream object) normally enables a program to input bytes from the keyboard; object System.out
(the standard output stream object) normally enables a program to output data to the screen; and object System.err
(the standard error stream object) normally enables a program to output error messages to the screen. Each of these streams can be redirected. For System.in
, this capability enables the program to read bytes from a different source. For System.out
and System.err
, this capability enables the output to be sent to a different location, such as a file on disk. Class System
provides methods setIn
, setOut
and setErr
to redirect the standard input, output and error streams, respectively.
Java programs perform file processing by using classes from package java.io
. This package includes definitions for stream classes, such as FileInputStream
(for byte-based input from a file), FileOutputStream
(for byte-based output to a file), FileReader
(for character-based input from a file) and FileWriter
(for character-based output to a file). Files are opened by creating objects of these stream classes, which inherit from classes InputStream
, OutputStream
, Reader
and Writer
, respectively (these classes will be discussed later in this chapter). Thus, the methods of these stream classes can all be applied to file streams as well.
Java contains classes that enable the programmer to perform input and output of objects or variables of primitive data types. The data will still be stored as bytes or characters behind the scenes, allowing the programmer to read or write data in the form of integers, strings, or other data types without having to worry about the details of converting such values to byte format. To perform such input and output, objects of classes ObjectInputStream
and ObjectOutputStream
can be used together with the byte-based file stream classes FileInputStream
and FileOutputStream
(these classes will be discussed in more detail shortly). The complete hierarchy of classes in package java.io
can be viewed in the online documentation at
java.sun.com/javase/6/docs/api/java/io/package-tree.html
Each indentation level in the hierarchy indicates that the indented class extends the class under which it is indented. For example, class InputStream
is a subclass of Object
. Click a class’s name in the hierarchy to view the details of the class.
As you can see in the hierarchy, Java offers many classes for performing input/output operations. We use several of these classes in this chapter to implement file-processing programs that create and manipulate sequential-access files. We also include a detailed example on class File
, which is useful for obtaining information about files and directories. In Chapter 19, Networking, we use stream classes extensively to implement networking applications. Several other classes in the java.io
package that we do not use in this chapter are discussed briefly in Section 14.7.
In addition to the classes in this package, character-based input and output can be performed with classes Scanner
and Formatter
. Class Scanner
is used extensively to input data from the keyboard. As we’ll see, this class can also read data from a file. Class Formatter
enables formatted data to be output to the screen or to a file in a manner similar to System.out.printf
. Chapter 24, Formatted Output, presents the details of formatted output with System.out.printf
. All these features can be used to format text files as well.
File
This section presents class File
, which is particularly useful for retrieving information about files or directories from disk. Objects of class File
do not open files or provide any file-processing capabilities. However, File
objects are used frequently with objects of other java.io
classes to specify files or directories to manipulate.
File
ObjectsClass File
provides four constructors. The constructor
public File( String name )
specifies the name
of a file or directory to associate with the File
object. The name
can contain path information as well as a file or directory name. A file or directory’s path specifies its location on disk. The path includes some or all of the directories leading to the file or directory. An absolute path contains all the directories, starting with the root directory, that lead to a specific file or directory. Every file or directory on a particular disk drive has the same root directory in its path. A relative path normally starts from the directory in which the application began executing, and is therefore a path that is “relative” to the current directory.
The constructor
public File( String pathToName, String name )
uses argument pathToName
(an absolute or relative path) to locate the file or directory specified by name
.
The constructor
public File( File directory, String name )
uses an existing File
object directory
(an absolute or relative path) to locate the file or directory specified by name
. Figure 14.3 lists some common File
methods. The complete list can be viewed at java.sun.com/javase/6/docs/api/java/io/File.html.
Fig. 14.3. File
methods.
public File( URI uri )
uses the given URI
object to locate the file. A Uniform Resource Identifier (URI) is a more general form of the Uniform Resource Locators (URLs) that are used to locate websites. For example, http://www.deitel.com/ is the URL for the Deitel & Associates’ website. URIs for locating files vary across operating systems. On Windows platforms, the URI
file:/C:/data.txt
identifies the file data.txt
stored in the root directory of the C: drive. On UNIX/Linux platforms, the URI
file:/home/student/data.txt
identifies the file data.txt
stored in the home
directory of the user student
.
Error-Prevention Tip 14.1
Use File
method isFile
to determine whether a File
object represents a file (not a directory) before attempting to open the file.
File
Figures 14.4–14.5 demonstrate class File
. The application prompts the user to enter a file name or directory name, then outputs information about the file name or directory name input.
Fig. 14.4. File
class used to obtain file and directory information.
Fig. 14.5. Testing class FileDemonstration
.
The program begins by prompting the user for a file or directory (line 12 of Fig. 14.5). Line 13 inputs the file name or directory name and passes it to method analyzePath
(lines 8–41 of Fig. 14.4). The method creates a new File
object (line 11) and assigns its reference to name
. Line 13 invokes File
method exists
to determine whether the name input by the user exists (either as a file or as a directory) on the disk. If the name input by the user does not exist, control proceeds to lines 37–40 and displays a message to the screen containing the name the user typed, followed by “does not exist
.” Otherwise, the body of the if
statement (lines 13–36) executes. The program outputs the name of the file or directory (line 18), followed by the results of testing the File
object with isFile
(line 19), isDirectory
(line 20) and isAbsolute
(line 22). Next, the program displays the values returned by lastModified
(line 24), length
(line 24), getPath
(line 25), getAbsolutePath
(line 26) and getParent
(line 26). If the File
object represents a directory (line 28), the program obtains a list of the directory’s contents as an array of String
s by using File
method list
(line 30) and displays the list on the screen.
The first output of this program demonstrates a File
object associated with the jfc
directory from the Java 2 Software Development Kit. The second output demonstrates a File
object associated with the readme.txt
file from the Java 2D example that comes with the Java 2 Software Development Kit. In both cases, we specified an absolute path on our personal computer.
A separator character is used to separate directories and files in the path. On a Windows computer, the separator character is a backslash () character. On a UNIX workstation, it is a forward slash (
/
) character. Java processes both characters identically in a path name. For example, if we were to use the path
c:Program FilesJavajdk1.6.0demo/jfc
which employs each separator character, Java would still process the path properly. When building strings that represent path information, use File.separator
to obtain the local computer’s proper separator character rather than explicitly using /
or . This constant returns a
String
consisting of one character—the proper separator for the system.
Common Programming Error 14.1
Using as a directory separator rather than
\
in a string literal is a logic error. A single indicates that the
followed by the next character represents an escape sequence. Use
\
to insert a in a string literal.
In this section, we create and manipulate sequential-access files. As mentioned earlier, these are files in which records are stored in order by the record-key field. We first demonstrate sequential-access files using text files, allowing the reader to quickly create and edit human-readable files. In the subsections of this chapter we discuss creating, writing data to, reading data from and updating sequential-access text files. We also include a credit-inquiry program that retrieves specific data from a file.
Java imposes no structure on a file—notions such as a record do not exist as part of the Java language. Therefore, the programmer must structure files to meet the requirements of the intended application. In the following example, we see how to impose a record structure on a file.
The program in Figs. 14.6–14.7 and Fig. 14.9 creates a simple sequential-access file that might be used in an accounts receivable system to help keep track of the amounts owed to a company by its credit clients. For each client, the program obtains from the user an account number, the client’s name and the client’s balance (i.e., the amount the client owes the company for goods and services received). The data obtained for each client constitutes a “record” for that client. The account number is used as the record key in this application—the file will be created and maintained in account-number order. The program assumes that the user enters the records in account-number order. In a comprehensive accounts receivable system (based on sequential-access files), a sorting capability would be provided so that the user could enter the records in any order. The records would then be sorted and written to the file.
Fig. 14.6. AccountRecord
maintains information for one account.
Fig. 14.7. Creating a sequential text file.
Fig. 14.8. End-of-file key combinations for various popular operating systems.
Fig. 14.9. Testing the CreateTextFile
class.
Class AccountRecord
(Fig. 14.6) encapsulates the client record information (i.e., account, first name, and so on) used by the examples in this chapter. The class AccountRecord
is declared in package com.deitel.javafp.ch14
(line 3), so that it can be imported into several examples. Class AccountRecord
contains private
data members account
, firstName
, lastName
and balance
(lines 7–10). This class also provides public
set and get methods for accessing the private
fields.
Compile class AccountRecord
as follows:
javac -d c:examplesch14 comdeiteljavafpch14AccountRecord.java
This places AccountRecord.class
in its package directory structure and places the package in c:examplesch14
. When you compile class AccountRecord
(or any other classes that will be reused in this chapter), you should place them in a common directory (e.g., c:examplesch14
). When you compile or execute classes that use AccountRecord
(e.g., CreateTextFile
in Fig. 14.7), you must specify the command-line argument -classpath
to both javac
and java
, as in
javac -classpath .;c:examplesch14 CreateTextFile.java
java -classpath .;c:examplesch14 CreateTextFile
Note that the current directory (specified with .
) is included in the classpath. This ensures that the compiler can locate other classes in the same directory as the class being compiled. The path separator used in the preceding commands should be the one that is appropriate for your platform—for example, a semicolon (;
) on Windows and a colon (:
) on UNIX/Linux/Mac OS X.
Now let us examine class CreateTextFile
(Fig. 14.7). Line 14 declares Formatter
variable output
. As discussed in Section 14.3, a Formatter
object outputs formatted strings, using the same formatting capabilities as method System.out.printf
. A Formatter
object can output to various locations, such as the screen or a file, as is done here. The Formatter
object is instantiated in line 21 in method openFile
(lines 17–34). The constructor used in line 21 takes one argument—a String
containing the name of the file, including its path. If a path is not specified, as is the case here, the JVM assumes that the files is in the directory from which the program was executed. For text files, we use the .txt
file extension. If the file does not exist, it will be created. If an existing file is opened, its contents are truncated—all the data in the file is discarded. At this point the file is open for writing, and the resulting Formatter
object can be used to write data to the file. Lines 23–28 handle the SecurityException
, which occurs if the user does not have permission to write data to the file. Lines 29–33 handle the FileNotFoundException
, which occurs if the file does not exist and a new file cannot be created. This exception may also occur if there is an error opening the file. Note that in both exception handlers, we call static
method System.exit
, and pass the value 1
. This method terminates the application. An argument of 0
to method exit
indicates successful program termination. A nonzero value, such as 1
in this example, normally indicates that an error has occurred. This value is passed to the command window that executed the program. The argument is useful if the program is executed from a batch file on Windows systems or a shell script on UNIX/Linux/Mac OS X systems. Batch files and shell scripts offer a convenient way of executing several programs in sequence. When the first program ends, the next program begins execution. It is possible to use the argument to method exit
in a batch file or shell script to determine whether other programs should execute. For more information on batch files or shell scripts, see your operating system’s documentation.
Method addRecords
(lines 37–91) prompts the user to enter the various fields for each record or to enter the end-of-file key sequence when data entry is complete. Figure 14.8 lists the key combinations for entering end-of-file for various computer systems.
Line 40 creates an AccountRecord
object, which will be used to store the values of the current record entered by the user. Line 42 creates a Scanner
object to read input from the user at the keyboard. Lines 44–48 and 50–52 prompt the user for input.
Line 54 uses Scanner
method hasNext
to determine whether the end-of-file key combination has been entered. The loop executes until hasNext
encounters end-of-file.
Lines 59–62 read data from the user, storing the record information in the AccountRecord
object. Each statement throws a NoSuchElementException
(handled in lines 82–86) if the data is in the wrong format (e.g., a string when an int
is expected) or if there is no more data to input. If the account number is greater than 0 (line 64), the record’s information is written to clients.txt
(lines 67–69) using method format
. This method can perform identical formatting to the System.out.printf
method used extensively in earlier chapters. This method outputs a formatted string to the output destination of the Formatter
object, in this case the file clients.txt
. The format string "%d %s %s %.2
"
indicates that the current record will be stored as an integer (the account number) followed by a string (the first name), another string (the last name) and a floating-point value (the balance). Each piece of information is separated from the next by a space, and the double value (the balance) is output with two digits to the right of the decimal point. The data in the text file can be viewed with a text editor, or retrieved later by a program designed to read the file (14.5.2). When lines 67–69 execute, if the Formatter
object is closed, a FormatterClosedException
will be thrown (handled in lines 77–81). [Note: You can also output data to a text file using class java.io.PrintWriter
, which also provides method format
for outputting formatted data.]
Lines 94–98 declare method closeFile
, which closes the Formatter
and the underlying output file. Line 97 closes the object by simply calling method close
. If method close
is not called explicitly, the operating system normally will close the file when program execution terminates—this is an example of operating system “housekeeping.”
Figure 14.9 runs the program. Line 8 creates a CreateTextFile
object, which is then used to open, add records to and close the file (lines 10–12). The sample data for this application is shown in Fig. 14.10. In the sample execution for this program, the user enters information for five accounts, then enters end-of-file to signal that data entry is complete. The sample execution does not show how the data records actually appear in the file. In the next section, to verify that the file has been created successfully, we present a program that reads the file and prints its contents. Because this is a text file, you can also verify the information by opening the file in a text editor.
Fig. 14.10. Sample data for the program in Fig. 14.7.
Data is stored in files so that it may be retrieved for processing when needed. Section 14.5.1 demonstrated how to create a file for sequential access. This section shows how to read data sequentially from a text file. In this section, we demonstrate how class Scanner
can be used to input data from a file rather than the keyboard.
The application in Figs. 14.11 and 14.12 reads records from the file "clients.txt"
created by the application of Section 14.5.1 and displays the record contents. Line 13 of Fig. 14.11 declares a Scanner
that will be used to retrieve input from the file.
Fig. 14.11. Sequential file reading using a Scanner
.
Fig. 14.12. Testing the ReadTextFile
class.
Method openFile
(lines 16–27) opens the file for reading by instantiating a Scanner
object in line 20. We pass a File
object to the constructor, which specifies that the Scanner
object will read from the file "clients.txt"
located in the directory from which the application executes. If the file cannot be found, a FileNotFoundException
occurs. The exception is handled in lines 22–26.
Method readRecords
(lines 30–64) reads and displays records from the file. Line 33 creates AccountRecord
object record
to store the current record’s information. Lines 35–36 display headers for the columns in the application’s output. Lines 40–51 read data from the file until the end-of-file marker is reached (in which case, method hasNext
will return false
at line 40). Lines 42–45 use Scanner
methods nextInt
, next
and nextDouble
to input an integer (the account number), two strings (the first and last names) and a double value (the balance). Each record is one line of data in the file. The values are stored in object record
. If the information in the file is not properly formed (e.g., there is a last name where there should be a balance), a NoSuchElementException
occurs when the record is input. This exception is handled in lines 53–58. If the Scanner
was closed before the data was input, an IllegalStateException
occurs (handled in lines 59–63). If no exceptions occur, the record’s information is displayed on the screen (lines 48–50). Note in the format string in line 48 that the account number, first name and last name are left justified, while the balance is right justified and output with two digits of precision. Each iteration of the loop inputs one line of text from the text file, which represents one record.
Lines 67–71 define method closeFile
, which closes the Scanner
. Method main
is defined in Fig. 14.12, in lines 6–13. Line 8 creates a ReadTextFile
object, which is then used to open, add records to and close the file (lines 10–12).
To retrieve data sequentially from a file, programs normally start reading from the beginning of the file and read all the data consecutively until the desired information is found. It might be necessary to process the file sequentially several times (from the beginning of the file) during the execution of a program. Class Scanner
does not provide the ability to reposition to the beginning of the file. If it is necessary to read the file again, the program must close the file and reopen it.
The program in Figs. 14.13–14.15 allows a credit manager to obtain lists of customers with zero balances (i.e., customers who do not owe any money), customers with credit balances (i.e., customers to whom the company owes money) and customers with debit balances (i.e., customers who owe the company money for goods and services received). A credit balance is a negative amount, and a debit balance is a positive amount.
Fig. 14.13. Enumeration for menu options.
Fig. 14.14. Credit-inquiry program.
Fig. 14.15. Testing the CreditInquiry class.
We begin by creating an enum
type (Fig. 14.13) to define the different menu options the user will have. The options and their values are listed in lines 7–10. Method getValue
(lines 19–22) retrieves the value of a specific enum
constant.
Figure 14.14 contains the functionality for the credit-inquiry program, and Fig. 14.15 contains the main
method that executes the program. The program displays a text menu and allows the credit manager to enter one of three options to obtain credit information. Option 1 (ZERO_BALANCE
) produces a list of accounts with zero balances. Option 2 (CREDIT_BALANCE
) produces a list of accounts with credit balances. Option 3 (DEBIT_BALANCE
) produces a list of accounts with debit balances. Option 4 (END
) terminates program execution. A sample output is shown in Fig. 14.16.
Fig. 14.16. Sample output of the credit-inquiry program in Fig. 14.15.
The record information is collected by reading through the entire file and determining whether each record satisfies the criteria for the account type selected by the credit manager. Method processRequests
(lines 116–139 of Fig. 14.14) calls method getRequest
to display the menu options (line 119) and stores the result in MenuOption
variable accountType
. Note that getRequest
translates the number typed by the user into a MenuOption
by using the number to select a MenuOption
from array choices
. Lines 121–138 loop until the user specifies that the program should terminate. The switch
statement in lines 123–134 displays a header for the current set of records to be output to the screen. Line 136 calls method readRecords
(lines 22–67), which loops through the file and reads every record.
Line 30 of method readRecords
opens the file for reading with a Scanner
. Note that the file will be opened for reading with a new Scanner
object each time this method is called, so that we can again read from the beginning of the file. Lines 34–37 read a record. Line 40 calls method shouldDisplay
(lines 70–85) to determine whether the current record satisfies the account type requested. If shouldDisplay
returns true
, the program displays the account information. When the end-of-file marker is reached, the loop terminates and line 65 calls the Scanner
’s close
method to close the Scanner
and the file. Notice that this occurs in a finally
block, which will execute whether or not the file was successfully read. Once all the records have been read, control returns to method processRequests
and getRequest
is again called (line 137) to retrieve the user’s next menu option. Figure 14.15 contains method main
, and calls method processRequests
in line 9.
The data in many sequential files cannot be modified without the risk of destroying other data in the file. For example, if the name “White
” needed to be changed to “Worthington
,” the old name cannot simply be overwritten because the new name requires more space. The record for White
was written to the file as
300 Pam White 0.00
If the record is rewritten beginning at the same location in the file using the new name, the record will be
300 Pam Worthington 0.00
The new record is larger (has more characters) than the original record. The characters beyond the second “o
” in “Worthington
” will overwrite the beginning of the next sequential record in the file. The problem here is that fields in a text file—and hence records—can vary in size. For example, 7, 14, –117, 2074 and 27383 are all int
s stored in the same number of bytes (4) internally, but they are different-sized fields when displayed on the screen or written to a file as text.
Therefore, records in a sequential-access file are not usually updated in place. Instead, the entire file is usually rewritten. To make the preceding name change, the records before 300 Pam White 0.00
would be copied to a new file, the new record (which can be of a different size than the one it replaces) would be written and the records after 300 Pam White 0.00
would be copied to the new file. It is uneconomical to update just one record, but reasonable if a substantial portion of the records needs to be updated.
In Section 14.5, we demonstrated how to write the individual fields of an AccountRecord
object into a file as text, and how to read those fields from a file and place their values into an AccountRecord
object in memory. In the examples, AccountRecord
was used to aggregate the information for one record. When the instance variables for an AccountRecord
were output to a disk file, certain information was lost, such as the type of each value. For instance, if the value "3"
were read from a file, there is no way to tell whether the value came from an int
, a String
or a double
. We have only data, not type information, on a disk. If the program that is going to read this data “knows” what object type the data corresponds to, then the data is simply read into objects of that type. For example, in Section 14.5.2, we know that we are inputting an int
(the account number), followed by two String
s (the first and last name) and a double
(the balance). We also know that these values are separated by spaces, with only one record on each line. Sometimes won’t know exactly how the data is stored in a file. In such cases, we would like to read or write an entire object from a file. Java provides such a mechanism, called object serialization. A so-called serialized object is an object represented as a sequence of bytes that includes the object’s data as well as information about the object’s type and the types of data stored in the object. After a serialized object has been written into a file, it can be read from the file and deserialized—that is, the type information and bytes that represent the object and its data can be used to recreate the object in memory.
Classes ObjectInputStream
and ObjectOutputStream
, which respectively implement the ObjectInput
and ObjectOutput
interfaces, enable entire objects to be read from or written to a stream (possibly a file). To use serialization with files, we initialize ObjectInputStream
and ObjectOutputStream
objects with stream objects that read from and write to files—objects of classes FileInputStream
and FileOutputStream
, respectively. Initializing stream objects with other stream objects in this manner is sometimes called wrapping—the new stream object being created wraps the stream object specified as a constructor argument. To wrap a FileInputStream
in an ObjectInputStream
, for instance, we pass the FileInputStream
object to the ObjectInputStream
’s constructor.
The ObjectOutput
interface contains method writeObject
, which takes an Object
that implements interface Serializable
(discussed shortly) as an argument and writes its information to an OutputStream
. Correspondingly, the ObjectInput
interface contains method readObject
, which reads and returns a reference to an Object
from an InputStream
. After an object has been read, its reference can be cast to the object’s actual type. As you’ll see in Chapter 19, Networking, applications that communicate via a network, such as the Internet, can also transmit entire objects across the network.
In this section, we create and manipulate sequential-access files using object serialization. Object serialization is performed with byte-based streams, so the sequential files created and manipulated will be binary files. Recall that binary files cannot be viewed in standard text editors. For this reason, we write a separate application that knows how to read and display serialized objects.
We begin by creating and writing serialized objects to a sequential-access file. In this section, we reuse much of the code from Section 14.5, so we focus only on the new features.
AccountRecordSerializable
ClassLet us begin by modifying our AccountRecord
class so that objects of this class can be serialized. Class AccountRecordSerializable
(Fig. 14.17) implements interface Serializable
(line 7), which allows objects of AccountRecordSerializable
to be serialized and deserialized with ObjectOutputStream
s and ObjectInputStream
s. Interface Serializable
is a tagging interface. Such an interface does not contain methods. A class that implements Serializable
is tagged as being a Serializable
object. This is important because an ObjectOutputStream
will not output an object unless it is a Serializable
object, which is the case for any object of a class that implements Serializable
.
Fig. 14.17. AccountRecordSerializable
class for serializable objects.
In a class that implements Serializable
, the programmer must ensure that every instance variable of the class is a Serializable
type. Any instance variable that is not serializable must be declared transient
to indicate that it is not Serializable
and should be ignored during the serialization process. By default, all primitive-type variables are serializable. For variables of reference types, you must check the definition of the class (and possibly its superclasses) to ensure that the type is Serializable
. By default, array objects are serializable. However, if the array contains references to other objects, those objects may or may not be serializable.
Class AccountRecordSerializable
contains private
data members account
, firstName
, lastName
and balance
. This class also provides public
get and set methods for accessing the private
fields.
Now let’s discuss the code that creates the sequential-access file (Figs. 14.18–14.19). We concentrate only on new concepts here. As stated in Section 14.3, a program can open a file by creating an object of stream class FileInputStream
or FileOutputStream
. In this example, the file is to be opened for output, so the program creates a FileOutputStream
(line 21 of Fig. 14.18). The string argument that is passed to the FileOutputStream
’s constructor represents the name and path of the file to be opened. Existing files that are opened for output in this manner are truncated. Note that the .ser
file extension is used—we use this file extension for binary files that contain serialized objects.
Fig. 14.18. Sequential file created using ObjectOutputStream
.
Fig. 14.19. Testing class CreateSequentialFile
.
Common Programming Error 14.2
It is a logic error to open an existing file for output when, in fact, the user wishes to preserve the file.
Class FileOutputStream
provides methods for writing byte
arrays and individual byte
s to a file. In this program we wish to write objects to a file—a capability not provided by FileOutputStream
. For this reason, we wrap a FileOutputStream
in an ObjectOutputStream
by passing the new FileOutputStream
object to the ObjectOutputStream
’s constructor (lines 20–21). The ObjectOutputStream
object uses the FileOutputStream
object to write objects into the file. Lines 20–21 might throw an IOException
if a problem occurs while opening the file (e.g., when a file is opened for writing on a drive with insufficient space or when a read-only file is opened for writing). If so, the program displays an error message (lines 23–26). If no exception occurs, the file is open and variable output
can be used to write objects to the file.
This program assumes that data is input correctly and in the proper record-number order. Method addRecords
(lines 30–86) performs the write operation. Lines 62–63 create an AccountRecordSerializable
object from the data entered by the user. Line 64 calls ObjectOutputStream
method writeObject
to write the record
object to the output file. Note that only one statement is required to write the entire object.
Method closeFile
(lines 89–101) closes the file. Method closeFile
calls ObjectOutputStream
method close
on output
to close both the ObjectOutputStream
and its underlying FileOutputStream
(line 94). Note that the call to method close
is contained in a try
block. Method close
throws an IOException
if the file cannot be closed properly. In this case, it is important to notify the user that the information in the file might be corrupted. When using wrapped streams, closing the outermost stream also closes the underlying file.
In the sample execution for the program in Fig. 14.19, we entered information for five accounts—the same information shown in Fig. 14.10. The program does not show how the data records actually appear in the file. Remember that now we are using binary files, which are not humanly readable. To verify that the file has been created successfully, the next section presents a program to read the file’s contents.
As discussed in Section 14.5.2, data is stored in files so that it may be retrieved for processing when needed. The preceding section showed how to create a file for sequential access using object serialization. In this section, we discuss how to read serialized data sequentially from a file.
The program in Figs. 14.20–14.21 reads records from a file created by the program in Section 14.6.1 and displays the contents. The program opens the file for input by creating a FileInputStream
object (line 21). The name of the file to open is specified as an argument to the FileInputStream
constructor. In Fig. 14.18, we wrote objects to the file, using an ObjectOutputStream
object. Data must be read from the file in the same format in which it was written. Therefore, we use an ObjectInputStream
wrapped around a FileInputStream
in this program (lines 20–21). If no exceptions occur when opening the file, variable input
can be used to read objects from the file.
Fig. 14.20. Sequential file read using an ObjectInputStream
.
Fig. 14.21. Testing class ReadSequentialFile
.
The program reads records from the file in method readRecords
(lines 30–60). Line 40 calls ObjectInputStream
method readObject
to read an Object
from the file. To use AccountRecordSerializable
-specific methods, we downcast the returned Object
to type AccountRecordSerializable
. Method readObject
throws an EOFException
(processed at lines 48–51) if an attempt is made to read beyond the end of the file. Method readObject
throws a ClassNotFoundException
if the class for the object being read cannot be located. This might occur if the file is accessed on a computer that does not have the class. Figure 14.21 contains method main
(lines 6–13), which opens the file, calls method readRecords
and closes the file.
java.io
ClassesWe now introduce you to other useful classes in the java.io
package. We overview additional interfaces and classes for byte-based input and output streams and character-based input and output streams.
InputStream
and OutputStream
(subclasses of Object
) are abstract
classes that declare methods for performing byte-based input and output, respectively. We used concrete classes FileInputStream
(a subclass of InputStream
) and FileOutputStream
(a subclass of OutputStream
) to manipulate files in this chapter.
Pipes are synchronized communication channels between threads. We discuss threads in Chapter 18, Multithreading. Java provides PipedOutputStream
(a subclass of OutputStream
) and PipedInputStream
(a subclass of InputStream
) to establish pipes between two threads in a program. One thread sends data to another by writing to a PipedOutputStream
. The target thread reads information from the pipe via a PipedInputStream
.
A FilterInputStream
filters an InputStream
, and a FilterOutputStream
filters an OutputStream
. Filtering means simply that the filter stream provides additional functionality, such as aggregating data bytes into meaningful primitive-type units. FilterInputStream
and FilterOutputStream
are abstract
classes, so some of their filtering capabilities are provided by their concrete subclasses.
A PrintStream
(a subclass of FilterOutputStream
) performs text output to the specified stream. Actually, we have been using PrintStream
output throughout the text to this point—System.out
and System.err
are PrintStream
objects.
Reading data as raw bytes is fast, but crude. Usually, programs read data as aggregates of bytes that form int
s, float
s, double
s and so on. Java programs can use several classes to input and output data in aggregate form.
Interface DataInput
describes methods for reading primitive types from an input stream. Classes DataInputStream
and RandomAccessFile
each implement this interface to read sets of bytes and view them as primitive-type values. Interface DataInput
includes methods readLine
(for byte
arrays), readBoolean
, readByte
, readChar
, readDouble
, readFloat
, readFully
(for byte
arrays), readInt
, readLong
, readShort
, readUnsignedByte
, readUnsignedShort
, readUTF
(for reading Unicode characters encoded by Java) and skipBytes
.
Interface DataOutput
describes a set of methods for writing primitive types to an output stream. Classes DataOutputStream
(a subclass of FilterOutputStream
) and RandomAccessFile
each implement this interface to write primitive-type values as bytes. Interface DataOutput
includes overloaded versions of method write
(for a byte
or for a byte
array) and methods writeBoolean
, writeByte
, writeBytes
, writeChar
, writeChars
(for Unicode String
s), writeDouble
, writeFloat
, writeInt
, writeLong
, writeShort
and writeUTF
(to output text modified for Unicode).
Buffering is an I/O-performance-enhancement technique. With a BufferedOutputStream
(a subclass of class FilterOutputStream
), each output statement does not necessarily result in an actual physical transfer of data to the output device (which is a slow operation compared to processor and main memory speeds). Rather, each output operation is directed to a region in memory called a buffer that is large enough to hold the data of many output operations. Then, actual transfer to the output device is performed in one large physical output operation each time the buffer fills. The output operations directed to the output buffer in memory are often called logical output operations. With a BufferedOutputStream
, a partially filled buffer can be forced out to the device at any time by invoking the stream object’s flush
method.
Buffering can greatly increase an application’s efficiency. Typical I/O operations are extremely slow compared to the speed of accessing computer memory. Buffering reduces the number of I/O operations by first combining smaller outputs together in memory. The number of actual physical I/O operations is small compared with the number of I/O requests issued by the program. Thus, the program that is using buffering is more efficient.
Performance Tip 14.1
Buffered I/O can yield significant performance improvements over unbuffered I/O.
With a BufferedInputStream
(a subclass of class FilterInputStream
), many “logical” chunks of data from a file are read as one large physical input operation into a memory buffer. As a program requests each new chunk of data, it is taken from the buffer. (This procedure is sometimes referred to as a logical input operation.) When the buffer is empty, the next actual physical input operation from the input device is performed to read in the next group of “logical” chunks of data. Thus, the number of actual physical input operations is small compared with the number of read requests issued by the program.
Java stream I/O includes capabilities for inputting from byte
arrays in memory and outputting to byte
arrays in memory. A ByteArrayInputStream
(a subclass of InputStream
) reads from a byte
array in memory. A ByteArrayOutputStream
(a subclass of OutputStream
) outputs to a byte
array in memory. One use of byte
-array I/O is data validation. A program can input an entire line at a time from the input stream into a byte
array. Then a validation routine can scrutinize the array’s contents and correct the data if necessary. Finally, the program can proceed to input from the byte
array, “knowing” that the input data is in the proper format. Outputting to a byte
array is a nice way to take advantage of the powerful output-formatting capabilities of Java streams. For example, data can be stored in a byte
array, using the same formatting that will be displayed at a later time, and the byte
array can then be output to a disk file to preserve the screen image.
A SequenceInputStream
(a subclass of InputStream
) enables concatenation of several InputStream
s, which means that the program sees the group as one continuous InputStream
. When the program reaches the end of an input stream, that stream closes, and the next stream in the sequence opens.
In addition to the byte-based streams, Java provides the Reader
and Writer abstract
classes, which are Unicode two-byte, character-based streams. Most of the byte-based streams have corresponding character-based concrete Reader
or Writer
classes.
Classes BufferedReader
(a subclass of abstract
class Reader
) and BufferedWriter
(a subclass of abstract
class Writer
) enable buffering for character-based streams. Remember that character-based streams use Unicode characters—such streams can process data in any language that the Unicode character set represents.
Classes CharArrayReader
and CharArrayWriter
read and write, respectively, a stream of characters to a character array. A LineNumberReader
(a subclass of BufferedReader
) is a buffered character stream that keeps track of the number of lines read (i.e., a newline, a return or a carriage-return–line-feed combination). Keeping track of line numbers can be useful if the program needs to inform the reader of an error on a specific line.
Class FileReader
(a subclass of InputStreamReader
) and class FileWriter
(a subclass of OutputStreamWriter
) read characters from and write characters to a file, respectively. Class PipedReader
and class PipedWriter
implement piped-character streams that can be used to transfer information between threads. Class StringReader
and StringWriter
read characters from and write characters to String
s, respectively. A PrintWriter
writes characters to a stream.
JFileChooser
Class JFileChooser
displays a dialog (known as the JFileChooser
dialog) that enables the user to easily select files or directories. To demonstrate the JFileChooser
dialog, we enhance the example in Section 14.4, as shown in Figs. 14.22–14.23. The example now contains a graphical user interface, but still displays the same data as before. The constructor calls method analyzePath
in line 34. This method then calls method getFile
in line 68 to retrieve the File
object.
Fig. 14.22. Demonstrating JFileChooser
.
Fig. 14.23. Testing class FileDemonstration
.
Method getFile
is defined in lines 38–62 of Fig. 14.22. Line 41 creates a JFileChooser
and assigns its reference to fileChooser
. Lines 42–43 call method setFileSelectionMode
to specify what the user can select from the fileChooser
. We use JFileChooser static
constant FILES_AND_DIRECTORIES
to indicate that files and directories can be selected. Other static
constants include FILES_ONLY
and DIRECTORIES_ONLY
.
Line 45 calls method showOpenDialog
to display the JFileChooser
dialog titled Open. Argument this
specifies the JFileChooser
dialog’s parent window, which determines the position of the dialog on the screen. If null
is passed, the dialog is displayed in the center of the screen—otherwise, the dialog is centered over the application window (specified by the argument this
). A JFileChooser
dialog is a modal dialog that does not allow the user to interact with any other window in the program until the user closes the JFileChooser
by clicking the Open or Cancel button. The user selects the drive, directory or file name, then clicks Open. Method showOpenDialog
returns an integer specifying which button (Open or Cancel) the user clicked to close the dialog. Line 48 tests whether the user clicked Cancel by comparing the result with static
constant CANCEL_OPTION
. If they are equal, the program terminates. Line 51 retrieves the file the user selected by calling JFileChooser
method getSelectedFile
. The program then displays information about the selected file or directory.
In this chapter, you learned how to use file processing to manipulate persistent data. You learned that data is stored in computers as 0
s and 1
s, and that combinations of these values are used to form bytes, fields, records and eventually files. We compared character-based and byte-based streams, and introduced several file-processing classes provided by the java.io
package. You used class File
to retrieve information about a file or directory. You used sequential-access file processing to manipulate records that are stored in order by the record-key field. You learned the differences between text-file processing and object serialization, and used serialization to store and retrieve entire objects. The chapter concluded with an overview of other classes provided by the java.io
package, and a small example of using a JFileChooser
dialog to allow users to easily select files from a GUI. Chapter 15, Generics, presents a mechanism for declaring classes and methods without specific type information so that the classes and methods can be used with many different types. Generics are used extensively in Java’s built-in set of data structures, known as the Collections API, which we discuss in Chapter 16.