Throughout this manual, we have made use of files -- containers on a hard or floppy disk that can be used to store information for long periods of time. For example, each source program that we have written has been stored in a file, and each executable program has also been stored in a file. While they may seem to be the same, files differ from programs in that a program is a sequence of instructions, and a file is a container in which a program (among other things) can be stored.
In the previous exercise, we saw a different use for files is to store data. That is, where previously our exercises read data from the keyboard and wrote data to the screen, an alternative approach is to store the data to be read in a file, and read the data from there. Similarly, instead of writing data to the screen, there are situations where it is useful to instead have a program write its data to a file. This approach is particular useful for problems where the amount of data to be processed is so large that entering the data each time the program is executed (interactively) becomes inconvenient. That inconvenience can be eliminated by storing the data in a file, and then having the program read from the file, instead of the keyboard.
In today's exercise we will take a closer look at the I/O facilities provided by Java.
A data stream is an abstraction of input/output that supports sequential reading/writing of data. For example, with an input stream, the data flows into the program and is handled by it in the order that it occurs in the stream. This notion fits rather closely with the kind of input we have done so far. In fact, Keyboard is a kind of input stream. Similarly, we have been writing to a Screen which is a kind of output stream.
In fact, Java has two kinds of classes that encode the notion of a stream at a very primitive level.
The most basic of the byte streams classes are the InputStream and OutputStream. These classes are very limited. Though there are a few other operations that are supported, they basically only know how to read and write bytes (8 bits of data).
InputStream |
OutputStream |
read() - read a byte |
write() - write a byte |
close() - close the input stream |
close() - close the output stream |
One other disadvantage of the these streams is that they will perform an I/O operation with every read or write. This can be a serious problem if we are accessing a hard disk which has slow access times. To improve performance, most systems use some kind of buffering. For writing, data is kept in a buffer in RAM until a large chunk of data has been accumulated and then it is written as a group. For reading, data is read in a big chunk and placed in a buffer in RAM where it can be read one piece at a time. The goal of both of these processes is to reduce the number of times a program accesses the slower I/O hardware. Java encodes this notion in the two classes BufferedInputStream and BufferedOutputStream. They don't supply any additional operations, but are more efficient.
If all we could do was to read and write bytes of data, we would not be very happy. Fortunately, we do not have to deal with such primitive capabilities. Java has stream objects that support abilities to write different kinds of data. These are the DataInputStream and the DataOutputStream. They can read and write any of the primitive data types that Java supports. Some of the additional methods that are available are:
DataInputStream |
DataOutputStream |
readBoolean() - read a boolean |
writeBoolean() - write a boolean |
readByte() - read a byte |
writeByte() - write a byte |
readChar() - read a char |
writeChar() - write a char |
readDouble() - read a double |
writeDouble() - write a double |
readFloat() - read a float |
writeFloat() - write a float |
readInt() - read a int |
writeInt() - write a int |
readLong() - read a long |
writeLong() - write a long |
readShort() - read a short |
writeShort() - write a short |
These methods have been shown paired for a good reason. The write operations will record the binary representation of the data. The consequence of this is that these operations do not give us human readable output streams. If we want to read the data, we must use the corresponding read operation. If we were to do writeFloat() and then use readDouble() to try and retrieve the data, we will not read what we expect. Even worse, since float and double are different sizes, any further read operations will be out of synch with the subsequent data.
In addition, Java has another set of primitive stream classes for doing I/O that are based on the classes Reader and Writer. In contrast to the InputStream and OutputStream, these work with characters. Characters in Java are stored internally in Unicode which uses 16 bits to represent characters. The actual representation of the data that is stored in a text file will typically depend on the locality of the machine.
Reader |
Writer |
read() - read a character |
write() - write a character |
close() - close the character stream |
close() - close the character stream |
Just as with byte streams, we have classes that support buffered I/O for character streams. These are BufferedReader and BufferedWriter. These classes support the reading and writing of strings of characters via the following operations.
BufferedReader |
BufferedWriter |
readLine() - read a String of characters |
write() - overloaded to provide writing a String |
Though we will use one InputStream in today's lab, typically we work with character streams and that is where our focus will lie.
A print stream is a kind of stream that has been extended to support the operations print() and println(). These operations are overloaded so that they can successfully print an Object as well as all the primitive Java data types. In contrast to write(), these operations will print the data in human readable form. This should come as no surprise as you have used these operations many times.
There are two kinds of print streams: PrintStream is a kind of OutputStream and PrintWriter is a kind of Writer.
There are three unique predefined streams in Java:
While Java programmers will typically use System.out directly, they will rarely do so with System.in. Instead, they usually use System.in in the constructor of some other class that provides a more high level of access.
Take a moment to look at the class definitions in ann.easyio. You'll see that the Screen class doesn't do much beyond invoking print() or println() with System.out. The Keyboard class on the other hand has added an operation that reads words and based on that are operations for reading each of the primitive data types.
We want to be able to have the ability to connect a stream to a data file (open a file). Java has four basic classes that support this ability. Each of these classes has a constructor that accepts a string as the name of the file one wishes to open.
File class |
Is a kind of |
FileInputStream |
InputStream |
FileOutputStream |
OutputStream |
FileReader |
Reader |
FileWriter |
Writer |
So if we needed to have a character stream connected to a file for reading, we would use FileReader. Since a FileReader is a kind of Reader, we can use it anywhere that an Reader can be used. If the file that we wanted to open was named "data.text" we would use the constructor
new FileReader("data.text)
to make the connection to the file. Each of the constructors that open a file for reading may throw a FileNotFoundException and we must put this code inside a try/catch which will catch that kind of exception, Similarly, if we open a file for writing, an IOException may be thrown and we must use a try/catch.
Java has a highly refined capability for writing data values in its print stream classes. For example, if we wanted to open a file for writing and use the print() method instead of write(), we could do
PrintWriter out = null; String name = "somewhere.dat"; try { out = new PrintWriter( // where print() is defined new BufferedWriter( // buffer the output new FileWriter(name))); // connect to the file } catch (IOException ex) { // code to deal with the exception }
With input, on the other hand, the best we have are the buffered streams which provide a readLine() facility. So how does one take that string and extract out the meaningful data? The answer is that we use a combination of a StringTokenizer and the classes associated with each primitive data type. The StringTokenizer is used to break the line up into the pieces that we will process individually. Examples of this are shown in GUI Interlude 2 and Lab 6.
The following examples show how one can convert a String into an int and a float:
String one = "123"; String two = "1.443"; int data1 = Integer.parseInt(one); float data2 = Float.valueOf(two).floatValue(); float data3 = Float.parseDouble(two);
For an int, we can use the parseInt() method, and for a float, we can use the parseDouble() method.
When many of us were younger, we enjoyed writing secret messages, in which messages were encoded in such a way as to prevent others from reading them, unless they were in possession of a secret that enabled them to decode the message. Coded messages of this sort have a long history. For example, the Caesar cipher is a simple means of encoding messages that dates from Roman times. To illustrate, the Caesar cipher produces the encoded sentence:
Rqh li eb odqg, wzr li eb vhd.
when applied to the historic phrase:
One if by land, two if by sea.
What is the relationship between the letters in the original sentence and those in the encoded sentence?
Today's exercise is to use the Caesar cipher to encode and decode messages stored in files.
Create a new project named Cipher for the files of this exercise, and in it, save copies of the files Encode.java, Decode.java, message.text, and alice.code. Then personalize the documentation in the file Encode.java, and take a few moments to study its contents.
The first part of today's exercise is to write a program that can be used to encode a message that is stored in a file. To demonstrate both input from and output to a file, we will store the encoded message in a second file.
As usual, we will apply object-centered design to solve this problem.
Behavior.
Our program should display a greeting and then prompt for and read the name of the input file. It should then try to connect an input stream to that file so that we can read from it, and print a diagnostic message if the stream does not open correctly. It should then prompt for and read the name of the output file. It should then try to connect an output stream to that file so that we can write to it, and print a diagnostic message if the stream does not open correctly. For each character in the input file, our program should read the character, encode it using the Caesar cipher, and output the encoded character to the output file. Our program should conclude by disconnecting the streams from the files, and then print a "success" message.
Objects. Using this behavioral description, we can identify the following objects:
Description |
Type |
Kind |
Name |
---|---|---|---|
An input stream |
BufferedReader |
varying |
theKeyboard |
An output stream |
PrintStream |
varying |
System.out |
a greeting |
String |
constant |
none |
The name of the input file |
String |
varying |
inFileName |
An input stream |
BufferedReader |
varying |
inStream |
The name of the output file |
String |
varying |
outFileName |
An output stream |
BufferedWriter |
varying |
outStream |
a character from the input file |
int |
varying |
inValue |
an encoded character |
char |
varying |
outChar |
Using this list of objects, we might specify the behavior of our program as follows:
Input(inFileName), a sequence of unencoded characters. Output(outFileName), a sequence of encoded characters.
Operations. From our behavioral description, we have these operations:
Description |
Defined? |
Name |
Package/Class? |
|
---|---|---|---|---|
1 |
Display a String |
yes |
println() |
java.io.PrintWriter |
2 |
Read a String |
yes |
readLine() |
java.io.BufferedReader |
3 |
Connect an input stream to a file |
yes |
BufferedReader constructor |
java.io.BufferedReader |
FileReader constructor |
java.io.FileReader |
|||
4 |
Connect an output stream to a file |
yes |
PrintWriter constructor |
java.io.PrintWriter |
FileWriter constructor |
java.io.FileWriter |
|||
5 |
Check that a stream opened properly |
yes |
An IOException is thrown if it does not |
|
6 |
Read a char from an input stream |
yes |
read() |
java.io.BufferedReader |
7 |
Encode a char using the Caesar cipher |
yes |
caesarEncode() |
Encode.java |
8 |
Write a char to an output stream |
yes |
write() |
java.io.PrintWriter |
9 |
Repeat 6, 7, 8 for each char in the file |
yes |
input loop |
built-in |
10 |
Determine when all chars have been read |
yes |
read() returns -1 when no chars remain to be read |
|
11 |
Disconnect a stream from a file |
yes |
close() |
java.io.BufferedReader |
Algorithm. We can organize these operations into the following algorithm:
0. Create an input stream to the keyboard using System.in. 1. Display a greeting. 2. Prompt for and read inFileName, the name of the input file. 3. Create a BufferedReader named inStream connecting our program to inFileName. 4. Check that inStream opened correctly. 5. Prompt for and read outFileName, the name of the output file. 6. Create a BufferedWriter named outStream connecting our program to outFileName. 7. Check that outStream opened correctly. 8. Loop through the following steps: a. read a character from the input file. b. if end-of-file was reached, then terminate repetition. c. encode the character. d. write the encoded character to the output file. End loop. 9. Close the input and output connections. 10. Display a "successful completion" message.
Encode.java already implements a number of these steps. It should be evident that we can perform
That leaves the file-related operations in steps 0, 2, 3, 4, 5, 6, 7, 8a, 8d and 9 for us to learn how to perform.
Wrapping System.in with a BufferedReader
As mentioned previously, System.in only provides a relatively low level access. We would like to create a BufferedReader based on System.in. The only problem with this is that both of the constructors for BufferedReader take an argument that is a Reader. So first we need to create a Reader using System.in.
If we look at the documentation for Reader we see that it has a subclass (also acceptable) which is an InputStreamReader. InputStreamReader is a bridge class between streams and readers and has a constructor that will take an InputStream.
We do
new InputStreamReader(System.in)
to create something which is a Reader. We take that and use it to create the BufferedReader:
BufferedReader theKeyboard = new BufferedReader( new InputStreamReader(System.in));
Using this information, implement step 0 of our algorithm in Encode.java.
One might think that
inFileName = theKeyboard.readLine();
would be sufficient to read in the name of the file. Unfortunately, if we try to compile this code, we will get an error message. This code has the possibility of throwing an IOException. Unlike the exceptions we've seen before which we can choose to ignore and let the code fail, Java forces us to deal with this type of exception. So we need to wrap this code in a try/catch as shown here:
try{ inFileName = theKeyboard.readLine(); } catch (IOException ex) { System.err.println("Failed to read file name"); System.exit(1); }
Using this information, implement step 2 of our algorithm in Encode.java.
Opening a Connection to a File.
An executing program is unable to interact directly with a file for a very simple reason: an executing program resides in main memory and a file resides on a secondary memory device, such as a hard disk. However, an executing program can interact indirectly with a file, by opening a connection between the program and that file. In Java, such connections are FileReader or FileWriter objects as mentioned before.
Like any other object, an FileReader must be created before it can be used. If inputFileName is a String object containing the name of an input file, then the code
new FileReader(inputFileName);
constructs a FileReader object connected to that file. Again we will want to wrap it with a BufferedReader for efficiency.
new BufferedReader( new FileReader(inputFileName));
Using this information, implement step 3 of our algorithm in Encode.java by declaring a BufferedReader named inStream that serves as a connection between our program and the file whose name is in inFileName.
To perform step 5 of our algorithm, we must open a FileWriter for output to the output file. Such an object can be created as follows:
new FileWriter(outputFileName)
Similarly, we will want to use this object to create a BufferedWriter for efficiency.
Using this information, implement step 5 of our algorithm by declaring an BufferedWriter named outStream that serves as a connection between our program and the file whose name is in outFileName.
Checking that a Connection Opened Correctly.
Opening files is an operation that is highly susceptible to user errors. For example, suppose the user has accidentally deleted the input file and our program tries to open a connection to it? In Java, if there is a problem opening a file to be read, a FileNotFoundException will be thrown. Again this exception requires our attention in the form of a try/catch.
If there is a problem opening a file to be written, an IOException will be thrown,
Complete steps 4 and 7 by wrapping the creation of inFileName an outFileName with try/catches. If there is an exception, print an error message and exit the program.
When your source program compiles correctly (except possibly for an error indicating that the completion message statement may not be reached) continue on to the next part of the exercise. Do not execute your source program yet or an infinite loop will occur.
Input from a BufferedReader.
We are interested in reading one character at a time. The read() method of the BufferedReader class almost does what we want. It will read a character from the file, but it returns an int value (32 bit). This operation is described as reading from the file, even though we are actually operating on the BufferedReader.
intVariable = inStream.read();
Once again, if there is an error, an IOException will be thrown and we are forced to implement a try/catch to handle this.
Using this information, implement step 8a of our algorithm. Then compile your program, and continue when what you have written is syntactically correct.
Controlling a File-Input Loop.
Files are created by a computer's operating system. When the operating system creates a file, it marks the end of the file with a special end-of-file mark. Input operations are then implemented in such a way as to prevent them from reading beyond the end-of-file mark, since doing so could allow a programmer unauthorized access to the files of another programmer.
This end-of-file mark can be used to control a loop that is reading data from the file. Java indicates an end of file by having read() return the value -1. If you do a read after that, an exception will be generated.
The expression
inValue == -1
will allow us to determine if the end of the file has been reached.
In a forever loop like the one in the source program, we can prevent infinite loop behavior by placing an if-break combination:
if ( /* end-of-file has been reached */ ) break;
following the input step, repetition will be terminated when all of the data in the file has been processed.
In your source program, place an if-break combination in the appropriate place to perform step 8b of our algorithm. Then compile your source program, to check the syntax of what you have written. At this point, you should not have any syntactical errors left in your program. When it is correct, continue to the next part of the exercise.
File Output.
We need to be able to write a character on the output file. If we examine the operations in BufferedWriter, we see that there is a write() method that will write a single character. This is just what we need. If we needed to be able to use print(), we can wrap our FileWriter with a PrintWriter instead of the BufferedWriter.
The general form to use write() is:
outputStreamName.write(charValue) ;
where outputStreamName is a FileWriter or BufferedWriter, and charVariable is a character we wish to store in the file to which outputStreamName is a connection. Once again, if there is a problem, an IOException will be thrown which we must catch.
Use this information to write the encoded character to your output file via outStream, to perform step 8d of our algorithm. Then compile your source program to test the syntax of what you have written, continuing when it is correct.
Closing Files.
Once we are done using a stream to read from or write to a file, we should close it, to break the connection between our program and the file. This is accomplished using the method close(), whose statement form is
streamName.close();
When execution reaches this statement, the program severs its connection to streamName. Once again, if there is a problem, an IOException will be thrown that must be dealt with.
In the appropriate place in the source program, place calls to close() to
Then compile your source program, and ensure that it is free of syntax errors.
When your program's syntax is correct, test it using the provided file named message.text. If what you have written is correct, your program should create an output file (e.g., message.code), containing the output:
Rqh Li Eb Odqg Wzr Li Eb Vhd
If this file is not produced, then your program contains a logical error. Retrace your steps, comparing the statements in your source program to those described in the preceding parts of the exercise, until you find your error. Correct it, retranslate your source program and then retest your program, until it performs correctly.
The last part of today's exercise is for you to apply what you have learned to the problem of decoding a file encoded using the Caesar cipher. Complete the skeleton program Decode.java, that can be used to decode a message encoded using the Caesar cipher. Do all that is necessary to get this program operational, so that messages encoded with Encode.java can be decoded with Decode.java. Put differently, the two programs should complement one another.
To test your program, you can use the output file created by Encode.java, or alice.code, a selection from Lewis Carroll's Alice In WonderLand.
File, Stream, Reader Writer, Buffer, Opening A File, File Input, File Output, Closing A File, End of File, IOException.
Hard copies of your final versions of Encode.java and Decode.java, plus an execution record showing their execution.
Back to This Lab's Table of Contents
Forward to the Homework Projects