Throughout this lab manual, we have made extensive use of files---containers on a hard or floppy disk that can be used to store information for long periods of time. Each source program that we have written has been stored in a file, and each binary executable program has also been stored in a file.
Files differ from programs in that a program is a sequence of instructions, and a file is a container in which data (like program code) can be stored. For example, the word processing documents that you work on are saved as files. Those files look quite different from the files that store your programs and from the files that store your executables.
But this begs the question, why can't we read and write data from and to a file like a word processor? This would be particularly useful for problems where the amount of data to be processed is so large that interactively entering the data each time the program is executed becomes inconvenient. If we could only store our data in a file, then we could test our program many times over without having to retype the data each time.
Let's consider a simple problem.
When many of us were younger, we enjoyed writing secret messages, in which messages were encoded in such a way as to prevent others from reading them, unless they were in possession of a secret that enabled them to decode the message. Coded messages of this sort have a long history. For example, the Caesar cipher (invented, it is said, by Julius Caesar himself) is a simple way to encoding messages.
For example, consider this message and its encoding:
Message | Encoded |
---|---|
One if by land, two if by sea. |
Rqh li eb odqg, wzr li eb vhd. |
This lab's exercise is to use the Caesar cipher to encode and decode messages stored in files.
Directory: lab9
caesar.h
, caesar.cpp
, and
caesar.doc
implement a Caesar encryption
function.encode.cpp
and decode.cpp
are the
two drivers needed for this lab exercise.message.text
and alice.code
are
two sample input files.gcc
users need a makefile; all others
should create a project and add all of the .cpp
files to
it.
Add your name, date, and purpose to the opening documentation of the code and documentation files; if you're modifying and adding to the code written by someone else, add your data as part of the file's modification history.
The first part of this exercise is to write a program that can be used to encode a message that is stored in a file. The encoded message will then be saved to a second file.
As usual, we will apply object-centered design to solve this problem.
Behavior.
Our program should display a greeting and then prompt for and read the name of the input file. It should then connect an input stream to that file so that we can read from it, and check that the stream opened correctly. It should then prompt for and read the name of the output file. It should then connect an output stream to that file so that we can write to it, and check that the stream opened correctly. For each character in the input file, our program should read the character, encode it using the Caesar cipher, and output the encoded character to the output file. Our program should conclude by disconnecting the streams from the files.
This behavior is a bit verbose since we're just learning about files, but it's perhaps better to err on the verbose side of things rather than on the forgetting side of things. (Programs tend to crash when you forget to do things.)
Objects. Using this behavioral description, we can identify the following objects:
Description | Type | Kind | Name |
---|---|---|---|
a greeting | string | constant | --- |
The name of the input file | string | varying | inFile |
An input stream | ifstream | varying | inStream |
The name of the output file | string | varying | outFile |
An output stream | ofstream | varying | outStream |
a character from the input file | char | varying | inChar |
an encoded character | char | varying | outChar |
Using this list of objects, here's our specification:
Specification:input (inFile
): a sequence of unencoded characters.
output (outFile
): a sequence of encoded characters.
This list of objects raises an important question that you should think about:
Question #9.1: What is the difference between a file name and a file stream?
One immediate hint: consider the data types. The data type determines the operations you can perform on an object. You should answer to this question after you've read through this section. It's an important distinction that, if you understand the difference, you'll save yourself much heartache when writing and debugging your programs.
Operations. From our behavioral description, we have these operations:
Description | Predefined? | Name | Library |
---|---|---|---|
Display a string | yes | << | iostream |
Read a string | yes | >> | iostream |
Connect an input stream to a file | yes | ifstream declaration | fstream |
Connect an output stream to a file | yes | ofstream declaration | fstream |
Check... | yes | assert() | cassert |
...that a stream opened properly | yes | is_open() | fstream |
Read a char from an input stream | yes | get() | fstream |
Encode a char using the Caesar cipher | yes | caesarEncode() | caesar |
Write a char to an output stream | yes | << | fstream |
Repeat input, encoding, and output operations | yes | input loop | built-in |
Determine when all char s have been read | yes | eof() | fstream |
Disconnect a stream from a file | yes | close() | fstream |
Algorithm. We can organize these operations into the following algorithm:
inFile
, the name of the input file.ifstream
named inStream
connecting our
program to inFile
.inStream
opened correctly.outFile
, the name of the output file.ofstream
named outStream
connecting our program to outFile
.outStream
opened correctly.This algorithm should be encoded in main()
in encode.cpp
.
string
s, so nothing new yet.for(;;)...
) with an if
-break
to stop
the loop.caesarEncode()
from the caesar
library.Write the code for these steps. The if
-break
statement can wait, but the other steps listed here are
straightforward.
We only have to figure out the file I/O steps.
Opening a Connection to a File. When we want to get input from a file, we have tell the compiler that's what we want. It's a fairly expensive operation (since data moves much slower to and from a disk than to and from a computer's main memory). We also have to be precise about what file we want. We certainly don't want all files on the machine.
So we need to open a connection between the program and a file. A
connection is a thing, and all things in C++ are represented as
objects. File connections are known as streams, and there are two types of streams: ifstream
for input file streams and ofstream
for output file
streams.
Like any other object, a stream must be declared before it can be
used. If inputFileName
is a string
object
containing the name of an input file, then the declaration
ifstreamconstructs a stream object namedinFileStream
(inputFileName
.data());
inFileStream
as a
connection to the file.
Thestring
methoddata()
extracts the actual characters from astring
. If your compiler is not fully ANSI compliant, you may have to use thec_str()
method instead. The stream classes are a bit particular about the strings that they'll accept.
If the file does not exist, bad things happen. More on this later.
Using this information, implement the step of our algorithm that
creates and opens a connection inStream
to the input file
named inFile
.
An output stream is similar:
ofstream outFileStream(outputFileName.data());This declaration constructs an object named
outFileStream
as a connection to the file named outputFileName
.
If the file does not exist, then a file by that name is created in
the working directory. If the file does exist, then its contents are
erased. An ofstream
thus provides a connection to a
file so that we write data to the file.
Using this information, implement the appropriate step of our
algorithm by declaring an ofstream
named outStream
that serves as a connection between our program and the file whose
name is in outFile
.
Libraries. Try compiling your code. Ooops. You should
get complaints about ifstream
and ofstream
. The
answer is in the object chart above: you haven't included the proper
library.
Include the proper library for these identifiers, and then compile your code. Don't run your code yet because your loop doesn't have a termination test.
Checking that a Connection Opened Correctly. Opening files
is an operation that is highly susceptible to user errors. Suppose
the user has accidentally deleted the input file and our program
tries to open a connection to it? What if it never existed in the
first place! If an fstream
opens as expected, the operation
is said to succeed, but if it does not open as expected,
the operation is said to fail.
To detect the success of an open operation, fstream objects contain
an is_open()
method:
fileStream
.is_open()
which returns true if fileStream
is open, and it
returns false otherwise.
In an assert()
, the is_open()
method provides a
readable way to perform the checking steps of our algorithm.
Write the code for these steps. Compile (but again don't execute) your program.
Input from an ifstream
. The most important thing
about input (and output) is that you already know how to do it:
Helpful hint: File I/O is done the same way a screen and keyboard I/O.
Just as we have used the >>
operator is used to read data
from the istream
named cin
, the >>
operator
can be used to read data from an ifstream
opened for input.
Since the ifstream
connects a file to a program, applying
>>
to it transfers data from the file to the program. For
this reason, this operation is described as reading from a file, even though we
are actually operating on the ifstream
. An expression of the
form:
thus serves to read values from aninputFileStream
>>VariableName
ifstream
named inputFileStream
into the variable VariableName
. The
type of the value being read must match the type of VariableName
, or the operation will fail.
However, while the input operator is the appropriate operator to
solve many problems involving file input, it is not the
appropriate operator for our problem. The reason is that the >>
operator skips leading whitespace characters. That is, if
our input were
One if by land. Two if by sea.and we were to use the
>>
operator (in a loop) to read each
of these characters:
inStream >> inChar;then all whitespace characters (blanks, tabs and newlines) would be skipped, so that only non-whitespace characters would be processed, as if the file contained
Oneifbyland.Twoifbysea.
To avoid this problem, ifstream
objects contain a get()
method:
When execution reaches this statement, the next character, including whitespace characters, is read frominputFileStream
.get(CharacterVariable
);
inputFileStream
and stored in CharacterVariable
.
Use the get()
method of the inStream
object to
perform the char
input in the loop. Then compile your
program, and continue when your program compiles without error. Don't run it yet!
Controlling a File-Input Loop. Files are created by a computer's operating system. When the operating system creates a file, it marks the end of the file with a special end-of-file mark. Input operations are then implemented in such a way as to prevent them from reading beyond the end-of-file mark, since doing so could allow a programmer unauthorized access to the files of another programmer. The input operations will just keep you at the end-of-file forever until you realize where you are.
An ifstream
object has a method named eof()
that can
be used to control an input loop:
inputFileStream
.eof()
This expression returns true if the last read from inputFileStream
tried to read the end-of-file mark, and it returns
false otherwise. We have to read first, then test for
end-of-file.
In a forever loop like the one in the source program, the eof()
method can be used as our termination test. By placing an
if-break combination:
if ( /* end-of-file has been reached */ ) break;following the input step, repetition will be terminated when all of the data in the file has been processed.
In your source program, place an if
-break
in the
appropriate place in our algorithm, using the eof()
method of
inStream
as the condition in the if statement. Then compile
your source program, to check the syntax of what you have written.
When it is syntactically correct, continue to the next part of the
exercise. You probably could run the program now, but it won't do
anything interesting because it's not generating any output.
File Output. Just as we have used the <<
operator
to write data to the ostream
named cout
, the <<
operator can be used to write data to an ofstream
opened
for output. Since the ofstream
connects a program to a file,
applying <<
to it transfers data from the program to the
file. This operation is thus described as writing to the file, even though it is
an ofstream
operation.
The pattern for output should look pretty familiar:
outputFileStream
<<Value
;
outputFileStream
is an ofstream
, and Value
is value that should be written in the file.
Use this example as a basis for a statement to finish up the loop, writing the encoded character (not the original!) to the output file. Compile your program to test the syntax of what you have written, and fix all of your compilation errors.
Closing Files. Once we are done using an stream to read
from or write to a file, we should close it, to break the
connection between our program and the file. This is accomplished
using the method close()
. Both the ifstream
and
ofstream
classes have this method:
fileStream
.close();
When execution reaches this statement, the program severs its connection
to fileStream
.
In the appropriate place in the source program, place calls to close()
on the input stream and on the output stream. Then compile
your program, and ensure that it is free of syntax errors.
When your program's syntax is correct, test it using the provided
file named message.text
. If what you have written is
correct, your program should create an output file, containing the
output:
Rqh Li Eb Odqg Wzr Li Eb VhdIf this file is not produced, then your program contains a logical error. Retrace your steps, comparing the statements in your source program to those described in the preceding parts of the exercise, until you find your error. Or, pretend you're the computer and walk through your program. Correct your program, recompile it, and retest your program until it performs correctly.
The last part of this exercise is for you to apply what you have
learned to the problem of decoding
a file encoded using the Caesar cipher. Complete the skeleton
program decode.cpp
, that can be used to decode a message
encoded using the Caesar cipher. Do all that is necessary to get
this program operational, so that messages encoded with encode.cpp
can be decoded with decode.cpp
. Put differently,
the two programs should complement one another.
The difficult part has been done for you. The caesar
library
contains a caesarDecode()
function which does all the work of
decoding. Your job is to build the driver to handle file I/O.
To test your program, you can use the output file created by encode.cpp
, or alice.code
, a selection from Lewis Carroll's
Alice In WonderLand.
Beware! Watch the names of your output files. Every
application you've used has probably warned you when you're about to
overwrite an existing file. This is behavior that the program
had to implement. You haven't implemented it in this program,
so you won't get this warning. So if you encode message.text
to be message.code
, and then you decode message.code
to be message.text
, then say goodbye to the old message.text
! The old version will disappear, and you'll have to
copy it over again. It's better to use message.decode
,
perhaps, when you decode message.code
.
Turn in your code as well as the output from your programs.