Lab 9: Files and Streams


Introduction

Anyone who has used computer disks, e-mail accounts, the world-wide web, and so on has been using files for storing and retrieving information. The source programs that we've written in these lab exercises and projects have been stored in files as have the binary executables created by compiling those programs.

However, files as we use them in computing are more general than programs because they can store other information than simply program instructions. Files are containers that store data — program code, the html representation of a web page, the text of a novel, pictures, and so on. They differ from programs in that a program is a sequence of instructions and a file is a container in which data (which might be program code) can be stored. For example, word processing documents are saved as files. Those files look quite different from the files that store your programs and the files that store binary executables.

But a basic question remains. A word processor can read data from a file and can write data to a file. Graphics software can read data from a file and display it graphically on the screen or produce an output file that can be sent to a printer. Can our programs read and process data from a file in addition to input from the keyboard? In addition to outputting data directly to the screen, can they send data to a file that is saved and can be output or processed in some other way at a later time? This would be especially useful for problems where the amount of data to be processed is so large that entering it from the keyboard each time the program is executed is inconvenient or even impractical. Similarly, if a large amount of output is produced, we might wish to save it to disk and process it later with other software.

The Problem

Because this is our first look at file i/o, we will start with a simple, but interesting, problem — encrypting and decrypting text. When you were younger, perhaps you enjoyed writing "secret" messages that could be read only by someone who knew your secret way of encryping that message. Today, some of us still anxiously await delivery of the newspaper so we can tackle the daily "cryptoquip." And we all are aware of the importance of encryption and decryption techniques in espionage and other situations where information must be kept secret.

Encrypting and decrypting messages of this sort have a long history. The example we will look at in this lab exercise is one of the simplest and most widely known — the Caesar cipher, named after Julius Caesar who, according to Wikipedia, used it in his private correspondence. Here is an example of a message encrypted using a Caesar cipher:

Message Encrypted
 One if by land, two if by sea.   Rqh li eb odqg, wzr li eb vhd. 

What is the relationship between the letters in the original sentence and those in the encrypted sentence? Hint: compare the "difference" between the corresponding characters in the two sentences.

This lab's exercise is to use the Caesar cipher technique to encrypt and decrypt messages stored in files.

Files

Here are the files you will be using:

Remember to add your name and other information required by your instructor to the opening documentation of the files to be handed in.

An Encryption Program

The first part of this exercise is to write a program to encrypt a message that is stored in a file. The encrypted message will be saved in a second file. For this you will need the caesar library, the source program encrypt.cpp that you will complete, and the data file message.txt to be encrypted.

Design

As usual, we will use object-centered design to develop our program.

Behavior.

Our program should display an introductory message and then prompt for and read the name of the input file. It should then connect an input stream to that file so that we can read from it, and check that the stream opened correctly. It should then prompt for and read the name of the output file. It should then connect an output stream to that file so that we can write to it and check that the connection was made. Our program should read each character in the input file one at a time, encrypt it using the Caesar cipher, and output the encrypted character to the output file. It should then disconnect the streams from the files and quit.

Objects. Using this behavioral description, we can identify the following objects:

Description Type Kind Name
 an introductory message  string  constant   ---
 the name of the input file  string  varying  inFile
 an input stream  ifstream  varying  inStream
 the name of the output file  string  varying  outFile
 an output stream  ofstream  varying  outStream
 a character from the input file  char varying  inChar
 an encrypted character  char  varying  outChar

Using this list of objects, we can formulate our specification:

Specification:
        Input  (inFile): a sequence of non-encrypted characters
        Output  (outFile): a sequence of encrypted characters

Note: This list of objects raises an important distinction:

    the difference between a file name and a file stream:

Look at their data types:

Understanding the difference may save you a lot of time debugging to find where you applied a file stream operation to the name of a file (a string) or a string operation to the name of a file stream.
 
Operations. From our behavioral description, we have these operations:

Description Predefined? Name Library
 Display a string  yes  <<  iostream
 Read a string  yes  >> or getline()  iostream
 Connect an input stream to a file  yes  ifstream declaration  fstream
 Connect an output stream to a file  yes  ofstream declaration  fstream
 Check...  yes  assert()  cassert
 ... that a stream opened properly  yes  is_open()  fstream
 Read a char from an input stream  yes  get()  fstream
 Encrypt a char using the Caesar cipher  yes  caesarEncrypt()  caesar
 Write a char to an output stream  yes  <<  fstream
 Repeat input, encrypting, and output operations  yes  input loop  built-in
 Determine when all chars have been read  yes  eof()  fstream
 Disconnect a stream from a file  yes  close()  fstream

Algorithm. We can organize these operations into the following algorithm:

  1. Display a greeting.
  2. Prompt for and read inFile, the name of the input file.
  3. Create an ifstream named inStream connecting our program to inFile.
  4. Check that inStream opened correctly.
  5. Prompt for and read outFile, the name of the output file.
  6. Create an ofstream named outStream connecting our program to outFile.
  7. Check that outStream opened correctly.
  8. Loop through the following steps:
    1. Read a character from the input file.
    2. If the end-of-file was reached, then terminate repetition.
    3. Encrypt the character.
    4. Write the encrypted character to the output file.
    End loop.
  9. Close the input and output connections.
  10. Display a "successful completion" message.

Coding

Steps 0, 1, and 4 are familiar and have already been implemented in main() of encrypt.cpp as has the encrypting in 7-c — it is handled by the encrypt() function from the caesar library. You will be implementing the file I/O steps in this lab exercise.

Opening a Connection to a File. Before we can input data from a file, we have to inform the compiler that this is what we want and instruct it to open a connection to this file. We also have to be precise about what file we want and where it is found. We surely don't want the program to have to search through all files on the machine with each transfer of data to or from a file!

As we will see, once this connection is made, input from and output to a file proceeds in very much the same way as input from the keyboard and output to the screen. However, file I/O is fairly "expensive" timewise because data moves much slower to and from a disk than to and from a computer's main memory.

So how do we open a connection between the program and a file? File connections are known as streams, and there are two types of streams: ifstream for input file streams and ofstream for output file streams. Like any other object, a stream must be declared before it can be used. If inputFileName is a string object containing the name of an input file, then the declaration

ifstream inFileStream(inputFileName.data());
constructs a stream object named inFileStream as a connection to the file. The string method data() extracts the actual characters from a string. If the file does not exist, an error results. More on this later.

═► Using this information, implement the step of our algorithm that creates and opens a connection inStream to the input file named inFile.

An output stream is similar; the declaration

ofstream outFileStream(outputFileName.data());
constructs a stream object named outFileStream as a connection to the file named outputFileName. If the file does not exist, then a file by that name is created in the working directory. If the file does exist, then its contents are erased. An ofstream thus provides a connection to a file so that we can output data to the file.

═►Using this information, implement the appropriate step of our algorithm by declaring an ofstream named outStream that serves as a connection between our program and the file whose name is in outFile.
 

Libraries. Try compiling your code. When you do, you should get error messages about ifstream and ofstream. To see how to fix this, look back in the operations chart above to find the name of the library where these streams are defined that must be #included.

═► Include the proper library for these identifiers, and then compile your code. But don't try to execute it yet because your loop doesn't have a termination test.
 

Checking that a Connection Opened Correctly. If an fstream opens as expected, the operation is said to succeed, but if it does not open as expected, the operation is said to fail. A variety of errors can cause an attempt to open a file to fail. For example, the input file to which our program attempts to open a connection may have been accidentally deleted. Or that file may have never existed in the first place!

To check whether an open operation succeeded, fstream objects contain an is_open() method:

fileStream.is_open()
returns true if fileStream is open, and it returns false otherwise.

═► Use the is_open() method in an assert() to implement the file-open checks in Steps 3 and 6 of our algorithm. Compile (but again don't execute) your program.
 

Input from an ifstream. One of the nicest things about file input and output is that you already know how to do it:

  File I/O is done the same way as keyboard and screen I/O.

In the same way that we have used the >> operator to read data from the istream named cin, we can use the >> operator to read data from an ifstream opened for input. Because an ifstream connects a file to a program, applying >> to it transfers data from the file to the program. For this reason, this operation is described as reading from a file, even though we are actually operating on the ifstream. An expression of the form

inputFileStream >> variableName

reads values from an ifstream named inputFileStream into the variable variableName. The type of the value being read must match the type of variableName, or the operation will fail.

However, although the input operator is the appropriate operator to solve many problems involving file input, it is not for our problem. The reason is that the >> operator skips leading whitespace characters. That is, if our input is

     One if by land.
     Two if by sea.
and we use the >> operator (in a loop) to read each individual character,
     inStream >> inChar;
then all whitespace characters (blanks, tabs and newlines) will be skipped, so that only non-whitespace characters are processed, just as if the file contained
     Oneifbyland.Twoifbysea.

To avoid this problem, ifstream objects contain a get() method:

inputFileStream.get( characterVariable );
When execution reaches this statement, the next character, including whitespace characters, is read from inputFileStream and stored in characterVariable.

═► Use the get() method of the inStream object to perform the char input in the loop. Then compile your program, but don't execute it yet!
 

Controlling a File-Input Loop. Files are created by a computer's operating system. When it creates a file, it marks the end of the file with a special end-of-file mark. Input operations are implemented in such a way that prevents them from reading beyond this end-of-file mark because doing so could allow a programmer unauthorized access to files belonging to someone else. However, unless instructed to stop, input operations will continue trying to read data at this end-of-file forever.

Each ifstream object has a method named eof() that can be used to detect this end-of-file condition,

inputFileStream.eof()
This expression returns true if the last attempt to read from inputFileStream tried to read the end-of-file mark, and it returns false otherwise. Note that we have to attempt to read first and then test for end-of-file.

In a forever loop like the one in the source program, the eof() method can be used as our termination test. By placing an if-break combination

if ( /* end-of-file has been reached */ ) break;
following the input step, repetition will be terminated when all of the data in the file has been processed.

═► In your source program, place an if-break in the appropriate place in our algorithm, using the eof() method of inStream as its condition. Then compile your source program to check the syntax of what you have written. When it is correct, continue to the next part of this lab exercise. (Note: You could run the program now, but it won't really do anything interesting because it's not generating any output.)
 

File Output. Just as the << operator was used to write data to the ostream named cout, it can also be used to write data to an ofstream opened for output. Once the ofstream has connected the program to a file, << can be used to transfer data from the program to the file. We call this writing to the file, although it actually is an ofstream operation.

The pattern for output is what we would expect:

outputFileStream << value 
where outputFileStream is an ofstream, and value is to be written to the file.

═► Now, finish the loop in our program, writing the encrypted character — not the original — to the output file. Compile your program to test its syntax and fix any compilation errors.
 

Closing Files. Once we are done using an stream to read from or write to a file, we should close it to break the connection between our program and the file. This is accomplished using the close() method. Both the ifstream and ofstream classes have this method:

fileStream.close();
When execution reaches this statement, the connection between fileStream and the file is broken.

═► In the appropriate place in the source program, place calls to close() for both the input stream and the output stream. Then compile and recompile your program until it is free of syntax errors.
 

Testing and Debugging

When your program's syntax is correct, test it using the provided file named message.text. If what you have written is correct, your program should create an output file that contains the output

    Rqh Li Eb Odqg
    Wzr Li Eb Vhd
If this file is not produced, then your program contains a logic error. Check and recheck the statements in your source program, comparing them to those described in the earlier parts of this lab exercise until you find the error. Correct your program, recompile and retest it, repeating this until it produces the correct output.
 

Applying What We Have Learned

The last part of this exercise is for you to apply what you have learned to the problem of decoding a file that was encrypted using the Caesar cipher. For this you will need the caesar library, the source program decrypt.cpp, and the data file alice.code.

Complete the skeleton program decrypt.cpp for decoding a message encoded using the Caesar cipher. Do all that is necessary to make it operational, so that messages encrypted with encrypt.cpp can be decrypted with decrypt.cpp. The two programs should, therefore, complement each other.

The difficult part has been done for you. The caesar library contains a caesarDecrypt() function that does all the work of decrypting. Your job is to build the driver to handle file I/O.

Test your program using the output file created by encrypt.cpp and alice.code, a selection from Lewis Carroll's Alice In WonderLand.

Watch out! Be careful with the names of your output files. Other software applications that you use may warn you when you're about to overwrite an existing file. However, you won't get this warning because we didn't implement this feature in our program. So, if you encrypt message.txt to produce message.code and then you decrypt message.code to be message.txt, then say goodbye to the old message.txt! It has been blown away and you'll have to copy it over again. It's better to use some other suffix such as message.decrypt when decrypting a file message.code.

Submit

Turn in your final code for encrypt.cpp and decrypt.cpp as well as the output from these programs.


Lab Home Page


Report errors to Larry Nyhoff (nyhl@cs.calvin.edu)