Lab 9: Vectors


Introduction

In past exercises, we have dealt with sequences of values by processing the values one at a time. Today, we want to examine a new kind of object called a vector that can store not just one value, but an entire sequence of values. Like a string, a vector is a subscripted object, meaning the values within it can be accessed via a subscript or index:

The components or "spaces" within a vector in which values can be stored are called the vector's elements.

By storing a sequence of values in a vector, we can design operations for the sequence, and then implement such operations in a function that receives the entire sequence through a vector parameter.

To introduce the use of the vector, today's exercise is to write a program that processes the names and scores of students in a course, and assigns letter grades (A, B, C, D or F) based on those scores, using the "curve" method of grading. The format of the input file is a series of lines, each having the form:

   name score
As usual, we will use object-centered design to design our solution.

Analysis

With a little thought, it should be evident that this problem requires that we process the sequence of scores more than once. More precisely, to assign letter grades using the "curve" method, we must first compute the average and standard deviation of the scores. Once we have these values, we can determine the letter grade corresponding to each score. We must thus process the sequence of scores at least twice.

One way to solve this problem would be to use a file to store the sequence, and then reread the file each time we need to process the values. The problem with this approach is that reading from a file is quite slow compared to reading from main memory -- up to 1000 times as slow!

By contrast, the vector provides a convenient way to store a sequence of values in main memory and to directly access any value in the sequence whenever we wish. A vector thus allows us to process a sequence of values multiple times in a more efficient manner than does a file.

Behavior

Our program should display a greeting, and then prompt for and read the name of an input file. It should then read a sequence of names and a sequence of scores from the input file. It should then compute the average and standard deviation of the sequence of scores, and display these values. Using the average and standard deviation, it should compute the sequence of letter grades that correspond to the sequence of scores. It should then display each student's name, score and letter grade.

Objects

If we examine our behavioral description for objects, we find these:

Description Type Kind Name
a greeting string constant none
The name of the input file string varying inFile
a sequence of names vector<string> varying nameVec
a sequence of scores vector<double> varying scoreVec
the average score double varying average
the standard deviation of the scores double varying stdDeviation
a sequence of letter grades vector<char> varying gradeVec

We can thus specify the behavior of our program this way:

   Input(keyboard): the name of an input file.
   Input(input file): a sequence of names and scores.
   Output(screen): the average and standard deviation of the scores,
                    plus the sequences of names and scores,
                    and the sequence of corresponding letter grades.

Operations

From our behavioral description, we have these operations:

Description Defined? Name Library?
1 Display a string yes << iostream
2 Read a string yes >> iostream
3 Read a sequence of names and scores from a file no -- --
4 Compute the average of a sequence scores no -- --
5 Compute the standard deviation of a sequence scores no -- --
6 Compute the sequence of letter grades corresponding to a sequence scores no -- --
7 Display sequences of names, scores and letter grades no -- --

As you can see, we have a few functions to write.

Algorithm

We can organize the preceding objects and operations into the following algorithm:

   0. Display a greeting.
   1. Prompt for and read the name of the input file.
   2. Fill nameVec and scoreVec with values from the input file.
   3. Output the mean and standard deviation of the values in scoreVec.
   4. Compute gradeVec, a vector of the letter grades corresponding
       to the scores in scoreVec.
   5. Display nameVec, scoreVec and gradeVec, value-by-value.
We will thus use three different vector objects, nameVec storing string values, scoreVec storing double values, and gradeVec storing char values.

Getting Started

A skeleton program is provided in grades.cpp. Since some of the operations we will be creating are reuseable, we store them in a library, consisting of the header file DoubleVectorOps.h, the implementation file DoubleVectorOps.cpp, and the documentation file DoubleVectorOps.doc. A Makefile is provided in Makefile, Sample data files are provided in scores1.data, scores2.data, and scores3.data.

Make a new directory for this exercise, and then save copies of these files into that directory. Then begin editing the file grades.cpp, and take a few moments to look it over.

As indicated by its documentation, grades.cpp is a skeleton program that (partially) encodes the algorithm given above. Today's exercise will consist of completing this program.

Defining vector Objects

To define an object named vecObj for storing multiple values of type Type, we can write:

   vector<Type> vecObj;
Such a statement defines vecObj as an object capable of storing multiple values of type Type. Since a vector can store any type of value, a programmer using a vector must specify the type of value they want to store in a given vector.

This is our first look at a new kind of class called a template. Just as a function provides a way to parameterize a set of statements, a template provides a way to parameterize a class declaration. A complete discussion of templates is beyond us at this point, but because they allow a type to be passed to an object definition, they are commonly used to create container classes like vector.

In grade.cpp, use this information to define nameVec and scoreVec (as vector objects capable of storing string and double objects, respectively), in the places indicated by the comments in grades.cpp. To define a vector, you must include the standard header file named vector.

Filling vector Objects from a File

Next, we want to write function FillVectors() which fills our vector objects with values from the file whose name is in inFile.

Design

We can describe how this function should behave as follows:

Behavior. Our function should receive the name of the input file, an empty vector of strings and an empty vector of doubles from its caller. The function should open an ifstream to the input file. It should then read each name and score from the ifstream, appending them to the vector of strings and the vector of doubles, respectively. When no values remain to be read, our function should close the ifstream, and pass back the filled vector objects.

Objects. Using the behavioral description, our function needs the following objects:

Description Type Kind Movement Name
The name of the input file const string & constant received inFile
An empty vector of strings vector<string> & varying received,
passed back
nameVec
An empty vector of doubles vector<double> & varying received,
passed back
scoreVec
an ifstream to inFile ifstream varying local inStream
a name string varying local name
a score double varying local score

Given these objects, we can specify the behavior of our function as follows:

   Receive: inFile, a string;
            nameVec, a vector<string>; and
            scoreVec, a vector<double>.
   Passback: nameVec and scoreVec, filled with the values from inFile.
Since this function seems unlikely to be generally reuseable, we will define it within the same file as our main function. Using the above information, place a prototype for FillVectors() before the main function, and a function stub after the main function. Note that since inFile is a class object that is received but not passed back, it should be defined as a constant reference parameter, not as a value parameter. By contrast, nameVec and scoreVec are passed back, and so should be defined as reference parameters.

Operations. In our behavioral description, we have the following operations:

Description Defined? Name Library?
1 Receive inFile, nameVec and scoreVec yes function call
mechanism
built-in
2 Open an ifstream to inFile yes fstream built-in
3 Read a string from an ifstream yes >> fstream
4 Read a double from an ifstream yes >> fstream
5 Append a string to a vector<string> yes push_back() vector
6 Append a double to a vector<double> yes push_back() vector
7 Repeat 3-6 for each line in the file yes eof-controlled
input loop
built-in
8 Close an ifstream yes close() fstream
9 Pass back vector objects yes reference parameter
mechanism
built-in

Algorithm. We can organize these objects and operations into the following algorithm:

   0. Receive inFile, nameVec and scoreVec.
   1. Open inStream, a stream to inFile, 
       and check that it opened successfully.
   2. Loop
      a. Read name and score from inStream.
      b. If end-of-file was read, terminate repetition.
      c. Append name to nameVec.
      d. Append score to scoreVec.
      End loop.
   3. Close inStream.

Coding

To append values to a vector, use the vector function member named push_back(), whose pattern is

   vectorObject.push_back(newValue);
Such a statement "pushes" a value newValue onto the back of the vector object named vectorObject, effectively appending it.

Using this information, complete the stub of FillVectors(). Then compile what you have written and make sure that it is free of syntax errors before proceeding.

Average

Next, we want to write function Average() which given a vector of double values, returns the average of those values.

Design

We can describe how this function should behave as follows:

Behavior. Our function should receive a vector of double values. It should sum the values in the vector, and if there is at least one value in the vector, our function should return the sum divided by the number of values; otherwise, our function should display an error message and return a default value (e.g., 0.0).

Objects. Using the behavioral description, our function needs the following objects:

Description Type Kind Movement Name
A vector of doubles const vector<double> & varying received numVec
the sum of the values in the vector double varying local sum
the number of values in the vector int varying local numValues
an error message string constant local --
an default return value double constant local 0.0

Given these objects, we can specify the behavior of our function as follows:

   Receive:  numVec, a vector<double>.
   Precondition: numVec is not empty.
   Return: the average of the values in numVec.
Since this function seems likely to be generally reuseable, we will define it within a library named DoubleVectorOps. Using this information, place a prototype for Average() in DoubleVectorOps.h and DoubleVectorOps.doc, and a function stub in DoubleVectorOps.cpp. Note that since numVec is a class object that is received but not passed back, it should be defined as a constant reference parameter.

Operations. In our behavioral description, we have the following operations:

Description Defined? Name Library?
1 Receive numVec yes function call
mechanism
built-in
2 Sum the values in a vector<double> yes accumulate() numeric
3 Determine the number of values in a vector<double> yes size() vector
4 Divide two double values yes / built-in
5 Return a double value yes return built-in
6 Display an error message yes << iostream
7 Select 4 and 5 or 6 and 5, but not both yes if statement built-in

Algorithm. We can organize these objects and operations into the following algorithm:

   0. Receive numVec.
   1. Compute sum, the sum of the values in numVec.
   2. Compute numValues, the number of values in numVec.
   3. If numValues > 0:
         Return sum / numValues.
      Otherwise
         a. Display an error message via cerr.
         b. Return 0.0 as a default value.
      End if.

Coding

To sum the values in a vector, we can use the standard template library algorithm named accumulate() (from STL's numeric algorithm library), whose pattern is

   accumulate(vectorObject.begin(),vectorObject.end(), 0.0)
Such an expression returns the sum of the values in vectorObject. To use accumulate(), you must include the standard header file named numeric.

To determine the number of values in a vector, we can use the vector function member size(), whose pattern is:

   vectorObject.size()
Such an expression returns the number of values in vectorObject.

Using this information, complete the stub for Average(). Then uncomment the call to Average() in the main function, translate and test your program. Don't forget to #include the numeric library! Verify that your program is correctly computing the average of the values in the input file before you proceed.

vector and the Standard Template Library

In writing the Average() function, we made use of one of the vector function members: size(); and one of the C++ standard library algorithms: accumulate(). Being knowledgeable about what function members and algorithms can be applied to an object is important, because it allows you to avoid "reinventing the wheel." That is, we could have written our own functions to find the number of values in a vector, or add up the values in a vector, but why go to all that extra effort when we can use the provided one? Learning about the functions and algorithms that can be applied to an object takes time, but it is time well-spent, since it provides ready-made solutions for many of the problems you will encounter.

To help you get acquainted with some of the vector function members and standard algorithms, we have provided two simplified quick references: one for vector function members and one for standard C++ algorithms. These quick references are not exhaustive, but they do list many of the most commonly used operations. You are encouraged to print hard copies of these references, so that you can refer to them conveniently.

The Standard Template Library. As mentioned earlier, vector is our first look at a template, or parameterized class. The vector template is just one of many different kinds of containers provided by the C++ standard template library (STL). Other containers include the list, the set, the map, the stack, the queue and a variety of others. Each of these containers has its own unique properties that distinquish it from the others. At most universities, such containers are typically the subject of the second computer science course Introduction to Data Structures, and so we will not discuss the further here.

As it happens, the accumulate() algorithm is also from the standard template library, as are all of the algorithms described on our algorithms quick reference. STL thus provides not only containers, but also standardized container-independent algorithms that can be applied to any of the containers.

Iterators. Some of the vector function members and each of the STL algorithms make use of a new kind of object called an iterator. An iterator is an object that can move through a sequence of values, and access each of the values in turn. Each of the containers in STL provide iterators for processing their elements, and each of the STL algorithms take a container's iterators as arguments, which they use to access the elements in that container. Iterators thus provide the linkage by which STL algorithms are able to be applied to any of the STL containers.

A complete discussion of iterators is beyond us at this point, however it is useful to understand that each of the containers (include vector) contains a function member begin() that returns an iterator to its first element, plus a function member end() that returns an iterator pointing beyond its final element.

These two iterators thus mark the beginning and end of a container, so that when we use an STL algorithm like accumulate():

   double sum = accumulate(theVec.begin(), theVec.end(), 0.0);
that algorithm uses the iterators returned by begin() and end() to process the sequence of values stored in theVec. That is all that you need to know about iterators at this point. (See C++ An Introduction to Computing for more information.)

Standard Deviation

Returning to our problem at hand, we next need to write function StandardDev() which given a vector of double values, returns the standard deviation of those values. Unless you have had a course in statistics, you may not know how to compute the standard deviation of a sequence of values, so we will consolidate the normal design steps and provide a specification and algorithm for this task.

Design

The specification for this function is as follows:

   Receive:  numVec, a vector<double>.
   Precondition: numVec is not empty.
   Return: the standard deviation of the values in numVec.
Since this function seems likely to be generally reuseable, it should also be defined in our DoubleVectorOps library. Using this specification, place a prototype for StandardDev() in DoubleVectorOps.h and DoubleVectorOps.doc, and a function stub in DoubleVectorOps.cpp. As before, numVec should be defined as a constant reference parameter.

An algorithm to compute the standard deviation is as follows:

   0. Receive numVec.
   1. Compute numValues, the number of values in numVec.
   2. If numValues > 0:
         a. Define avg, the average of the values in numVec.
         b. Define sumSqrTerms, a double initialized to zero.
         c. Define term, a double.
         d. For each index i of numVec:
               1) Set term to numVeci - avg.
               2) Add term^2 to sumSqrTerms.
            End loop.
         e. Return sqrt(sumSqrTerms / numValues).
      Otherwise
         a. Display an error message via cerr.
         b. Return 0.0 as a default value.
      End if.
Note that we use the notation numVeci to refer to the value in numVec whose index is i.

Coding

To determine the number of values in a vector (step 1), we can use the vector function member size(), as we saw last time.

To compute the average of the values in a vector (step 2a), we can use the Average() function that we just defined.

Steps 2b and 2c are trivial. Step 2d can be performed using a for loop that counts from 0 to the numValues minus one:

   for (int i = 0; i < numValues; i++)
      ...
To access a value v within the vector, the subscript operation can be applied to a vector, in the same manner as a string. The pattern is:
   vectorObject [ index ]
Such an expression returns the value that is stored within vectorObject at index index.

Using this information, complete the stub for StandardDev(). Then uncomment the call to StandardDev() in the main function, and translate and test what you have written. You should get standard deviations of approximately 11.1803, 2.69186, and 0.927025 for scores1.data, scores2.data, and scores3.data, respectively. Make sure that your program is correctly computing the standard deviation of the values in the input file before you proceed.

Define another vector Object

This step is easy. In the appropriate place in the main function, define gradeVec as a vector of char objects. As usual, recompile to check the syntax of what you have written.

Computing Letter Grades

Print a hard copy of each of the three scores files (scores1.data, scores2.data and scores3.data). Take a moment and and look over the distribution of scores in each file. If you were assigning grades, what letter grades would you assign for the scores in scores1.data, scores2.data and scores3.data? Take a moment and write beside each person's name (on the hard copies) what letter grade you would give them, based on their score.

Our task is to write a function that computes the appropriate letter grades for the input scores. Since this function needs a sequence of scores to process, and must return a corersponding sequence of letter grades, we can specify what the function must do as follows:

   Receive: scoreVec, a vector of double values.
   Return: gradeVec, a vector of char values.
Since this function seems pretty tightly tied to this particular grading problem, we will define it locally, within grades.cpp. Using this information, prototype ComputeLetterGrades() before the main function and define a stub following the main function. Since scoreVec is a class object that is received but not passed back, define it as a constant reference parameter.

Students often want grades to be "curved." The "curve" method of grading is based upon the assumption that the scores being graded fall into a normal distribution (i.e., a bell curve). Given a set of scores, the "curve" method determines letter grades as follows:

An algorithm for determining the letter grades corresponding to a vector of scores is thus as follows:
   0. Receive scoreVec, a vector of scores.
   1. Define numValues, the number of values in the vector.
   2. Define gradeVec, a vector of characters
       the same length as scoreVec.
   3. If numValues > 0:
         a. Define avg, the average of the values.
         b. Define standardDev, the standard deviation of the values.
         c. Define F_CUT_OFF as avg - 1.5 * standardDev.
         d. Define D_CUT_OFF as avg - 0.5 * standardDev.
         e. Define C_CUT_OFF as avg + 0.5 * standardDev.
         f. Define B_CUT_OFF as avg + 1.5 * standardDev.
         g. For each index i of scoreVec:
               If scoreVeci < F_CUT_OFF:
                  Set gradeVeci to 'F'.
               Else if scoreVeci  < D_CUT_OFF:
                  Set gradeVeci to 'D'.
               Else if scoreVeci  < C_CUT_OFF:
                  Set gradeVeci to 'C'.
               Else if scoreVeci  < B_CUT_OFF:
                  Set gradeVeci to 'B'.
               Else
                  Set gradeVeci to 'A'.
               End if.
            End loop.
      End if.
   4. Return gradeVec.
As before, we use the notation scoreVeci and gradeVeci to refer to an element of scoreVec or gradeVec whose index is i. The corresponding notation in C++ is
   scoreVec[i]
or
   gradeVec[i]
Using the information above, complete the stub for ComputeLetterGrades(). Then compile what you have written to check for syntax errors. (We'll check for logic errors in the next step.)

Display the Names, Scores and Grades

The final operation is to display the information we have computed. Design and write a function that, given an ostream, a vector of names, a vector of scores, and a vector of letter grades, displays these three vector objects in tabular form. For example, the output produced when scores1.data is processed should appear something like the following:

   Enter the name of the scores file: scores1.data

   Mean score: 75
   Std. Dev: 11.1803

   joan    55      F
   joe     60      D
   jane    60      D
   jim     65      D
   janet   70      C
   john    70      C
   johanna 75      C
   jack    75      C
   joeline 75      C
   jacques 75      C
   josh    80      C
   janna   80      C
   jason   85      B
   jadzia  90      B
   jon     90      B
   jackie  95      A
If you find that you have logic errors in what you have written, use the debugger to find them.

When scores2.data or scores3.data are processed using the "curve" method, how do the "curved" grades compare with the grades you would have assigned?

What have you learned about grading "on the curve?"

Phrases you should now understand:

Vector, Index, Subscript, Element, Template, Algorithm, Iterator, Standard Template Library.


Submit:

Hard copies of your final versions of grades.cpp, DoubleVectorOps.h, DoubleVectorOps.cpp, DoubleVectorOps.doc, and execution records showing the processing of scores1.data, scores2.data and scores3.data, plus the hard copies of these scores that you printed earlier annotated with the grades you assigned.


Back to This Lab's Home Page

Back to the Prelab Questions

Forward to the Homework Projects


Copyright 1998 by Joel C. Adams. All rights reserved.