In past exercises, we have dealt with sequences of values by processing the values one at a time. Today, we want to examine a new kind of object called a vector that can store not just one value, but an entire sequence of values. Like a string, a vector is a subscripted object, meaning the values within it can be accessed via a subscript or index:
By storing a sequence of values in a vector, we can design operations for the sequence, and then implement such operations in a function that receives the entire sequence through a vector parameter.
To introduce the use of the vector, today's exercise is to write a program that processes the names and scores of students in a course, and assigns letter grades (A, B, C, D or F) based on those scores, using the "curve" method of grading. The format of the input file is a series of lines, each having the form:
name scoreAs usual, we will use object-centered design to design our solution.
With a little thought, it should be evident that this problem requires that we process the sequence of scores more than once. More precisely, to assign letter grades using the "curve" method, we must first compute the average and standard deviation of the scores. Once we have these values, we can determine the letter grade corresponding to each score. We must thus process the sequence of scores at least twice.
One way to solve this problem would be to use a file to store the sequence, and then reread the file each time we need to process the values. The problem with this approach is that reading from a file is quite slow compared to reading from main memory -- up to 1000 times as slow!
By contrast, the vector provides a convenient way to store a sequence of values in main memory and to directly access any value in the sequence whenever we wish. A vector thus allows us to process a sequence of values multiple times in a more efficient manner than does a file.
Our program should display a greeting, and then prompt for and read the name of an input file. It should then read a sequence of names and a sequence of scores from the input file. It should then compute the average and standard deviation of the sequence of scores, and display these values. Using the average and standard deviation, it should compute the sequence of letter grades that correspond to the sequence of scores. It should then display each student's name, score and letter grade.
If we examine our behavioral description for objects, we find these:
| Description | Type | Kind | Name |
|---|---|---|---|
| a greeting | string | constant | none |
| The name of the input file | string | varying | inFile |
| a sequence of names | vector<string> | varying | nameVec |
| a sequence of scores | vector<double> | varying | scoreVec |
| the average score | double | varying | average |
| the standard deviation of the scores | double | varying | stdDeviation |
| a sequence of letter grades | vector<char> | varying | gradeVec |
We can thus specify the behavior of our program this way:
Input(keyboard): the name of an input file.
Input(input file): a sequence of names and scores.
Output(screen): the average and standard deviation of the scores,
plus the sequences of names and scores,
and the sequence of corresponding letter grades.
From our behavioral description, we have these operations:
| Description | Defined? | Name | Library? | |
|---|---|---|---|---|
| 1 | Display a string | yes | << | iostream |
| 2 | Read a string | yes | >> | iostream |
| 3 | Read a sequence of names and scores from a file | no | -- | -- |
| 4 | Compute the average of a sequence scores | no | -- | -- |
| 5 | Compute the standard deviation of a sequence scores | no | -- | -- |
| 6 | Compute the sequence of letter grades corresponding to a sequence scores | no | -- | -- |
| 7 | Display sequences of names, scores and letter grades | no | -- | -- |
As you can see, we have a few functions to write.
We can organize the preceding objects and operations into the following algorithm:
0. Display a greeting.
1. Prompt for and read the name of the input file.
2. Fill nameVec and scoreVec with values from the input file.
3. Output the mean and standard deviation of the values in scoreVec.
4. Compute gradeVec, a vector of the letter grades corresponding
to the scores in scoreVec.
5. Display nameVec, scoreVec and gradeVec, value-by-value.
We will thus use three different vector objects,
nameVec storing string values,
scoreVec storing double values,
and gradeVec storing char values.
A skeleton program is provided in grades.cpp. Since some of the operations we will be creating are reuseable, we store them in a library, consisting of the header file DoubleVectorOps.h, the implementation file DoubleVectorOps.cpp, and the documentation file DoubleVectorOps.doc. A Makefile is provided in Makefile, Sample data files are provided in scores1.data, scores2.data, and scores3.data.
Make a new directory for this exercise, and then save copies of these files into that directory. Then begin editing the file grades.cpp, and take a few moments to look it over.
As indicated by its documentation, grades.cpp is a skeleton program that (partially) encodes the algorithm given above. Today's exercise will consist of completing this program.
To define an object named vecObj for storing multiple values of type Type, we can write:
vector<Type> vecObj;Such a statement defines vecObj as an object capable of storing multiple values of type Type. Since a vector can store any type of value, a programmer using a vector must specify the type of value they want to store in a given vector.
This is our first look at a new kind of class called a template. Just as a function provides a way to parameterize a set of statements, a template provides a way to parameterize a class declaration. A complete discussion of templates is beyond us at this point, but because they allow a type to be passed to an object definition, they are commonly used to create container classes like vector.
In grade.cpp, use this information to define nameVec and scoreVec (as vector objects capable of storing string and double objects, respectively), in the places indicated by the comments in grades.cpp. To define a vector, you must include the standard header file named vector.
Next, we want to write function FillVectors() which fills our vector objects with values from the file whose name is in inFile.
We can describe how this function should behave as follows:
Behavior. Our function should receive the name of the input file, an empty vector of strings and an empty vector of doubles from its caller. The function should open an ifstream to the input file. It should then read each name and score from the ifstream, appending them to the vector of strings and the vector of doubles, respectively. When no values remain to be read, our function should close the ifstream, and pass back the filled vector objects.
Objects. Using the behavioral description, our function needs the following objects:
| Description | Type | Kind | Movement | Name |
|---|---|---|---|---|
| The name of the input file | const string & | constant | received | inFile |
| An empty vector of strings | vector<string> & | varying | received, passed back |
nameVec |
| An empty vector of doubles | vector<double> & | varying | received, passed back |
scoreVec |
| an ifstream to inFile | ifstream | varying | local | inStream |
| a name | string | varying | local | name |
| a score | double | varying | local | score |
Given these objects, we can specify the behavior of our function as follows:
Receive: inFile, a string;
nameVec, a vector<string>; and
scoreVec, a vector<double>.
Passback: nameVec and scoreVec, filled with the values from inFile.
Since this function seems unlikely to be generally reuseable,
we will define it within the same file as our main function.
Using the above information, place a prototype for FillVectors()
before the main function, and a function stub after the main function.
Note that since inFile is a class object that is received
but not passed back, it should be defined as a constant reference parameter,
not as a value parameter.
By contrast, nameVec and scoreVec are passed back,
and so should be defined as reference parameters.
Operations. In our behavioral description, we have the following operations:
| Description | Defined? | Name | Library? | |
|---|---|---|---|---|
| 1 | Receive inFile, nameVec and scoreVec | yes | function call mechanism |
built-in |
| 2 | Open an ifstream to inFile | yes | fstream | built-in |
| 3 | Read a string from an ifstream | yes | >> | fstream |
| 4 | Read a double from an ifstream | yes | >> | fstream |
| 5 | Append a string to a vector<string> | yes | push_back() | vector |
| 6 | Append a double to a vector<double> | yes | push_back() | vector |
| 7 | Repeat 3-6 for each line in the file | yes | eof-controlled input loop |
built-in |
| 8 | Close an ifstream | yes | close() | fstream |
| 9 | Pass back vector objects | yes | reference parameter mechanism |
built-in |
Algorithm. We can organize these objects and operations into the following algorithm:
0. Receive inFile, nameVec and scoreVec.
1. Open inStream, a stream to inFile,
and check that it opened successfully.
2. Loop
a. Read name and score from inStream.
b. If end-of-file was read, terminate repetition.
c. Append name to nameVec.
d. Append score to scoreVec.
End loop.
3. Close inStream.
To append values to a vector, use the vector function member named push_back(), whose pattern is
vectorObject.push_back(newValue);Such a statement "pushes" a value newValue onto the back of the vector object named vectorObject, effectively appending it.
Using this information, complete the stub of FillVectors(). Then compile what you have written and make sure that it is free of syntax errors before proceeding.
Next, we want to write function Average() which given a vector of double values, returns the average of those values.
We can describe how this function should behave as follows:
Behavior. Our function should receive a vector of double values. It should sum the values in the vector, and if there is at least one value in the vector, our function should return the sum divided by the number of values; otherwise, our function should display an error message and return a default value (e.g., 0.0).
Objects. Using the behavioral description, our function needs the following objects:
| Description | Type | Kind | Movement | Name |
|---|---|---|---|---|
| A vector of doubles | const vector<double> & | varying | received | numVec |
| the sum of the values in the vector | double | varying | local | sum |
| the number of values in the vector | int | varying | local | numValues |
| an error message | string | constant | local | -- |
| an default return value | double | constant | local | 0.0 |
Given these objects, we can specify the behavior of our function as follows:
Receive: numVec, a vector<double>. Precondition: numVec is not empty. Return: the average of the values in numVec.Since this function seems likely to be generally reuseable, we will define it within a library named DoubleVectorOps. Using this information, place a prototype for Average() in DoubleVectorOps.h and DoubleVectorOps.doc, and a function stub in DoubleVectorOps.cpp. Note that since numVec is a class object that is received but not passed back, it should be defined as a constant reference parameter.
Operations. In our behavioral description, we have the following operations:
| Description | Defined? | Name | Library? | |
|---|---|---|---|---|
| 1 | Receive numVec | yes | function call mechanism |
built-in |
| 2 | Sum the values in a vector<double> | yes | accumulate() | numeric |
| 3 | Determine the number of values in a vector<double> | yes | size() | vector |
| 4 | Divide two double values | yes | / | built-in |
| 5 | Return a double value | yes | return | built-in |
| 6 | Display an error message | yes | << | iostream |
| 7 | Select 4 and 5 or 6 and 5, but not both | yes | if statement | built-in |
Algorithm. We can organize these objects and operations into the following algorithm:
0. Receive numVec.
1. Compute sum, the sum of the values in numVec.
2. Compute numValues, the number of values in numVec.
3. If numValues > 0:
Return sum / numValues.
Otherwise
a. Display an error message via cerr.
b. Return 0.0 as a default value.
End if.
To sum the values in a vector, we can use the standard template library algorithm named accumulate() (from STL's numeric algorithm library), whose pattern is
accumulate(vectorObject.begin(),vectorObject.end(), 0.0)Such an expression returns the sum of the values in vectorObject. To use accumulate(), you must include the standard header file named numeric.
To determine the number of values in a vector, we can use the vector function member size(), whose pattern is:
vectorObject.size()Such an expression returns the number of values in vectorObject.
Using this information, complete the stub for Average(). Then uncomment the call to Average() in the main function, translate and test your program. Don't forget to #include the numeric library! Verify that your program is correctly computing the average of the values in the input file before you proceed.
In writing the Average() function, we made use of one of the vector function members: size(); and one of the C++ standard library algorithms: accumulate(). Being knowledgeable about what function members and algorithms can be applied to an object is important, because it allows you to avoid "reinventing the wheel." That is, we could have written our own functions to find the number of values in a vector, or add up the values in a vector, but why go to all that extra effort when we can use the provided one? Learning about the functions and algorithms that can be applied to an object takes time, but it is time well-spent, since it provides ready-made solutions for many of the problems you will encounter.
To help you get acquainted with some of the vector function members and standard algorithms, we have provided two simplified quick references: one for vector function members and one for standard C++ algorithms. These quick references are not exhaustive, but they do list many of the most commonly used operations. You are encouraged to print hard copies of these references, so that you can refer to them conveniently.
The Standard Template Library. As mentioned earlier, vector is our first look at a template, or parameterized class. The vector template is just one of many different kinds of containers provided by the C++ standard template library (STL). Other containers include the list, the set, the map, the stack, the queue and a variety of others. Each of these containers has its own unique properties that distinquish it from the others. At most universities, such containers are typically the subject of the second computer science course Introduction to Data Structures, and so we will not discuss the further here.
As it happens, the accumulate() algorithm is also from the standard template library, as are all of the algorithms described on our algorithms quick reference. STL thus provides not only containers, but also standardized container-independent algorithms that can be applied to any of the containers.
Iterators. Some of the vector function members and each of the STL algorithms make use of a new kind of object called an iterator. An iterator is an object that can move through a sequence of values, and access each of the values in turn. Each of the containers in STL provide iterators for processing their elements, and each of the STL algorithms take a container's iterators as arguments, which they use to access the elements in that container. Iterators thus provide the linkage by which STL algorithms are able to be applied to any of the STL containers.
A complete discussion of iterators is beyond us at this point, however it is useful to understand that each of the containers (include vector) contains a function member begin() that returns an iterator to its first element, plus a function member end() that returns an iterator pointing beyond its final element.
These two iterators thus mark the beginning and end of a container, so that when we use an STL algorithm like accumulate():
double sum = accumulate(theVec.begin(), theVec.end(), 0.0);that algorithm uses the iterators returned by begin() and end() to process the sequence of values stored in theVec. That is all that you need to know about iterators at this point. (See C++ An Introduction to Computing for more information.)
Returning to our problem at hand, we next need to write function StandardDev() which given a vector of double values, returns the standard deviation of those values. Unless you have had a course in statistics, you may not know how to compute the standard deviation of a sequence of values, so we will consolidate the normal design steps and provide a specification and algorithm for this task.
The specification for this function is as follows:
Receive: numVec, a vector<double>. Precondition: numVec is not empty. Return: the standard deviation of the values in numVec.Since this function seems likely to be generally reuseable, it should also be defined in our DoubleVectorOps library. Using this specification, place a prototype for StandardDev() in DoubleVectorOps.h and DoubleVectorOps.doc, and a function stub in DoubleVectorOps.cpp. As before, numVec should be defined as a constant reference parameter.
An algorithm to compute the standard deviation is as follows:
0. Receive numVec.
1. Compute numValues, the number of values in numVec.
2. If numValues > 0:
a. Define avg, the average of the values in numVec.
b. Define sumSqrTerms, a double initialized to zero.
c. Define term, a double.
d. For each index i of numVec:
1) Set term to numVeci - avg.
2) Add term^2 to sumSqrTerms.
End loop.
e. Return sqrt(sumSqrTerms / numValues).
Otherwise
a. Display an error message via cerr.
b. Return 0.0 as a default value.
End if.
Note that we use the notation numVeci to refer to
the value in numVec whose index is i.
To determine the number of values in a vector (step 1), we can use the vector function member size(), as we saw last time.
To compute the average of the values in a vector (step 2a), we can use the Average() function that we just defined.
Steps 2b and 2c are trivial. Step 2d can be performed using a for loop that counts from 0 to the numValues minus one:
for (int i = 0; i < numValues; i++)
...
To access a value v within the vector,
the subscript operation can be applied to a vector,
in the same manner as a string.
The pattern is:
vectorObject [ index ]Such an expression returns the value that is stored within vectorObject at index index.
Using this information, complete the stub for StandardDev(). Then uncomment the call to StandardDev() in the main function, and translate and test what you have written. You should get standard deviations of approximately 11.1803, 2.69186, and 0.927025 for scores1.data, scores2.data, and scores3.data, respectively. Make sure that your program is correctly computing the standard deviation of the values in the input file before you proceed.
This step is easy. In the appropriate place in the main function, define gradeVec as a vector of char objects. As usual, recompile to check the syntax of what you have written.
Print a hard copy of each of the three scores files (scores1.data, scores2.data and scores3.data). Take a moment and and look over the distribution of scores in each file. If you were assigning grades, what letter grades would you assign for the scores in scores1.data, scores2.data and scores3.data? Take a moment and write beside each person's name (on the hard copies) what letter grade you would give them, based on their score.
Our task is to write a function that computes the appropriate letter grades for the input scores. Since this function needs a sequence of scores to process, and must return a corersponding sequence of letter grades, we can specify what the function must do as follows:
Receive: scoreVec, a vector of double values. Return: gradeVec, a vector of char values.Since this function seems pretty tightly tied to this particular grading problem, we will define it locally, within grades.cpp. Using this information, prototype ComputeLetterGrades() before the main function and define a stub following the main function. Since scoreVec is a class object that is received but not passed back, define it as a constant reference parameter.
Students often want grades to be "curved." The "curve" method of grading is based upon the assumption that the scores being graded fall into a normal distribution (i.e., a bell curve). Given a set of scores, the "curve" method determines letter grades as follows:
0. Receive scoreVec, a vector of scores.
1. Define numValues, the number of values in the vector.
2. Define gradeVec, a vector of characters
the same length as scoreVec.
3. If numValues > 0:
a. Define avg, the average of the values.
b. Define standardDev, the standard deviation of the values.
c. Define F_CUT_OFF as avg - 1.5 * standardDev.
d. Define D_CUT_OFF as avg - 0.5 * standardDev.
e. Define C_CUT_OFF as avg + 0.5 * standardDev.
f. Define B_CUT_OFF as avg + 1.5 * standardDev.
g. For each index i of scoreVec:
If scoreVeci < F_CUT_OFF:
Set gradeVeci to 'F'.
Else if scoreVeci < D_CUT_OFF:
Set gradeVeci to 'D'.
Else if scoreVeci < C_CUT_OFF:
Set gradeVeci to 'C'.
Else if scoreVeci < B_CUT_OFF:
Set gradeVeci to 'B'.
Else
Set gradeVeci to 'A'.
End if.
End loop.
End if.
4. Return gradeVec.
As before,
we use the notation scoreVeci
and gradeVeci to refer to
an element of scoreVec or gradeVec whose index is i.
The corresponding notation in C++ is
scoreVec[i]or
gradeVec[i]Using the information above, complete the stub for ComputeLetterGrades(). Then compile what you have written to check for syntax errors. (We'll check for logic errors in the next step.)
The final operation is to display the information we have computed. Design and write a function that, given an ostream, a vector of names, a vector of scores, and a vector of letter grades, displays these three vector objects in tabular form. For example, the output produced when scores1.data is processed should appear something like the following:
Enter the name of the scores file: scores1.data Mean score: 75 Std. Dev: 11.1803 joan 55 F joe 60 D jane 60 D jim 65 D janet 70 C john 70 C johanna 75 C jack 75 C joeline 75 C jacques 75 C josh 80 C janna 80 C jason 85 B jadzia 90 B jon 90 B jackie 95 AIf you find that you have logic errors in what you have written, use the debugger to find them.
When scores2.data or scores3.data are processed using the "curve" method, how do the "curved" grades compare with the grades you would have assigned?
What have you learned about grading "on the curve?"
Vector, Index, Subscript, Element, Template, Algorithm, Iterator, Standard Template Library.
Hard copies of your final versions of grades.cpp, DoubleVectorOps.h, DoubleVectorOps.cpp, DoubleVectorOps.doc, and execution records showing the processing of scores1.data, scores2.data and scores3.data, plus the hard copies of these scores that you printed earlier annotated with the grades you assigned.
Forward to the Homework Projects