Lab 10: Arrays and Vectors

Introduction

Up to now, we have processed sequences of values by processing them one value at a time. C++, however, provides various kinds of containers for processing collections of values — e.g., arrays, vectors, lists, stacks, queues, and others. In this lab exercise we examine vectors, which store sequences of elements in C-style arrays but, in addition, provide a collection of built-in operations for processing them. As with a string, a vector (as well as an array) can access the values it stores via a subscript or index:

The boxes in the above diagram of a vector (or array) named aVec are called its elements or components; they can store values that are of some specified type. However, unlike a string, which only stores chars, an array and a vector can store any type of data. Also, by storing a sequence of values in an array or vector, we can design operations for the sequence and then implement these operations with functions that receive the entire sequence through a parameter.

Vectors are implemented in C++ by the vector class provided in the vector library, which is part of the C++ Standard Template Library (STL). The STL is a collection of containers and algorithms that are very useful in a variety of situations. This lab exercise will be exploring only a small part of the STL.

The Problem

In this lab exercise, we will develop a program to process the names and scores of students in a course and assign letter grades (A, B, C, D or F) using "grading on the curve." The input file will be organized as a series of lines, each having the form

    name score

where name is a student's name and score is the score received by that student on a test, quiz, or some other course item. As usual, we will use object-centered design to develop our solution.

Files

The following are the files that you will be using:

DoubleVectorOps.h, DoubleVectorOps.cpp, and DoubleVectorOps.txt make up a library that implements various operations for vectors of doubles.
grades.cpp implements a skeleton driver.
scores1.txt, scores2.txt, and scores3.txt are sample input files.

Add your name and other infomation your instructor requires to the opening documentation of the code and documentation files.

You will begin by editing the file grades.cpp. Open it in an editor and take a few minutes to look it over. As its documentation indicates, grades.cpp is a "skeleton program" that (partially) encodes an algorithm to solve the problem. In this lab's exercise you will be completing this program.

Design

Our programs until now have been able to process their inputs as soon as they were entered. This is actually quite rare. Often, as in this lab's problem, we need to keep the information available after the data has been input so that it can be reused later in the program.

To get started, let's look at what the end result will be: letter grades have been assigned to the scores that are input. This seems straightforward — just input a student's score and output a letter grade corresponding to that score. Here, however, it isn't that simple because of how letter grades are to be computed — using the grading-on-the-curve method.

We will explain this method of assigning letter grades near the end of this lab exercise. Suffice it to say for now that it requires first computing the average score and then the standard deviation of the scores. And the only way to know this is to sum all of the scores and then compute the average and the standard deviation. Once these are known, we can determine which scores will get A's, which will get B's, and so on. Thus, the scores have to be processed (at least) twice: once to compute the average and standard deviation and a second time to assign letter grades based on the average and standard deviation.

We could store the list of scores in a file and then reread the file each time we need to process them. The problem with this approach is that reading from a file is very, very, very slow compared to reading from main memory! A much better solution is to input all the scores and store them in an internal container — that is, a container stored in main memory. For this we will use a vector, which is one of the containers in the Standard Template Library. In addition to is being much, much, much faster to access the scores from a vector than from a file, a vector gives us direct — i.e., immediate — access to each score. It takes no longer to access any element of the vector than any other element.

Behavior

Our program should display an introductory message, and then prompt for and read the name of an input file. It should then read a list of names and scores from the input file. It should then compute the average and standard deviation of the scores, and display these values. Using the average and standard deviation, it should compute the sequence of letter grades that correspond to the scores. It should then display each student's name, score, and letter grade.

Objects

When we examine our behavioral description for objects, we find these objects:

Description Type Kind Name

an introductory message string literal —-

the name of the input file string varying inFileName

a sequence of names vector<string> varying nameVec

a sequence of scores vector<double> varying scoreVec

the average score double varying average

the standard deviation of the scores double varying stdDeviation

a sequence of letter grades vector<char> varying gradeVec

Description	Type	Kind	Name
an introductory message	`string`	literal	—-
the name of the input file	`string`	varying	`inFileName`
a sequence of names	`vector<string>`	varying	`nameVec`
a sequence of scores	`vector<double>`	varying	`scoreVec`
the average score	`double`	varying	`average`
the standard deviation of the scores	`double`	varying	`stdDeviation`
a sequence of letter grades	`vector<char>`	varying	`gradeVec`

We can thus specify the behavior of our program as follows:

Specification:
Input (keyboard): the name of an input file.
Input (input file): a sequence of names and scores.
Output (screen): the average and standard deviation of the scores followed by the sequence of names, scores, and letter grades.

Operations

From our behavioral description, we have these operations:

Description Predefined? Name Library

Display a string
yes

<<

iostream

Read a string
yes

>>

iostream

Read a sequence of names and scores from a file
no

—

—

Compute the average of a sequence of scores
no

—

—

Compute the standard deviation of a sequence of scores
no

—

—

Compute the sequence of letter grades corresponding to a sequence of scores
no

—

—

Display sequence of names, scores, and letter grades
no

—

—

Description	Predefined?	Name	Library
Display a `string`	yes	`<<`	`iostream`
Read a `string`	yes	`>>`	`iostream`
Read a sequence of names and scores from a file	no	—	—
Compute the average of a sequence of scores	no	—	—
Compute the standard deviation of a sequence of scores	no	—	—
Compute the sequence of letter grades corresponding to a sequence of scores	no	—	—
Display sequence of names, scores, and letter grades	no	—	—

Note that there are several undefined operations. For each of these we will write a function to implement it.

Algorithm

We can organize the preceding objects and operations into the following algorithm:

Display a greeting.
Prompt for and read the name of the input file.
Fill nameVec and scoreVec with values from the input file.
Compute and output the average and the standard deviation of the values in scoreVec.
Compute gradeVec, a vector of the letter grades corresponding to the scores in scoreVec.
Display the elements in nameVec, scoreVec, and gradeVec.

Note that we will be using three different types of vector objects:

nameVec storing string values,
scoreVec storing double values, and
gradeVec storing char values.

Declaring `vector` Objects

A declaration of a vector whose elements are of type ElementType has the following form:

vector<ElementType> vecObj;

This declaration defines vecObj as a vector that can store values of type ElementType; it is initially empty.

The preceding declaration actually specifies two things:

vector<ElementType> is a class;
vecObj is an object whose type is that class.

vector itself is known as a class template — a pattern for a class. When we attach the name of some type enclosed in < and >, it becomes a class. A complete discussion of templates and class templates is left for later courses in computing. For now, it is enough when, why, and how to use them.

The compiler will check that vecObj is used correctly. For example, only values of type ElementType can be put into vecObj; and only ElementType objects can be retrieved and/or removed from it.

Question #10.1: In grade.cpp, use this information to declare the three vectors nameVec, scoreVec, and gradeVec needed in the main program at the places indicated by comments — the first two near the beginning of main() and gradeVec near the end. Make sure that your declarations specify the correct type of data that these vectors will store. Also be sure to #include the vector library.

Filling `vector` Objects from a File

Now we start working on the operations left undefined in our main algorithm. The first one is to read names and scores from the file whose name is stored in inFileName and put them into the vectors that you just declared in the main program — names in nameVec and scores in scoreVec. This is the task for function fillVectors()

Design

Behavior

We can describe how this function should behave as follows:

This function should receive the name of an input file and two empty vectors — an empty vector of strings and an empty vector of doubles from its caller. It should open an ifstream to the input file and then read each name and score from the ifstream and append them to the vector of strings and the vector of doubles, respectively. When the end of the file is reached, this function should close the ifstream, and pass back the filled vector objects.

Behavior

Using the behavioral description, our function needs the following objects:

Description Type Kind Movement Name

The name of the input file string constant in inFileName

An empty vector of strings vector<string> & varying in & out nameVec

An empty vector of doubles vector<double> & varying in & out scoreVec

a stream to inFileName ifstream varying local inStream

a student's name string varying local name

a student's score double varying local score

Description	Type	Kind	Movement	Name
The name of the input file	`string`	constant	in	`inFileName`
An empty `vector` of strings	`vector<string> &`	varying	in & out	`nameVec`
An empty `vector` of doubles	`vector<double> &`	varying	in & out	`scoreVec`
a stream to `inFileName`	`ifstream`	varying	local	`inStream`
a student's name	`string`	varying	local	`name`
a student's score	`double`	varying	local	`score`

Given these objects, we can specify the behavior of our function as follows:

Specification:
Receive: inFileName, a string; nameVec, a vector<string>; and scoreVec, a vector<double>.
Pass back: nameVec and scoreVec, filled with the values from inFileName.

Note that because nameVec and scoreVec are to be passed back to main(), they will need to be reference parameters.

Since this function seems unlikely to be generally reuseable, we will define it within the same file as our main function.

Operations

In our behavioral description, we have the following operations:

Description Predefined? Name Library

Receive inFileName, nameVec and scoreVec
yes

function call mechanism

built-in

Open an ifstream to inFileName
yes

ifstream declaration

fstream

Read a string from an ifstream
yes

>>

fstream

Read a double from an ifstream
yes

>>

fstream

Append a string to a vector<string>
yes

push_back()

vector

Append a double to a vector<double>
yes

push_back()

vector

Repeat reading and appending for whole file
yes

eof-controlled input loop

built-in

Close an ifstream
yes

close()

fstream

Pass back vector objects
yes

reference parameter

built-in

Description	Predefined?	Name	Library
Receive `inFileName`, `nameVec` and `scoreVec`	yes	function call mechanism	built-in
Open an `ifstream` to `inFileName`	yes	`ifstream` declaration	`fstream`
Read a `string` from an `ifstream`	yes	`>>`	`fstream`
Read a `double` from an `ifstream`	yes	`>>`	`fstream`
Append a `string` to a `vector<string>`	yes	`push_back()`	`vector`
Append a `double` to a `vector<double>`	yes	`push_back()`	`vector`
Repeat reading and appending for whole file	yes	eof-controlled input loop	built-in
Close an `ifstream`	yes	`close()`	`fstream`
Pass back `vector` objects	yes	reference parameter	built-in

Algorithm

We can organize these objects and operations into the following algorithm:

Receive inFileName, nameVec, and scoreVec.
Open a stream inStream to inFileName, and check that it opened successfully.
Loop
1. Read name and score from inStream.
2. If end-of-file was reached, terminate repetition.
3. Append name to nameVec.
4. Append score to scoreVec.
End loop.
Close inStream.

Coding

To append values to a vector, use the vector method named push_back():

vectorObject.push_back(newValue);

Such a statement "pushes" newValue onto the back of the vector named vectorObject.

Question #10.2: Add a prototype for fillVectors() at the designated place in grades.cpp before main(). Remember the comments about reference parameters in the earlier specification for this function.
Then, using the preceding information and the algorithm above, write a definition of fillVectors() after main(). Then compile what you have written and make sure that it is free of syntax errors before proceeding.

Average

Now that we have the student names and scores transferred from a file into two vectors nameVec and scoreVec, we can turn to implementing the operations needed for grading on the curve. The first operation is that of computing the average of the scores. Of course, there's nothing special about their being scores; they could be temperatures, heights, weights, times, or any double values. So the average() function we write will compute the average of some generic double values. And in keeping with this genericity, we will think of the values we're averaging as arbitrary real numbers and use generic names for them. Why? Because after putting a goodly amount of effort into developing functions for calculating the average and standard deviation, we might want to add other statistical measures such as the variance, median, maximum, and minimum and store these in a statistics libary for use in other programs.

It's important to note this now because it will have an effect on the way we name our variables. Yes, we'll continue to think of the scores as scores in the main program (and other functions), but in average() and later in standardDev(), we'll think of them as just a sequence of real numbers.

Design

Behavior

We can describe how this function should behave as follows:

Our function should receive a vector of double values. It should compute the sum of the values in this vector, check that the vector is not empty and then compute and return the average of the values (i.e., the sum divided by the number of values). Otherwise, our function should display an error message indicating an empty set of values and return some default value (e.g., 0).

Objects

Using the behavioral description, our function needs the following objects:

Description Type Kind Movement Name

A vector of doubles
const vector<double> &

varying

in

numVec

The sum of the values in the vector
double

varying

local

sum

The number of values in the vector
int

varying

local

numValues

An error message
string

literal

local

--

A default return value
double

literal

local

0

Description	Type	Kind	Movement	Name
A `vector` of doubles	`const vector<double> &`	varying	in	`numVec`
The sum of the values in the `vector`	`double`	varying	local	`sum`
The number of values in the `vector`	`int`	varying	local	`numValues`
An error message	`string`	literal	local	--
A default return value	`double`	literal	local	0

Given these objects, we can specify the behavior of our function as follows:

Specification:
Receive: numVec, a vector<double>
Precondition: numVec is not empty
Return: the average of the values in numVec

Since we've designed this function to be general and thus reusable, we will define it within a library named DoubleVectorOps.

Question #10.3: Place a prototype for average() in DoubleVectorOps.h and DoubleVectorOps.txt, and a function stub in DoubleVectorOps.cpp.
Since numVec is a class object that is received but not passed back, it should be defined as a constant reference parameter.

Operations

In our behavioral description, we have the following operations:

Description Predefined? Name Library

Receive numVec
yes

function call mechanism

built-in

Sum the values in a vector<double>
yes

accumulate()

numeric

Determine the number of values in a vector<double>
yes

size()

vector

Divide two double values
yes

/

built-in

Return a double value
yes

return

built-in

Display an error message
yes

<<

iostream

Select between normal case and error case
yes

if statement

built-in

Description	Predefined?	Name	Library
Receive `numVec`	yes	function call mechanism	built-in
Sum the values in a `vector<double>`	yes	`accumulate()`	`numeric`
Determine the number of values in a `vector<double>`	yes	`size()`	`vector`
Divide two `double` values	yes	`/`	built-in
Return a `double` value	yes	`return`	built-in
Display an error message	yes	`<<`	`iostream`
Select between normal case and error case	yes	`if` statement	built-in

Algorithm

We can organize these objects and operations into the following algorithm for our function:

Receive numVec.
Compute sum, the sum of the values in numVec.
Compute numValues, the number of values in numVec.
If numValues > 0
Return sum / numValues.
Otherwise
1. Display an error message via cerr.
2. Return 0.0 as a default value.

Coding

As mentioned earlier, the Standard Template Library has not only containers (such as vector), but it also contains a collection of incredibly powerful functions — actually function templates —defined in two libraries: <algorithm> and <numeric>. The function accumulate() is one of these; it is found in the numeric library and can be used to sum the values in certain containers such as vector:

accumulate(vectorObject.begin(), vectorObject.end(), 0.0)

This expression returns the sum of the values in vectorObject. Actually, it computes this sum, adds it to the second argument — 0.0 in our case. To use accumulate(), we must include the numeric library in our program.

Now, to compute the average of the values in a vector, we need to know the number of elements it has. But this is easy, because each vector has a function member named size() that we can use:

vectorObject.size()

This expression returns the number of elements in vectorObject. This size() method is part of the vector class so we must #include the vector library, which we have done because we're using vectors.

Question #10.4: Using this information, complete the definition of average(). Then uncomment the call to average() in the main function, compile, and test your program. Don't forget to #include the numeric library so you can use accumulate(). Verify that your program is computing the average of the values in the input file correctly before you proceed.

`vector` and the Standard Template Library

In our definition of the average() function we used size(), which is one of vector's methods. We also used accumulate(), one of the standard library algorithms. Being knowledgeable about such methods and algorithms that can be applied to an object is worthwhile because it avoids having to "reinvent the wheel." Of course, we could have written our own functions to count the number of values in a vector and to sum them. But why go to all that extra effort when C++ provides them for us? Taking the time to learn about these ready-made functions and algorithms that are available is time well spent!

The vector class template is just one of many different kinds of containers provided by the C++ Standard Template Library. Others include list, set, map, stack, queue, and a variety of others. Each of these containers has its own unique properties that distinquish it from the others and they are usually studied in later computer science courses.

Some of vector's methods and each of the STL algorithms make use of a new kind of object called an iterator that can move through a sequence of values and access each of them in turn. Each of the containers in STL provides iterators for processing its elements and each of the STL algorithms has a container's iterators as arguments that it uses to access its elements. Iterators make it possible to write STL algorithms so they can handle any of the STL containers. Otherwise, for example, there would have to be a different accumulate() function for every container in STL. Also, we could conceivably create a brand new container to which we could apply all of STL's algorithms without having to change the code as long as we wrote iterators for our new container.

Although a complete discussion of iterators isn't practical at this point, it is useful to understand that each of the containers (including vector) contains a method begin() that returns an iterator to its first element and a method end() that returns an iterator pointing just beyond its final element.

These two iterators thus mark the beginning and end of a container. An STL algorithm such as accumulate() can then use these two end points of the container to iterate through the whole container and do its work (e.g., add the elements together).

Standard Deviation

Returning to the problem at hand, we move on to writing a function standardDev() to compute the standard deviation of our scores. It will be given a vector of double values and will return the standard deviation of those values. Because this function is similar to that for the average and this lab exercise is already quite lengthy, we will shorten our design description and go direcly to the function's specification and an algorithm for it.

Design

The specification for this function is as follows:

Specification:
Receive: numVec, a vector of double values.
Precondition: numVec is not empty.
Return: The standard deviation of the values in numVec.

Question #10.5: For this function, like average(), to be reusable, it should also be in our DoubleVectorOps library. Using the preceding specification, place a prototype for standardDev() in DoubleVectorOps.h and DoubleVectorOps.txt and a function stub in DoubleVectorOps.cpp. As before, numVec should be defined as a constant reference parameter.

The following algorithm can be used to compute standard deviation:

Algorithm

Receive numVec.
If numvec is not empty:
1. Find the average of the values in numvec and assign it to some variable such as theAverage.
2. Initialize to 0 some variable such as sumSqrTerms that will be sum of the squares of the deviations of the values in numvec from the average.
3. For each index i of numVec:
  Compute the square of the difference numVec_i − theAverage and add it to sumSqrTerms.
4. Return the square root of sumSqrTerms / (number of values in numVec).
Otherwise
1. Display an error message via cerr.
2. Return 0.0 as a default value.

(Note: The notation numVec_i is used to refer to the value in numVec whose index is i.)

Coding

Vector size. To determine the number of values in a vector, we can use the size() method, as before.

Average. To compute the average of the values in a vector, we can use the average() function that we just defined.

The loop. The loop can be implemented as a counting for loop that counts from 0 to numValues minus one:

for (int i = 0; i < numValues; i++)
  ...

This is perhaps one of the most common ways to process a vector. You should familiarize yourself with this loop because you will see it in any program that uses a vector. It's a little different from the counting loops we've used in the past because vector indices start at 0.

Element access. To access a value from a vector, the subscript operation can be applied to the vector, in the same manner as a string. The pattern is:

vectorObject[index]

The value of this expression is the value that is stored within vectorObject at index index.

Question #10.6: Using this information, complete the stub for standardDev(). Then uncomment the call to standardDev() in the main function and test what you have written.

Computing Letter Grades

Print a hard copy of each of the three scores files. Take a moment and and look over the distribution of scores in each file.

Question #10.7: If there are 100 points possible for the scores in scores1.txt, scores2.txt, and scores3.txt and you were assigning grades, what letter grades would you assign? Write your letter grades on the printouts.

To finish our program, we must write a function that computes the appropriate letter grades for the scores, using the "grading-on-the-curve" approach. This function will receive a sequence of scores to process and must return a corresponding sequence of letter grades.

Specification:
Receive: scoreVec, a vector of double values.
Return: gradeVec, a vector of char values.

Because this function is tied to this particular grading problem, we will make it part of grades.cpp rather than put it in our DoubleVectorOps library.

Question #10.8: Using this information, prototype computeLetterGrades() before the main function and define a stub following the main function. Since scoreVec is a class object that is received but not passed back, define it as a constant reference parameter.

"Grading on the curve" is one common way to curve grades for a collection of scores and is based on the average (ave) and standard deviation (sdev) of the scores:

Grade Cutoffs Score Letter Grade

F cutoff: ave − 1.5 × sdev score < F cutoff
F

D cutoff: ave − 0.5 × sdev F cutoff ≤ score < D cutoff
D

C cutoff: ave + 0.5 × sdev D cutoff ≤ score < C cutoff
C

B cutoff: ave + 1.5 × sdev B cutoff ≤ score < B cutoff
B

A cutoff: ave + 1.5 × sdev A cutoff ≤ score
A

Grade Cutoffs	Score	Letter Grade
F cutoff: ave − 1.5 × sdev	score < F cutoff	F
D cutoff: ave − 0.5 × sdev	F cutoff ≤ score < D cutoff	D
C cutoff: ave + 0.5 × sdev	D cutoff ≤ score < C cutoff	C
B cutoff: ave + 1.5 × sdev	B cutoff ≤ score < B cutoff	B
A cutoff: ave + 1.5 × sdev	A cutoff ≤ score	A

Here is the algorithm:

Receive scoreVec, a vector of scores.
Declare gradeVec, a vector of characters the same length as scoreVec.
If the number of scores is > 0:
1. Find the average and the standard deviation of the scores.
2. Define the cutoff lines for F, D, C, B, and A grades as described above.
3. For each index i of scoreVec:
  If scoreVec_i < the F cut-off, assign 'F' to gradeVec_i
  Else if scoreVec_i < the D cut-off, assign 'D' to gradeVec_i
  . . . etc . . .
4. Return gradeVec.

Question #10.9: Using the information above, complete the definition of computeLetterGrades(). Then compile what you have written to check for syntax errors.

Display the Names, Scores and Grades

The final operation is to display the information we have computed.

Question #10.10 Design and write a function that, given a vector of names, a vector of scores, and a vector of letter grades, displays these three vector objects in tabular form. For example, the output produced when scores1.txt is processed should appear something like the following:
Enter the name of the scores file: scores1.txt Mean score: 75 Std. Dev: 11.1803 joan 55 F joe 60 D jane 60 D jim 65 D janet 70 C john 70 C johanna 75 C jack 75 C joeline 75 C jacques 75 C josh 80 C janna 80 C jason 85 B jadzia 90 B jon 90 B jackie 95 A

Once you are confident that your program is correct, compare the grades computed by your program with those you assigned at the beginning of the lab exercise in Question 10.1. In some cases, there are probably discrepancies between the two, perhaps enough to cause you to hope that none of your professors use "grading on the curve."

Lab Home Page

Report errors to Larry Nyhoff (nyhl@cs.calvin.edu)

Lab 10: Arrays and Vectors

Introduction

The Problem

Files

Design

Behavior

Objects

Operations

Algorithm

Declaring vector Objects

Filling vector Objects from a File

Design

Behavior

Behavior

Operations

Algorithm

Coding

Average

Design

Behavior

Objects

Operations

Algorithm

Coding

vector and the Standard Template Library

Standard Deviation

Design

Coding

Computing Letter Grades

Display the Names, Scores and Grades

Declaring `vector` Objects

Filling `vector` Objects from a File

`vector` and the Standard Template Library