Lab 10: Arrays and Vectors


Introduction

Up to now, we have processed sequences of values by processing them one value at a time. C++, however, provides various kinds of containers for processing collections of values — e.g., arrays, vectors, lists, stacks, queues, and others. In this lab exercise we examine vectors, which store sequences of elements in C-style arrays but, in addition, provide a collection of built-in operations for processing them. As with a string, a vector (as well as an array) can access the values it stores via a subscript or index:

The boxes in the above diagram of a vector (or array) named aVec are called its elements or components; they can store values that are of some specified type. However, unlike a string, which only stores chars, an array and a vector can store any type of data. Also, by storing a sequence of values in an array or vector, we can design operations for the sequence and then implement these operations with functions that receive the entire sequence through a parameter.

Vectors are implemented in C++ by the vector class provided in the vector library, which is part of the C++ Standard Template Library (STL). The STL is a collection of containers and algorithms that are very useful in a variety of situations. This lab exercise will be exploring only a small part of the STL.

The Problem

In this lab exercise, we will develop a program to process the names and scores of students in a course and assign letter grades (A, B, C, D or F) using "grading on the curve." The input file will be organized as a series of lines, each having the form

    name score

where name is a student's name and score is the score received by that student on a test, quiz, or some other course item. As usual, we will use object-centered design to develop our solution.

Files

The following are the files that you will be using:

Add your name and other infomation your instructor requires to the opening documentation of the code and documentation files.

You will begin by editing the file grades.cpp. Open it in an editor and take a few minutes to look it over. As its documentation indicates, grades.cpp is a "skeleton program" that (partially) encodes an algorithm to solve the problem. In this lab's exercise you will be completing this program.

Design

Our programs until now have been able to process their inputs as soon as they were entered. This is actually quite rare. Often, as in this lab's problem, we need to keep the information available after the data has been input so that it can be reused later in the program.

To get started, let's look at what the end result will be: letter grades have been assigned to the scores that are input. This seems straightforward — just input a student's score and output a letter grade corresponding to that score. Here, however, it isn't that simple because of how letter grades are to be computed — using the grading-on-the-curve method.

We will explain this method of assigning letter grades near the end of this lab exercise. Suffice it to say for now that it requires first computing the average score and then the standard deviation of the scores. And the only way to know this is to sum all of the scores and then compute the average and the standard deviation. Once these are known, we can determine which scores will get A's, which will get B's, and so on. Thus, the scores have to be processed (at least) twice: once to compute the average and standard deviation and a second time to assign letter grades based on the average and standard deviation.

We could store the list of scores in a file and then reread the file each time we need to process them. The problem with this approach is that reading from a file is very, very, very slow compared to reading from main memory! A much better solution is to input all the scores and store them in an internal container — that is, a container stored in main memory. For this we will use a vector, which is one of the containers in the Standard Template Library. In addition to is being much, much, much faster to access the scores from a vector than from a file, a vector gives us direct — i.e., immediate — access to each score. It takes no longer to access any element of the vector than any other element.

Behavior

Our program should display an introductory message, and then prompt for and read the name of an input file. It should then read a list of names and scores from the input file. It should then compute the average and standard deviation of the scores, and display these values. Using the average and standard deviation, it should compute the sequence of letter grades that correspond to the scores. It should then display each student's name, score, and letter grade.

Objects

When we examine our behavioral description for objects, we find these objects:

Description Type Kind Name
 an introductory message  string   literal     —-
 the name of the input file  string  varying   inFileName
 a sequence of names  vector<string>   varying  nameVec
 a sequence of scores  vector<double>  varying  scoreVec
 the average score  double  varying  average
 the standard deviation of the scores  double  varying  stdDeviation 
 a sequence of letter grades  vector<char>  varying  gradeVec

We can thus specify the behavior of our program as follows:

Specification:
Input (keyboard): the name of an input file.
Input (input file): a sequence of names and scores.
Output (screen): the average and standard deviation of the scores followed by the sequence of names, scores, and letter grades.

Operations

From our behavioral description, we have these operations:

Description  Predefined?   Name   Library 
 Display a string

yes

<<

 iostream 

 Read a string

yes

>>

iostream

 Read a sequence of names and scores from a file

no

 Compute the average of a sequence of scores

no

 Compute the standard deviation of a sequence of scores

no

 Compute the sequence of letter grades corresponding to a sequence of scores 

no

 Display sequence of names, scores, and letter grades

no

Note that there are several undefined operations. For each of these we will write a function to implement it.

Algorithm

We can organize the preceding objects and operations into the following algorithm:

  1. Display a greeting.
  2. Prompt for and read the name of the input file.
  3. Fill nameVec and scoreVec with values from the input file.
  4. Compute and output the average and the standard deviation of the values in scoreVec.
  5. Compute gradeVec, a vector of the letter grades corresponding to the scores in scoreVec.
  6. Display the elements in nameVec, scoreVec, and gradeVec.
Note that we will be using three different types of vector objects:

    nameVec storing string values,
    scoreVec storing double values, and
    gradeVec storing char values.

Declaring vector Objects

A declaration of a vector whose elements are of type ElementType has the following form:

vector<ElementType> vecObj;

This declaration defines vecObj as a vector that can store values of type ElementType; it is initially empty.

The preceding declaration actually specifies two things:

  1. vector<ElementType> is a class;
  2. vecObj is an object whose type is that class.
vector itself is known as a class template — a pattern for a class. When we attach the name of some type enclosed in < and >, it becomes a class. A complete discussion of templates and class templates is left for later courses in computing. For now, it is enough when, why, and how to use them.

The compiler will check that vecObj is used correctly. For example, only values of type ElementType can be put into vecObj; and only ElementType objects can be retrieved and/or removed from it.

Question #10.1: In grade.cpp, use this information to declare the three vectors nameVec, scoreVec, and gradeVec needed in the main program at the places indicated by comments — the first two near the beginning of main() and gradeVec near the end. Make sure that your declarations specify the correct type of data that these vectors will store. Also be sure to #include the vector library.

Filling vector Objects from a File

Now we start working on the operations left undefined in our main algorithm. The first one is to read names and scores from the file whose name is stored in inFileName and put them into the vectors that you just declared in the main program — names in nameVec and scores in scoreVec. This is the task for function fillVectors()

Design

Behavior

We can describe how this function should behave as follows:

This function should receive the name of an input file and two empty vectors — an empty vector of strings and an empty vector of doubles from its caller. It should open an ifstream to the input file and then read each name and score from the ifstream and append them to the vector of strings and the vector of doubles, respectively. When the end of the file is reached, this function should close the ifstream, and pass back the filled vector objects.

Behavior

Using the behavioral description, our function needs the following objects:

Description Type Kind Movement Name
 The name of the input file  string  constant   in  inFileName 
 An empty vector of strings  vector<string> &   varying  in & out  nameVec
 An empty vector of doubles   vector<double> &  varying  in & out  scoreVec
 a stream to inFileName  ifstream  varying  local  inStream
 a student's name  string  varying  local  name
 a student's score  double  varying  local  score

Given these objects, we can specify the behavior of our function as follows:

Specification:
Receive: inFileName, a string; nameVec, a vector<string>; and scoreVec, a vector<double>.
Pass back: nameVec and scoreVec, filled with the values from inFileName.

Note that because nameVec and scoreVec are to be passed back to main(), they will need to be reference parameters.

Since this function seems unlikely to be generally reuseable, we will define it within the same file as our main function.

Operations

In our behavioral description, we have the following operations:

Description Predefined? Name Library
 Receive inFileName, nameVec and scoreVec

yes

 function call mechanism 

built-in

 Open an ifstream to inFileName

yes

ifstream declaration

fstream

 Read a string from an ifstream

yes

>>

fstream

 Read a double from an ifstream

yes

>>

fstream

 Append a string to a vector<string>

yes

push_back()

vector

 Append a double to a vector<double>

yes

push_back()

vector

 Repeat reading and appending for whole file 

yes

eof-controlled input loop

built-in

 Close an ifstream

yes

close()

fstream

 Pass back vector objects

yes

reference parameter

built-in

Algorithm

We can organize these objects and operations into the following algorithm:

  1. Receive inFileName, nameVec, and scoreVec.
  2. Open a stream inStream to inFileName, and check that it opened successfully.
  3. Loop
    1. Read name and score from inStream.
    2. If end-of-file was reached, terminate repetition.
    3. Append name to nameVec.
    4. Append score to scoreVec.
    End loop.
  4. Close inStream.

Coding

To append values to a vector, use the vector method named push_back():

vectorObject.push_back(newValue);

Such a statement "pushes" newValue onto the back of the vector named vectorObject.

Question #10.2: Add a prototype for fillVectors() at the designated place in grades.cpp before main(). Remember the comments about reference parameters in the earlier specification for this function.

Then, using the preceding information and the algorithm above, write a definition of fillVectors() after main(). Then compile what you have written and make sure that it is free of syntax errors before proceeding.


Average

Now that we have the student names and scores transferred from a file into two vectors nameVec and scoreVec, we can turn to implementing the operations needed for grading on the curve. The first operation is that of computing the average of the scores. Of course, there's nothing special about their being scores; they could be temperatures, heights, weights, times, or any double values. So the average() function we write will compute the average of some generic double values. And in keeping with this genericity, we will think of the values we're averaging as arbitrary real numbers and use generic names for them. Why? Because after putting a goodly amount of effort into developing functions for calculating the average and standard deviation, we might want to add other statistical measures such as the variance, median, maximum, and minimum and store these in a statistics libary for use in other programs.

It's important to note this now because it will have an effect on the way we name our variables. Yes, we'll continue to think of the scores as scores in the main program (and other functions), but in average() and later in standardDev(), we'll think of them as just a sequence of real numbers.

Design

Behavior

We can describe how this function should behave as follows:

Our function should receive a vector of double values. It should compute the sum of the values in this vector, check that the vector is not empty and then compute and return the average of the values (i.e., the sum divided by the number of values). Otherwise, our function should display an error message indicating an empty set of values and return some default value (e.g., 0).

Objects

Using the behavioral description, our function needs the following objects:

Description Type Kind Movement Name
 A vector of doubles

 const vector<double> & 

 varying 

in

 numVec

 The sum of the values in the vector

double

varying

local

sum

 The number of values in the vector 

int

varying

local

numValues

 An error message

string

literal

local

--

 A default return value

double

literal

local

0

Given these objects, we can specify the behavior of our function as follows:

Specification:
Receive: numVec, a vector<double>
Precondition: numVec is not empty
Return: the average of the values in numVec

Since we've designed this function to be general and thus reusable, we will define it within a library named DoubleVectorOps.

Question #10.3: Place a prototype for average() in DoubleVectorOps.h and DoubleVectorOps.txt, and a function stub in DoubleVectorOps.cpp.

Since numVec is a class object that is received but not passed back, it should be defined as a constant reference parameter.

Operations

In our behavioral description, we have the following operations:

Description Predefined? Name Library
 Receive numVec

yes

function call mechanism

built-in

 Sum the values in a vector<double> 

yes

accumulate()

numeric

 Determine the number of values in a vector<double> 

yes

size()

vector

 Divide two double values

yes

/

built-in

 Return a double value

yes

return

built-in

 Display an error message

yes

<<

iostream

 Select between normal case and error case

yes

if statement

built-in

Algorithm

We can organize these objects and operations into the following algorithm for our function:

  1. Receive numVec.
  2. Compute sum, the sum of the values in numVec.
  3. Compute numValues, the number of values in numVec.
  4. If numValues > 0
        Return sum / numValues.
    Otherwise
    1. Display an error message via cerr.
    2. Return 0.0 as a default value.

Coding

As mentioned earlier, the Standard Template Library has not only containers (such as vector), but it also contains a collection of incredibly powerful functions — actually function templates —defined in two libraries: <algorithm> and <numeric>. The function accumulate() is one of these; it is found in the numeric library and can be used to sum the values in certain containers such as vector:

accumulate(vectorObject.begin(), vectorObject.end(), 0.0)

This expression returns the sum of the values in vectorObject. Actually, it computes this sum, adds it to the second argument — 0.0 in our case. To use accumulate(), we must include the numeric library in our program.

Now, to compute the average of the values in a vector, we need to know the number of elements it has. But this is easy, because each vector has a function member named size() that we can use:

vectorObject.size()

This expression returns the number of elements in vectorObject. This size() method is part of the vector class so we must #include the vector library, which we have done because we're using vectors.

Question #10.4: Using this information, complete the definition of average(). Then uncomment the call to average() in the main function, compile, and test your program. Don't forget to #include the numeric library so you can use accumulate(). Verify that your program is computing the average of the values in the input file correctly before you proceed.

vector and the Standard Template Library

In our definition of the average() function we used size(), which is one of vector's methods. We also used accumulate(), one of the standard library algorithms. Being knowledgeable about such methods and algorithms that can be applied to an object is worthwhile because it avoids having to "reinvent the wheel." Of course, we could have written our own functions to count the number of values in a vector and to sum them. But why go to all that extra effort when C++ provides them for us? Taking the time to learn about these ready-made functions and algorithms that are available is time well spent!

The vector class template is just one of many different kinds of containers provided by the C++ Standard Template Library. Others include list, set, map, stack, queue, and a variety of others. Each of these containers has its own unique properties that distinquish it from the others and they are usually studied in later computer science courses.

Some of vector's methods and each of the STL algorithms make use of a new kind of object called an iterator that can move through a sequence of values and access each of them in turn. Each of the containers in STL provides iterators for processing its elements and each of the STL algorithms has a container's iterators as arguments that it uses to access its elements. Iterators make it possible to write STL algorithms so they can handle any of the STL containers. Otherwise, for example, there would have to be a different accumulate() function for every container in STL. Also, we could conceivably create a brand new container to which we could apply all of STL's algorithms without having to change the code as long as we wrote iterators for our new container.

Although a complete discussion of iterators isn't practical at this point, it is useful to understand that each of the containers (including vector) contains a method begin() that returns an iterator to its first element and a method end() that returns an iterator pointing just beyond its final element.

These two iterators thus mark the beginning and end of a container. An STL algorithm such as accumulate() can then use these two end points of the container to iterate through the whole container and do its work (e.g., add the elements together).
 

Standard Deviation

Returning to the problem at hand, we move on to writing a function standardDev() to compute the standard deviation of our scores. It will be given a vector of double values and will return the standard deviation of those values. Because this function is similar to that for the average and this lab exercise is already quite lengthy, we will shorten our design description and go direcly to the function's specification and an algorithm for it.

Design

The specification for this function is as follows:

Specification:
Receive: numVec, a vector of double values.
Precondition: numVec is not empty.
Return: The standard deviation of the values in numVec.

Question #10.5: For this function, like average(), to be reusable, it should also be in our DoubleVectorOps library. Using the preceding specification, place a prototype for standardDev() in DoubleVectorOps.h and DoubleVectorOps.txt and a function stub in DoubleVectorOps.cpp. As before, numVec should be defined as a constant reference parameter.

The following algorithm can be used to compute standard deviation:

     Algorithm

  1. Receive numVec.
  2. If numvec is not empty:
    1. Find the average of the values in numvec and assign it to some variable such as theAverage.
    2. Initialize to 0 some variable such as sumSqrTerms that will be sum of the squares of the deviations of the values in numvec from the average.
    3. For each index i of numVec:
          Compute the square of the difference numVecitheAverage and add it to sumSqrTerms.
    4. Return the square root of sumSqrTerms / (number of values in numVec).
    Otherwise
    1. Display an error message via cerr.
    2. Return 0.0 as a default value.

(Note: The notation numVeci is used to refer to the value in numVec whose index is i.)

Coding

Vector size. To determine the number of values in a vector, we can use the size() method, as before.

Average. To compute the average of the values in a vector, we can use the average() function that we just defined.

The loop. The loop can be implemented as a counting for loop that counts from 0 to numValues minus one:

for (int i = 0; i < numValues; i++)
  ...

This is perhaps one of the most common ways to process a vector. You should familiarize yourself with this loop because you will see it in any program that uses a vector. It's a little different from the counting loops we've used in the past because vector indices start at 0.

Element access. To access a value from a vector, the subscript operation can be applied to the vector, in the same manner as a string. The pattern is:

vectorObject[index]

The value of this expression is the value that is stored within vectorObject at index index.

Question #10.6: Using this information, complete the stub for standardDev(). Then uncomment the call to standardDev() in the main function and test what you have written.

Computing Letter Grades

Print a hard copy of each of the three scores files. Take a moment and and look over the distribution of scores in each file.

Question #10.7: If there are 100 points possible for the scores in scores1.txt, scores2.txt, and scores3.txt and you were assigning grades, what letter grades would you assign? Write your letter grades on the printouts.

To finish our program, we must write a function that computes the appropriate letter grades for the scores, using the "grading-on-the-curve" approach. This function will receive a sequence of scores to process and must return a corresponding sequence of letter grades.

Specification:
Receive: scoreVec, a vector of double values.
Return: gradeVec, a vector of char values.

Because this function is tied to this particular grading problem, we will make it part of grades.cpp rather than put it in our DoubleVectorOps library.

Question #10.8: Using this information, prototype computeLetterGrades() before the main function and define a stub following the main function. Since scoreVec is a class object that is received but not passed back, define it as a constant reference parameter.

"Grading on the curve" is one common way to curve grades for a collection of scores and is based on the average (ave) and standard deviation (sdev) of the scores:

Grade Cutoffs Score Letter Grade
  F cutoff:   ave − 1.5 × sdev   score   <   F cutoff 

F

  D cutoff:  ave − 0.5 × sdev  F cutoff   ≤   score  <  D cutoff  

D

  C cutoff:  ave + 0.5 × sdev  D cutoff   ≤   score  <  C cutoff  

C

  B cutoff:  ave + 1.5 × sdev  B cutoff   ≤   score  <  B cutoff  

B

  A cutoff:  ave + 1.5 × sdev   A cutoff  ≤ score  

A

Here is the algorithm:

  1. Receive scoreVec, a vector of scores.
  2. Declare gradeVec, a vector of characters the same length as scoreVec.
  3. If the number of scores is > 0:
    1. Find the average and the standard deviation of the scores.
    2. Define the cutoff lines for F, D, C, B, and A grades as described above.
    3. For each index i of scoreVec:
          If scoreVeci < the F cut-off, assign 'F' to gradeVeci
          Else if scoreVeci < the D cut-off, assign 'D' to gradeVeci
          . . . etc . . .
    4. Return gradeVec.

Question #10.9: Using the information above, complete the definition of computeLetterGrades(). Then compile what you have written to check for syntax errors.

Display the Names, Scores and Grades

The final operation is to display the information we have computed.

Question #10.10 Design and write a function that, given a vector of names, a vector of scores, and a vector of letter grades, displays these three vector objects in tabular form. For example, the output produced when scores1.txt is processed should appear something like the following:

Enter the name of the scores file: scores1.txt

Mean score: 75
Std. Dev: 11.1803

joan    55      F
joe     60      D
jane    60      D
jim     65      D
janet   70      C
john    70      C
johanna 75      C
jack    75      C
joeline 75      C
jacques 75      C
josh    80      C
janna   80      C
jason   85      B
jadzia  90      B
jon     90      B
jackie  95      A

Once you are confident that your program is correct, compare the grades computed by your program with those you assigned at the beginning of the lab exercise in Question 10.1. In some cases, there are probably discrepancies between the two, perhaps enough to cause you to hope that none of your professors use "grading on the curve."


Lab Home Page


Report errors to Larry Nyhoff (nyhl@cs.calvin.edu)