# Lab 10: Arrays and Vectors

## Introduction

Up to now, we have processed sequences of values by processing them one value at a time. C++, however, provides various kinds of containers for processing collections of values — e.g., arrays, vectors, lists, stacks, queues, and others. In this lab exercise we examine vectors, which store sequences of elements in C-style arrays but, in addition, provide a collection of built-in operations for processing them. As with a `string`, a vector (as well as an array) can access the values it stores via a subscript or index:

The boxes in the above diagram of a vector (or array) named `aVec` are called its elements or components; they can store values that are of some specified type. However, unlike a `string`, which only stores `char`s, an array and a `vector` can store any type of data. Also, by storing a sequence of values in an array or vector, we can design operations for the sequence and then implement these operations with functions that receive the entire sequence through a parameter.

Vectors are implemented in C++ by the `vector` class provided in the `vector` library, which is part of the C++ Standard Template Library (STL). The STL is a collection of containers and algorithms that are very useful in a variety of situations. This lab exercise will be exploring only a small part of the STL.

## The Problem

In this lab exercise, we will develop a program to process the names and scores of students in a course and assign letter grades (A, B, C, D or F) using "grading on the curve." The input file will be organized as a series of lines, each having the form

``    name` `score``

where `name` is a student's name and `score` is the score received by that student on a test, quiz, or some other course item. As usual, we will use object-centered design to develop our solution.

## Files

The following are the files that you will be using:

Add your name and other infomation your instructor requires to the opening documentation of the code and documentation files.

You will begin by editing the file `grades.cpp`. Open it in an editor and take a few minutes to look it over. As its documentation indicates, `grades.cpp` is a "skeleton program" that (partially) encodes an algorithm to solve the problem. In this lab's exercise you will be completing this program.

## Design

Our programs until now have been able to process their inputs as soon as they were entered. This is actually quite rare. Often, as in this lab's problem, we need to keep the information available after the data has been input so that it can be reused later in the program.

To get started, let's look at what the end result will be: letter grades have been assigned to the scores that are input. This seems straightforward — just input a student's score and output a letter grade corresponding to that score. Here, however, it isn't that simple because of how letter grades are to be computed — using the grading-on-the-curve method.

We will explain this method of assigning letter grades near the end of this lab exercise. Suffice it to say for now that it requires first computing the average score and then the standard deviation of the scores. And the only way to know this is to sum all of the scores and then compute the average and the standard deviation. Once these are known, we can determine which scores will get A's, which will get B's, and so on. Thus, the scores have to be processed (at least) twice: once to compute the average and standard deviation and a second time to assign letter grades based on the average and standard deviation.

We could store the list of scores in a file and then reread the file each time we need to process them. The problem with this approach is that reading from a file is very, very, very slow compared to reading from main memory! A much better solution is to input all the scores and store them in an internal container — that is, a container stored in main memory. For this we will use a `vector`, which is one of the containers in the Standard Template Library. In addition to is being much, much, much faster to access the scores from a `vector` than from a file, a `vector` gives us direct — i.e., immediate — access to each score. It takes no longer to access any element of the `vector` than any other element.

### Behavior

Our program should display an introductory message, and then prompt for and read the name of an input file. It should then read a list of names and scores from the input file. It should then compute the average and standard deviation of the scores, and display these values. Using the average and standard deviation, it should compute the sequence of letter grades that correspond to the scores. It should then display each student's name, score, and letter grade.

### Objects

When we examine our behavioral description for objects, we find these objects:

Description Type Kind Name
an introductory message  `string`   literal     —-
the name of the input file  `string`  varying   `inFileName`
a sequence of names  `vector<string>`   varying  `nameVec`
a sequence of scores  `vector<double>`  varying  `scoreVec`
the average score  `double`  varying  `average`
the standard deviation of the scores  `double`  varying  `stdDeviation `
a sequence of letter grades  `vector<char>`  varying  `gradeVec`

We can thus specify the behavior of our program as follows:

Specification:
Input (keyboard): the name of an input file.
Input (input file): a sequence of names and scores.
Output (screen): the average and standard deviation of the scores followed by the sequence of names, scores, and letter grades.

### Operations

From our behavioral description, we have these operations:

Description  Predefined?   Name   Library
Display a `string`

yes

`<<`

` iostream `

Read a `string`

yes

`>>`

`iostream`

Read a sequence of names and scores from a file

no

Compute the average of a sequence of scores

no

Compute the standard deviation of a sequence of scores

no

Compute the sequence of letter grades corresponding to a sequence of scores

no

Display sequence of names, scores, and letter grades

no

Note that there are several undefined operations. For each of these we will write a function to implement it.

### Algorithm

We can organize the preceding objects and operations into the following algorithm:

1. Display a greeting.
2. Prompt for and read the name of the input file.
3. Fill `nameVec` and `scoreVec` with values from the input file.
4. Compute and output the average and the standard deviation of the values in `scoreVec`.
5. Compute `gradeVec`, a vector of the letter grades corresponding to the scores in `scoreVec`.
6. Display the elements in `nameVec`, `scoreVec`, and `gradeVec`.
Note that we will be using three different types of `vector` objects:

`    nameVec` storing `string` values,
`    scoreVec` storing `double` values, and
`    gradeVec` storing `char` values.

## Declaring `vector` Objects

A declaration of a vector whose elements are of type `ElementType` has the following form:

`vector<`ElementType`> `vecObj`;`

This declaration defines `vecObj` as a `vector` that can store values of type `ElementType`; it is initially empty.

The preceding declaration actually specifies two things:

1. `vector<ElementType>` is a class;
2. `vecObj` is an object whose type is that class.
`vector` itself is known as a class template — a pattern for a class. When we attach the name of some type enclosed in < and >, it becomes a class. A complete discussion of templates and class templates is left for later courses in computing. For now, it is enough when, why, and how to use them.

The compiler will check that `vecObj` is used correctly. For example, only values of type `ElementType` can be put into `vecObj`; and only `ElementType` objects can be retrieved and/or removed from it.

Question #10.1: In `grade.cpp`, use this information to declare the three vectors `nameVec`, `scoreVec`, and `gradeVec` needed in the main program at the places indicated by comments — the first two near the beginning of `main()` and `gradeVec` near the end. Make sure that your declarations specify the correct type of data that these vectors will store. Also be sure to #include the `vector` library.

## Filling `vector` Objects from a File

Now we start working on the operations left undefined in our main algorithm. The first one is to read names and scores from the file whose name is stored in `inFileName` and put them into the `vector`s that you just declared in the main program — names in `nameVec` and scores in `scoreVec`. This is the task for function `fillVectors()`

### Design

#### Behavior

We can describe how this function should behave as follows:

This function should receive the name of an input file and two empty vectors — an empty `vector` of strings and an empty `vector` of doubles from its caller. It should open an `ifstream` to the input file and then read each name and score from the `ifstream` and append them to the `vector` of strings and the `vector` of doubles, respectively. When the end of the file is reached, this function should close the `ifstream`, and pass back the filled `vector` objects.

#### Behavior

Using the behavioral description, our function needs the following objects:

Description Type Kind Movement Name
The name of the input file ` string`  constant   in ` inFileName`
An empty `vector` of strings ` vector<string> & `  varying  in & out ` nameVec`
An empty `vector` of doubles  ` vector<double> &`  varying  in & out ` scoreVec`
a stream to `inFileName` ` ifstream`  varying  local ` inStream`
a student's name ` string`  varying  local ` name`
a student's score ` double`  varying  local ` score`

Given these objects, we can specify the behavior of our function as follows:

Specification:
Receive: `inFileName`, a `string`; `nameVec`, a `vector<string>`; and `scoreVec`, a `vector<double>`.
Pass back: `nameVec` and `scoreVec`, filled with the values from `inFileName`.

Note that because `nameVec` and `scoreVec` are to be passed back to `main()`, they will need to be reference parameters.

Since this function seems unlikely to be generally reuseable, we will define it within the same file as our main function.

#### Operations

In our behavioral description, we have the following operations:

Description Predefined? Name Library
Receive `inFileName`, `nameVec` and `scoreVec`

yes

function call mechanism

built-in

Open an `ifstream` to `inFileName`

yes

`ifstream` declaration

`fstream`

Read a `string` from an `ifstream`

yes

`>>`

`fstream`

Read a `double` from an `ifstream`

yes

`>>`

`fstream`

Append a `string` to a ```vector<string> ```

yes

`push_back()`

`vector`

Append a `double` to a ```vector<double> ```

yes

`push_back()`

`vector`

Repeat reading and appending for whole file

yes

eof-controlled input loop

built-in

Close an `ifstream`

yes

`close()`

`fstream`

Pass back `vector` objects

yes

reference parameter

built-in

#### Algorithm

We can organize these objects and operations into the following algorithm:

1. Receive `inFileName`, `nameVec`, and `scoreVec`.
2. Open a stream `inStream` to `inFileName`, and check that it opened successfully.
3. Loop
1. Read `name` and `score` from `inStream`.
2. If end-of-file was reached, terminate repetition.
3. Append `name` to `nameVec`.
4. Append `score` to `scoreVec`.
End loop.
4. Close `inStream`.

### Coding

To append values to a `vector`, use the `vector` method named `push_back()`:

``vectorObject`.push_back(`newValue`);`

Such a statement "pushes" `newValue` onto the back of the `vector` named `vectorObject`.

Question #10.2: Add a prototype for `fillVectors()` at the designated place in `grades.cpp` before `main()`. Remember the comments about reference parameters in the earlier specification for this function.

Then, using the preceding information and the algorithm above, write a definition of `fillVectors()` after `main()`. Then compile what you have written and make sure that it is free of syntax errors before proceeding.

## Average

Now that we have the student names and scores transferred from a file into two vectors `nameVec` and `scoreVec`, we can turn to implementing the operations needed for grading on the curve. The first operation is that of computing the average of the scores. Of course, there's nothing special about their being scores; they could be temperatures, heights, weights, times, or any `double` values. So the `average()` function we write will compute the average of some generic `double` values. And in keeping with this genericity, we will think of the values we're averaging as arbitrary real numbers and use generic names for them. Why? Because after putting a goodly amount of effort into developing functions for calculating the average and standard deviation, we might want to add other statistical measures such as the variance, median, maximum, and minimum and store these in a statistics libary for use in other programs.

It's important to note this now because it will have an effect on the way we name our variables. Yes, we'll continue to think of the scores as scores in the main program (and other functions), but in `average()` and later in `standardDev()`, we'll think of them as just a sequence of real numbers.

### Design

#### Behavior

We can describe how this function should behave as follows:

Our function should receive a `vector` of `double` values. It should compute the sum of the values in this `vector`, check that the `vector` is not empty and then compute and return the average of the values (i.e., the sum divided by the number of values). Otherwise, our function should display an error message indicating an empty set of values and return some default value (e.g., 0).

#### Objects

Using the behavioral description, our function needs the following objects:

Description Type Kind Movement Name
A `vector` of doubles

``` const vector<double> &  ```

varying

in

` numVec`

The sum of the values in the `vector`

`double`

varying

local

`sum`

The number of values in the `vector`

`int`

varying

local

`numValues`

An error message

`string`

literal

local

--

A default return value

`double`

literal

local

0

Given these objects, we can specify the behavior of our function as follows:

Specification:
Receive: `numVec`, a `vector<double>`
Precondition: `numVec` is not empty
Return: the average of the values in `numVec`

Since we've designed this function to be general and thus reusable, we will define it within a library named `DoubleVectorOps`.

Question #10.3: Place a prototype for `average()` in `DoubleVectorOps.h` and `DoubleVectorOps.txt`, and a function stub in `DoubleVectorOps.cpp`.

Since `numVec` is a class object that is received but not passed back, it should be defined as a constant reference parameter.

#### Operations

In our behavioral description, we have the following operations:

Description Predefined? Name Library
Receive `numVec`

yes

function call mechanism

built-in

Sum the values in a `vector<double>`

yes

`accumulate()`

`numeric`

Determine the number of values in a `vector<double>`

yes

`size()`

`vector`

Divide two `double` values

yes

`/`

built-in

Return a `double` value

yes

`return`

built-in

Display an error message

yes

`<<`

`iostream`

Select between normal case and error case

yes

`if` statement

built-in

#### Algorithm

We can organize these objects and operations into the following algorithm for our function:

1. Receive `numVec`.
2. Compute `sum`, the sum of the values in `numVec`.
3. Compute `numValues`, the number of values in `numVec`.
4. If `numValues` > 0
Return `sum` / `numValues`.
Otherwise
1. Display an error message via `cerr`.
2. Return 0.0 as a default value.

### Coding

As mentioned earlier, the Standard Template Library has not only containers (such as `vector`), but it also contains a collection of incredibly powerful functions — actually function templates —defined in two libraries: `<algorithm>` and `<numeric>`. The function `accumulate()` is one of these; it is found in the `numeric` library and can be used to sum the values in certain containers such as `vector`:

`accumulate(`vectorObject`.begin(), `vectorObject`.end(), 0.0)`

This expression returns the sum of the values in `vectorObject`. Actually, it computes this sum, adds it to the second argument — 0.0 in our case. To use `accumulate()`, we must include the `numeric` library in our program.

Now, to compute the average of the values in a `vector`, we need to know the number of elements it has. But this is easy, because each `vector` has a function member named `size()` that we can use:

``vectorObject`.size()`

This expression returns the number of elements in `vectorObject`. This `size()` method is part of the `vector` class so we must #include the `vector` library, which we have done because we're using `vector`s.

Question #10.4: Using this information, complete the definition of `average()`. Then uncomment the call to `average()` in the main function, compile, and test your program. Don't forget to #include the `numeric` library so you can use `accumulate()`. Verify that your program is computing the average of the values in the input file correctly before you proceed.

## `vector` and the Standard Template Library

In our definition of the `average()` function we used `size()`, which is one of `vector`'s methods. We also used `accumulate()`, one of the standard library algorithms. Being knowledgeable about such methods and algorithms that can be applied to an object is worthwhile because it avoids having to "reinvent the wheel." Of course, we could have written our own functions to count the number of values in a `vector` and to sum them. But why go to all that extra effort when C++ provides them for us? Taking the time to learn about these ready-made functions and algorithms that are available is time well spent!

The `vector` class template is just one of many different kinds of containers provided by the C++ Standard Template Library. Others include `list`, `set`, `map`, `stack`, `queue`, and a variety of others. Each of these containers has its own unique properties that distinquish it from the others and they are usually studied in later computer science courses.

Some of `vector`'s methods and each of the STL algorithms make use of a new kind of object called an iterator that can move through a sequence of values and access each of them in turn. Each of the containers in STL provides iterators for processing its elements and each of the STL algorithms has a container's iterators as arguments that it uses to access its elements. Iterators make it possible to write STL algorithms so they can handle any of the STL containers. Otherwise, for example, there would have to be a different `accumulate()` function for every container in STL. Also, we could conceivably create a brand new container to which we could apply all of STL's algorithms without having to change the code as long as we wrote iterators for our new container.

Although a complete discussion of iterators isn't practical at this point, it is useful to understand that each of the containers (including `vector`) contains a method `begin()` that returns an iterator to its first element and a method `end()` that returns an iterator pointing just beyond its final element.

These two iterators thus mark the beginning and end of a container. An STL algorithm such as `accumulate()` can then use these two end points of the container to iterate through the whole container and do its work (e.g., add the elements together).

## Standard Deviation

Returning to the problem at hand, we move on to writing a function `standardDev()` to compute the standard deviation of our scores. It will be given a `vector` of `double` values and will return the standard deviation of those values. Because this function is similar to that for the average and this lab exercise is already quite lengthy, we will shorten our design description and go direcly to the function's specification and an algorithm for it.

### Design

The specification for this function is as follows:

Specification:
Receive: `numVec`, a `vector` of `double` values.
Precondition: `numVec` is not empty.
Return: The standard deviation of the values in `numVec`.

Question #10.5: For this function, like `average()`, to be reusable, it should also be in our `DoubleVectorOps` library. Using the preceding specification, place a prototype for `standardDev()` in `DoubleVectorOps.h` and `DoubleVectorOps.txt` and a function stub in `DoubleVectorOps.cpp`. As before, `numVec` should be defined as a constant reference parameter.

The following algorithm can be used to compute standard deviation:

Algorithm

1. Receive `numVec`.
2. If `numvec` is not empty:
1. Find the average of the values in `numvec` and assign it to some variable such as `theAverage`.
2. Initialize to 0 some variable such as `sumSqrTerms` that will be sum of the squares of the deviations of the values in `numvec` from the average.
3. For each index `i` of `numVec`:
Compute the square of the difference `numVec``i``theAverage` and add it to `sumSqrTerms`.
4. Return the square root of `sumSqrTerms `/ (number of values in `numVec`).
Otherwise
1. Display an error message via `cerr`.
2. Return 0.0 as a default value.

(Note: The notation `numVec``i` is used to refer to the value in `numVec` whose index is `i`.)

### Coding

Vector size. To determine the number of values in a `vector`, we can use the `size()` method, as before.

Average. To compute the average of the values in a `vector`, we can use the `average()` function that we just defined.

The loop. The loop can be implemented as a counting `for` loop that counts from 0 to `numValues` minus one:

```for (int i = 0; i < numValues; i++)
...```

This is perhaps one of the most common ways to process a `vector`. You should familiarize yourself with this loop because you will see it in any program that uses a `vector`. It's a little different from the counting loops we've used in the past because `vector` indices start at 0.

Element access. To access a value from a `vector`, the subscript operation can be applied to the `vector`, in the same manner as a `string`. The pattern is:

``vectorObject`[`index`]`

The value of this expression is the value that is stored within `vectorObject` at index `index`.

Question #10.6: Using this information, complete the stub for `standardDev()`. Then uncomment the call to `standardDev()` in the main function and test what you have written.

Print a hard copy of each of the three scores files. Take a moment and and look over the distribution of scores in each file.

Question #10.7: If there are 100 points possible for the scores in `scores1.txt`, `scores2.txt`, and `scores3.txt` and you were assigning grades, what letter grades would you assign? Write your letter grades on the printouts.

To finish our program, we must write a function that computes the appropriate letter grades for the scores, using the "grading-on-the-curve" approach. This function will receive a sequence of scores to process and must return a corresponding sequence of letter grades.

Specification:
Receive: `scoreVec`, a `vector` of `double` values.
Return: `gradeVec`, a `vector` of `char` values.

Because this function is tied to this particular grading problem, we will make it part of `grades.cpp` rather than put it in our `DoubleVectorOps` library.

Question #10.8: Using this information, prototype `computeLetterGrades()` before the main function and define a stub following the main function. Since `scoreVec` is a class object that is received but not passed back, define it as a constant reference parameter.

"Grading on the curve" is one common way to curve grades for a collection of scores and is based on the average (ave) and standard deviation (sdev) of the scores:

F cutoff:   ave − 1.5 × sdev   score   <   F cutoff

F

D cutoff:  ave − 0.5 × sdev  F cutoff   ≤   score  <  D cutoff

D

C cutoff:  ave + 0.5 × sdev  D cutoff   ≤   score  <  C cutoff

C

B cutoff:  ave + 1.5 × sdev  B cutoff   ≤   score  <  B cutoff

B

A cutoff:  ave + 1.5 × sdev   A cutoff  ≤ score

A

Here is the algorithm:

1. Receive `scoreVec`, a `vector` of scores.
2. Declare `gradeVec`, a `vector` of characters the same length as `scoreVec`.
3. If the number of scores is > 0:
1. Find the average and the standard deviation of the scores.
2. Define the cutoff lines for F, D, C, B, and A grades as described above.
3. For each index `i` of `scoreVec`:
If `scoreVec``i` < the F cut-off, assign 'F' to `gradeVeci`
Else if `scoreVec``i` < the D cut-off, assign 'D' to `gradeVeci`
. . . etc . . .
4. Return `gradeVec`.

Question #10.9: Using the information above, complete the definition of `computeLetterGrades()`. Then compile what you have written to check for syntax errors.

## Display the Names, Scores and Grades

The final operation is to display the information we have computed.

Question #10.10 Design and write a function that, given a `vector` of names, a `vector` of scores, and a `vector` of letter grades, displays these three `vector` objects in tabular form. For example, the output produced when `scores1.txt` is processed should appear something like the following:

```Enter the name of the scores file: scores1.txt

Mean score: 75
Std. Dev: 11.1803

joan    55      F
joe     60      D
jane    60      D
jim     65      D
janet   70      C
john    70      C
johanna 75      C
jack    75      C
joeline 75      C
jacques 75      C
josh    80      C
janna   80      C
jason   85      B