CS 112 Lab 6: Comparing List & Vector Operations

Objectives:

In this exercise, you will:
  1. Perform experiments that compare the performance of similar list<T> and vector<T> operations.
  2. Use iterators to insert items into list<T> and vector<T> containers.
  3. Use empirical analysis to quantify and compare the behavior of list<T> and vector<T> operations.

Introduction

Today, we want to time a few list and vector operations, using the STL list and vector templates. While we have talked in class about "Big Oh" notation, today's lab should give you some hands-on experience with the difference between an O(1) operation and an O(n) operation.

For a given operation and several values of n, our basic methodology will be to build a vector of length n, measure how long that operation takes for the vector, build a list of length n, and measure how long that operation takes for the list. By choosing five different values for n (10, 100, 1000, 10000, and 100000), we can then plot our measurements and see empirically how fast or slow a given operation is.

Today's exercise has four sections:

  1. In the first section -- Experiment 0 -- we will practice timing using the vector subscript operation..
  2. In the second section -- Experiment 1 -- we will conduct a head-to-head comparison of the vector and list push_back() operations.
  3. The vector has no push_front() method, but the Standard Template Library (STL) containers have an insert() method that we can use to insert an item at the beginning of either a vector or a list. In the third section -- Experiment 2 -- we will use this insert() method to conduct a head-to-head comparison of how long it takes to insert an item at the beginning of our two containers.
  4. Finally, the list does have a push_front() method, so in our last section, we will use it to measure how long it takes to insert an item into a list using the push_front() method compared to inserting an item into a list using the insert() method.

Let's get started!

Getting Started

Begin by creating a new project for this week's files. Download main.cpp, NanoTimer.h, Experiment0.h, and Experiment0.cpp, and import them into your project. Open these files and take a moment to look them over, to see what they contain and how they relate to one another.

Experiment 0: Vector Subscript

The purpose of this experiment is to get a feel for how to use the NanoTimer class to time the vector subscript operation. This experiment contains a method named timeAccessAtIndex() that uses NanoTimer to measure how long it takes to access the item at a given index in a vector. The run() method calls timeAccessesForVectorOfSize() for various vector sizes. This method calls the timeAccessesAtIndex() method three times, to see how long it takes to access the first, middle and last value in the vector. The timeAccessesAtIndex() method uses a loop to perform the requested access five times, and then reports the average of those times.

To time the method calls, we are using a home-grown class called the NanoTimer. As its name implies, this class lets you time operations with nanosecond granularity. To achieve this, it uses the POSIX real-time clocking library. Using this library involves two steps:

  1. In order for the GCC compiler to find the declarations of the functions in this library, our NanoTimer class's header file contains the directive #include <ctime>.
  2. In order for the GCC linker to find the definitions of the functions in this library, we must tell it to use this library. You can do this in eclipse by navigating to
       Project > Properties > C/C++ Build > Settings > GCC C++ Linker > Libraries
    
    In the Libraries (-l) pane, click the + button to add a new library. The name of the library we need to add is rt (for real-time), so in the dialog that appears, enter:
       rt
    
    Then click the OK buttons and you should be good to go.

Compile and run Experiment0. Our NanoTimer class has nanosecond granularity, which should be sufficient to provide you with meaningful (i.e., non-zero) results.

To report your results, launch your computer's spreadsheet application. On the worksheet, write Lab 6: Experiment 0 in the first row; on the next row create column-headings labeled size, first, middle, and last. Then add rows for each vector size (i.e., 10, 100, 1000, 10000, 100000, 1000000). For each row size, enter the times your program displayed for accessing the first, middle, and last value in the vector of size size.

When you have entered all of your data, create a well-labeled line chart from your timing results. Your chart should look something like this:

An example graph showing the time to access an item in a vector
If your chart looks markedly different from the one above, try changing the value of REPS from 5 to 1000. Looking at the code for this experiment, what effect does this have, and why?

Analysis. On the chart shown above, the very first access took about twice as long as the others (there seems to be a time-penalty associated with the first subscript call), but after that, the lines are basically flat -- there is no significant change to the access time as the vector size increases by orders of magnitudes. If I use the spreadsheet to compute the averages for the first, middle, and last columns, they are all about the same (0.000000028, 0.000000021, and 0.000000021 respectively). Your chart should be similar.

The relative flatness of these lines and their close proximity of the values to one another -- as the size of the vector increases by orders of magnitude -- indicate that the vector subscript operation is a constant time operation. These results thus indicate empirically that the vector subscript operation is an O(1) (i.e., constant time) operation.

Save your spreadsheet in your project folder (but don't close it), and continue.

Experiment 1: Vector & List Appending

In our next experiment, we will use a similar procedure to measure how long it takes to append an item to a vector and compare that with how long it takes to append an item to a list. Start by downloading Experiment1.h and Experiment1.cpp. Import these files into your project, open them, and take a few minutes to look over the Experiment1 class. Then in main.cpp, comment out the lines referring to Experiment0 and uncomment the lines referring to Experiment1.

Recompile your project and run Experiment1 as it is written. As you can see, Experiment1 contains code to measure how long it takes to append a value to a vector. It does this two ways:

  1. It first invokes timeAppendToVectorOfSize(n) for varying values of n (e.g., 10, 100, 1000, etc.).
  2. It then uses timeRepeatedAppendUntilVectorIsSize(n) to measure the average time an append operation takes as a vector grows from being initially empty to being very large.
Our timeAppendToVectorOfSize(n) method uses an approach similar to that of Experiment0, but timing how long it takes the vector push_back(item) method to append a single value, By contrast, our timeRepeatedAppendUntilVectorIsSize(n) starts with an empty vector and then calls push_back() n times, and computes the average length of time required to append a value.

Your task is to write two methods that do the same things for lists:

  1. timeAppendToListOfSize(n) that can be used to measure the average time to append an item to a list of size n; and
  2. timeRepeatedAppendUntilListIsSize(n) that measures the average time an append operation takes as a list grows very large.
Using the existing methods as models, write these two methods; then invoke them so that Experiment1 displays the append times for both vectors and lists.

When you have good data, return to your spreadsheet. Create a new worksheet, and label the first row Lab 6: Experiment 1. On the next row, place column headings for size, vector, and list. Then add a separate row for each size (10, 100, 1000, 10000, 100000, 1000000). Add your data to your worksheet; then create a chart similar to that of Experiment0.

Study your chart. Using it and the data in your spreadsheet, answer the following questions on your spreadsheet, below your chart:

  1. What does your data indicate about the time required to append an item to a vector as the vector grows larger? Is the timing behavior produced by timeAppendToVectorOfSize(n) consistent with that of timeRepeatedAppendUntilVectorIsSize(n)? Explain why or why not. What is the "Big Oh" notation for each? (Remember that our vector's size is increasing exponentially.)
  2. What does your data indicate about the time required to append an item to a list as the list grows larger? Is the timing behavior produced by timeAppendToListOfSize(n) consistent with that of timeRepeatedAppendUntilListIsSize(n)? Explain why or why not. What is the "Big Oh" notation for each?
  3. On average, what can you conclude about the time required to append an item to a vector compared to a list, as each gets large?
Save your spreadsheet again; then continue.

Experiment 2: Vector & List Prepending

Our final experiment is to compare how long it takes to prepend an item to a vector, compared to a list. Start by downloading Experiment2.h and Experiment2.cpp. Import these files into your project, open them, and take a few minutes to look over Experiment2. Then in main.cpp, comment out the lines referring to Experiment1 and uncomment the lines referring to Experiment2.

Rebuild your project and run Experiment2 as it is written. As you can see, Experiment2 contains code to time how long it takes to prepend a value to a vector. It does this two ways:

  1. It first invokes timePrependToVectorOfSize(n) for varying values of n.
  2. It then uses timeRepeatedPrependUntilVectorIsSize(n) to measure the average time a prepend operation takes as a vector grows very large.
Our timePrependToVectorOfSize(n) method uses an approach similar to that of Experiment1, but the STL vector template does not provide a push_front() method by which we can prepend a value. However, STL containers do provide an insert() method that we can use for this task. More precisely, a statement like this:
   containerName.insert(iterator, value);
will insert value at the position indicated by iterator in the container named containerName.

An iterator is an abstract pointer, defined by a container class to provide access to its items. Every STL container provides at least two methods that return iterators:

Thus, we can use the statement:
   v.insert( v.begin(), i );
to prepend the value of our loop-control variable i to our vector v.

Our timeRepeatedPrependUntilVectorIsSize(n) method also uses the insert() method.

Your task is to write two methods that do the same things for lists:

  1. timePrependToListOfSize(n) that can be used to measure the average time to prepend an item to a list of size n; and
  2. timeRepeatedPrependUntilListIsSize(n) that measures the average time a prepend operation takes as a list grows very large.
Using the existing vector methods as a model, write these two methods; then invoke them so that Experiment2 displays the append times for both vectors and lists.

When you have good data, return to your spreadsheet. Create a new worksheet, and label the first row Lab 6: Experiment 2. On the next row, place column headings for size, vector, and list. Then add a separate row for each size (10, 100, 1000, 10000, and 100000). Add your data to your worksheet; then create a chart similar to that of Experiment1.

Study your chart. Using it and the data in your spreadsheet, answer the following questions on your spreadsheet, below your chart:

  1. What does your data indicate about the time required to prepend an item to a vector as the vector grows larger? Is the timing behavior produced by timePrependToVectorOfSize(n) consistent with that of timeRepeatedPrependUntilVectorIsSize(n)? Explain why or why not. What is the "Big Oh" notation for each?
  2. What does your data indicate about the time required to prepend an item to a list as the list grows larger? Is the timing behavior produced by timePrependToListOfSize(n) consistent with that of timeRepeatedPrependUntilListIsSize(n)? What is the "Big Oh" notation for each?
  3. On average, what can you say about the time required to prepend an item to a vector, compared to a list, as each gets large?

Finally

Unlike vector, the STL list does provide a push_front() method. Where push_back() appends an item to the end of the list, push_front() prepends an item to the beginning of the list.

We might hypothesize that push_front() should behave like l.insert(l.begin(), item). Let's see if we can verify or falsify that hypothesis.

In your timePrependToListOfSize(n) and timeRepeatedPrependUntilListIsSize(n) methods, comment out your calls to insert() and replace them with calls to push_front(). Then rerun your experiment. Add a new column named list (push_front) to your Experiment2 spreadsheet. Under it, add the timing results for this revised version to your spreadsheet and add a line for this new column to your chart. Then answer the following question below your other answers on your Experiment2 spreadsheet:

4. How does the time to prepend using push_front() compare to the time to prepend using insert()? Is the "Big Oh" the same or different for these two different ways of prepending?

Whew! Congratulations, you're done! (For now.) Save your spreadsheet in Excel format (.xls) in your project-folder for this lab. I recommend the 97/2000/XP format for maximum compatibility. Do not save it in .ods format or all of your charts will be lost, negatively affecting your grade on this lab.

Turn In

Copy your project to a new /home/cs/112/current/yourUserName/lab6 folder. A significant part of your score for this lab is based on the spreadsheet, so double-check that a copy of your spreadsheet is saved there for the grader to review.

If time permits, you may begin working on this week's project.


CS > 112 > Labs > 06


This page maintained by Joel Adams.