Experiment 1: Appending n Values to an Empty Container


Save a copy of append1.cpp, in your directory. Using your text editor, open append1.cpp and take a few moments to examine what it does. Its basic algorithm is as follows:

   0. Read n.
   1. Determine the time to append n values to an empty list:
      a. Define an empty list.
      b. Define a Timer.
      c. Start the Timer.
      d. For each value iin the range 0..n-1:
           Append i to the end of the list.
      e. Stop the Timer.
      f. Display the value of the Timer.
   2. Determine the time to append n values to an empty vector:
      a. Define an empty vector.
      b. Reset the Timer.
      c. Start the Timer.
      d. For each value iin the range 0..n-1:
            Append i to the end of the vector.
      e. Stop the Timer.
      f. Display the value of the Timer.
Download and print a piece of graph paper. Then compile and link append1.cpp. We will be plotting the times for the values: 5000, 10000, 15000, 20000, 25000, 30000, 35000, 40000, 45000 and 50000 (unless your instructor has alternative values for you to use).

Begin by running the program for the value 50000, to establish the upper bound for your graph. Using the results, establish the scale on your graph's Time axis, writing in the values for each line on that axis. Then plot the points for n = 50000, drawing a box around the time for the list, and a circle around the time for the vector. Use the program to determine the times for the remaining values, and add those points to your graph.

Graph the time for appending to a list by connecting the boxed points, and graph the time for appending to a vector by connect the circled points. Add a legend to your graph in which you associate the line of boxed points with the list, and the line of circled points with the vector. Then label the graph with the title of the experiment (i.e., Appending n Values to an Empty Container).

Computer scientists categorize the time to perform an operation in any of several ways, depending on how long that operation takes as a function of n, the "size" of the input. Informally, an operation occurs in

There are additional categories (logarithmic, cubic, exponential and factorial time) but these three are all that we will see in our exercise today.

On the back of your piece of "graph paper," write a short paragraph in which you characterize each operation as constant, linear, or quadratic time, plus your general conclusions about the time required to append n values to a list compared to the time required to append n values to a vector, based on the results of this experiment.

Analysis of Experiment 1

Let's use the preceding information to see what happened in Experiment 1. We began with an empty list:

   list<int> aList;
The structure of the list is a bit complicated, but each node in the list stores three things: a value, a pointer to the previous node (if any), and a pointer to the following node (if any). As we mentioned earlier, every list has a special first node called the head node whose previous-node pointer points to the last node in the list, and whose next-node pointer points to the first node in the list. The empty list is thus something like the following:

We then used push_back() to append a value to this list:

   aList.push_back(1);
and repeated this operation n times. The push_back() function simplistically:
  1. allocates a new node to store the new value:

  2. stores the value in the node:

  3. attaches the node to the list using the previous-node member of the head node (since this always points to the last node), and updates the size of the list:

Thanks to the head node, push_back() has direct access to the last node in the list. It follows that appending one value always takes these three steps, so that push_back() takes constant time for a list. If appending one value takes constant time, appending n values should take time proportional to n, or linear time, which should be what you observed in Experiment 1.

Note that the number of values in a list (i.e., its size) is always the same as the number of nodes in the list (i.e., its capacity). Since the two quantities are the same, no capacity() function member is needed, and so the list class does not provide one.

Let's compare what happens when the push_back() message is sent to a vector. We began with an empty vector:

   vector<int> aVector;
which we have seen looks like this:

and then used push_back() to append n values to it:

   aVector.push_back(1);
The push_back() function member is designed to append values in an efficient manner with respect to time. It should be evident that the push_back() function must increase the size of aVector to make room for the new value. Since this takes time, the push_back() operation preallocates a block of memory, so that subsequent calls to push_back() will not also have to increase the capacity of aVector. It then writes the new value into the next available location, giving us:

That is, instead of push_back() causing aVector to grow just big enough to hold one more integer, push_back() instead causes aVector to increase in capacity by an amount sufficient to store a large sequence of integers (512 on our machine)! This is the reason that the size() and capacity() of a vector are two different quantities: the size() is the number of values the vector contains, the capacity() is the number of values it can contain!

By preallocating more space than is needed, subsequent calls to push_back() will proceed faster, since the vector will not need to be expanded to accomodate them. When push_back() is called the next trip through the loop, all that push_back() needs to do is write the new value at the next empty spot in the array, and update its size, giving us:

This is the reason appending to aVector was relatively fast in Experiment 1 -- the first call to push_back() is slow, because of the time consumed to allocate the new block of memory, but subsequent calls occur in constant time, until the vector gets filled (i.e., when size() == capacity()).

When a vector's size == its capacity, the next push_back() causes the vector to again expand again, just as it did before. However, this time, it must also copy all of the values from the original vector into the newly expanded vector. The time required to perform this copying contributes a significant amount to the overall time required by this particular call to push_back(). However, thanks to the preallocation of space by push_back(), this only happens on those rare occasions when the vector's size and capacity are the same. As a result, the time to append n values to a vector should be roughly linear in Experiment 1.


Back to the List of Experiments


Copyright 1998 by Joel C. Adams. All rights reserved.