## CS 112 Lab 5: Linked Lists

### Objectives:

In this exercise, you will:
1. Build a list of linked nodes.
2. Implement a variety of list operations.
3. Practice using pointers.

### Introduction

In previous exercises, we have seen how to use run-time memory allocation to build dynamic arrays that effectively grow and shrink as a program executes. Arrays are sometimes called random access structures because the time to access any randomly-chosen item is uniform. (I.e., if v is a Vec<int> of length 8, it takes us no longer to access v[7] or v[3] than it does for us to access v[0].) Arrays are thus a useful data structure for problems in which we need to access arbitrary items from the structure in constant time.

However there are other problems for which the array is less useful. For example, if a problem requires that items be inserted into or deleted from the structure at a position other than the end, the time required to do so is proportional to the number of items in the array, because all of the items following the point of insertion/deletion must be copied, either to create a gap for the new value, or to close up the gap formed by the deleted value.

The array stores its items in adjacent memory locations. This is the source of both its advantages and its disadvantages:

• The time to access a random item is constant: To access v[i], the program just has to:

1. multiply i times the size of an array-item; and
So having all of an array's values stored in adjacent memory locations makes fast random-access possible.
• The time to insert or delete a value at a position other than the end is linear:
• To insert a value at index i, the values at index i and following must be shifted to make room for the new value:
If the array is full, it must be resized... this can be very a time-consuming operation!
• To delete the value at index i, the values at index i+1 and following must be shifted to "close the gap":

Today, we want to examine a different data structure called the List. The List does not store items in adjacent memory locations. Instead, it stores items in structures called Nodes that are linked together (like a chain) using pointers:

By storing items in non-adjacent memory locations, the List makes it possible to insert and delete items without having to shift those that are already in place. However we lose the ability to access a random item in constant time.

### Getting Started

Begin by making a new project named lab5 in which to store the files for this week's exercise. Then download and import ListTester.h, ListTester.cpp, and tester.cpp into your project. Then create a new class named List in your project, and add a typedef statement that defines Item as a synonym for the type double.

### The List Class

The minimalist way to build a List class is to just store a pointer to the first node. This lets a List operation quickly access the first node, from which it can access all the rest of the nodes. If we make sure that the value of the pointer in the last node is NULL, we can distinguish it from the other nodes.

However if we only store the address of the first node, then a frequently needed operation like finding the size of the list must count the number of nodes, requiring time proportional to the size of the list instead of constant time. To avoid this, our List class will keep track of its size in an instance variable.

Another common operation is to append an item to the end of the list. If we just store the address of the first node, then finding the last node requires time proportional to the number of nodes to find the last node. To avoid this delay and append items in constant time, our List class will also store the address of its last node. We might envision the resulting data structure like this:

Using this information, add three instance variables to your List class: one named mySize to store the size of the list, one named myFirst to store the address of the first Node, and one named myLast to store the address of the last Node. (Common mistake: myFirst and myLast should not store the addresses of Item values; they should store the addresses of Node objects.) It is our responsibility to ensure that these variables always contain the right values.

In ListTester.cpp, uncomment the call and definition of testDefaultConstructor(). Save/compile. What happens?

### The Node Struct

Before we can declare pointers to a node, we need a Node type. One way to define this is to declare a Node struct within the private section of our List class:
class List {
...
private:
struct Node {

};
...
};
This approach views Node as an implementation detail of the List class. Since implementation details should be inaccessible to users of the class, we declare Node nested within List, in its private section.

Save/compile. You should see an error that you are able to fix. Read the error message carefully, try to fix it, and save/compile. If you get stuck, here is a hint.

Once Node has now been declared and ListTester can access the private section of a List, everything should compile correctly.

Run the tests. Since we have not yet defined a default constructor, you should see a failed assertion. Using the test as a guide, define the List default constructor. (If the compiler complains about NULL being undefined, use #include<iostream> to define it, as we will need I/O eventually.) Continue when your constructor passes the test.

### The Node Constructors

Before we can do anything very interesting with a List, it needs to be able to build Node objects, so our next step is to write Node constructors.

The Node Default Constructor. As can be seen in the diagram above, a Node needs to store two pieces of information:

• the Item it is to store; and
• the address of the next Node in the "chain".
Inside the Node struct, declare an instance variable named myItem, and an instance variable named myNext to store these two pieces of information.

Uncomment the testDefaultNodeConstructor() method and when everything compiles correctly, run the tests. Because we have not defined a default Node constructor, the initialization required by the test is not being performed. Here is the stub for the Node default constructor:

List::Node::Node() {
}

Note the name of this constructor: because Node is a struct declared within the List class, its fully qualified name is List::Node::Node(). Note also that we must use the qualified type List::Node to declare a Node variable within testNodeDefaultConstructor().

Add this stub to List.cpp, and place a prototype within the Node struct. Then fill in the body of the stub with the statements needed to pass the test. Save/compile, and run the tests. Continue when all tests are passed.

The Node Explicit Constructor. It is also useful to be able to build a Node with specified values. Uncomment the call and definition of testNodeExplicitConstructor(). Save/compile, and you should see an error indicating that there is no explicit constructor for a Node.

Using the Node default constructor as a guide, define a Node explicit constructor that, given an Item named it and a Node pointer named next, initializes myItem to it, and initializes myNext to next. Finally, add its prototype to the Node struct.

Save/compile, and run the tests. Continue when you've passed all the tests.

### The List::append() Method

Our List will require a copy constructor and a destructor, but it doesn't make much sense to build these until we can create non-empty lists. So our next task is to build an operation that adds an item to a List. To do this, we will now define an append(it) operation that appends an item it to the List.

Uncomment the call and definition of testAppend(). Save/compile, and verify that you get errors indicating the missing append(), getSize(), getFirst(), and getLast() methods. We will now define these methods.

The List::getSize() Method. Since we have defined an instance variable to store a List's size, defining the getSize() operation is trivial, so let's start there. Define and prototype the getSize() method. (Don't forget to make it a const method!) Save/compile, and verify that the getSize() errors disappear.

The List::getFirst() Method. The operation aList.getFirst() should return the first Item in aList. Since myFirst stores the address of our first Node, defining the getFirst() operation is fairly easy. Define and prototype getFirst(); save/compile, and verify that the getFirst() errors disappear. If you get stuck, here is a hint, but only use it after you've tried it on your own.

The List::getLast() Method. The operation aList.getLast() should return the last Item in aList. Since myLast stores the address of our last Node, defining the getLast() operation is similar to the getFirst() operation. Define and prototype getLast(); save/compile, and verify that the getLast() errors disappear. If you get stuck, here is a hint, but if you understand how getFirst() works, you really shouldn't need it.

The List::append() Method. The operation aList.append(33) should append 33 to the end of the List and increment its size:

However there is a special case: if the List is empty, the operation must also update myFirst:

Looking at these two cases: if I (the List) am empty, then myFirst must be changed; otherwise myNext in the Node pointed to by myLast must be changed. In both cases, mySize and myLast have to be changed.

Using this information, define and prototype append(); save/compile, and verify that the append() errors disappear. When they do, run the test-method, and debug any errors. If you get stuck, here is a hint. Continue when your method passes all tests.

Now that we are able to put items into a list, we are ready to write the other methods any pointer-based class requires. Let's begin with the destructor.

### The List Destructor

Since our List contains pointer instance variables, we need to define a destructor for it, to reclaim the dynamic memory to which those variables point (i.e., our "chain" of Node objects).

Uncomment the call and definition of testDestructor(). Everything should save/compile without error, but running the tests should trigger a failed assertion, as our destructor's stub is empty.

Since our List consists of a chain of dynamically allocated nodes, it is our List destructor's responsibility to make sure that these dynamically allocated nodes are returned to storage; otherwise, a program that uses our List will leak memory. There are several ways our List destructor could do this, including:

• Use a loop to move a pair of pointers from Node to Node:
List::~List() {
Node* first = myFirst;     // initialize both to point at the first node
Node* next = first;
while (first != NULL) {    // while there are nodes to be processed:
next = first->myNext;   //   save the address of the next node
delete first;           //   delete the first node
first = next;           //   make first point to the next node
}
myFirst = myLast = NULL;
mySize = 0;
}
This approach works quite well. Its only drawback is that it is a bit complicated, and doesn't take advantage of the capabilities of a Node, the way our next approach does.
• An alternative approach is to (i) use delete to deallocate the first Node:
List::~List() {
delete myFirst;          // delete first node, invoking ~Node() (does nothing if myFirst == NULL)
myFirst = myLast = NULL; // clear myFirst and myLast (optional)
mySize = 0;              // clear mySize (optional)
}
and (ii) leave it to the Node destructor to deallocate the remaining nodes. In C++, invoking delete on a dynamically allocated object invokes that object's destructor. So if we define a Node destructor and then delete the first Node as shown above, the Node destructor in the first Node will be invoked. We can define the Node destructor as follows:
List::Node::~Node() {
delete myNext;       // delete the next node, invoking ~Node() in it
// (does nothing if myNext == NULL)
}
When the first Node performs these statements, the delete myNext; statement will invoke ~Node() in the second node and reclaim its memory. The second node's destructor will perform these statements, invoking ~Node() in the third node and reclaiming its memory. And so on down the line of nodes. This simpler version of the node destructor thus triggers a recursive chain reaction that reclaims all of the nodes in the list.

Note that the order of these statements matters: if we were to mistakenly write (in ~Node()):

...
myNext = NULL;
delete myNext;
...
or (in ~List()):
...
myFirst = myLast = NULL;
delete myFirst;
...
then we would be leaking memory. The delete operation deallocates the dynamically allocated memory at the address stored in its pointer operand. If we overwrite that address with NULL before calling delete, we have lost our handle to that dynamic memory, leaving us with no way to deallocate it.
• The only disadvantages of this 'chain reaction' approach are:

• It is slower than the loop-based approach, because we are (indirectly) calling ~Node() to delete each node, and each of those calls takes time.
• Each (indirect) call to ~Node() requires some memory, and these calls stack up, so if a list is really long (i.e., thousands of items), this approach can consume all of our program's memory, causing it to fail. If that happens, we can always switch to the first approach.

The loop-based approach is thus faster, since it avoids all of the calls to ~Node(), and it will never run out of memory the way the second approach may. These are compelling advantages.

However the second approach is simpler, more elegant, and makes a Node responsible for its own cleanup. It also lets us see something interesting, so let's use the second approach for today. Save/compile; run your tests, and continue when you pass all your tests.

Now, to see something interesting, put a statement like this at the beginning of your Node destructor:

cout << "~Node() is deallocating the node containing item "
<< myItem << endl;
Then save/compile, run your tests, and talk to your neighbor about what you are seeing. What is happening? When you understand what you're seeing, comment out that statement in ~Node() and continue.

### The List Copy Constructor

Since our List class contains pointer instance variables, we need to define a copy constructor, to avoid the "shallow copy" problem. Uncomment the testCopyConstructor() call and definition. Save/compile, and run the tests to verify that a copy constructor is needed.

As with the destructor, there are multiple ways to build a copy constructor. However since we always want copying to occur as fast as possible, we will use the "fast" way, using a pointer variable and a loop to go through the List being copied, appending each Item in that List to ourselves. But what if the List being copied is empty? Below is one solution:

List::List(const List& original) {
myFirst = myLast = NULL;       //  set pointers
mySize = 0;                    //   and size to 'empty' values
Node* oPtr = original.myFirst; //  start at the first node
while (oPtr != NULL) {         //  while there are nodes to copy:
append(oPtr->myItem);       //   append the item in that node
oPtr = oPtr->myNext;        //   advance to next node
}
}
Define and prototype this operation, then save/compile, and run your tests. Continue when you've passed all the tests.

### The Assignment Operation

Our final operation is the assignment operation, which is similar to the copy constructor. Both make copies of the List they receive via their parameter. However there are some subtle differences:

• The copy constructor is building a pristine object -- initializing it for the first time -- so it cannot have any dynamic memory to clean up. By contrast, an object receiving the assignment message may have allocated dynamic memory; if so, that memory must be deallocated or a memory leak will occur.
• The copy constructor returns nothing, and thus has no return type. However a List object receiving the assignment message must return itself, to permit assignment chaining (e.g., list3 = list2 = list1 = list0;).
• Users can do silly things like self-assignment (e.g., list2 = list2;).
Here is the stub for the List assignment operation:
List& List::operator=(const List& original) {
}
Declare a prototype for this method, and then take a crack at completing its definition. Use the copy constructor's logic to make the copy, but take into account the differences between the copy constructor and the assignment operation. Do your best, here is a hint in case you get desperate.

Save/compile and run your tests. When your operation passes all the tests, congratulations! You now have a functional List class, to which we can add List-specific operations! Adding some of those operations will be this week's project.