Objectives
- Run experiments to determine the effects of hyperparameters on classifier performance
- Describe the structure of image batches
- Write code to access and modify image data
Getting Started
Get the template. Save it in your cs344 folder.
Instructions
We’re going to start with the cat-vs-dog classifier that we built in chapter 1 and try changing a few things.
Classifier 1
Here is an example of a task that a research advisor might give you:
Train a classifier to distinguish between images of cats and dogs.
- Use the Oxford-IIIT Pet Dataset.
- Fine-tune a 34-layer ResNet model for 1 epoch
- Report the error rate on a held-out validation set of 20% of the data.
The first code block from chapter 1 (or lab 1) accomplishes this task. It has been included in your lab template with a few blanks; fill them in. (You may reference Lab 1.)
Note: If you experience an “out of memory error”:
- Check that you don’t have another notebook already running (if you’re not sure, log off and log back in).
- “Restart and Run All” on the Kernel menu.
- If this still doesn’t work, add
bs=8as a keyword parameter toImageDataLoaders.from_name_func.
Classifier 2
One aspect that affects how a model performs is how many layers of processing it was trained to do. Generally, a deeper model (more layers) can be more accurate, but requires more computational time, and often requires more training to reach that level of accuracy. Let’s try a classifier using a different number of layers:
Comparisons
Optional, come back to this when you’re done with the rest of the lab.
One of our goals in this class is to get in the habit of treating numbers critically, especially understanding the variability behind them.
So, instead of just comparing the single accuracy (error rate) numbers we get from both classifiers, let’s compare the range of results we’d get from repeating the experiment several times.
A random seed makes a sequence of random numbers (or decisions) identical, which really helps for reproducibility. So to run our experiment several times, all we need to do is change the seed before each run.
We just changed the architecture, but not anything else. We could also change:
- The data: give it different images (or more or fewer of the same images)
- The task: have it try to predict something different
- The hyperparameters: train it a different way, e.g., let it train longer (more epochs) or adjust itself faster (higher learning rate)
In a future week we will try varying more aspects.
Image Batch Structure
Now, let’s see what’s going into the classifier. The images are given in batches of dataloaders.train.bs (defaults to 64) images at a time. They all get packed into a single PyTorch Tensor. That’s possible because Tensors can have multiple axes.
Get one image
The first axis is the batch size. To show the first image in the batch, run
show_image(images[0]).
Get one channel
The second axis is the color channel (red, green, and blue).
You can provide multiple indices at the same time, e.g., images[0, 0] is the red channel of the first image.