Train a simple image classifier¶

Task: Train a cat-vs-dog classifier.

Outline:

  1. Load the data
    1. Download the dataset.
    2. Get a list of filenames.
    3. Get a list of ground-truth labels.
    4. Set up the dataloaders (which handles train-test split, batching, and resizing)
  2. Train a model
    1. Get a foundation model (resnet18 in our case)
    2. Fine-tune it.
  3. Get the model's predictions on an image.

This notebook includes tasks (marked with "Task") and blank code cells (labeled # your code here) to fill in your answers.

Setup¶

Run this code. (You do not need to read or modify the code in this section to successfully complete this assignment.)

In [1]:
# fastai-specific stuff.
# Import fastai code.
from fastai.vision.all import *

# Set a seed so that the results are the same every time this is run.
set_seed(0, reproducible=True)
In [2]:
# Show what GPU we have.
if torch.cuda.is_available():
    print("Found a GPU:", torch.cuda.get_device_properties(0))
else:
    print("No CUDA.")
Found a GPU: _CudaDeviceProperties(name='NVIDIA GeForce GTX 960', major=5, minor=2, total_memory=2000MB, multi_processor_count=8)

Load the data¶

Although the original fastai classifer training was famously short, it was inhospitably jam-packed. So I've taken the liberty to space things out a bit and split it into multiple cells.

Download the dataset¶

In [3]:
# fastai-specific
dataset_path = untar_data(URLs.PETS) / 'images'

Get a list of filenames¶

In [4]:
# fastai-specific
image_files = get_image_files(dataset_path).sorted()

Task: Write a concise description of what image_files is in the Markdown cell below. Note: you'll need to look at the value of image_files; here's one way:

In [5]:
image_files
Out[5]:
(#7390) [Path('/scratch/cs344/data/oxford-iiit-pet/images/Abyssinian_1.jpg'),Path('/scratch/cs344/data/oxford-iiit-pet/images/Abyssinian_10.jpg'),Path('/scratch/cs344/data/oxford-iiit-pet/images/Abyssinian_100.jpg'),Path('/scratch/cs344/data/oxford-iiit-pet/images/Abyssinian_101.jpg'),Path('/scratch/cs344/data/oxford-iiit-pet/images/Abyssinian_102.jpg'),Path('/scratch/cs344/data/oxford-iiit-pet/images/Abyssinian_103.jpg'),Path('/scratch/cs344/data/oxford-iiit-pet/images/Abyssinian_104.jpg'),Path('/scratch/cs344/data/oxford-iiit-pet/images/Abyssinian_105.jpg'),Path('/scratch/cs344/data/oxford-iiit-pet/images/Abyssinian_106.jpg'),Path('/scratch/cs344/data/oxford-iiit-pet/images/Abyssinian_107.jpg')...]

your answer here

Task: How many images are in the image_files list?

In [6]:
# your code here
7390 images

Task: Assign the first element of the list to first_img and the last element to last_img. What is the file name of the first image? The last image?

Note: The list contains Path objects, which include the full path, which tells Python where to go on your computer to find the file. In the code chunk below, I print out first_img.name (after making that variable), which gives just the filename.

In [7]:
# your code here
First image: Abyssinian_1.jpg
Last image: yorkshire_terrier_99.jpg

Task: Is the first image a cat or a dog? Is the last image a cat or a dog? Notice how the first letter tells you so (this is an unusual quirk of this dataset). Write your answer in a Markdown cell below.

your answer here

In [8]:
# Try this: load_image(first_img)

Get a list of ground-truth labels¶

We'll need a function that takes a filename and tells us whether that image should be labeled as a cat or not. For now we'll provide it for you:

In [9]:
# Cat images have filenames that start with a capital letter.
def cat_or_dog(filename):
    return 'cat' if filename[0].isupper() else 'dog'

Task: Check that the output of cat_or_dog is correct for first_img.name and for last_img.name.

In [10]:
# your code here
Abyssinian_1.jpg is a cat
yorkshire_terrier_99.jpg is a dog

Task: Make a list called labels containing the result of calling cat_or_dog(path.name) for every path in the image_files list.

remember that if labels = [] then you can labels.append(whatever) ... or use a list comprehension, one of my favorite Python features.

In [11]:
# your code here

Set up the dataloaders¶

This is a kinda fastai-specific thing. We'll unpack it more in the next two weeks, but for now just take it as-is.

In [12]:
dataloaders = ImageDataLoaders.from_lists(
    # What images to use:
    path=dataset_path, fnames=image_files, labels=labels,

    # train-test split parameters:
    # - amount to hold out for validation:
    valid_pct=0.2,
    # - set the seed used for the train-test split (not the training)
    seed=42, 

    # Set batch size
    bs=4,

    # Make all the images the same size.
    item_tfms=Resize(224)
)

# Show what it did:
dataloaders.train.show_batch()
In [13]:
print(dataloaders.train.n, "training images")
print(dataloaders.valid.n, "validation images")
5912 training images
1478 validation images

Train a model¶

Now we train the model. This is again fastai-specific.

Note: we'll get two tables, each of which has one or more rows. For seeing how well the model works, we'll look at the very last row of the last table. Notice that we get both the accuracy and the error rate; what is the relationship between these two numbers?

In [14]:
# fastai-specific
learn = vision_learner(
    dls=dataloaders,
    arch=resnet18,
    metrics=[accuracy, error_rate]
)
learn.fine_tune(epochs=1)
learn.recorder.plot_loss()
epoch train_loss valid_loss accuracy error_rate time
0 0.426024 0.075974 0.971583 0.028417 01:13
epoch train_loss valid_loss accuracy error_rate time
0 0.190319 0.064173 0.984438 0.015562 01:55

Make some predictions¶

In [15]:
with learn.no_bar(): # <- this just avoids showing an empty progress bar. fastai-specific.
    prediction, _, probs = learn.predict(PILImage.create(first_img))
print(f"This is a: {prediction}.")
print(f"Probabilities: {probs}")   
This is a: cat.
Probabilities: TensorBase([0.9133, 0.0867])

Task: Compute the model's prediction for the last image (which should be a dog).

In [16]:
# your code here
This is a: dog.
Probabilities: TensorBase([8.6988e-04, 9.9913e-01])

Task: Compute the accuracy by hand.

Update the following loop to compute the accuracy of the classifier on the given sample of images. That is, what fraction of the images does the classifier get correct?

Note: For this exercise, you won't need the predicted probabilities, so the starter code below just assigns them to the unused placeholder (_).

In [17]:
sample_images = random.sample(image_files, k=50)
In [19]:
num_correct = 0
for path in sample_images:
    with learn.no_bar():
        prediction, _, _ = learn.predict(PILImage.create(path))
  
    # your code here
Accuracy: 48/50 = 96.00%

optional: uploader widget to try out your own image

In [20]:
from ipywidgets import widgets
uploader = widgets.FileUpload()
uploader
FileUpload(value={}, description='Upload')
In [21]:
if len(uploader.data) > 0:
    img = PILImage.create(uploader.data[0])
    is_cat, _, probs = learn.predict(img)
    print(f"Is this a cat?: {is_cat}.")