Optional Mini-Project: Classifier from Scratch

Task

Build a simple image classifier from scratch, using only basic numerical computing primitives (like matrix multiplication).

Specifically:

Constraints:

Why?

Instructions

Create a new notebook. You may start with this code, which sets up the data loading.

# Import fastai, but only for the DataBlock part.
from fastai.vision.all import *

path = untar_data(URLs.MNIST)

# Create a subset of the images, so we train faster. We do this by taking 500 random images of each digit.
set_seed(0)
num_imgs_per_digit = 500
items = [
    p
    for split in ['training', 'testing']
    for digit in range(10)
    for p in (path/split/str(digit)).ls().shuffle()[:num_imgs_per_digit]
]

# Create the `dataloaders`. We need a slightly special `ImageBlock` because we want grayscale images.

block = DataBlock(
    blocks=(ImageBlock(PILImageBW), CategoryBlock),
    get_y = parent_label,
    splitter=GrandparentSplitter(train_name='training', valid_name="testing"),
)
dataloaders = block.dataloaders(items, bs=16)
print(f"{dataloaders.train.n} training images, {dataloaders.valid.n} validation images")

Beyond that, you’re on your own! See “constraints” above.

Optional Constraint: No Autograd

I have a suggested structure for implementing backpropagation yourself in a modular way. If you’re interested, please ask me.

Note: the PyTorch cross_entropy function does some interesting things under the hood. See this notebook (preview, Colab).

Week 6 Q&A
Homework 6