Unit 2: Supervised Learning

Supervised Learning

Students who complete this unit will demonstrate that they can:

Contents

Preparation 2 (draft!)
The content may not be revised for this year. If you really want to see it, click the link above.
Homework 1: Train and evaluate a classifier on your own images

Goal

In this assignment, you will train and evaluate your own image classifier to distinguish the handwritten letters A, B, and C.

Completing this homework will give you practice

A famous image classification example is handwritten digits (called MNIST). For fun, we’ll remix that idea and classify handwritten letters. To keep it manageable, we’ll just work with the first 3 letters (a through c).

Try to make the best model you can, under the following constraints:

  1. No more than 100 training images. (Note: This is a maximum, not a minimum.)
  2. No more than 5 minutes compute time (on a Kaggle, Colab, or lab machine GPU) to train a model.
  3. Only use models that are already built into keras.

Instructions

Let’s make this a friendly competition: which team (of up to 5) can make the best classifier?

  1. Collect your own set of images of handwritten letters, one letter per image. (Do this yourself, don’t get it from the Internet.)
    • Please do share images amongst your team. You might use a OneDrive shared folder or similar.
  2. Organize your dataset into a folder structure like images/c/c01.png.
    • Make an images/README.txt describing your dataset (see below for details)
  3. Train a classifier to indicate which letter is contained in the image.
  4. Evaluate the accuracy of the classifier on the validation set. (See below for details).
  5. Submit your Jupyter Notebook and dataset ZIP file to Moodle.

Report Expectations

Your report should be a professionally crafted Jupyter Notebook, suitable to use in a portfolio. So your notebook should be:

We highly recommend the following structure:

  1. A compelling opening vision statement, with appropriate citations of any code or notebooks on which you are basing this work (e.g., for this assignment that would be the Lab 1 notebook);
  2. A clear explanation of the source and nature of the data, including links that would allow others to access the same data (e.g., how you built your dataset and where it can be found);
  3. A complete discussion/demonstration of the analysis, with explanations and code required to build and evaluate the models;
  4. Strong conclusions.

The notebook shouldn’t include anything that doesn’t apply to these goals (e.g., no in-applicable text retained from an original notebook)

For this assignment:

Notes

Tips

To get the confusion matrix, you can use val_predicted_probs = model.predict(val_dataset) to get the model’s probabilities (look at val_predicted_probs.shape and make sure you understand why its second dimension is 3), then val_predictions = np.argmax(val_predicted_probs, axis=1) to get the model’s top prediction. To get the true labels out of the dataset, use val_labels = [int(label) for img, label in val_dataset.unbatch()]. Then to show a confusion matrix, use:

from sklearn.metrics import ConfusionMatrixDisplay
ConfusionMatrixDisplay.from_predictions(val_labels, val_predictions, display_labels=class_names)

(assuming that class_names is the same list you used when constructing the data loader).

Pretrained models

The docs page doesn’t format the list of available presets well, so here goes:

  • resnet50_imagenet
  • resnet50_v2_imagenet
  • mobilenet_v3_large_imagenet
  • mobilenet_v3_small_imagenet
  • csp_darknet_tiny_imagenet
  • csp_darknet_l_imagenet
  • efficientnetv2_s_imagenet
  • efficientnetv2_b0_imagenet
  • efficientnetv2_b1_imagenet
  • efficientnetv2_b2_imagenet
  • densenet121_imagenet
  • densenet169_imagenet
  • densenet201_imagenet
  • yolo_v8_xs_backbone_coco
  • yolo_v8_s_backbone_coco
  • yolo_v8_m_backbone_coco
  • yolo_v8_l_backbone_coco
  • yolo_v8_xl_backbone_coco
  • vitdet_base_sa1b
  • vitdet_large_sa1b
  • vitdet_huge_sa1b
  • resnet50_v2_imagenet_classifier
  • efficientnetv2_s_imagenet_classifier
  • efficientnetv2_b0_imagenet_classifier
  • efficientnetv2_b1_imagenet_classifier
  • efficientnetv2_b2_imagenet_classifier
  • mobilenet_v3_large_imagenet_classifier

Note that “imagenet”, “coco”, and “sa1b” are three different datasets, so they might lead to models with different performance on this task.