Unit 6: Recap and Regularization

In this unit, after reviewing where we’ve been, we push towards state-of-the-art models (still focusing on computer vision). We’ll first show how our work last 2 weeks connects to the pre-trained models we used in the opening weeks. Then, we’ll introduce or revisit tools that allow our models to achieve high performance, such as data augmentation and regularization. Finally, we’ll get more practice with how neural networks work from the ground up as we implement our own simple neural net image classifier from scratch (definitely something to mention in an interview!). Students who complete this unit will demonstrate that they can:

Explain how a pre-trained model can be repurposed for a new task by separating it into a general-purpose “body” (aka “encoder”) and a task-specific “head”.

Identify some examples of data augmentation and regularization.

Predict the effect of data augmentation and regularization on model training.

Implement a multi-layer neural network using basic numerical computing primitives

Preparation

The fastai course videos are still a bit disorganized, sorry about that.

Read Deep Learning for Coders chapter 6 (open in Colab). Pay attention to:

What has to change if classes are or aren’t mutually exclusive?
What has to change if you want a regression instead of classification?

Skim Deep Learning for Coders chapter 7 (open in Colab).

If you haven’t yet, read Deep Learning for Coders chapter 17 (open in Colab).

Video

Some of this material is part of the video you already watched, FastAI Course Lecture 3. Then some of it gets delayed to FastAI Course Lecture 8, I know, it’s confusing, sorry.
Instead, if you want to watch something, I recommend getting ahead with next week’s videos, because they split some material between two different videos again. See Prep 7 for the details.

Supplemental Materials

Strategies for getting state-of-the-art performance:

We’ll be doing some automatic differentiation this week:

Finally, I sometimes remark that “machine learning is lazy” (in that it tends to focus on superficial easy features). Here’s a more precise statement of a related claim: What do deep networks learn and when do they learn it. A recent paper describes what to do about it: Learning an Invertible Output Mapping Can Mitigate Simplicity Bias in Neural Networks | Abstract

Class Meetings

Monday class

If anyone uncovers a pit or digs one and fails to cover it and an ox or a donkey falls into it, the one who opened the pit must pay the owner for the loss and take the dead animal in exchange.

If anyone’s bull injures someone else’s bull and it dies, the two parties are to sell the live one and divide both the money and the dead animal equally. However, if it was known that the bull had the habit of goring, yet the owner did not keep it penned up, the owner must pay, animal for animal, and take the dead animal in exchange.

Exodus 21:33-36 (NIV)

Slides: Head and Body

Wednesday in lab

See Slides for the precision-recall chart I was trying to draw in class. (I almost had it right: it’s True Positive vs False Positive.)

Slides
Week 5 Q&A
Linear layers
- What are their parameters? How do they use them to compute the results?
- Why do we need nonlinearities? without them, a stack of Linear layers is just a Linear layer.
- How do ReLUs allow piecewise linear function approximation?

Friday in lab

Lab 6

Unit 6: Recap and Regularization

Preparation

Supplemental Materials

Class Meetings

Contents