Unit 4: Models

Unit 4: Models

Students who complete this unit will demonstrate that they can:

Describe the basic structure of a machine learning model.

Describe the overall approach of Stochastic Gradient Descent: how does it use information from a batch of data to improve its performance on that and other data?

Describe the parameters of a linear layer and how they are used to compute its output.

Identify the following loss functions: Mean Squared Error and Mean Absolute Difference.

Define what a Multi-Layer Perceptron (MLP) is and identify the terms “input features”, “hidden features”, “activation function”, and “output features”.

Trace the execution of a basic image classifier model using a fully-connected network.

Apply automatic differentiation (as implemented in PyTorch) to compute the gradients of programs

(Note that we’re focusing on regression models this week; next week we’ll add classification.)

Preparation

Check your prior knowledge:

can you define the Mean Squared Error (MSE) of a linear regression (i.e., y = m*x + b)?
can you write an algorithm that, given some data and a starting m and b, returns a new m and b that give a lower MSE?

If not, skim through this linreg explainer originally by Amazon Web Services, some edits by Prof Arnold)

First, watch the video and take conceptual notes.

Then, do these exercises

Then, re-watch and write the code along with him.

Old way (still valid, though video links were for the previous version of the course)

Prep Notes

For this week, focus on how things are used rather than the underlying math, especially for tensors (which have several different definitions) and derivatives (which we’ll get to shortly).

The book uses “rank” to refer to the number of axes of a tensor, but “rank” means something different in linear algebra. To avoid confusion, let’s call it “number of axes”, or perhaps “number of dimensions” (abbreviated “ndim” in PyTorch).

For example, a length-5 column vector times a length-4 row vector would give a matrix (tensor) with two axes (2-dimensional), with shape (5, 4) and rank 1 in the linear algebra sense. See this notebook.

Optional Supplemental Materials

There are too many videos out there on deep learning to list here, but here’s a few very different styles:

Monday Class

Scripture: Genesis 1
- Ctrl-F for “saw”
- We are surrounded by rich data.
Logistics
- Homework images upload
- Due date suggestions
Modeling Fundamentals
- What are the basic steps of gradient-based learning?
- How does a neural net make a prediction?
  - equinox examples: Linear layer Simple MLP.
- How do we evaluate those predictions?
- How do we iteratively improve a model?
some slides from last week
Programming Moment: Comprehensions

Preparation

Prep Notes

Optional Supplemental Materials

Contents