Unit 4: Neural Models

In this unit we extend our modeling skills to encompass classification models, and start to build the tools that will let us represent complex functions by using hidden layers. Both of these objectives require us to learn about nonlinear operations. We’ll focus on the two most commonly used ones: the softmax operator (which converts scores to probabilities) and the rectifier (“ReLU”, which clips negative values).

Neural Models

Students who complete this unit will demonstrate that they can:

Automatic Differentiation

We’ll be doing some automatic differentiation this week:

Contents

Preparation 4 (draft!)
The content may not be revised for this year. If you really want to see it, click the link above.
Softmax

Background

Jargon:

Warm-Up Activity

Open the softmax and cross-entropy interactive demo that Prof Arnold created.

Try adjusting the logits (the inputs to softmax) and get a sense for how the outputs change. Describe the outputs when:

  1. All of the inputs are the same value. (Does it matter what the value is?)
  2. One input is much bigger than the others.
  3. One input is much smaller than the others.

Finally, describe the input that gives the largest possible value for output 1.

Notebooks

Softmax, part 1 (name: u04n2-softmax.ipynb; show preview, open in Colab)

PyTorch and Logistic Regression

Logistic Regression

i n f e a t u r e s M o d e l n ( _ a ( c s l l v c o a e o g s c r i s t e t e o s s s r ) ) s o f t m a x n ( _ a p c r l v o a e b s c s s t e o s r ) c c o r r o r s e s c - t e n a t n r s o w p e y r ( a l n o 1 u s m s b e r )

Jargon:

PyTorch

Imports:

import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F

Building a model object with the desired architecture (structure)

model = nn.Linear(in_features=2, out_features=3, bias=True)

# or

class Model(nn.Module):
    def __init__(self):
        super().__init__()
        self.linear = nn.Linear(in_features=2, out_features=3, bias=True)
    
    def forward(self, x):
        return self.linear(x)
model = Model()

# or

n_hidden = 100
model = nn.Sequential(
    nn.Linear(in_features=2, out_features=n_hidden, bias=True),
    nn.ReLU(),
    nn.Linear(in_features=n_hidden, out_features=3, bias=True)
)

Training a model:

loss_fn = nn.MSELoss()
optimizer = torch.optim.SGD(model.parameters())
# in a training loop ...
y_pred = model(x)
loss = loss_fn(y_pred, y_true)
loss.backward()
optimizer.step()

Warm-Up Questions

  1. We’re classifying houses as low/medium/high price based on longitude and latitude using logistic regression. The model outputs 3 scores, one for each class. For 100 houses (processed all at once in a “batch” of samples):

    a. What shape is X? X.shape =

    b. What shape should W (the array of weights) be? W.shape =

    c. What shape should b (the array of biases) be? b.shape =

    d. What shape will the output have? (X @ W + b).shape =

  2. For one house, if our model outputs scores [1.0, 2.0, -1.0] for low/med/high prices:

    Write the steps to convert these scores to probabilities that sum to 1. (You can use words or math notation.)

  3. If the true label for this house is “medium”, what’s the model’s accuracy and loss for this house? (You can use words or math notation.)

Notebooks

From Linear Regression in NumPy to Logistic Regression in PyTorch (name: u04n3-logreg-pytorch.ipynb; show preview, open in Colab)