Trace Simple Image Classifier¶

Task: trace and explain the dimensionality of each tensor in a simple image classifier.

Setup¶

In [ ]:
from fastai.vision.all import *
from fastbook import *

matplotlib.rc('image', cmap='Greys')

Get some example digits from the MNIST dataset.

In [ ]:
path = untar_data(URLs.MNIST_SAMPLE)
In [ ]:
threes = (path/'train'/'3').ls().sorted()
sevens = (path/'train'/'7').ls().sorted()
len(threes), len(sevens)
Out[ ]:
(6131, 6265)

Here is one image:

In [ ]:
example_3 = Image.open(threes[1])
example_3
Out[ ]:
No description has been provided for this image

To prepare to use it as input to a neural net, we first convert integers from 0 to 255 into floating point numbers between 0 and 1.

In [ ]:
example_3_tensor = tensor(example_3).float() / 255
example_3_tensor.shape
Out[ ]:
torch.Size([28, 28])
In [ ]:
height, width = example_3_tensor.shape

Our particular network will ignore the spatial relationship between the features; later we'll learn about network architectures that do pay attention to spatial neighbors. So we'll flatten the image tensor into 28*28 values.

In [ ]:
example_3_flat = example_3_tensor.view(width * height)
example_3_flat.shape
Out[ ]:
torch.Size([784])

Task¶

We'll define a simple neural network (in the book, chapter 4, a 3-vs-7 classifier) as the sequential combination of 3 layers.

Terminology note: This is a Multi-Layer Perceptron (MLP) with one hidden layer of 30 hidden features. It has one output feature (which we would train to generate the log-odds of 3 vs 7.)

First we define each layer:

In [ ]:
# Define the layers. This is where you'll try changing constants.
linear_1 = nn.Linear(in_features=784, out_features=30, bias=True)
relu_layer = nn.ReLU()
linear_2 = nn.Linear(in_features=30, out_features=1, bias=True)

# Then we put them together in sequence.
simple_net = nn.Sequential(
    linear_1,
    relu_layer,
    linear_2
)

Each of nn.Linear, nn.ReLU, and nn.Squential are PyTorch modules. We can call a module with some input data to get the output data:

In [ ]:
simple_net(example_3_flat)
Out[ ]:
tensor([-0.1385], grad_fn=<AddBackward0>)

Your turn:

  1. Obtain the same result as the line above by applying each layer in sequence.

The outputs of each layer are called activations, so we can name the variables act1 for the activations of layer 1, and so forth. Each act will be a function of the previous act (or the input, for the first layer.)

In [ ]:
inp = example_3_flat
act1 = ...
In [ ]:
act2 = ...
In [ ]:
act3 = ...
  1. Evaluate act1, act2, and act3. (Code already provided; look at the results.)
In [ ]:
act1
Out[ ]:
tensor([-0.1971, -0.2886,  0.2023, -0.0984,  0.1338, -0.1604,  0.2701, -0.3103,  0.2313,  0.1280, -0.3245,  0.1302, -0.1761, -0.1394,  0.0234, -0.1384,  0.3531,  0.5236, -0.1388,  0.1109,  0.0033,
         0.1793, -0.3673, -0.0706, -0.1324, -0.4853,  0.3566,  0.1476, -0.2868, -0.0929], grad_fn=<AddBackward0>)
In [ ]:
act2
Out[ ]:
tensor([0.0000, 0.0000, 0.2023, 0.0000, 0.1338, 0.0000, 0.2701, 0.0000, 0.2313, 0.1280, 0.0000, 0.1302, 0.0000, 0.0000, 0.0234, 0.0000, 0.3531, 0.5236, 0.0000, 0.1109, 0.0033, 0.1793, 0.0000, 0.0000,
        0.0000, 0.0000, 0.3566, 0.1476, 0.0000, 0.0000], grad_fn=<ReluBackward0>)
In [ ]:
act3
Out[ ]:
tensor([-0.1385], grad_fn=<AddBackward0>)

How would you describe the relationship between act1 and act2? Specifically,

  • Are they same shape?
  • What happens to negative numbers?
  • What happens to positive numbers?
  1. Evaluate the shape of act1, act2, and act3.
In [ ]:
# your code here
Out[ ]:
(torch.Size([30]), torch.Size([30]), torch.Size([1]))
  1. Write expressions for the shapes of each activation in terms of linear_1.in_features, linear_2.out_features, etc. (ignore the torch.Size( part)
In [ ]:
linear_1.in_features
Out[ ]:
784
In [ ]:
act1_shape = [linear_1.out_features]
act2_shape = [...]
act3_shape = [...]

assert list(act1_shape) == list(act1.shape)
assert list(act2_shape) == list(act2.shape)
assert list(act3_shape) == list(act3.shape)
  1. Evaluate the shape of linear_1.weight, linear_1.bias, and the same for linear_2. Write expressions that give the value of each shape in terms of the in_features and other parameters.
In [ ]:
print(f"Linear 1: Weight shape is {list(linear_1.weight.shape)}, bias shape is {list(linear_1.bias.shape)}")
print(f"Linear 2: Weight shape is {list(linear_2.weight.shape)}, bias shape is {list(linear_2.bias.shape)}")
Linear 1: Weight shape is [30, 784], bias shape is [30]
Linear 2: Weight shape is [1, 30], bias shape is [1]
In [ ]:
linear_1_weight_shape = [...]
linear_1_bias_shape = [...]
linear_2_weight_shape = [...]
linear_2_bias_shape = [...]
In [ ]:
assert list(linear_1_weight_shape) == list(linear_1.weight.shape)
assert list(linear_1_bias_shape) == list(linear_1.bias.shape)
assert list(linear_2_weight_shape) == list(linear_2.weight.shape)
assert list(linear_2_bias_shape) == list(linear_2.bias.shape)

Analysis¶

  1. Try changing each of the constants provided to the nn.Linear modules. Identify an example of:
    1. A constant that can be freely changed in the neural net definition.
    2. A constant that cannot be changed because it depends on the input.
    3. A pair of constants that must be changed together.

your answer here

  1. Describe the relationship between the values in act1 and act2.

your answer here

  1. In a concise but complete sentence, describe the shapes of the parameters of the Linear layer (weight and bias).

your answer here