Trace Simple Image Classifier¶

Task: trace and explain the dimensionality of each tensor in a simple image classifier.

Setup¶

from fastai.vision.all import *
from fastbook import *

matplotlib.rc('image', cmap='Greys')

Get some example digits from the MNIST dataset.

path = untar_data(URLs.MNIST_SAMPLE)

threes = (path/'train'/'3').ls().sorted()
sevens = (path/'train'/'7').ls().sorted()
len(threes), len(sevens)

(6131, 6265)

Here is one image:

example_3 = Image.open(threes[1])
example_3

To prepare to use it as input to a neural net, we first convert integers from 0 to 255 into floating point numbers between 0 and 1.

example_3_tensor = tensor(example_3).float() / 255
example_3_tensor.shape

torch.Size([28, 28])

height, width = example_3_tensor.shape

Our particular network will ignore the spatial relationship between the features; later we'll learn about network architectures that do pay attention to spatial neighbors. So we'll flatten the image tensor into 28*28 values.

example_3_flat = example_3_tensor.view(width * height)
example_3_flat.shape

torch.Size([784])

Task¶

We'll define a simple neural network (in the book, a 3-vs-7 classifier) as the sequential combination of 3 layers. First we define each layer:

# Define the layers. This is where you'll try changing constants.
linear_1 = nn.Linear(in_features=784, out_features=30)
relu_layer = nn.ReLU()
linear_2 = nn.Linear(in_features=30, out_features=1)

Then we put them together in sequence.

simple_net = nn.Sequential(
    linear_1,
    relu_layer,
    linear_2
)

Each of nn.Linear, nn.ReLU, and nn.Squential are PyTorch modules. We can call a module with some input data to get the output data:

simple_net(example_3_flat)

tensor([-0.1385], grad_fn=<AddBackward0>)

Your turn:

Obtain the same result as the line above by applying each layer in sequence.

The outputs of each layer are called activations, so we can name the variables act1 for the activations of layer 1, and so forth. Each act will be a function of the previous act (or the input, for the first layer.)

inp = example_3_flat
act1 = ...

act2 = ...

act3 = ...

Evaluate act1, act2, and act3. (Code already provided; look at the results.)

act1

tensor([-0.1971, -0.2886,  0.2023, -0.0984,  0.1338, -0.1604,  0.2701, -0.3103,  0.2313,  0.1280, -0.3245,  0.1302, -0.1761, -0.1394,  0.0234, -0.1384,  0.3531,  0.5236, -0.1388,  0.1109,  0.0033,
         0.1793, -0.3673, -0.0706, -0.1324, -0.4853,  0.3566,  0.1476, -0.2868, -0.0929], grad_fn=<AddBackward0>)

act2

tensor([0.0000, 0.0000, 0.2023, 0.0000, 0.1338, 0.0000, 0.2701, 0.0000, 0.2313, 0.1280, 0.0000, 0.1302, 0.0000, 0.0000, 0.0234, 0.0000, 0.3531, 0.5236, 0.0000, 0.1109, 0.0033, 0.1793, 0.0000, 0.0000,
        0.0000, 0.0000, 0.3566, 0.1476, 0.0000, 0.0000], grad_fn=<ReluBackward0>)

act3

tensor([-0.1385], grad_fn=<AddBackward0>)

Evaluate the shape of act1, act2, and act3.

# your code here

(torch.Size([30]), torch.Size([30]), torch.Size([1]))

Write expressions for the shapes of each activation in terms of linear_1.in_features, linear_2.out_features, etc. (ignore the torch.Size( part)

linear_1.in_features

784

act1_shape = [...]
act2_shape = [...]
act3_shape = [...]

assert list(act1_shape) == list(act1.shape)
assert list(act2_shape) == list(act2.shape)
assert list(act3_shape) == list(act3.shape)

Evaluate the shape of linear_1.weight, linear_1.bias, and the same for linear_2. Write expressions that give the value of each shape in terms of the in_features and other parameters.

print(f"Linear 1: Weight shape is {list(linear_1.weight.shape)}, bias shape is {list(linear_1.bias.shape)}")
print(f"Linear 2: Weight shape is {list(linear_2.weight.shape)}, bias shape is {list(linear_2.bias.shape)}")

Linear 1: Weight shape is [30, 784], bias shape is [30]
Linear 2: Weight shape is [1, 30], bias shape is [1]

linear_1_weight_shape = [...]
linear_1_bias_shape = [...]
linear_2_weight_shape = [...]
linear_2_bias_shape = [...]

assert list(linear_1_weight_shape) == list(linear_1.weight.shape)
assert list(linear_1_bias_shape) == list(linear_1.bias.shape)
assert list(linear_2_weight_shape) == list(linear_2.weight.shape)
assert list(linear_2_bias_shape) == list(linear_2.bias.shape)

Analysis¶

Try changing each of the constants provided to the nn.Linear modules. Identify an example of:
1. A constant that can be freely changed in the neural net definition.
2. A constant that cannot be changed because it depends on the input.
3. A pair of constants that must be changed together.

your answer here

Describe the relationship between the values in act1 and act2.

your answer here

In a concise but complete sentence, describe the shapes of the parameters of the Linear layer (weight and bias).

your answer here