Task: perform broadcast and reduction operations on a tensor representing a batch of color images
Goal: The goal of this exercise was just to get used to thinking about shapes of multidimensional structures. A surprisingly large amount of the thinking that goes into implementing neural net code is getting the shapes right. I didn’t really believe that until I had to figure it out myself a couple of times, and that convinced me that everyone could use some guided practice with that.
from fastai.vision.all import *
# Make one-channel images display in greyscale.
# See https://forums.fast.ai/t/show-image-displays-color-image-for-mnist-sample-dataset/78932/4
# But "Grays" is inverted, so we use "gray" instead.
matplotlib.rc('image', cmap='gray')
Download dataset.
path = untar_data(URLs.PETS) / "images"
Make a stable order for the images: first sort, then randomize using a known seed.
set_seed(333)
image_files = get_image_files(path).sorted().shuffle()
Define how we're going to split the data into a training and validation set.
splitter = RandomSplitter(valid_pct=0.2, seed=42)
In this dataset, cat breeds start with a capital letter, so we can get the label from the filename.
def cat_or_dog(x):
return 'cat' if x[0].isupper() else 'dog'
def get_y(file_path):
return cat_or_dog(file_path.name)
Define a standard image-classification DataBlock.
dblock = DataBlock(blocks = (ImageBlock, CategoryBlock),
get_y = get_y,
splitter = splitter,
item_tfms = Resize(224))
Override shuffle_fn so that the images never actually get shuffled (batch order is consistent).
dataloaders = dblock.dataloaders(image_files, batch_size=9, shuffle_fn=lambda idxs: idxs)
Since we set the shuffle_fn to the identity above, the images will always get loaded in the same order, so the first batch will always be the same:
batch = dataloaders.train.one_batch()
images_orig, labels = batch
images = images_orig.clone()
show_image_batch((images, labels))
images.shape. What does each number represent?images.shape
your answer here
labels. Explain those numbers, with the help of dataloaders.train.vocab.labels
dataloaders.train.vocab
your answer here
show_image.)# your code here
.mean(axis=___); think about what the blank is.# your code here
Part B: Show the average of the middle 3 images.
You'll need to use slicing to compute this.
# your code here
show_images to show all of the images.# your code here
# your code here
The next exercise will require you to assign to slices. It wil also require you to "skip" dimensions in slicing. To prepare, study what this does:
images.shape
images[:, 0].shape
images[:, :, 0, 0] = 0.0
print(images[0, 0, 0, 0], images[5, 1, 0, 0], images[0, 0, 5, 1])
# restore the original images
images = images_orig.clone()
# your code here
# restore the original images for the next step
images = images_orig.clone()
# your code here
images.shape represent?your answer here
labels.your answer here