Wednesday Class: Data

Welcome Survey

Logistics

Review

Unit 1 objectives

Unit 2 objectives

Lab 1 Walkthrough with Q&A

(done live)

Q&A

Overfitting as a problem?

The figure in the book is misleading.

overfitting, supposedly

There’s also underfitting. Underfitting means that the model isn’t capturing the patterns even of the training data. It usually means that your model is too small (so the range of functions it can approximate isn’t rich enough), or your training is insufficient (your learning rate is too low, you’re not giving it enough time to train, there’s something broken about the training process, etc.).

Why Python?

Lots of jargon!

The more often something shows up in class, the more important it is to know.

Do we need math?

Yes! But not all at once. Some highlights:

Can we explore the validation set, or should we leave it totally hidden?

To get some assurance about how our model will work once deployed, we need some data that we intentionally don’t look at until the very end. That’s the test set. But we often need to guess at how well it’s going to work before then—e.g., because we’re adjusting a parameter that might affect how well the model generalizes. The validation set (or, sometimes, validation sets) help us estimate that.

In general, it’s a good idea to look at the validation set to understand how and why the model worked or didn’t, e.g., get an overview of what kinds of images an image classifier tends to misclassify. But it’s probably not a good idea to study it in too much depth, or it will stop being a good proxy for the test set.

What if the randomly selected held-out part was the most unhelpful?

(Note: validation of 20% means training set is 80%.)

What are layers? What does each one do?

Gradually integrating information from wider area of the image. Lower layers = really zoomed in.

Converting sound to image is a cool idea.

Recently this approach has been replaced by: turn everything into a sequence and pretend it’s language.

Pretrained models are useful.

But can introduce bias, may not actually be as helpful as thought (more later).

How might we prove the universal approximation theorem?

How to collect data?

AI libraries?

Epoch?

One full pass through the training data. Not uncommon to see tens or hundreds of epochs, depending on training set size.

SGD?

Lab 2: Comparing classifiers, image batch structure
Homework 2: Train and evaluate a classifier on your own images