Week 3 Q&A

Q&A

What aspects of the reading should we focus on?

Use the prep instructions, prep quizzes, and learning objectives as a guide. (Unfortunately, when faced with a potential digression, the book tends to always take it.)

Important concepts will come back again in lab, homework, and future readings, so don’t fret over missing something.

Why did we learn the dot product in Lab 1?

Because it’s a core building block to neural net modeling. You’ll see in Lab 4 and beyond how we build on it!

How much math needed?

Depends!

What’s a test set?

In the hierarchy of data purity:

How much augmentation?

See chapter 5. You can get pretty outlandish.

Are there laws that safeguard our information?

Yes, see the EU’s GDPR and subsequent legislation like California’s CCPA.

What’s gradient boosting?

The book mentions gradient-boosted trees in Chapter 2 as an alternative approach to tabular data. File it away as an alternative approach (along with software names: XGBoost and LightGBM) but we won’t be discussing it in this class.

Do we really need labels?

No: a big area of growth now is in self-supervised learning, where the model learns overall patterns in the data independent of labels, then only needs a small amount of labeled data (or none at all, in some cases) to make conclusions. For example, consider learning to predict what text ends up near an image on a website… or blanking out part of an image and learning to fill it in.

Imagine Amazon’s if-you-liked-this product recommender. It’s probably seeded by trying to predict what product people actually bought. But you can only buy things that you’re already aware of. So they probably tweaked it to recommend products that they think people will be interested in and wouldn’t already know about.

Why does fast.ai import *?

I wish they didn’t.

Any real-life examples of overfitting?

See an example in slides. Also, have a look at “adversarial examples”.

Could we still “overfit” if we had infinite data?

Yes, if the data were from a different distribution. For example, if you trained on an infinite set of black cats and white dogs, your model would struggle with white cats or black dogs.

Friday class: Evaluation