Q&A Week 2

Last updated on Feb 17, 2021

Tech

Why do we get two different error rates from fine_tune?

The training happens in two stages. You’ll learn more about this process in later chapters. For now, use the last error rate.

Why do I get different results when training the same model multiple times, even though I set the seed?

Nobody actually asked this, but it bugged me (based on my expectations from sklearn) so I looked into it.

From looking at ImageDataLoaders.from_path_func??, the seed parameter only controls the RandomSplitter (i.e., the split between training set and validation set). So passing a seed should ensure that the same images make it into the training set vs validation set each time, which is a really good idea.

To make a fastai training reproducible, call set_seed(12345, reproducible=True) before creating the dataloader. That function seeds Python’s standard library random, numpy.random, and PyTorch’s RNG.

(I eventually found this discussed in the fastai issue tracker. But before I did, I poked around at the code. So: DataLoaders are iterators, so the dls.train.__iter__?? code is what gets run when you iterate through it. Notice that it starts with self.randomize(), which creates a fresh self.rng from its previous RNG. And if you look at the definition of DataLoader?? (github link), self.rng is created by calling random.Random.)

Can we get 100% accurate AI?

Depends on what you mean. Keywords to search for if you want to look more into this:

Verified AI
robust machine learning
robust reinforcement learning

Context

Will unbiased data prevent biased decisions?

Unfortunately, no. See this thread for a survey:

Yesterday, I ended up in a debate where the position was "algorithmic bias is a data problem".

I thought this had already been well refuted within our research community but clearly not.

So, to say it yet again -- it is not just the data. The model matters.

1/n
— Sara Hooker (\@sarahookr) February 15, 2021

How can I make sure that my AI project is beneficial?

Hard question. Here’s one paper that suggests a set of questions to ask.

Q&A Week 2

Tech

Context

Ken Arnold

Assistant Professor of Computer Science