Showed an example of an embedding space from a version of the Lab 5 notebook where I added some code; see the output immediately above the “Softmax and Cross-Entropy” section of this notebook. The x’s are the prototypes; the dots are the image feature vectors.
Thinking through it more now: I could have used a linear discriminant analysis (LDA) to project the features down to 2D in a way that best preserved the class separation. The factor analysis was unaware of the task so it made a projection where many classes overlapped.
From review question 3, we distinguished “metric” from “loss function”.
In section B, we demonstrated bias vs variance with this example: Bias-Variance Decomposition
(name: u06s01-bias-variance.ipynb; show preview,
open in Colab)