class: center, middle, inverse, title-slide # Learning to Classify ### Ken Arnold ### 2022-02-09 --- ## Logistics - Reflection and Homework due tomorrow --- ## Interesting Points from Discussion? - Share with neighbors. - Then we'll share with everyone. --- ## How to classify? - How do we measure how good a classifier is? - How do we turn scores into probabilities? --- ## Elo A measure of relative skill: - Higher Elo more likely to win - Greater point spread -> more confidence in win - Uses: [chess](https://en.wikipedia.org/wiki/Elo_rating_system), [NFL](https://projects.fivethirtyeight.com/complete-history-of-the-nfl/) - 538's [Superbowl prediction](https://projects.fivethirtyeight.com/2021-nfl-predictions/games/) ([discussion](https://fivethirtyeight.com/features/rams-or-bengals-our-guide-to-super-bowl-lvi/)) --- ## How do we measure how good a classifier is? Cross Entropy: should give high probability to correct result. --- ## How do we turn scores into probabilities? --- ## Where do the "right" scores come from? - In linear regression we were given the right scores. - In classification, we have to learn the scores from data. --- # Friday --- ## Discussion Intro (Caleb) --- ## Review Which of the following two classifiers has a better *accuracy*? Which has a better *cross-entropy loss*? --- ## Review Which of the following is a good *loss function* for classification? 1. Mean squared error 2. Softmax (generalization of sigmoid to multiple categories) 3. Error rate (number of answers that were incorrect) 4. Average of the probability the classifier assigned to the wrong answer 5. Average of the negative of the log of the probability the classifier assigned to the right answer Why? --- ## From scores to probabilities Suppose we have 3 chess players: | player | Elo | |--------|-----| | A | 1000 | | B | 2200 | | C | 1010 | A and B play. Who wins? -- A and C play. Who wins? --- ## Softmax This year the Super Bowl is between the LA Rams and the Cincinnati Bengals. FiveThirtyEight rates their team Elo scores at 1661 for the Rams and 1593 for the Bengals. Continue [here](https://colab.research.google.com/drive/1CdTEcZP2bOx7zbPltAdTE7xBse6JI9z5) --- ## Softmax 1. Start with scores (use variable name `logits`), which can be any numbers (positive, negative, whatever) 2. Make them only positive by exponentiating: `xx = exp(logits)` or `10 ** logits` 3. Make them sum to 1: `probs = xx / xx.sum()` --- ## Some properties of softmax - Sums to 1 (by construction) - Largest logit in gets biggest prob output - `logits + constant` doesn't change output. - `logits * constant` *does* change output. --- ## Sigmoid Special case of softmax when you just have one `score` (binary classification): use `logits = [score, 0.0]` --- ## Building a Neural Net Where do Linear, Cross-Entropy and Softmax or Sigmoid go? --- ## Going deeper Open our Trace Simple Classifier in [Colab](https://colab.research.google.com/github/kcarnold/cs344/blob/main/static/fundamentals/u4n2-trace-mnist.ipynb). --- ```python new_linear = nn.Linear(in_features=784, out_features=1) new_linear.weight.data = (linear_2.weight.data @ linear_1.weight.data) new_linear.bias.data = linear_2(linear_1.bias).data new_linear(example_3_flat) ``` --- ## ReLU Intuition Piecewise linear. Ramps. See [this notebook](https://colab.research.google.com/github/kcarnold/cs344/blob/main/static/fundamentals/u5n3-linreg-manual.ipynb).