What makes a good prediction?

class: center, middle, inverse, title-slide

# What makes a good prediction?
### K Arnold

---

## Objectives

* Compare and contrast regression tasks and classification tasks, and give examples of each
* Identify two different ways of measuring accuracy for regression and for classification
* Identify several reasons why a model may predict better on some subsets of data than others

---

## Types of Tasks

* **regression**: predict a *number* ("continuous")
  * number should be "close" in some sense to the correct number
* **classification**: predict a *category*
  * which one of these two groups? three groups? 500,000 groups?
  * could ask: "how likely is it to be in group *i*"

---

## Are these tasks *regression* or *classification*?

1. Is this a picture of the inside or outside of the restaurant?
1. How much will it rain in GR next year?
1. Is this person having a seizure?
1. How much will this home sell for?
1. How much time will this person spend watching this video?
1. How big a fruit will this plant produce?
1. Which word did this person mean to type?
1. Will this person "Like" this post?

---

## Today's examples

**Regression**: housing prices in Ames, Iowa. Details:

* [Paper](http://jse.amstat.org/v19n3/decock.pdf)
* [Data Dictionary](http://jse.amstat.org/v19n3/decock/DataDocumentation.txt)

**Classification**: *seizure classification*.

First FDA-approved AI-powered medical device: Empatica [Embrace2](https://www.empatica.com/embrace2/),
company founded by MIT data scientist Rosalind Picard

---

## What makes a good prediction? *Regression*

We predicted the home would sell for $250k. It sold for $200k. Is that good?

* **residual**: actual minus predicted
  * If home sold for $200k but we predicted $250k, residual is _______
* **absolute error**
* **squared error**

Across the entire dataset:

* **average error**: do we tend to predict too high? too low? "*bias*"
* **max** absolute error
* **mean** absolute error
* **mean squared error** (MSE)
* normalized squared error: MSE / Variance
  * The confusingly named "R2" = 1 - normalized squared error

---

## What makes a good prediction? *Classification*

Suppose: every minute, the armband decides whether a seizure is occurring

<br>

The child was perfectly fine but our armband flagged a seizure. Is that good?

<br>

The child was having a seizure but our armband didn't flag it. Is that good?

---

## What makes a good prediction? *Classification*

|                      | Seizure happened              | No seizure happened           |
|----------------------|-------------------------------|-------------------------------|
| Seizure predicted    | True positive                 | False positive (Type 1 error) |
| No seizure predicted | False negative (Type 2 error) | True negative                 |

--
- **Accuracy** (% correct) = (TP + TN) / (# episodes)
- **False negative** ("miss") **rate** = FN / (# actual seizures)
- **False positive** ("false alarm") **rate** = FP / (# true non-seizures)

--
- **Sensitivity** ("true positive rate") = TP / (# actual seizures)
  - Sensitivity = 1 − False negative rate
- **Specificity** ("true negative rate") = TN / (# actual seizures)
  - Specificity = 1 − False positive rate
- [Wikipedia article](https://en.wikipedia.org/wiki/Sensitivity_and_specificity)

---

.question[
If you were designing a seizure alert system, would you want sensitivity and specificity to be high or low? What are the trade-offs associated with each decision? 
]

---

class: middle, center

## Validation

.large[**Key point**: you *must* evaluate predictions on *unseen* data]

---

Hey look! I can exactly predict how much a home will sell for!

.small[
<table>
 <thead>
  <tr>
   <th style="text-align:right;"> Lot_Area </th>
   <th style="text-align:right;"> Sale_Price </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:right;"> 31770 </td>
   <td style="text-align:right;"> 215000 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 11622 </td>
   <td style="text-align:right;"> 105000 </td>
  </tr>
</tbody>
</table>
]

sale price = 41548.54 + 5.459599 * lot area

---

## Validation: *unseen* data

.pull-left[
.small[
<table>
 <thead>
  <tr>
   <th style="text-align:right;"> Lot_Area </th>
   <th style="text-align:right;"> Sale_Price </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:right;"> 31770 </td>
   <td style="text-align:right;"> 215000 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 11622 </td>
   <td style="text-align:right;"> 105000 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 14267 </td>
   <td style="text-align:right;"> 172000 </td>
  </tr>
</tbody>
</table>
]]

.pull-right[
<img src="w6d2-accuracy_files/figure-html/perfectly-wrong-1.png" width="100%" style="display: block; margin: auto;" />
]
--

<table>
 <thead>
  <tr>
   <th style="text-align:right;"> Lot_Area </th>
   <th style="text-align:right;"> Sale_Price </th>
   <th style="text-align:right;"> predicted </th>
   <th style="text-align:right;"> residual </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:right;"> 31770 </td>
   <td style="text-align:right;"> 215000 </td>
   <td style="text-align:right;"> 215000.0 </td>
   <td style="text-align:right;"> 0.00 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 11622 </td>
   <td style="text-align:right;"> 105000 </td>
   <td style="text-align:right;"> 105000.0 </td>
   <td style="text-align:right;"> 0.00 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 14267 </td>
   <td style="text-align:right;"> 172000 </td>
   <td style="text-align:right;"> 119440.6 </td>
   <td style="text-align:right;"> 52559.36 </td>
  </tr>
</tbody>
</table>

---

## Oh ok, I'll just fix that one...

.small[
<table>
 <thead>
  <tr>
   <th style="text-align:right;"> Lot_Area </th>
   <th style="text-align:right;"> Bsmt_Unf_SF </th>
   <th style="text-align:right;"> Sale_Price </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:right;"> 31770 </td>
   <td style="text-align:right;"> 441 </td>
   <td style="text-align:right;"> 215000 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 11622 </td>
   <td style="text-align:right;"> 270 </td>
   <td style="text-align:right;"> 105000 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 14267 </td>
   <td style="text-align:right;"> 406 </td>
   <td style="text-align:right;"> 172000 </td>
  </tr>
</tbody>
</table>
]

sale price = -37769.46 + 1.5311432 \* lot area + **462.8685748 \* basement sq ft**

### and look, it works!

<table>
 <thead>
  <tr>
   <th style="text-align:right;"> Lot_Area </th>
   <th style="text-align:right;"> Bsmt_Unf_SF </th>
   <th style="text-align:right;"> Sale_Price </th>
   <th style="text-align:right;"> predicted </th>
   <th style="text-align:right;"> residual </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:right;"> 31770 </td>
   <td style="text-align:right;"> 441 </td>
   <td style="text-align:right;"> 215000 </td>
   <td style="text-align:right;"> 215000 </td>
   <td style="text-align:right;"> 0 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 11622 </td>
   <td style="text-align:right;"> 270 </td>
   <td style="text-align:right;"> 105000 </td>
   <td style="text-align:right;"> 105000 </td>
   <td style="text-align:right;"> 0 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 14267 </td>
   <td style="text-align:right;"> 406 </td>
   <td style="text-align:right;"> 172000 </td>
   <td style="text-align:right;"> 172000 </td>
   <td style="text-align:right;"> 0 </td>
  </tr>
</tbody>
</table>

*Do you really think so?*

---

## Failure to generalize

Predictive models almost always do better on the data they're trained on than anything else.

Why?

* model uses a pattern that only held by chance
* model uses a pattern that only holds for some data
* model uses a pattern that's real but got a fuzzy picture of it

General name: **Overfitting**