CS 375 Week 2

Ken Arnold

Overview

In this class we’re studying how Tuneable Machines can play Optimization Games.

In Lab 1:

The tuneable machine: output = model(input)
- We’ll “spiral” into how this machine works and how to tune it.
The optimization game: supervised learning
- What are the rules of the game? What’s the score?

Think about that for a moment.

Supervised Learning

Given lots of examples of (item, label) pairs
Learn a function that maps items to labels

flowchart LR
    A1[("Training Data (X and y)")] --> B{{fit}}
    A2[Model Object] --> B
    B --> FM[Fitted Model]
    FM --> C{{Predict}}
    B2[(New data X)] --> C
    C --> D[("predicted y's")]

Rules and Assumptions of Supervised Learning

We don’t get to peek at new data when training. (We want to make a model that works well on data we don’t yet have.)
The world is going to give us a bunch of samples.
There’s one “correct” label we’re supposed to give each sample.
Getting it wrong on one sample isn’t fatal.
We can put a number on how wrong we are.
The “score” is how good our labels are, on average.

Regression

Labels are continuous numbers. Measure error by averaging the differences

Classification

Labels are discrete categories (so outputs are probabilities). Measure error by accuracy or partial credit

A brief caution about classification

Note: sometimes classes are represented using numbers.
e.g., classifying digits: 0, 1, 2, …, 8, 9
But that doesn’t mean it’s regression!
Why might measuring error by differences be misleading?

Supervised Learning Model

Model is a function from inputs to outputs
“fitting” the model means searching for a “good” function. (training)
“predicting” means applying the function to new inputs. (inference)
Think: why does training take longer than inference?

Types of Labels

Unsupervised Learning

No explicit labels

Clustering
Finding outliers
Filling in missing data
Learning good representations
Generating new things (e.g., conversations, images, …)

Generative Models

A type of unsupervised learning

e.g. This Person Does Not Exist

Latent Variables

Conditional Synthesis

Reinforcement Learning

Input: perceptions of the world
Output: actions to take
Feedback: reward or punishment (potentially delayed)
Goal: take actions that get the most reward

Example: chess

Actions: possible moves
Reward: win or lose
Challenge: delayed feedback

Why is it hard?

We don’t have all the data: gotta try out things to get experience.
Credit assignment: which of my actions were good or bad?
Exploration vs. exploitation: should I stick with what I know or try something new?
Stochasticity: world responds differently each time

Other Examples

Self-driving cars
Robotics
Healthcare (e.g., treatment plans)
Games (e.g., AlphaGo)

Computing

Learning Path

“I trained a neural net classifier from scratch.”

Basic array/“tensor” operations in PyTorch
- Code: array operations
- Concepts: dot product, mean squared error
Linear Regression “the hard way” (but black-box optimizer)
- Code: Representing data as arrays
- Concepts: loss function, forward pass, optimizer
Logistic Regression “the hard way”
- Concepts: softmax, cross-entropy loss
Multi-layer Perceptron
- Concepts: nonlinearity (ReLU), initialization
Gradient Descent
- Concepts: backpropagation, training loop
Data Loaders
- Concepts: batching, shuffling