Unit 1: Introduction
Unit 2: Data
Unit 3: Data and Ethics
Unit 4: Models
Unit 5: Learning and Classification
Unit 6: Recap and Regularization
Unit 7: Learned Representations (Embeddings)
Unit 8: NLP Intro
Unit 9: NLP Modeling
Unit 10: Transformers
Unit 11: Generation
Unit 12: Review, Neural Architectures
Unit 13: Generation and Miscellaneous Topics

Unit 12: Review, Neural Architectures

Objectives

Compare and contrast the main types of deep neural network models (Transformers, Convolutional Networks, and Recurrent Networks) in terms of how information flows through them

Preparation

Just to make sure you’re following: Everything to Know About Artificial Intelligence, or AI - The New York Times (we have a campus-wide subscription to NYTimes, so you can read it for free)
Review the Thinking like Transformer blog post. This will be review and broader context for those who were following last week.
Skim Illustrating Reinforcement Learning from Human Feedback (RLHF) and What Makes a Dialog Agent Useful? on the Hugging Face blog.

Recommended but not essential:

Watch MIT 6.S191 Lecture 5: Deep Reinforcement Learning: [Slides], [Video]

Supplemental Material

Contextual
- AlphaGo Documentary
- ACM Selects: AI for Robotics
Technical
- Using Sequence Models for RL
  - Overview: Hugging Face blog post
  - Sequence Modeling Solutions for Reinforcement Learning Problems (a simple and clever approach)
    - See also: Decision Transformer: Reinforcement Learning via Sequence Modeling | Papers With Code
- Spinning Up in Deep RL - a hands-on introduction to reinforcement learning in PyTorch by OpenAI
- Creativity and Exploration
  - one example paper: BeBold: Exploration Beyond the Boundary of Explored Regions | Abstract

Class Meetings

Monday

Neural network architectures (slides)
- Fixed wiring: Feed-forward (MLP)
- Current sample wired to previous sample:
  - Recurrent Networks (Elman; LSTM and GRU)
- Current sample wired to surrounding samples: Convolutional Networks (CNN)
  - What convolution does to an image: Image Kernels explained visually
  - How to use convolutions in a neural network: CS231n Convolutional Neural Networks for Visual Recognition
  - What they learn: Feature Visualization
- Wiring computed dynamically based on “self-attention”: Transformer
Tricks
- Residual Connections
- Dropout

Wednesday in lab

Finish Monday lecture
Review: Self-Attention = conditional information flow
- Software: describe the wiring, then what flows through the wires.
- Hardware: compute queries, keys, and values, then compute the attention matrix, then compute the output.

bumped to next week: RL

Reinforcement Learning (learning from feedback)

Reward Discounting, quantifying the good life, and value alignment
- Jesus’s discount factor: he endured the cross for the joy set before him. Infinite time horizon, no convergence problems.
Types of learning: Supervised, Self-Supervised, Reinforcement
Challenges of RL
- Exploration
- Credit assignment
RL formalism: Markov Decision Process
What functions can we learn: value, Q, policy (see lab)
(Didn’t get to) How does MuZero work?

Friday

No class (Good Friday)

Contents

Homework 12