Unit 12: Review, Neural Architectures
Objectives
Compare and contrast the main types of deep neural network models (Transformers, Convolutional Networks, and Recurrent Networks) in terms of how information flows through them
Preparation
Recommended but not essential:
Watch MIT 6.S191 Lecture 5: Deep Reinforcement Learning: [Slides ], [Video ]
Supplemental Material
Contextual
Technical
Using Sequence Models for RL
Spinning Up in Deep RL - a hands-on introduction to reinforcement learning in PyTorch by OpenAI
Creativity and Exploration
Class Meetings
Monday
Neural network architectures (slides )
Fixed wiring: Feed-forward (MLP)
Current sample wired to previous sample:
Current sample wired to surrounding samples: Convolutional Networks (CNN)
Wiring computed dynamically based on “self-attention”: Transformer
Tricks
Residual Connections
Dropout
Wednesday in lab
Finish Monday lecture
Review: Self-Attention = conditional information flow
Software: describe the wiring, then what flows through the wires.
Hardware: compute queries, keys, and values, then compute the attention matrix, then compute the output.
bumped to next week: RL
Reinforcement Learning (learning from feedback)
Reward Discounting, quantifying the good life, and value alignment
Jesus’s discount factor: he endured the cross for the joy set before him. Infinite time horizon, no convergence problems.
Types of learning: Supervised, Self-Supervised, Reinforcement
Challenges of RL
Exploration
Credit assignment
RL formalism: Markov Decision Process
What functions can we learn: value, Q, policy (see lab )
(Didn’t get to) How does MuZero work?
Friday
No class (Good Friday)
Contents