Notebooks Index | CS 375-376 Spring 2025 at Calvin University

Goal

These notebooks will demonstrate proficiency in basic machine learning concepts and skills.

To complete a notebook, follow instructions and fill in blanks. Most blanks will be labeled # your code here, an ellipsis (...), or *your answer here* (for narrative answers written in Markdown). You should remove placeholder comments.

Successful solutions will:

Include code that successfully accomplishes the task.
- It should generate the results when run fresh (“Restart and Run All”)
- It should have no extraneous code.
- Format code clearly (consistent spacing, one idea per line, no overly long lines, etc.)
Document each major step succinctly but clearly.
- Use Markdown cells (with appropriate formatting and links) to describe the overall steps taken.
  - See the Setup section of various notebooks for an example of code explanations.
  - Note that you are not required to understand the code in the “Setup” section.
- Use clear variable names, keyword arguments, and code comments to make the code easy to follow.
Include responses to each of the analysis questions.
- Add a Markdown cell for each question.
- Add code cells as necessary to run computations that some questions may need.
- Any activities marked “Extension” are optional but encouraged.

We aim that each notebook will:

Demonstrate a single concept
Take less than 15 minutes to complete, if that concept is understood (if it’s taking longer than 15 minutes, please let the instructor know so it can be simplified in the future)
Take less than 5 minutes to run to completion
Be a useful reference for how to perform that operation in the future

We also strive for the sequence to make sense.

The Notebooks

Note: Notebooks beyond the current week may not be updated for the current year.

Week 1

Jupyter Notebook Warmup (name: u01n0-notebook-warmup.ipynb; show preview, open in Colab)
- Jupyter Notebooks
Train a simple image classifier (name: u01n1-train-clf.ipynb; show preview, open in Colab)
- Setup
- Configure our experiments
  - Load the data
  - Train a model
  - Make some predictions
- Experimentation
- Optional extension: try out your own image

Week 2

PyTorch Warmup (name: u02n1-pytorch.ipynb; show preview, open in Colab)
- Dot Products
  - for loop approach
    - Torch Elementwise Operations
  - Torch Reduction Ops
  - Building a dot product out of Torch ops
- Linear Layer
  - Linear layer, Module-style
- Mean Squared Error
- Multidimensional arrays
- Appendix
Regression in scikit-learn (name: u02n2-sklearn-regression.ipynb; show preview, open in Colab)
- Setup
- Task
  - Part A: Linear regression
  - Part B: Decision tree regression
  - Part C: Random Forest regression
- Analysis
- Extension

Week 3

Linear Regression the Hard Way (name: u03n1-linreg-manual.ipynb; show preview, open in Colab)
- Objectives
- Setup
- Task
  - Step 0: Initialize the model
  - Step 1: Single prediction
  - Step 2: Prediction for all inputs
    - Visualizing the predictions
  - Step 3: Compute loss
  - Step 4: Compute loss given parameters
  - Check in
- Guided Extension
Classification in scikit-learn (name: u03n2-sklearn-classification.ipynb; show preview, open in Colab)
- Setup
- Task
  - Part A: Logistic Regression
  - Part B: Decision tree classifier
  - Part C: Random Forest
- Analysis
- Extension

Week 4

Multiple Linear Regression, the Hard Way (name: u04n1-multi-linreg-manual.ipynb; show preview, open in Colab)
- Setup
- Task
  - Part A: Linear regression
- Analysis
Softmax, part 1 (name: u04n2-softmax.ipynb; show preview, open in Colab)
- Setup
- Task
- Analysis
- Optional Extension: Numerical Issues
  - Task for Numerical Issues
  - Analysis of Numerical Issues
- Extension optional
From Linear Regression in NumPy to Logistic Regression in PyTorch (name: u04n3-logreg-pytorch.ipynb; show preview, open in Colab)
- Setup
  - Basic EDA
- Part 1: Classification the wrong way (using linear regression)
  - 1.A Using NumPy
  - 1.B: Using PyTorch
- Part 2: Converting to Classification
  - 2.A Using NumPy (we’ll do this together)
  - 2.B Using PyTorch
    - Setting up the linear layer
    - Softmax
    - Cross-entropy loss
    - Full PyTorch Implementation
  - Looking ahead: a multi-layer network
- Analysis

Week 5

ReLU Regression Interactive (name: u05n00-relu.ipynb; show preview, open in Colab)
- Task
Image Classification: Losses and Feature Extraction (name: u05n1-img-classifier-feature-extractor.ipynb; show preview, open in Colab)
- Setup
- Configure our experiments
  - Load the data
  - Train a model
- Top Losses
- Model as Feature Extractor
  - Importance of Feature Extractors
- Check-In
Logistic Regression and MLP (name: u05n2-logreg-mlp.ipynb; show preview, open in Colab)
- Setup
- Data Loading
- Train and Evaluate Model
- Analysis
Train Simple Image Classifier (name: u05n3-mnist-clf.ipynb; show preview, open in Colab)
- Setup
- Task
- Analysis
- Extension
Supplemental
- Softmax and Sigmoid (name: u05s2-softmax-2.ipynb; show preview, open in Colab)
  - Setup
  - Task
  - Analysis
- Diagnose and Probe an Image Classifier (name: u05s3-clf-prototypes.ipynb; show preview, open in Colab)
  - Setup
  - Configure our experiments
    - Load the data
    - Train a model
  - Top Losses
  - Manual Last Layer
  - Softmax and Cross-Entropy

Week 6

MNIST with PyTorch (name: u06n1-mnist-torch.ipynb; show preview, open in Colab)
- Load and Understand the Data
  - Understanding Flattening
  - Setting up data loaders
- Train an MLP to classify MNIST
- Data Augmentation
Trace Simple Image Classifier (name: u06n1-trace-mnist.ipynb; show preview, open in Colab)
- Setup
- Task
- Analysis
Compute gradients using PyTorch (name: u06n2-compute-grad-pytorch.ipynb; show preview, open in Colab)
- Setup
- Task 1
- Task 2
- Analysis
Linear Regression using the Fast.ai Learner (name: u06n2-linreg-learner.ipynb; show preview, open in Colab)
- Setup
- Task
- Solution
- Analysis
Linear Regression the Hard Way (name: u06n3-linreg-manual.ipynb; show preview, open in Colab)
- Setup
- Task
Nonlinear Regression (name: u06n3-nn-regression.ipynb; show preview, open in Colab)
- Setup
- Task
- Solution
  - Step 1: Fit a Line
  - Step 2: Add a Layer
  - Step 3: Add a nonlinearity
- Analysis
- Extension (optional)
Supplemental
- Bias-Variance Decomposition (name: u06s01-bias-variance.ipynb; show preview, open in Colab)
- MNIST with PyTorch (name: u06s2-mnist-torch-augmentation.ipynb; show preview, open in Colab)
  - Load and Understand the Data
    - Understanding Flattening
    - Setting up data loaders
  - Train an MLP to classify MNIST
  - Data Augmentation

Week 7

Probe an Image Classifier (name: u07n1-image-embeddings.ipynb; show preview, open in Colab)
- Setup
- Configure our experiments
  - Load the data
  - Train a model
- Top Losses
- Manual Last Layer
- Softmax and Cross-Entropy
Image Operations (name: u07n1-image-ops.ipynb; show preview, open in Colab)
- Setup
  - Configure
  - Load the data
- Task
- Convolution layers
- A real conv layer
A Reinforcement Learning Example (name: u07n2-rl.ipynb; show preview, open in Colab)

Week 8

Tokenization (name: u08n1-tokenization.ipynb; show preview, open in Colab)
- Setup
  - Download and load the model
- Demo
- Task
  - Getting familiar with tokens
  - Applying what you learned
- Analysis
Supplemental
- Sentence Embeddings (name: u08s1-sentence-embeddings.ipynb; show preview, open in Colab)
  - Install and Import
  - Load Model and Data
  - Compute Sentence Vectors
  - Visualize Sentence Vectors
  - Find Clusters
  - How does it work?
  - Looking for Similar Vectors

Week 9

Demo of Logits and Embeddings from a Language Model (name: u09n0-logits-demo.ipynb; show preview, open in Colab)
- Tokenization
- Embeddings
- Example of mapping
- Vector Analogies
- What the model does
Logits in Causal Language Models (name: u09n1-lm-logits.ipynb; show preview, open in Colab)
- Setup
- Task
- Analysis
An exercise on bias in word embeddings. (name: u09n1-word-embeddings.ipynb; show preview, open in Colab)
- Directions are meaningful
Translation as Language Modeling (name: u09n2-decoding.ipynb; show preview, open in Colab)
- Setup
- Warm-up
- Scoring a candidate translation
- Dig In!
  - The guts of the model
- Visualize attentions
- Similarity

Week 10

Implementing self-attention (name: u10n1-implement-transformer.ipynb; show preview, open in Colab)
- Setup
- Dataset
- Tokenization
- Multi-Layer Perceptron
- A language model with a single MLP
- Trace the Simple Model
  - Step 1: Embeddings
  - Step 2: MLP
  - Step 3: LM Head
- Generating text
- Self-Attention
  - Train the Transformer
  - Tracing the Transformer
  - Finish the trace
- Other things you could try

Week 11

Prompt Engineering (name: u11n1-prompt-engineering.ipynb; show preview, open in Colab)
- Warm-Up
- Chat Templating
- Retrieval-Augmented Generation

Week 12

Stable Diffusion Deep Dive (name: u12n1-stable-diffusion.ipynb; show preview, open in Colab)
- Setup & Imports
- Loading the models
- A diffusion loop
- The Autoencoder (AE)
- The Scheduler
- Loop starting from noised version of input (AKA image2image)
- Exploring the text -> embedding pipeline
  - Token embeddings
  - Positional Embeddings
  - Combining token and position embeddings
  - Feeding these through the transformer model
  - Textual Inversion
- Messing with Embeddings
- The UNET and CFG
  - Classifier Free Guidance
- Sampling
- Guidance
- Conclusions
Calvin Course Advisor Bot (name: u12n2-agent-rag.ipynb; show preview, open in Colab)
- Option 1: Run locally:
- Warm-Up: Structured Outputs
- Step 2: Make a simple retrieval system
- A Complete Bot

Week 13

Why so big? Counting parameters in sequence models (name: u13n1-count-params.ipynb; show preview, open in Colab)
- Setup
- Embeddings
- Complete but vacuous model
- Multi-Layer Perceptron
  - Complete Language Model with MLP
- Transformer
- Analysis
Models for Sequence Data (name: u13n2-seq-models.ipynb; show preview, open in Colab)
- Setup
- Getting started
- Feed-Forward Network
  - Create the model
  - Check its output shape
  - Check its speed
  - Check how gradients flow
- GRU
  - Your turn
- Convolution
- Transformer
- Analysis
Programming with Self-Attention (name: u13n3-self-attention.ipynb; show preview, open in Colab)
- Where We’re Going
- Section 1: Feed-Forward Network
  - Exercise
- Section 2: Keys and Queries
- Exercise
- Exercise
  - Exercise: histogram.
- Section 3: Values (other than 1)
- Exercise: pattern detect