We’ll view these systems through the same lens we started developing in CS 375: as tuneable machines that learn to play optimization games. But now we’re scaling up and out.
Discuss with your tables:
What do you hope that a technical understanding of generative AI will let you do? What difference do you want to make in the world as a result of taking this course?
Tuneable Machines:
ML Systems:
Learning Machines
Context and Implications
Some ideas are up on the course website.
On Perusall (graded by participation: watch/read it all, write a few good comments)
On this Calvin Day of Prayer, gather with one or two others and spend a few minutes asking God for…
pray for…
for those who…
The one who has knowledge uses words with restraint,
and whoever has understanding is even-tempered.
Even fools are thought wise if they keep silent,
and discerning if they hold their tongues.
Tell me a joke.
Q: What don’t scientists trust atoms?
A: They ___
What is the first word that goes in the blank?
neighs and rhymes with course
A language model produces text one token at a time, predicting a probability distribution over what comes next.
Today you’ll see exactly what that looks like — and what it tells us about how these models work.
Open the LM Internals tool and grab the handout.
In the activity, you saw:
Let’s now put formal names on these observations.
Write the joint probability as a product of conditional probabilities:
P(tell, me, a, joke) = P(tell) * P(me | tell) * P(a | tell, me) * P(joke | tell, me, a)
A causal language model gives P(word | prior words)
Task: given what came so far, predict the next thing
P(word | context) = softmax(wordLogits(word, context))
Do you recognize this structure?
Talk to your neighbors:
What shall I return to the Lord
for all his goodness to me?
I will lift up the cup of salvation
and call on the name of the Lord.
See also James 1:16-18
How can we generate complex data (text, images, audio)?
| Approach | Core idea | Example |
|---|---|---|
| Autoregressive | One piece at a time, left to right | ChatGPT, Claude |
| Latent variable | Sample a code, decode it | StyleGAN interpolation |
| Diffusion | Start from noise, iteratively denoise | Diffusion Explainer |
We’ll focus on autoregressive models — they power most modern LLMs.
See the notes page for details on all three.
Neural nets work with numbers. How do we convert text to numbers that we can feed into our models?
Neural nets give us numbers as output. How do we go back from numbers into text?
Two parts:
An example: https://platform.openai.com/tokenizer
(The “Ġ” is an internal detail to GPT-2; ignore it for now.)
Key abstraction: the conversation — a structured “document” with system instructions, user messages, assistant responses, “tool” calls/responses, reasoning traces
API: you ask for the next message given the conversation so far (no training)
Stateless: Each conversation is independent — the model itself doesn’t remember past conversations (but system can prepend them to future conversations)
Agent extension: the model can output requests to run code (e.g., search, calculate, edit file); the system runs the code and includes the output in the conversation
When appropriate: text tasks, prototyping, when training data is scarce
When not: latency-critical, cost-sensitive, tasks requiring precise numeric output
| Approach | Cost | Models |
|---|---|---|
| Commercial API (OpenAI, Google, Anthropic) | Pay per token | Largest, most capable |
| Free tier (Google Gemini) | Free (rate-limited) | Mid-size |
| Run locally (Ollama) | Free (your hardware) | Smaller models |
API key = how the provider identifies you. Don’t share it or commit it to git.
For Exercise 376.1: free Google Gemini API key or Ollama.
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is 2+2?"},
]
response = client.chat.completions.create(model=model, messages=messages)
assistant_msg = response.choices[0].message.content
# To continue: append the assistant's reply, then your next message
messages.append({"role": "assistant", "content": assistant_msg})
messages.append({"role": "user", "content": "Are you sure?"})
response2 = client.chat.completions.create(model=model, messages=messages)Some figures from Prince, Understanding Deep Learning, 2023