Logits in Causal Language Models¶

Task: Ask a language model for how likely each token is to be the next one.

This notebook explores how causal language models predict the next token in a sequence by examining the raw logit values they produce. You'll learn to:

  • Extract and interpret logits from a language model
  • Convert logits to probabilities using softmax
  • Find the most likely next tokens for a given prompt
  • Create reusable functions to analyze token prediction probabilities

Understanding token prediction logits is fundamental to working with language models and provides insights into how these models make decisions during text generation.

Course Objectives Addressed¶

This notebook addresses the following CS376 course objectives:

  • MS-LLM-Generation: "I can extract and interpret model outputs (token logits) and use them to generate text."
  • MS-LLM-Tokenization: "I can explain the purpose, inputs, and outputs of tokenization."
  • MS-LLM-API: "I can apply industry-standard APIs to work with pretrained language models (LLMs) and generative AI systems."

It will also help set you up to make progress towards the following objectives in the next lab:

  • NC-Embeddings: "I can identify various types of embeddings (tokens, hidden states, output, key, and query) in a language model and explain their purpose."
  • NC-TransformerDataFlow: "I can identify the shapes of data flowing through a Transformer-style language model."

The exercises in this notebook will give you hands-on experience with the internals of language model prediction, which is essential for understanding how these models work and how to effectively utilize them for text generation tasks.

Setup¶

We start in the same way as the tokenization notebook:

In [ ]:
 
In [ ]:
# If the import fails, uncomment the following line:
# !pip install transformers
import torch
from torch import tensor
from transformers import AutoTokenizer, AutoModelForCausalLM
import pandas as pd
# Avoid a warning message
import os; os.environ["TOKENIZERS_PARALLELISM"] = "false"

One step in this notebook will ask you to write a function. The most common error when function-ifying notebook code is accidentally using a global variable instead of a value computed in the function. This is a quick and dirty little utility to check for that mistake. (For a more polished version, check out localscope.)

In [ ]:
def check_global_vars(func, allowed_globals):
    import inspect
    used_globals = set(inspect.getclosurevars(func).globals.keys())
    disallowed_globals = used_globals - set(allowed_globals)
    if len(disallowed_globals) > 0:
        raise AssertionError(f"The function {func.__name__} used unexpected global variables: {list(disallowed_globals)}")

Download and load the model.

In [ ]:
from transformers import AutoTokenizer, AutoModelForCausalLM, set_seed
model_name = "openai-community/gpt2"
# Here's a few larger models you could try:
# model_name = "EleutherAI/pythia-1.4b-deduped"
# model_name = "google/gemma-3-4b"
# model_name = "google/gemma-3-4b-it"
# Note: you'll need to accept the license agreement on https://huggingface.co/google/gemma-7b to use Gemma models
tokenizer = AutoTokenizer.from_pretrained(model_name, add_prefix_space=True)

# add the EOS token as PAD token to avoid warnings
model = AutoModelForCausalLM.from_pretrained(model_name)
if model.generation_config.pad_token_id is None:
    model.generation_config.pad_token_id = model.generation_config.eos_token_id
# Silence a warning.
tokenizer.decode([tokenizer.eos_token_id]);
In [ ]:
print(f"The tokenizer has {len(tokenizer.get_vocab())} strings in its vocabulary.")
print(f"The model has {model.num_parameters():,d} parameters.")
The tokenizer has 50257 strings in its vocabulary.
The model has 124,439,808 parameters.

Task¶

In the tokenization notebook, we simply used the generate method to have the model generate some text. Now we'll do it ourselves.

Consider the following phrase:

In [ ]:
phrase = "This weekend I plan to"
# Another one to try later. This was a famous early example of the GPT-2 model:
# phrase = "In a shocking finding, scientists discovered a herd of unicorns living in"

1: Call the tokenizer on the phrase to get a batch. After having a look at what goes in the batch, extract the input_ids.

In [ ]:
batch = tokenizer(ph..., return_tensors='pt')
input_ids = batch['in...']

2: Call the model on the input_ids. Examine the shape of the logits; what does each number mean?

Note: The model returns an object that has multiple values. The logits are in model_output.logits.

In [ ]:
with torch.no_grad(): # This tells PyTorch we don't need it to compute gradients for us.
    model_output = model(...)
print(f"logits shape: {list(model_output.logits.shape)}")
logits shape: [1, 5, 50257]

3: Pull out the logits corresponding to the last token in the input phrase. Hint: Think about what each number in the shape means. Remember that in Python, arr[-1] is shorthand for arr[len(arr) - 1].

In [ ]:
last_token_logits = model_output.logits[...]
assert last_token_logits.shape == (len(tokenizer.get_vocab()),)

4: Identify the token id and corresponding string of the most likely next token.

To find the most likely token, we need to find the index of the largest value in the last_token_logits. The method that does this is called argmax. (It's a common enough operation that it's built into PyTorch.)

Note: The tokenizer has a decode method that takes a token id, or a list of token ids, and returns the corresponding string.

In [ ]:
# compute the probability distribution over the next token
last_token_probabilities = last_token_logits.sof...(dim=-1)
# dim=-1 means to compute the softmax over the last dimension
In [ ]:
most_likely_token_id = ...
decoded_token = tokenizer.decode(most_likely_token_id)
probability_of_most_likely_token = last_token_probabilities[...]

print("For the phrase:", phrase)
print(f"Most likely next token: {most_likely_token_id}, which corresponds to {repr(decoded_token)}, with probability {probability_of_most_likely_token:.2%}")
For the phrase: This weekend I plan to
Most likely next token: 467, which corresponds to ' go', with probability 5.79%

5: Use the topk method to find the top-10 most likely choices for the next token.

See the documentation for torch.topk. Calling topk on a tensor returns a named tuple with two tensors: values and indices. The values are the top-k values, and the indices are the indices of those values in the original tensor. (In this case, the indices are the token ids.)

Note: This uses Pandas to make a nicely displayed table, and a list comprehension to decode the tokens. You don't need to understand how this all works, but I highly encourage thinking about what's going on.

In [ ]:
most_likely_tokens = last_token_logits.topk(...)
print(f"most likely token index from topk is {most_likely_tokens.indices[0]}") # this should be the same as argmax
decoded_tokens = [tokenizer.decode(...) for ... in most_likely_tokens.indices]
probabilities_of_most_likely_tokens = last_token_probabilities[most_likely_tokens.indices]

# Make a nice table to show the results
most_likely_tokens_df = pd.DataFrame({
    'tokens': decoded_tokens,
    'probabilities': probabilities_of_most_likely_tokens,
})
# Show the table, in a nice formatted way (see https://pandas.pydata.org/pandas-docs/stable/user_guide/style.html#Builtin-Styles)
# Caution: this "gradient" has *nothing* to do with gradient descent! (It's a color gradient.)
most_likely_tokens_df.style.hide(axis='index').background_gradient()
most likely token index from topk is 467
Out[ ]:
tokens probabilities
go 0.057938
take 0.053048
attend 0.038624
visit 0.036411
be 0.027352
do 0.024956
make 0.023817
spend 0.021302
play 0.019172
travel 0.017760
  1. Write a function that is given a phrase and a k and returns the most_likely_tokens_df DataFrame with the top k most likely next tokens. (Don't include the style line.)

Build this function using only code that you've already filled in above. Clean up the code so that it doesn't do or display anything extraneous. Add comments about what each step does.

In [ ]:
def predict_next_tokens(...):
    # your code here

def show_tokens_df(tokens_df):
    return tokens_df.style.hide(axis='index').background_gradient()

check_global_vars(predict_next_tokens, allowed_globals=["torch", "tokenizer", "pd", "model"])
In [ ]:
show_tokens_df(predict_next_tokens("This weekend I plan to", 5))
Out[ ]:
tokens probabilities
go 0.057938
take 0.053048
attend 0.038624
visit 0.036411
be 0.027352
In [ ]:
show_tokens_df(predict_next_tokens("To be or not to", 5))
Out[ ]:
tokens probabilities
be 0.963997
become 0.004372
have 0.004315
Be 0.001392
get 0.000955
In [ ]:
show_tokens_df(predict_next_tokens("For God so loved the", 5))

Analysis¶

Q1: Give a specific example of the shape of model_output.logits and explain what each number means.

your answer here

Q2: Change the -1 in the definition of last_token_logits to -3. What does the variable represent now (what would be a better name for it)? What does its argmax represent?

your answer here

Q3: Let's think. The method in this notebook only get the scores for one next-token at a time. What if we wanted to do a whole sentence? We’d have to generate a token for each word in that sentence, step by step. What are two ways we could decide which token to generate next?

Write pseudocode (not Python) for two approaches: (1) greedy generation and (2) sampling.

your answer here