Warning: This content has not yet been fully revised for this year.
Find answers to the following questions; the articles below should be helpful.
- How can a causal language model be used to power a dialog agent? (What does a “document” look like?)
- What is “few-shot learning”, aka “in-context learning”, and how is it helpful for getting a LM to do what you want?
- How does “chain of thought” prompting help a model reason better? (What does that have to do with autoregressive generation?)
- How does “tool use” work in LMs?
- How could you get a model to give an output in a specific structure that you could use in a program?
- In general, how can you use a “chat” API to do useful things?
- Are LM outputs always accurate? How can you tell?
Some resources:
- Azure Prompt Engineering Tips: Chat-Style and Completions-Style
- What Makes a Dialog Agent Useful?
- You may also want to skim Illustrating Reinforcement Learning from Human Feedback (RLHF)
- Chat Templates: An End to the Silent Performance Killer
- Prompt Engineering Guide | Learn Prompting: Your Guide to Communicating with AI. Some highlights:
- How Chain-of-Thought Reasoning Helps Neural Networks Compute | Quanta Magazine
Supplemental Material
- Read A basic introduction to decoding: How to generate text: using different decoding methods for language generation with Transformers
- Watch Lecture 4 of MIT 6.S191 (skim Lecture 3 if needed)
We probably won’t get to this until next week, but:
- Some slides
- An extensive collection of notebooks on generative models: Hitchhiker’s Guide To The Latent Space: Community Notebook Document - Google Docs
- Here’s a good intro to text-guided image generation and manipulation: StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery (paper)
I found the Foreward to this book on Deep Generative Modeling (Available through Calvin library) to be reasonably accessible, but you may prefer the author’s blog posts. (github).
Now, how do you control what gets generated?
- Controlling generated text
- Captioning images
- Generating images
- GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models | Abstract
- probably the best starting point for a project like this: Autoregressive Image Generation using Residual Quantization | Papers With Code - pretrained models in the official implementation, nice clean implementation in lucidrains/vector-quantize-pytorch: Vector Quantization, in Pytorch