376 Preparation 4

Warning: This content has not yet been fully revised for this year.

Find answers to the following questions; the articles below should be helpful.

How can a causal language model be used to power a dialog agent? (What does a “document” look like?)
What is “few-shot learning”, aka “in-context learning”, and how is it helpful for getting a LM to do what you want?
How does “chain of thought” prompting help a model reason better? (What does that have to do with autoregressive generation?)
How does “tool use” work in LMs?
How could you get a model to give an output in a specific structure that you could use in a program?
In general, how can you use a “chat” API to do useful things?
Are LM outputs always accurate? How can you tell?

Some resources:

Azure Prompt Engineering Tips: Chat-Style and Completions-Style
What Makes a Dialog Agent Useful?
- You may also want to skim Illustrating Reinforcement Learning from Human Feedback (RLHF)
Chat Templates: An End to the Silent Performance Killer
Prompt Engineering Guide | Learn Prompting: Your Guide to Communicating with AI. Some highlights:
- 🟢 Build ChatGPT from GPT-3
- 🟡 LLMs Using Tools or 🟡 Math
How Chain-of-Thought Reasoning Helps Neural Networks Compute | Quanta Magazine

Supplemental Material

Read A basic introduction to decoding: How to generate text: using different decoding methods for language generation with Transformers
Watch Lecture 4 of MIT 6.S191 (skim Lecture 3 if needed)

We probably won’t get to this until next week, but:

The Illustrated Stable Diffusion – Jay Alammar – Visualizing machine learning one concept at a time.

Some slides
An extensive collection of notebooks on generative models: Hitchhiker’s Guide To The Latent Space: Community Notebook Document - Google Docs
Here’s a good intro to text-guided image generation and manipulation: StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery (paper)

I found the Foreward to this book on Deep Generative Modeling (Available through Calvin library) to be reasonably accessible, but you may prefer the author’s blog posts. (github).

Now, how do you control what gets generated?

Controlling generated text
- Prefix-Tuning: Optimizing Continuous Prompts for Generation; extension: Control Prefixes for Text Generation
Captioning images
- Multimodal Few-Shot Learning with Frozen Language Models; extension: MAGMA – Multimodal Augmentation of Generative Models through Adapter-based Finetuning
Generating images
- GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models | Abstract
- probably the best starting point for a project like this: Autoregressive Image Generation using Residual Quantization | Papers With Code - pretrained models in the official implementation, nice clean implementation in lucidrains/vector-quantize-pytorch: Vector Quantization, in Pytorch

← Lab 376.4: Dialogue Agents, Prompt Engineering, Retrieval-Augmented Generation, and Tool Use