CS 375: Wrap-Up

Core Concepts: The Four Pillars

Neural Computation

  • Tensors as universal data structure
  • Linear transformations + nonlinearities
  • Learning via gradient descent

ML Systems

  • Data pipelines: preprocessing → model → evaluation
  • Abstractions: fit/predict APIs
  • Systematic evaluation & generalization

Learning Machines

  • Supervised learning: learning from examples
  • Unsupervised learning: finding patterns
  • Training, testing, and the generalization gap

Context & Implications

  • What AI can vs. should solve
  • Limitations: correlation ≠ causation
  • AI in service of human flourishing

Neural Computation: The Core

  • Traditional vs. Neural computing
    • Traditional: Explicit instructions → Outputs
    • Neural: Data + Parameters + Architecture → Learned mapping
  • Building blocks
    • Tensors (arrays) as fundamental data structure
    • Linear layers transform data
    • Nonlinearities (ReLU) add conditional logic
    • Gradient descent to adjust parameters

ML Systems: Connecting to the World

  • From raw data to predictions
    • Input transformation to structured tensors
    • Task-appropriate outputs and metrics
    • Systematic evaluation (train/val/test)
  • Key abstractions
    • Data pipelines with clear stages
    • Common API patterns
    • Hyperparameters vs. learned parameters

Learning Machines: Improving from Experience

  • Learning paradigms
    • Supervised: Mimicry from examples
    • Unsupervised: Pattern discovery without labels
    • Reinforcement: Learning from interaction and reward signals
  • Error sources
    • Underfitting: Can’t represent training data well
    • Overfitting: Can’t generalize beyond training
    • Data issues: Biased or shifting distributions
    • Task misspecification: Optimizing the wrong thing

Context & Implications: The Bigger Picture

  • Possibilities and limitations
    • What problems can AI solve? Desk tasks with clear metrics
    • What should we use AI for? Love and service, not just efficiency
  • Current limitations
    • Correlation vs. causation
    • Limited real-world interaction
    • Fixation on numeric metrics

Going Deeper: Neural Computation

  • What we’ve seen
    • Basic building blocks: vectors, matrices, tensors
    • Linear layers and activation functions
    • Simple architectures (MLPs)
    • Gradient descent as learning mechanism
  • What we haven’t seen
    • Deep networks with many layers
    • CNNs, RNNs, Transformers
    • Backpropagation internals
    • Advanced optimizers (Adam, etc.)

Going Deeper: ML Systems

  • What we’ve seen
    • Classification and regression tasks
    • Input transformations and batching
    • Performance metrics and evaluation
    • Hyperparameter tuning
  • What we haven’t seen
    • Scaling models and abstractions to LLMs
    • Commercial APIs for embeddings
    • Converting real-world problems to ML tasks

Going Deeper: Learning Machines

  • What we’ve seen
    • Basic supervised and unsupervised approaches
    • Generalization concepts
    • Error analysis and debugging
  • What we haven’t seen
    • Self-supervised learning
    • Training at massive scale
    • RLHF for generative models
    • Scale as regularization

Going Deeper: Context & Implications

  • Questions we’ve explored
    • AI capabilities vs. appropriate uses
    • Evaluation beyond metrics
    • Ethical considerations and impacts
  • Questions we’ll continue exploring
    • How AI systems align with human values
    • Navigating benefits and risks in deployment
    • Cultivating wisdom in technological development

Connections: Modern AI Systems

  • Image classifiers
    • Same basic structure as our simple networks
    • Addition of convolutional layers for pattern extraction
    • Hierarchical feature learning
  • Large Language Models
    • At core: fancy classifier over next token
    • Feature extractors + linear/softmax layers
    • Addition of attention mechanisms for context
    • Self-supervised learning at massive scale

Looking Forward to CS 376

  • Models for structured objects
    • Images, text, multimodal inputs
  • Advanced architectures
    • CNN, RNN, Transformers
  • Agent-based approaches
    • Advanced use of LLMs
    • Reinforcement learning for language/agents
    • Tool use and planning

This Course: Reflecting Together

  • We’ve prototyped education in an AI-pervasive world
  • What worked? What didn’t?
  • What did you appreciate?
  • How might we think about the value of learning communities?