376 Unit 6: Miscellaneous Topics

Reading

Background

Review my blog post on Mapping to Mimicry. I wrote it in one short sprint; feedback welcome!

Robotics

Rather than study theory, let’s look at two recent blog advances:

Explainable and Human-Centered AI

ACM Selects: Trustworthy AI in Healthcare #02

Supplemental

What happens when AI meets people? How can we ensure that AI results are:

The first two are the subject of a subfield called Fairness, Accountability, and Transparency; the last is the subject of much research in human-computer interaction (HCI) and computer-supported cooperative work (CSCW). We’ll explore all three in these last two weeks of class.

Correctness and Transparency / Explainability

Read one or more of these:

Watch:

Supplemental Material

Justice (Fairness, Bias)

Supplemental: The Effects of Regularization and Data Augmentation are Class Dependent | Abstract

Usability

Read or watch something from Human-Centered Artificial Intelligence.

Reinforcement Learning

Recommended but not essential:

Supplemental Material

Reinforcement Learning (learning from feedback)

Contents

Lab: RL, Transformers, or other topics

Choose from one of the following notebooks, or do the Reinforcement Learning activities at the bottom of this page.

Neural Net Architecture

Reinforcement Learning

Policy, Value, and Q functions

Open up Observable RL Playground

  1. Read through “Strategically Making Mistakes”.
  2. What does a low epsilon do? What does a high epsilon do?
  3. Try editing the maze = definition to edit the environment. What does it take to get the agent to tolerate a short-term negative reward to achieve a higher long-term reward?
Older activity that doesn’t work anymore

Go to the “Playground” at the bottom of this article.

  1. Change the Algorithm to Q-Learning. We won’t look at the others at this time.
  2. Try each of the “Visualization” options. What does each one show? Each one is a different function
  3. Add one agent. How does completing an episode affect each of the functions that the agent is learning?

Exploration

  1. Set the Explore-Exploit slider all the way to Explore. What do you notice about the agent’s behavior?
  2. Set it all the way to Exploit. What do you notice now?

This environment isn’t rich enough for exploration to help much. So: go to a different playground, where we can actually edit the environment and see what the agent learns.