Discussion 376.3: When Agents Go Wrong

LLM-powered agents can now browse the web, write and execute code, send emails, book appointments, and manage files. When they work, they’re impressive. When they fail, the consequences land on real people — and it’s not always clear who’s responsible.

This Discussion addresses the course objectives Overall-Impact and Overall-LLM-Failures, and connects to OG-LLM-Advanced.

Initial Post

Find a specific, documented case where an AI agent or AI-powered automation caused harm or failed in a consequential way. This could be:

If you can’t find a documented real case, you may construct a realistic hypothetical based on a system you’ve used or built — but label it as hypothetical and explain why you think it’s plausible.

In your post (~150-250 words):

  1. Describe what happened (or could happen). Be specific: what was the agent, what tools did it have access to, what went wrong, and who was affected?
  2. Identify the failure point. Was this a problem with the model (hallucination, misunderstanding instructions), the system design (insufficient guardrails, too much autonomy), the deployment context (wrong use case, missing human-in-the-loop), or something else?
  3. Who should be responsible? The developer? The deploying company? The user who trusted it? Make an argument.

Cite your source.

Replies

Reply to at least two classmates (~75-150 words each). Your replies should:

  1. Propose a concrete fix or mitigation for the failure they described. Be specific — not “add more testing” but what you’d test, what guardrail you’d add, or where you’d require human approval.
  2. Engage with their responsibility argument. Do you agree with who they held responsible? Would a different framing (e.g., product liability, professional ethics, or a concept like stewardship or the common good) change the answer?

Rubric

Lab 376.4: Dialogue Agents, Prompt Engineering, Retrieval-Augmented Generation, and Tool Use
376 Preparation 4