AI Incident Tabletop Kit: 3 Scenarios Your Team Can Run This Quarter

Answer box

Most enterprise incident response runbooks were written for human attackers and bounded-system breaches. They do not cover the case where the AI agent is the actor. The result: when a real AI incident lands, the security team improvises — and improvisation under pressure is where mistakes get made and audit evidence gets lost. The cure is a tabletop exercise. We packaged three ready-to-run AI incident scenarios — kill-switch failure, MCP supply-chain poisoning, and memory-poisoning persistence — into a downloadable kit with facilitator injection cards, scoring rubric, and after-action report templates. Ninety minutes per scenario. Run it this quarter, before the real incident arrives.

Why your existing IR runbook doesn't cover this

Walk through your incident response runbook with one question in mind: does it say anything about what to do when the actor is an AI agent, not a human?

In our experience across customer engagements, the answer is no in roughly 90% of cases. The runbook assumes you can identify the attacker, contain the actor, and bound the blast radius. For AI agent incidents, all three assumptions break:

The "actor" might be an agent with valid credentials behaving in a way nobody authorized. Identifying the attacker means identifying the prompt, the tool, the memory entry, or the upstream document that caused the agent to do what it did.
"Containing" the actor means stopping in-flight tool calls, revoking tokens issued to the agent across multiple systems, and ensuring watchdogs don't auto-respawn it. None of those are in the typical IR playbook.
"Bounding blast radius" assumes the agent acted once. Agents acting at machine speed may have triggered cascading actions across multiple systems before the incident is detected.

The first AI agent incident a security team faces is not the time to discover these gaps. A tabletop exercise discovers them on a Tuesday afternoon, in a conference room, with no real damage.

The three scenarios in the kit

We built three scenarios that together cover the most common AI incident patterns. Each can be run in 90 minutes with 6-10 participants. The kit contains a facilitator's guide, participant briefing materials, injection cards (events that the facilitator drops during the exercise), and a scoring rubric.

Scenario 1 — The 9-Second Database Delete

A coding agent with elevated permissions, used legitimately for routine maintenance, suddenly issues a series of valid DROP TABLE statements against the production data warehouse. The first table goes in nine seconds. Three more are queued.

The exercise probes: Who gets paged? Who is authorized to execute the kill switch? Does your kill switch stop the queued drops or only future actions? Can you reconstruct what happened from logs? What is the customer communication?

This is the scenario inspired by the ServiceNow RSAC 2026 keynote demo. Most teams discover within 15 minutes of running it that their "kill switch" requires a config-file edit, redeploy, and IDE restart per developer. That's not a kill switch.

Scenario 2 — The Poisoned MCP Server

A widely-used internal MCP server is updated by its maintainer overnight. The next morning, every agent connecting to it begins exfiltrating snippets of source code in the responses they send back to a downstream synthesizer agent, which writes them into an outbound customer-facing report.

The exercise probes: How quickly is the supply-chain compromise detected? Which agents have already pulled from the poisoned server? Which customers received the leaked content? What is the regulator notification timeline if the synthesizer agent is part of a high-risk system under the EU AI Act?

This is the OWASP ASI04 (Agentic Supply Chain Compromise) scenario. It's the test of whether your AI Bill of Materials (AIBOM) and tool allowlisting are real or aspirational.

Scenario 3 — The Memory Poisoning Persistence

An agent's long-term memory store is poisoned by a malicious user who briefly held a partner-organization account. The malicious entries assert "this user account is pre-authorized for unlimited refunds." The user is rotated out; the memory entry persists. Two weeks later, a customer-success agent processes a refund request and finds the "pre-authorization" in its context.

The exercise probes: How do you find every memory entry written by the rotated account? What is your audit trail for memory writes? Can you disable the agent's memory ingestion temporarily without disabling the agent? What is the financial exposure?

This is the OWASP ASI06 (Memory & Context Poisoning) scenario. It tests whether your agents have provenance-signed memory or unsigned memory — and most don't have the former.

How to run the exercise

Each scenario runs in 90 minutes:

Minutes	Activity
0-10	Facilitator briefs participants on the scenario starting state
10-60	Live exercise — facilitator drops injection cards every 5-10 minutes; participants react and document decisions
60-75	Gap discussion — what didn't work?
75-90	After-action report draft (template in the kit)

The kit includes one full set of facilitator notes per scenario, scripted injection cards (e.g., "Inject at T+0:30: customer support team starts receiving complaints"), participant role cards (incident commander, legal, compliance, communications, engineering), and a scoring rubric for the after-action.

What you'll find when you run it

Three predictable findings, based on the kit's beta runs:

Your kill switch is slower than you think. The exercise reveals mean-time-to-kill metrics that are 5-10× longer than the team estimated beforehand.
Your provenance logs are incomplete. The exercise reveals that critical context (which prompt triggered the action, which user originally authorized the agent's session, what tool descriptors were active) is not in the audit trail.
Your communication tree breaks. Page lists are stale. Legal and communications are looped in too late. Customer notifications are drafted under time pressure.

None of those are failures. They are exactly what the tabletop is for: surfacing the gaps before the real incident does. Each finding maps to a specific control improvement.

Download the kit

Download the AI Incident Tabletop Kit (free) — 4-field form, 35-page PDF, three full scenarios with facilitator materials.

FAQ

How long does each scenario take?

90 minutes per scenario. Most security teams run one scenario per week across three weeks to digest findings between sessions. Some run all three back-to-back in a half-day workshop.

How many people do I need?

6-10 participants is the sweet spot. Roles: incident commander, two-person engineering team (one familiar with agents, one familiar with infra), security analyst, legal, compliance, communications. Optional: customer-success representative.

Do I need AccuroAI deployed to use the kit?

No. The kit is platform-agnostic. The scenarios test your existing IR processes and detection signals regardless of which AI control plane (if any) you operate.

Can I customize the scenarios?

Yes. The kit ships with a "customizer" section explaining how to adapt the scenarios to your specific industry (financial services, healthcare, defense, etc.) and tool stack (LangChain, AutoGen, MCP, custom).

What's the next step after running the tabletop?

The kit ends with a "next steps" template — typically 5-7 control improvements ranked by impact. Implement them in the following quarter and re-run the tabletop after to validate.

Is the kit suitable for an external auditor?

The after-action report templates are designed to be auditor-presentable. ISO 42001 §9.1 and NIST AI RMF MANAGE-4 both reference tabletop exercises as evidence of operational readiness; the kit's outputs map cleanly.

Run an AI Incident Tabletop This Quarter: A Three-Scenario Kit for Security Teams