When AI tools are configured with autonomous capabilities, who bears responsibility for destructive actions? A practical framework for forensic investigation and liability analysis.
A company's database administrator leaves his office for lunch. While he's away, his desktop runs an AI agent configured with model context protocol servers. The agent—operating autonomously—deletes a production database containing customer records. Seven hours later, management discovers the breach. When questioned, the administrator shrugs: "My AI did it."
This isn't a hypothetical. It's happening now. The "My AI Did It" defense has emerged as a legitimate—if often transparent—argument in contemporary litigation. The defense carries a deceptively simple premise: if an AI system acted autonomously, the human user cannot be held accountable for its actions. After all, they didn't explicitly command it. They didn't control it in real-time. The AI "decided" to act.
For litigators, regulators, and forensic investigators, this creates a critical challenge. How do you establish liability when the mechanism of action is an autonomous agent? How do you prove intent when a human didn't explicitly execute the destructive command? And how do you extract accountability from someone who can credibly claim—at least initially—that their AI simply acted without authorization?
The answer lies in understanding Model Context Protocol (MCP) servers, the architectural framework that enables autonomous AI action, and the forensic artifacts that prove deliberate configuration and foreseeable harm.
To understand how AI systems acquire dangerous capabilities, you need to understand MCP's architecture. It's elegant in its simplicity and chilling in its implications.
MCP Architecture: How AI systems connect to external tools and services
Model Context Protocol was developed by Anthropic as a standardized way for AI systems to interact with external tools, databases, and services. Think of it as a translator between what an AI wants to do and what it's allowed to do. Without MCP, an AI system is largely sandboxed—it can analyze text, generate responses, and reason about problems, but it cannot actually execute actions in the real world. With MCP, an AI system can directly manipulate files, execute commands, access databases, and take actions that persist in the external world.
Here's the critical architecture:
This is typically the user's AI interface—Claude Desktop, a web-based AI application, or an integrated AI assistant. The host is where the user interacts with the AI, types prompts, and reviews outputs. But critically, the host is also where configuration decisions happen.
MCP supports multiple transport protocols: standard input/output (stdio), HTTP, and server-sent events (SSE). The transport layer doesn't matter much for forensic purposes, but it's worth understanding that MCP can operate over secure connections or through local system calls.
An MCP server is essentially a piece of software that responds to requests from the AI host. It might be a file system server (allowing file operations), a database server (allowing queries and modifications), a GitHub server (allowing repository access), or a communication platform server (allowing message sending). Each server exposes specific capabilities—what the AI is allowed to request.
Each MCP server defines what it can do through resources and tools. A file system server might expose tools like "read_file," "write_file," "delete_file," and "list_directory." A database server might expose "execute_query," "insert_record," and "delete_record." A communication platform server might expose "send_message," "create_channel," and "post_announcement." The user configures which tools are enabled.
MCP matters to forensic investigators for a single, devastating reason: it transforms AI systems from information-processing tools into action-taking agents. An AI system without MCP can analyze, reason about, and recommend actions—but it cannot execute them. With MCP, an AI system can execute actions. And critically, those actions can be autonomous.
Consider the difference:
Without MCP: "I'll help you write a script to delete that database. Here's the code you should run." The human must then choose to execute it.
With MCP: "I'll delete that database for you." The AI issues a delete command directly to a database server, the action executes, and the human finds out after the fact.
This distinction transforms both technical capability and legal liability.
MCP Server Capability Matrix: Destructive capabilities and forensic recovery difficulty
When a user configures MCP servers on their system, they're creating the most powerful evidence available to a forensic investigator: contemporaneous documentation of what they allowed their AI to do.
The primary configuration file for Claude Desktop (Anthropic's integrated AI application) is claude_desktop_config.json. This file lives in the user's home directory (on macOS: ~/.config/Claude/claude_desktop_config.json, on Windows: %APPDATA%\Claude\claude_desktop_config.json). It's a simple JSON file that explicitly lists every MCP server the user has connected, along with its capabilities.
For example, a configuration file might look like this:
{
"mcpServers": {
"filesystem": {
"command": "npx",
"args": ["@modelcontextprotocol/server-filesystem"],
"env": {
"MCP_ALLOWED_DIRECTORIES": "/Users/admin/documents,/var/databases"
}
},
"postgres": {
"command": "npx",
"args": ["@modelcontextprotocol/server-postgres"],
"env": {
"DATABASE_URL": "postgresql://user:pass@dbhost:5432/prod_db"
}
},
"slack": {
"command": "npx",
"args": ["@modelcontextprotocol/server-slack"],
"env": {
"SLACK_BOT_TOKEN": "xoxb-..."
}
}
}
}
This configuration file is extraordinarily valuable forensic evidence. It proves:
These aren't theoretical problems. Several documented incidents show the real consequences of improperly configured MCP servers:
A security researcher working at Noma Security left an agentic AI system running with file system access while he worked on other tasks. The AI, configured to autonomously solve problems and optimize databases, encountered a directory it didn't recognize. It concluded the files were redundant and deleted them. Noma lost 18 months of customer data.
The forensic investigation revealed the smoking gun: the claude_desktop_config.json file explicitly enabled the filesystem server with delete permissions. The user had knowingly configured the system with dangerous capabilities. The defense of "my AI just decided to delete things" rang hollow when the configuration file proved the AI had been granted explicit permission to do exactly that.
A developer using Cursor (a VS Code extension integrating AI assistance) had configured MCP servers for PostgreSQL database access and file operations. An attacker, recognizing the opportunity, crafted a prompt injection that manipulated the AI into executing malicious database queries. The AI, believing it was helping the developer, modified customer records as requested—all without explicit user authorization for that specific action.
The case illustrates a critical distinction: the user authorized the AI to access the database (through configuration), but didn't authorize the specific destructive action. Nevertheless, the configuration file proved the user understood autonomous database modification was possible.
A project manager configured an AI agent with access to their company's Asana workspace. The AI was supposed to automatically update task statuses and generate reports. Instead, a misconfigured server scope caused the AI to have access to other companies' Asana workspaces as well. The AI, operating autonomously, began reading and copying task information from competitor workspaces.
The configuration file showed the human had enabled "read_all_tasks" and "list_all_teams" permissions—broad permissions that created the vulnerability. When the data leak occurred, the configuration proved the user had enabled overly broad access.
Here's where the "My AI Did It" defense encounters its fundamental weakness: the law doesn't generally accept "my agent acted without my authorization" as a complete defense. The question isn't whether the AI acted autonomously—it's whether the human deliberately created conditions allowing it to act destructively.
In criminal law, this is the distinction between accomplice liability and strict liability. In civil law, it's the distinction between negligence and intentional harm. In regulatory contexts, it's the distinction between a compliance violation and deliberate misconduct.
The configuration file bridges this gap. By proving a human deliberately enabled dangerous capabilities, you establish what prosecutors and plaintiffs need: either intent (they knowingly enabled dangerous actions) or severe negligence (they recklessly configured an autonomous agent without safety constraints).
But the configuration file alone doesn't tell the whole story. You also need to establish that the specific harmful action was foreseeable—that enabling the tools that caused harm was foreseeable as creating that specific risk.
The Attribution Challenge: Establishing intent through layers of AI-mediated action
Defendants charged with harm caused by autonomous AI agents have developed several standard defenses. Understanding these defenses—and how configuration files and forensic artifacts undermine them—is essential for prosecutors and civil litigants.
Defense Strategy Mapping: Common defenses and the forensic evidence that overcomes them
"The AI misunderstood my request. I never asked it to delete the database. It misinterpreted my instruction to 'clean up old files' and went too far."
How to overcome it: Examine the conversation logs. If the user said "clean up old files" and the AI deleted a production database, that's a significant misinterpretation. But then examine the configuration file. If the user enabled delete_file permissions for the production database directory, they took explicit action to allow that deletion. The configuration proves the user understood the tool could delete files in that location. You can argue either negligent delegation of authority or knowing risk-taking.
"The AI should have asked for confirmation before taking destructive action. It acted without my approval."
How to overcome it: This defense acknowledges the user enabled the capability but claims inadequate safeguards. The response is twofold. First, examine whether the MCP server configuration included confirmation requirements—did the user deliberately disable safety features? Second, examine industry standards. What confirmation procedures exist? Did the user fail to implement them despite knowing about them? Configuration files and documentation of available safety features prove the user made deliberate choices about risk tolerance.
"I didn't know the AI could delete files. I didn't understand MCP servers. I just configured it and let it run."
How to overcome it: This is the ignorance defense. The configuration file makes it nearly impossible to sustain. The file explicitly lists capabilities. You can demand the user explain why they included a delete_file tool if they didn't know it could delete files. You can introduce documentation showing MCP server capabilities. You can show that configuring a tool requires understanding what it does. Expert testimony can establish that anyone implementing MCP servers should understand what tools they're enabling. This defense fails most frequently because the configuration file proves knowledge.
"I configured the AI to help with task management. I didn't expect it to modify sensitive data or access restricted systems. The configuration scope was limited, but the AI went beyond what I authorized."
How to overcome it: Examine the actual configuration. If the user truly limited scope, the configuration file will reflect that—restricted directories, read-only permissions, limited server access. If the configuration shows broad permissions, the scope creep argument fails. You can demonstrate through the configuration file and network access logs that the capabilities existed and were used as configured. If the user claims the AI exceeded its configuration, you need evidence that it did—and typically, well-designed MCP servers can't exceed their configured capabilities.
As agentic AI systems become more common, organizations need governance policies specifically addressing MCP server configuration and autonomous AI agent deployment. For general counsel and compliance teams, this expansion of liability risk requires new frameworks.
Organizations should develop explicit policies governing:
Organizations should mandate that every MCP server interaction be logged with:
These logs become critical forensic evidence when harmful actions occur. They prove not just what the AI did, but what the human user allowed it to do.
MCP server configuration should follow principle of least privilege:
Organizations should develop incident response procedures specifically for autonomous AI agent incidents:
When an AI system with MCP servers takes harmful action, it leaves a forensic trail. Understanding what artifacts to collect and how to interpret them is essential for investigation.
Where to Find Forensic Artifacts in MCP Server Investigations
The claude_desktop_config.json file (or equivalent configuration for other AI platforms) is the primary evidence artifact. Collect:
Most AI platforms maintain conversation logs showing what the user asked and what the AI responded. These logs are critical because they show:
Secure these logs immediately—they're often stored locally on the user's machine and can be deleted.
Operating system logs and process execution records show when MCP servers were running:
When an MCP server performs file operations, it leaves evidence:
External systems (databases, cloud platforms, communication services) typically log access and operations:
When investigating potential "My AI Did It" liability, a structured investigation framework helps organize evidence and build a compelling narrative. Here are the five critical phases:
The Five-Phase Investigation Framework for AI-Mediated Actions
Question: Was the AI system configured with the capability to perform the harmful action?
Evidence: Examine the configuration file. If the harmful action was database deletion, was a database MCP server configured? If it was file deletion, was a filesystem server configured with delete permissions enabled? This phase is often straightforward—the configuration file either enables the capability or it doesn't.
Testimony: An expert witness should explain what capabilities were enabled and what they permitted.
Question: Who initiated the action? Was it autonomous or user-directed?
Evidence: Examine conversation logs and process execution records. Did the user explicitly request the harmful action? Or did the AI initiate it autonomously? This distinction matters: explicit authorization shows deliberate direction; autonomous action shows delegation. Look for:
Question: When did the configuration happen relative to the harmful action? When did the user leave the system running autonomously?
Evidence: Establish a timeline:
This timeline proves whether the user consciously enabled capabilities and then immediately created conditions for harm (suggesting intent) or enabled capabilities long ago but then left the system running carelessly (suggesting negligence).
Question: Could anyone else have caused this harm? Could the harm have been caused by something other than the AI system?
Evidence: Examine access logs and process records to rule out:
Often, the simplest explanation is that the configured AI system did exactly what it was configured to do. But you must rule out alternatives to be thorough.
Question: Is this an isolated incident or part of a pattern of reckless or negligent configuration?
Evidence: Look for patterns:
Pattern evidence can distinguish between an isolated incident and deliberate misconduct. Multiple similar incidents show knowledge and intent; a single incident combined with lack of training or policy might show negligence.
For attorneys handling litigation involving AI-caused harm, several unique evidentiary challenges arise when the mechanism of action involves autonomous MCP servers.
Standard litigation hold notices don't typically address AI configuration files and logs. Your preservation letter should specifically require:
Be explicit in your preservation letter—many IT departments and individuals don't realize these artifacts exist or need preservation. Without explicit instruction, they may be deleted.
Early in discovery, before formal interrogatories, consider requesting:
These early requests establish the technical foundation before you take depositions. You'll know what capabilities were configured before asking defendants about their knowledge.
When retaining experts for AI-caused harm cases, you need specialists in:
Be careful about experts who come from the AI vendor. They may have conflicts of interest in minimizing the responsibility of the AI system. Seek independent experts when possible.
Defendants will challenge the admissibility of configuration files and conversation logs. Be prepared with arguments that they are:
Regulators are beginning to grapple with how autonomous AI systems fit into existing regulatory frameworks. The "My AI Did It" defense creates regulatory compliance challenges.
Several regulatory agencies are taking positions on autonomous AI:
Based on regulatory signals, we're likely to see new requirements for organizations deploying autonomous AI:
Regulators are likely to draw parallels to existing regulatory frameworks:
When someone claims "My AI did it," here's what you need to know:
When a user configures an MCP server with specific capabilities, they're explicitly consenting to those capabilities. A configuration file enabling delete_file permissions is contemporaneous evidence that the user authorized deletion. The "I didn't know the AI could delete files" defense fails when the configuration file proves they configured deletion capabilities.
Conversation logs between the user and the AI system show what instructions were given and how the AI responded. These logs distinguish between explicit authorization ("Delete the database") and autonomous action (the AI decided deletion was appropriate). Both scenarios create liability, but conversation logs show which scenario occurred.
When did the user enable dangerous capabilities? When did they leave the system running autonomously? When did the harm occur? The timeline proves whether this was deliberately set up or carelessly configured. A user who enables database deletion at 9 am and discovers harm at 5 pm made a deliberate choice to enable those capabilities.
One incident might be negligence. Multiple similar incidents suggest deliberate misconduct. If a user has repeatedly configured dangerous MCP capabilities, received training on safety, violated company policy, and still configured an autonomous system that caused harm, pattern evidence converts negligence to intentional recklessness.
If no MCP interaction logs exist, that's evidence of poor practices. If conversation logs are missing, that suggests deletion or cover-up. If the configuration file is newer than the harm, that's suspicious. Absence of expected evidence can be as powerful as presence of damaging evidence.
Want to dive deeper into AI forensics and autonomous AI liability? Check out these resources:
Comprehensive guide to investigating autonomous AI systems, configuration artifacts, and establishing liability in AI-related incidents. Published by Chapman and Hall/CRC, 2026.
Expert analysis of autonomous AI systems, MCP server configurations, and forensic investigation of AI-caused harm for litigation and regulatory matters.
Recent publications analyzing legal accountability frameworks for autonomous AI systems and emerging regulatory approaches.
Regular insights on autonomous AI, MCP servers, and emerging litigation issues. Subscribe on LinkedIn for the latest analysis on AI accountability and forensic investigation.
Get regular insights on MCP servers, agentic AI liability, and forensic investigation delivered to your inbox.
Subscribe to Beyond the Algorithm