iSteps™: The Building Blocks of Autonomous Investigations

Summary

Investigation is the real bottleneck in today’s SOC. Playbooks are too rigid, and raw LLM agents are too unreliable. At Embed, we built a third path: Investigation Steps (iSteps), which are structured, investigative building blocks that:

gather evidence
break questions into sub-questions
produce reasoned conclusions with a full chain of evidence.

The result is AI-driven investigation that analysts can trust because every conclusion is transparent and auditable. This is Part 1 of a two-part series. Here, we cover why the current approaches to security automation (SOAR playbooks and unconstrained LLM agents) fall short of what investigation actually demands. In Part 2, we go under the hood of iSteps to show how they deliver the reliability and transparency that neither playbooks nor raw LLMs can.

The Investigation Gap

SOCs have a lot of problems. A shortage of alerts isn’t typically one of them. What they lack is the capacity to investigate the ones that matter.

The space between “an alert fired” and “here’s what actually happened” is where the real work lives. It’s cognitive, analytical, and context-heavy. An analyst has to pull data from multiple systems, ask follow-up questions, test hypotheses, and synthesize findings into a judgment call. That process is what we mean by investigation, and it’s the bottleneck that determines whether a SOC is effective.

The tools on either side of this gap haven’t solved it. Raw alerting generates the work. SOAR playbooks tried to automate it, but couldn’t make the complexity manageable. And the latest wave of LLM-based agents promises flexibility but introduces new failure modes.

At Embed Security, we’ve built an agentic AI system that autonomously investigates security alerts. The system has multiple components that work together to plan, execute, and synthesize investigations. A critical piece of what makes it reliable, transparent, and accurate is a concept we call Investigation Steps, or iSteps. This post explains what they are, how they work, and why the design choices behind them matter.

Where Playbooks Break Down

SOAR playbooks encode investigation logic as if/else decision trees. For a given alert type, someone defines: look up this IP, check this user, query this log source, branch based on the result. You define the logic once and let the system run it forever.

In practice, this approach has the same fundamental limitation that expert systems hit in the 1980s. That parallel is worth taking seriously. Expert systems worked well enough when the domain knowledge was crisp and stable. Clear-cut rules work well for well-defined situations. But most real-world knowledge isn’t like that. It’s fuzzy, contextual, and constantly shifting. Expert systems struggled with that, and the maintenance burden made it worse: the cost of extracting, codifying, and updating expert reasoning eventually overwhelmed the value the systems produced. SOAR playbooks are replaying the same failure mode today. Investigation logic is full of gray areas, and judgment calls that don’t reduce cleanly to if/else branches, and the threat landscape shifts fast enough that even the rules that do work need constant upkeep. (I wrote about this parallel in more detail in a previous blog post, The Evolution of Security Automation.)

Every edge case, every vendor API change, every new attacker technique requires someone to go back and update the tree. The result is that playbooks are expensive to build and maintain. And they still can’t reason. When the real world deviates from what was codified, playbooks either break or produce misleading results. Security teams have lived this firsthand.

Why “Just Use an LLM” Isn’t the Answer Either

On the other end of the spectrum: give an AI agent access to your security tools and tell it to investigate.

It’s appealing in theory, but it’s essentially giving a junior analyst access to every tool in your stack with no runbook and no supervision. They’ll do something, and it’ll look like an investigation, but you won’t be confident in the result. The underlying reason is something the AI community has long understood: abstraction and hierarchy matter for agent performance. If you give an agent only low-level primitives (make this API call, parse this field, check this list), it needs to reliably sequence together dozens of small operations into a coherent investigation. Each additional tool or operation multiplies the decision complexity at every step, so the space of possible action sequences grows combinatorially. As I discussed in Why AI SOC Agents Fail, this makes plans fragile. Reasoning compounds errors, and outputs become less predictable.

The failure mode here is subtle. The investigation looks reasonable. The language is fluent, the structure seems logical. But critical steps can get skipped. And because there’s no structure to audit, you can’t easily tell when things have gone wrong. A confident-sounding paragraph about why an alert is benign isn’t the same as a traceable chain of evidence supporting that conclusion.

What’s needed is something between rigid playbooks and unconstrained LLM agents: structured enough to be reliable and transparent, but intelligent enough to reason dynamically. That’s what iSteps are. Get the in-depth details about iSteps in Part 2 of this series.

Want to see how Embed works as the decision layer for your SOC? Request a demo.