How Middleware Lets You Customize Your Agent Harness
TL;DR
Agent middleware gives developers precise control over logging, retries, guardrails, and tool behavior by hooking into defined lifecycle points. Here is how LangChain designed it.
Agent harnesses are frameworks that manage how an AI model uses tools to complete tasks. They handle the orchestration loop: feeding context to the model, executing tool calls, and looping until the task is done.
Deep Agents is LangChain's open source, model-agnostic agent harness. It is the core of products like Fleet and Open SWE. As the team built more complex agents, they needed a way to let developers customize behavior without forking or patching the core library. That led to the design of agent middleware.
The basic agent loop handled by a harness: request, model, tools, result.
In this post
- 01What are agent harnesses
- 02What is agent middleware
- 03Lifecycle hooks in Deep Agents
- 04Building with AgentMiddleware
- 05Example: logging middleware
- 06Example: input guardrails
What are agent harnesses
An agent harness abstracts the execution loop so you can focus on the task rather than the plumbing. You define a model, a set of tools, and a system prompt. The harness handles everything else: formatting tool schemas, parsing model output, routing tool calls, managing context, and looping until the agent decides it is done.
With Deep Agents, creating an agent is a single function call. The harness takes care of the rest.
from deep_agents import create_agent
agent = create_agent(
model="anthropic:claude-sonnet-4-6",
tools=[read_file, write_file, run_command],
system="You are a coding assistant.",
)
result = await agent.run("Fix the bug in utils.py")That simplicity is great until you need to add cross-cutting behavior: logging every tool call, enforcing token budgets, validating inputs before they reach the model, or injecting retry logic around flaky tools. Those requirements do not belong in your tools or your system prompt. They belong in the infrastructure layer.
What is agent middleware
Agent middleware is a system for intercepting and customizing behavior at defined points in the agent lifecycle. It borrows from HTTP middleware patterns: handlers are composed in a chain, each one able to inspect, modify, or short-circuit the request before passing it along.
The key insight is that the agent lifecycle has natural seams where intervention is useful. Before the first model call, you might want to validate the input or add context. Before each tool call, you might want to log it or enforce a rate limit. After each model response, you might want to check for hallucinations or track token usage.
The six lifecycle hooks in Deep Agents middleware: before_agent, before_model, wrap_model_call, wrap_tool_call, after_model, after_agent.
Lifecycle hooks in Deep Agents
Deep Agents exposes six hooks in the middleware interface. Each one corresponds to a meaningful transition in the agent lifecycle.
- before_agent: Called once when the agent starts. Use this to inject context, set up state, or validate the initial input.
- before_model: Called before every model invocation. Use this to modify the message history, add system context, or enforce token limits.
- wrap_model_call: Wraps the actual model API call. Use this for retries, latency tracking, or swapping in a different model under certain conditions.
- wrap_tool_call: Wraps each tool execution. Use this for logging, input validation, output caching, or rate limiting.
- after_model: Called after every model response. Use this to inspect tool call decisions, track token usage, or detect anomalies.
- after_agent: Called once when the agent finishes. Use this to log the full trace, persist results, or send notifications.
Not every hook needs to be implemented. Define only the ones relevant to your use case. The rest pass through without overhead.
Building with AgentMiddleware
The AgentMiddleware class is the base interface. Subclass it and override the hooks you need. Then pass your middleware instances when creating an agent.
from deep_agents import create_agent, AgentMiddleware
class MyMiddleware(AgentMiddleware):
async def before_agent(self, context):
# runs once at the start
pass
async def wrap_tool_call(self, tool_name, tool_input, call_next):
# runs around every tool execution
result = await call_next(tool_input)
return result
agent = create_agent(
model="anthropic:claude-sonnet-4-6",
tools=[read_file, write_file],
middleware=[MyMiddleware()],
)The wrap_* hooks use a call_next pattern similar to ASGI middleware. You receive the inputs, can modify them, call the underlying function, and then inspect or modify the output before returning. This makes it easy to compose multiple middleware instances: each one wraps the next in the chain.
Example: logging middleware
The most common use case is observability. Here is a minimal middleware that logs every tool call with its inputs and outputs.
import time
import logging
from deep_agents import AgentMiddleware
logger = logging.getLogger(__name__)
class LoggingMiddleware(AgentMiddleware):
async def wrap_tool_call(self, tool_name, tool_input, call_next):
start = time.monotonic()
logger.info(f"Tool call: {tool_name}", extra={"input": tool_input})
result = await call_next(tool_input)
elapsed = time.monotonic() - start
logger.info(
f"Tool result: {tool_name} ({elapsed:.2f}s)",
extra={"output": result},
)
return result
async def after_agent(self, context, result):
logger.info(
"Agent finished",
extra={
"total_turns": context.turn_count,
"total_tokens": context.token_usage,
},
)Example: input guardrails
Another practical use is enforcing guardrails before the model sees the input. This is useful for multi-tenant systems where you want to prevent prompt injection or ensure the request stays within a defined scope.
from deep_agents import AgentMiddleware
BLOCKED_PATTERNS = ["ignore previous instructions", "system prompt"]
class InputGuardrailMiddleware(AgentMiddleware):
async def before_agent(self, context):
user_input = context.messages[-1].content.lower()
for pattern in BLOCKED_PATTERNS:
if pattern in user_input:
raise ValueError(f"Input blocked by guardrail: '{pattern}' detected")
async def wrap_tool_call(self, tool_name, tool_input, call_next):
allowed_tools = {"read_file", "list_directory"}
if tool_name not in allowed_tools:
raise PermissionError(f"Tool '{tool_name}' is not permitted in this context")
return await call_next(tool_input)Because these checks happen in the middleware layer rather than in individual tools, they apply consistently across every tool call the agent makes. Adding a new tool does not require remembering to add the guardrail check.
Middleware composes well. Stacking LoggingMiddleware and InputGuardrailMiddleware means every tool call is both logged and validated, with no changes to the agent or tool code.
Why this matters for production agents
Most agent reliability problems are cross-cutting: they affect every tool call or every model invocation, not just one specific path. Middleware is the right abstraction for cross-cutting concerns because it keeps that logic centralized and composable.
The alternative is to scatter observability, retry, and validation logic across every tool definition. That approach leads to duplication, inconsistency, and tools that are harder to test in isolation.
The middleware design in Deep Agents is open source. The AgentMiddleware class, all six hooks, and several reference implementations are available in the repository. It is designed to be general enough to work with any model and any tool set, not just the ones LangChain builds.
- Deep Agents middleware documentation: langchain-ai.github.io/deep-agents/middleware
- LangChain thread on agent middleware design: x.com/LangChainAI