Agent EngineeringMarch 202610 min read

How Middleware Lets You Customize Your Agent Harness

TL;DR

Agent middleware gives developers precise control over logging, retries, guardrails, and tool behavior by hooking into defined lifecycle points. Here is how LangChain designed it.

Agent harnesses are frameworks that manage how an AI model uses tools to complete tasks. They handle the orchestration loop: feeding context to the model, executing tool calls, and looping until the task is done.

Deep Agents is LangChain's open source, model-agnostic agent harness. It is the core of products like Fleet and Open SWE. As the team built more complex agents, they needed a way to let developers customize behavior without forking or patching the core library. That led to the design of agent middleware.

Diagram showing a simple agent loop: request goes to model, model calls tools, tools return observations, result is returned

The basic agent loop handled by a harness: request, model, tools, result.

In this post

01What are agent harnesses
02What is agent middleware
03Lifecycle hooks in Deep Agents
04Building with AgentMiddleware
05Example: logging middleware
06Example: input guardrails

What are agent harnesses

An agent harness abstracts the execution loop so you can focus on the task rather than the plumbing. You define a model, a set of tools, and a system prompt. The harness handles everything else: formatting tool schemas, parsing model output, routing tool calls, managing context, and looping until the agent decides it is done.

With Deep Agents, creating an agent is a single function call. The harness takes care of the rest.

python

from deep_agents import create_agent

agent = create_agent(
    model="anthropic:claude-sonnet-4-6",
    tools=[read_file, write_file, run_command],
    system="You are a coding assistant.",
)

result = await agent.run("Fix the bug in utils.py")

That simplicity is great until you need to add cross-cutting behavior: logging every tool call, enforcing token budgets, validating inputs before they reach the model, or injecting retry logic around flaky tools. Those requirements do not belong in your tools or your system prompt. They belong in the infrastructure layer.

What is agent middleware

Agent middleware is a system for intercepting and customizing behavior at defined points in the agent lifecycle. It borrows from HTTP middleware patterns: handlers are composed in a chain, each one able to inspect, modify, or short-circuit the request before passing it along.

The key insight is that the agent lifecycle has natural seams where intervention is useful. Before the first model call, you might want to validate the input or add context. Before each tool call, you might want to log it or enforce a rate limit. After each model response, you might want to check for hallucinations or track token usage.

Agent middleware lifecycle diagram showing before_agent, before_model, wrap_model_call, wrap_tool_call, after_model, and after_agent hooks around the agent loop

The six lifecycle hooks in Deep Agents middleware: before_agent, before_model, wrap_model_call, wrap_tool_call, after_model, after_agent.

Lifecycle hooks in Deep Agents

Deep Agents exposes six hooks in the middleware interface. Each one corresponds to a meaningful transition in the agent lifecycle.

before_agent: Called once when the agent starts. Use this to inject context, set up state, or validate the initial input.
before_model: Called before every model invocation. Use this to modify the message history, add system context, or enforce token limits.
wrap_model_call: Wraps the actual model API call. Use this for retries, latency tracking, or swapping in a different model under certain conditions.
wrap_tool_call: Wraps each tool execution. Use this for logging, input validation, output caching, or rate limiting.
after_model: Called after every model response. Use this to inspect tool call decisions, track token usage, or detect anomalies.
after_agent: Called once when the agent finishes. Use this to log the full trace, persist results, or send notifications.

Not every hook needs to be implemented. Define only the ones relevant to your use case. The rest pass through without overhead.

Building with AgentMiddleware

The AgentMiddleware class is the base interface. Subclass it and override the hooks you need. Then pass your middleware instances when creating an agent.

python

from deep_agents import create_agent, AgentMiddleware

class MyMiddleware(AgentMiddleware):
    async def before_agent(self, context):
        # runs once at the start
        pass

    async def wrap_tool_call(self, tool_name, tool_input, call_next):
        # runs around every tool execution
        result = await call_next(tool_input)
        return result

agent = create_agent(
    model="anthropic:claude-sonnet-4-6",
    tools=[read_file, write_file],
    middleware=[MyMiddleware()],
)

The wrap_* hooks use a call_next pattern similar to ASGI middleware. You receive the inputs, can modify them, call the underlying function, and then inspect or modify the output before returning. This makes it easy to compose multiple middleware instances: each one wraps the next in the chain.

Example: logging middleware

The most common use case is observability. Here is a minimal middleware that logs every tool call with its inputs and outputs.

python

import time
import logging
from deep_agents import AgentMiddleware

logger = logging.getLogger(__name__)

class LoggingMiddleware(AgentMiddleware):
    async def wrap_tool_call(self, tool_name, tool_input, call_next):
        start = time.monotonic()
        logger.info(f"Tool call: {tool_name}", extra={"input": tool_input})

        result = await call_next(tool_input)

        elapsed = time.monotonic() - start
        logger.info(
            f"Tool result: {tool_name} ({elapsed:.2f}s)",
            extra={"output": result},
        )
        return result

    async def after_agent(self, context, result):
        logger.info(
            "Agent finished",
            extra={
                "total_turns": context.turn_count,
                "total_tokens": context.token_usage,
            },
        )

Example: input guardrails

Another practical use is enforcing guardrails before the model sees the input. This is useful for multi-tenant systems where you want to prevent prompt injection or ensure the request stays within a defined scope.

python

from deep_agents import AgentMiddleware

BLOCKED_PATTERNS = ["ignore previous instructions", "system prompt"]

class InputGuardrailMiddleware(AgentMiddleware):
    async def before_agent(self, context):
        user_input = context.messages[-1].content.lower()

        for pattern in BLOCKED_PATTERNS:
            if pattern in user_input:
                raise ValueError(f"Input blocked by guardrail: '{pattern}' detected")

    async def wrap_tool_call(self, tool_name, tool_input, call_next):
        allowed_tools = {"read_file", "list_directory"}

        if tool_name not in allowed_tools:
            raise PermissionError(f"Tool '{tool_name}' is not permitted in this context")

        return await call_next(tool_input)

Because these checks happen in the middleware layer rather than in individual tools, they apply consistently across every tool call the agent makes. Adding a new tool does not require remembering to add the guardrail check.

Middleware composes well. Stacking LoggingMiddleware and InputGuardrailMiddleware means every tool call is both logged and validated, with no changes to the agent or tool code.

Why this matters for production agents

Most agent reliability problems are cross-cutting: they affect every tool call or every model invocation, not just one specific path. Middleware is the right abstraction for cross-cutting concerns because it keeps that logic centralized and composable.

The alternative is to scatter observability, retry, and validation logic across every tool definition. That approach leads to duplication, inconsistency, and tools that are harder to test in isolation.

The middleware design in Deep Agents is open source. The AgentMiddleware class, all six hooks, and several reference implementations are available in the repository. It is designed to be general enough to work with any model and any tool set, not just the ones LangChain builds.

Deep Agents middleware documentation: langchain-ai.github.io/deep-agents/middleware
LangChain thread on agent middleware design: x.com/LangChainAI

How Middleware Lets You Customize Your Agent Harness

What are agent harnesses

What is agent middleware

Lifecycle hooks in Deep Agents

Building with AgentMiddleware

Example: logging middleware

Example: input guardrails

Why this matters for production agents

How LangChain Builds Evals for Deep Agents

Designing Multi-Agent Systems That Stay Coherent