The Agentic Journal

Writing

What I learn building agents, and why it matters

Agent EvaluationMarch 202612 min read

How LangChain Builds Evals for Deep Agents

The best agent evals directly measure agent behavior that matters. Here is how the LangChain team sources data, creates metrics, and runs well-scoped, targeted experiments over time to make agents more accurate and reliable.

Read Post

TL;DR

The best agent evals directly measure agent behavior that matters. Here is how the LangChain team sources data, creates metrics, and runs targeted experiments to make agents more accurate and reliable.

Agent EngineeringMarch 202610 min read

How Middleware Lets You Customize Your Agent Harness

LangChain built a middleware system for Deep Agents that lets developers intercept and customize any part of the agent lifecycle, from before the first model call to after the final result, without modifying core agent logic.

Read Post

TL;DR

Agent middleware gives developers precise control over logging, retries, guardrails, and tool behavior by hooking into defined lifecycle points. Here is how LangChain designed it.

Agent Architecture9 min read

Designing Multi-Agent Systems That Stay Coherent

When agents delegate to other agents, state management becomes the hardest problem in the room. Here is how I think about orchestration, trajectory coherence, and keeping multi-agent systems from unraveling.

Coming Soon

Agent Engineering7 min read

Context Window Management: The Underrated Skill

Most production agent failures trace back to context management: truncated history, lost tool results, or prompts that grow until the model loses the thread. Here is the approach I use.

Coming Soon

Production AI10 min read

From Prototype to Production: What Actually Breaks

Demos are easy. Production is where agent systems face real users, edge cases, and failure modes no benchmark prepared you for. Here is what I have learned shipping agents to production.

Coming Soon