The Agentic Journal
Writing
What I learn building agents, and why it matters
How LangChain Builds Evals for Deep Agents
The best agent evals directly measure agent behavior that matters. Here is how the LangChain team sources data, creates metrics, and runs well-scoped, targeted experiments over time to make agents more accurate and reliable.
TL;DR
The best agent evals directly measure agent behavior that matters. Here is how the LangChain team sources data, creates metrics, and runs targeted experiments to make agents more accurate and reliable.
How Middleware Lets You Customize Your Agent Harness
LangChain built a middleware system for Deep Agents that lets developers intercept and customize any part of the agent lifecycle, from before the first model call to after the final result, without modifying core agent logic.
TL;DR
Agent middleware gives developers precise control over logging, retries, guardrails, and tool behavior by hooking into defined lifecycle points. Here is how LangChain designed it.
03
Designing Multi-Agent Systems That Stay Coherent
When agents delegate to other agents, state management becomes the hardest problem in the room. Here is how I think about orchestration, trajectory coherence, and keeping multi-agent systems from unraveling.
04
Context Window Management: The Underrated Skill
Most production agent failures trace back to context management: truncated history, lost tool results, or prompts that grow until the model loses the thread. Here is the approach I use.
05
From Prototype to Production: What Actually Breaks
Demos are easy. Production is where agent systems face real users, edge cases, and failure modes no benchmark prepared you for. Here is what I have learned shipping agents to production.