Essays & notes
Notes on AI, systems, and
software that works in the real world.
Writing on decision frontiers, LLM agents in production, and accidental financial software.
Latest writing
All posts-
Four boundaries for an agent with access to your books
An agent that reconciles invoices and posts to your ledger holds the keys to your bank. The four boundaries that decide whether a prompt injection can move money.
-
Your accounting agent isn't incapable, it's unreliable
Aggregate pass@1 falls from 76% to 52% as tasks get longer. In an agent that posts to your ledger, that gap is mis-stated entries. Capability isn't reliability.
-
Three parts, one agent: Phil Schmid's 2026 stack
In 22 days, Phil Schmid published three deep dives — skills, MCP, subagents — that read together are the unwritten manual for building a serious agent in 2026.
-
Three proofs the harness matters as much as the model
Firefox found 13x more bugs, Zenith won 5 of 8 tasks at 43% cost, Eugene Yan shared his workflow. The signal: invest in your harness, not the model.