[ infer ] inference.
// writing field notes

Field notes.

Working notes from the problems we’re in the middle of. Agentic AI for corporate use cases, written by the people doing the build, not the people selling the deck.

No vendor pitches. No conference keynote framing. Every citation a real link you can click. Every claim something we’ve seen, shipped, or broken ourselves.

filter
n. 01 14 min

The Jammed-In Agent

OpenClaw, Paperclip, Hermes Agent and the 2026 wave have shifted the conversation. Most of them still treat context, memory and tool permissions as things to bolt on at the end.

A new generation of agents has pushed the field forward on planning, durable execution and procedural memory. The context layer underneath most of them is still held together with glue. This is a walk through what the 2026 agents got right, where their architecture still falls short, and what properly engineered agents look like.

17 apr 2026 read
n. 02 14 min

The Vibe-Coded Attack Surface

The cost of shipping a web app has collapsed. The cost of attacking one is collapsing next. 2026 is the year those curves cross.

Autonomous agents are already topping HackerOne leaderboards. AI app builders are already shipping the same authentication bug across hundreds of live apps at once. When the attacker cost curve crosses the builder cost curve, the middle of the market gets reshaped in a quarter.

17 apr 2026 read
n. 03 13 min

The Forty-Dollar Invoice

What AP automation actually costs you before the agents arrive, and why the ROI story is usually three layers deeper than the deck suggests.

Most CFOs count the labour line on their AP stack. That is the smallest number. The real cost is rework, late-payment penalties, missed early-payment discounts, duplicate payments, invoice fraud, and Month 13 cleanup. Agentic AP does not just reduce labour. It collapses the whole stack. But only if deployed with a real threat model.

16 apr 2026 read
n. 04 13 min

The RAG Demo Tax

Why your proof-of-concept works, your production doesn't, and the gap has a name.

RAG demos work because they sit on small, clean corpora with one user and a generous latency budget. Production RAG is a different engineering problem. Teams discover this around month four, when accuracy quietly collapses. Here is why, and what the teams who get it right actually build.

15 apr 2026 read
n. 05 13 min

Prompt Injection Is Not a Sidebar

It is the threat model. For any LLM system with tool use or untrusted input, the filter-based mental model is already broken.

Most enterprise LLM deployments treat prompt injection as something to put a filter in front of. That framing is backwards. For any LLM that reads untrusted content or holds a tool, prompt injection is the core threat model. There is no filter solution. The durable defences are architectural.

14 apr 2026 read