Insights

Thoughts on building AI systems that actually work.

January 15, 2024

Agents need evals, not vibes

Most agent demos succeed because the human silently compensates for the agent. In production, nobody is there to nudge the model. Here's how to build agents that survive contact with reality.

agentsevaluationproduction

January 10, 2024

LLM cost control is a product feature

If your unit economics are powered by tokens, you're running a software business and a commodities desk at the same time. Here's how to govern LLM spend without breaking quality.

costoptimizationllm

January 5, 2024

Chaos-proof delivery: shipping AI with TDD + CI

AI moves too fast for discipline? Reality: AI moves too fast without discipline. Here's how TDD and CI actually work for AI systems.

tddci-cdengineering

December 28, 2023

Fine-tune, RAG, or prompt? A ruthless decision framework

Start cheap. Most teams fine-tune too early. Here's a ruthless decision framework for when to use prompting, RAG, or fine-tuning.

fine-tuningragpromptingarchitecture

December 20, 2023

A production blueprint for one agentic workflow

Take one workflow and ship it as a reliable service. Here's the minimal architecture that actually works in production.

agentsarchitectureproduction