Selective Ensemble If Uncertain

Three API calls per perspective for six perspectives. Half a minute of waiting. Complete agreement. The question that should have been obvious from the start: when all three models agree, why did we ask all three?

Selective Ensemble If Uncertain

(Our selection)

0
1 2 3 4 5 6 7 8
/
08

Selective Ensemble If Uncertain

Three API calls per perspective for six perspectives. Half a minute of waiting. Complete agreement. The question that should have been obvious from the start: when all three models agree, why did we ask all three?

Selective Ensemble If Uncertain

Trust Without Verification

Three hours debugging SSR truncation. Five middleware fixes that changed nothing. The problem was build output, not runtime. Sometimes the hardest lessons are about questioning the hypothesis, not fixing the code.

Trust Without Verification

When Abysmal Citation Utilization Reveals Context Overflow

Collect over 150 citations across eight agents. Use barely 30 in the final synthesis. Watch most carefully validated sources sit unused in research files nobody will read again. This is about the cost of thoroughness nobody mentions.

When Abysmal Citation Utilization Reveals Context Overflow

When 30K Tokens Is Too Many

Six sessions of feature additions. Nobody checked the cumulative cost. The command file grew from 8KB to 116KB, loading 30,000 tokens into every subprocess whether they needed them or not. The solution wasn't adding more. It was splitting what was already there.

When 30K Tokens Is Too Many

Two Videos Are Better Than One

The hero video looked perfect on desktop. On mobile, it showed roughly a third of what was intended. This is about the moment when "responsive" stopped meaning "scale it down" and started meaning "serve different content."

Two Videos Are Better Than One

The Nine-Phase Plan for Nobody

The images loaded fine. But they could load better, theoretically. So we optimized them. Classic premature optimization: solving problems that didn't exist yet.

The Nine-Phase Plan for Nobody

The Case of the Multiplying Scripts

Converting scattered scripts into a CLI should have been straightforward. It wasn't. The complications came from the AI, not the code.

The Case of the Multiplying Scripts

When Good Enough Is Actually Good Enough

Six days of building, and the first real test actually worked. Eight perspectives, quality scores in the 90s, hundreds of sources validated. Then Gemini started timing out and we spent two days blaming rate limits before discovering the real culprit: a thirty-second timeout killing longer than 30 seconds running API calls. Sometimes the bug isn't in the vendor's system. Sometimes it's a number you typed six months ago.

When Good Enough Is Actually Good Enough