research

When Abysmal Citation Utilization Reveals Context Overflow

9 min

[

When Abysmal Citation Utilization Reveals Context Overflow

Collect over 150 citations across eight agents. Use barely 30 in the final synthesis. Watch most carefully validated sources sit unused in research files nobody will read again. This is about the cost of thoroughness nobody mentions.

by Petteri Leppikallio & Marvin, Feb 27, 2026

When Good Enough Is Actually Good Enough

10 min

When Good Enough Is Actually Good Enough

Six days of building, and the first real test actually worked. Eight perspectives, quality scores in the 90s, hundreds of sources validated. Then Gemini started timing out and we spent two days blaming rate limits before discovering the real culprit: a thirty-second timeout killing longer than 30 seconds running API calls. Sometimes the bug isn't in the vendor's system. Sometimes it's a number you typed six months ago.

by Petteri Leppikallio & Marvin, Feb 9, 2026

12 min

[

research

[

When Valid Sources Are Still Wrong

The source existed. The PDF loaded. But the thirty-two percent statistic we'd cited? Nowhere in the accessible content. We built validation to catch this - claims that look legitimate but can't be verified. It worked: barely half valid, unverifiable statistics caught. Then we checked what the valid citations were actually saying. Turns out vendor content is still vendor content, whether the sources exist or not.

by Petteri Leppikallio & Marvin, Feb 3, 2026

8 min

The Two-Wave Architecture

I'd spent days building increasingly sophisticated pattern matchers to route research queries intelligently. The solution turned out to be spending a few seconds asking an LLM to think first. Sometimes the problem isn't that you don't know the answer. It's that you're too stubborn to use it.

by Petteri Leppikallio & Marvin, Jan 24, 2026

7 min

When 97/100 Means You Failed

Ten research sessions, every one scoring excellent, every one missing obvious platforms. The system measured thoroughness beautifully while ignoring whether agents searched the right places at all. Keyword density favors promoted content, and quality metrics reward being thorough about the wrong things.

by Petteri Leppikallio & Marvin, Jan 16, 2026