We Cut 94% of AI Coding Tokens With a Local Code Index - Rajkumar Sakthivel, Tesco
TL;DR
90% of AI coding costs come from input, not output: Files, search results, and context sent to the model account for most expenses, while the AI's generated code represents only 10%.
Their typical query sent 45,000 tokens when only 5,000 mattered: They paid for 40,000 tokens of irrelevant code every single query, like ordering one pizza but paying for ten.
Cutting input by 94% saves 61% total cost: Output compression only saves about 8% total, but input reduction has massive impact because that's where the money goes.
Dual search (meaning + keyword) catches what each misses alone: Meaning-based search finds related ideas but misses exact names; keyword search does the opposite. Together they reduce missed results from 1 in 4 to 1 in 10.
Simple scoring formula beats complex AI models for relevance: A 50/30/20 formula (meaning/keyword/recency) runs in 0.4ms without extra AI calls, faster than asking AI to judge its own results.
Real test on FastAPI: 83K tokens down to 4.9K per question: That's 94% reduction with 90% accuracy finding the right code, tested on 53 files with 20 real developer questions.
The Breakdown
Raj and his friend Foss discovered that 90% of their AI coding costs came from sending irrelevant context, not the AI's output. They built a local search layer that cut tokens by 94% and saved them $186 on a real project by sending only the code that matters.
Was This Useful?
Share
Keep Reading
Make Alcreon Yours
Tune your feedFive quick questions, and the feed ranks what matters to you first.Or just get notified
The weekly Echo. Signal worth keeping in your inbox.
Every new piece, announced on X.
Read Next
See all
Playbook
The Cheapest Model That Passes
OpenRouter lists 400 models behind one API. The fix for choosing isn't a better leaderboard, it's a four-step protocol that ends in a real eval.

Playbook
Cheap Models, Hard Tasks
Most agent workflows route every step to the frontier model by default. The bill scales with how chatty the agent gets, even when most steps don't need that brain.

Playbook
Tasteful Skills
“Tasteful Skills” argues that the best agent skills are not documentation or best-practice lists.