The cost math the wiki posts skip, run at three scales, with assumptions you can challenge.
In Part 2, we laid out the three layers of token waste and established that the semantic layer is 60–80% of the enterprise bill. Here we run the numbers (at personal, mid-market, and large enterprise scale) to show exactly why the wiki cost curve and the semantic layer cost curve diverge.
When Andrej Karpathy’s post about LLM-compiled knowledge wikis went around, the headline number was 70–90% token savings on repeated queries. The replies were mostly enthusiasm. Buried in the same thread was the comment that matters most:
"Simply ignorant. Do you know how much that wiki will cost to maintain? Figure that out first."
That reply gets less attention than it deserves. It’s also exactly the right question. So we figured it out, at three scales, with assumptions you can challenge.
The short version: at personal scale, the critic is wrong. At enterprise scale, the critic is right, and right by roughly two orders of magnitude.
Use his own setup as the baseline. About 100 articles, ~400K words in wiki/, call it 5M tokens of source material. At Claude Sonnet pricing ($3 per million input tokens, $15 per million output):
raw/ grows 20% monthly , $10–20/month.Total: roughly $200–250/month for a heavy personal user. For a researcher who lives in that corpus 40 hours a week, that’s excellent value. The "wiki maintenance will kill the savings" objection is wrong at this scale. Karpathy is right.
Now move the same architecture into an enterprise and watch the curve.
A typical €100M ARR B2B SaaS. Hudl-shape, or Viessmann-shape. Conservative assumptions: operational data spanning 10+ TB across SAP, Salesforce, MongoDB, and a data lake. Roughly 2–3 trillion tokens in raw form. Sample at 1% and aggressively deduplicate: the wiki compile corpus is ~20–30 billion tokens. Query volume: 10,000+ AI queries per day across knowledge workers. Schema change rate: dozens of fields per week.
The bill:
Wiki at mid-market scale: ~$45,000–120,000/month of new spend, plus ~$90,000 setup, plus the team needed to run the pipeline.
And here’s the part the cost calculator never catches: none of that resolves cross-system conflicts. The wiki dutifully writes five articles called "Revenue" because that’s what it found across five systems. The CFO still gets five answers. The token bill is paid; the original problem is unsolved.
Apply the same architecture to a Mercedes-Benz or Commerzbank shape, multi-system, multi-region, regulated, post-M&A. Setup runs $1M+ one-time. Monthly: $500K+ to keep fresh on a corpus that legitimately needs daily refresh in places. Plus a multi-person team to operate the pipeline, plus the security and compliance review costs for pushing regulated data through an external LLM at all.
This is not a hypothetical. It’s the cost curve any honest financial model produces from the public component costs. It’s also why most large enterprises that have piloted a wiki-style approach quietly stop talking about it after the first quarter.
A wiki costs more than the recompile bill because three things scale with the wiki, not with the model, and none of them appear on the OpenAI or Anthropic invoice.
Drift cost. Karpathy’s raw/ is articles and papers, slow-changing. Enterprise schemas change daily. A wiki compiled on Monday is silently wrong by Friday in ways the model can’t detect without re-reading the source. Catching that drift means either continuous recompile (expensive) or an event-driven detector (complex).
Conflict cost. When two articles in the wiki contradict each other, somebody has to adjudicate. In Karpathy’s setup that’s him, taking ten minutes. At enterprise scale it’s three meetings, four stakeholders, a steering committee, and three weeks before the next AI deployment can ship. That’s not a token cost; it’s the larger one.
Compliance cost. Compiling a wiki of regulated data requires pushing that data through an LLM. For an Allianz or DWS or any publicly-listed entity, that’s a regulator conversation, a security review, possibly a "no" from legal. It is not a "buy more inference credits" problem.
At 36% annual growth (the current industry rate) a typical mid-market AI deployment compounds into a nine-figure annual problem within a single planning horizon.
The wiki cost curve and the semantic layer cost curve diverge by roughly two orders of magnitude at enterprise scale, because they’re compressing different objects.
| Scale | Wiki. Setup | Wiki. Monthly | LazyFox. Setup | LazyFox. Monthly |
|---|---|---|---|---|
| Personal (Karpathy) | ~$50 | ~$200 | n/a | n/a |
| Mid-market enterprise | ~$90K | $45K–120K | ~$1–2K | $300–700 |
| Large enterprise (Mercedes-tier) | ~$1M+ | $500K+/mo | ~$10–50K | $2–10K/mo |
LazyFox compiles the definition graph, not the content corpus. It operates at the field-and-metric level, not the document level.
At the same mid-market enterprise: setup enrichment across ~10,000 distinct fields, schema plus sample values mapped to semantic type, language mapping, and candidate definition, runs about 30–50M tokens total, roughly $1,000–2,000 one-time. Per-query cost is roughly $0: queries are generated deterministically from the governed graph and executed against the source systems, with no LLM in the hot path. Drift handling, schema changes triggering targeted re-enrichment of only affected slices, runs $50–200/month.
Mid-market on LazyFox: ~$2K setup, ~$300–700/month ongoing. And the semantic problem is actually solved, because we targeted the level the conflict lives at.
The wiki’s cost is governed by corpus size (grows with operational data), query volume (grows with adoption), and refresh frequency (grows with schema change rate). None of those is bounded. All three grow as the company grows or as AI adoption grows.
The semantic layer’s cost is governed by the definition surface, bounded, because your enterprise has thousands of fields, not billions. Content scales unboundedly with the enterprise. Meaning doesn’t, there are only so many ways to define "revenue" in a company before someone has to make a decision.
The wiki is excellent architecture compressing the wrong object for the enterprise workload. The right move is not to argue with Karpathy. It’s to ask which surface your bill is actually being burnt against, and to compile that surface.
For most enterprises, the dominant surface is semantic. That’s why we built LazyFox to compress meaning instead of content, and why the monthly cost stops scaling with query volume at the level you’d otherwise be paying.
That’s the layer we’re at.
"From nothing two years ago, this is the largest single external spend that every software company is making."Jason Lemkin, SaaStr: 20VC Podcast, 2026
Share your email and we’ll run a free token cost audit, showing where your knowledge worker queries are burning budget and what a 90-day reduction path looks like.