The Karpathy Series · Part 3 of 3

What Your Wiki Will
Actually Cost
at Enterprise Scale

The cost math the wiki posts skip, run at three scales, with assumptions you can challenge.

By Alexander Braun · April 2026 · 7 min read
01 Karpathy & the CFO layer problem 02 Three layers of token waste 03 Wiki costs at enterprise scale
$300M — Salesforce’s entire 2026 Anthropic spend·
5–7% of every knowledge worker’s salary — Rory O’Driscoll, 20VC·
71% of companies exceeded their AI budget in 2025·
60–80% of enterprise queries never need to hit the model at all·
36% annual growth in AI token costs — CloudZero 2026·
10 margin points — what AI governance is worth by 2029 (Gartner)·
$300M — Salesforce’s entire 2026 Anthropic spend·
5–7% of every knowledge worker’s salary — Rory O’Driscoll, 20VC·
71% of companies exceeded their AI budget in 2025·
60–80% of enterprise queries never need to hit the model at all·
36% annual growth in AI token costs — CloudZero 2026·
10 margin points — what AI governance is worth by 2029 (Gartner)·

In Part 2, we laid out the three layers of token waste and established that the semantic layer is 60–80% of the enterprise bill. Here we run the numbers (at personal, mid-market, and large enterprise scale) to show exactly why the wiki cost curve and the semantic layer cost curve diverge.

When Andrej Karpathy’s post about LLM-compiled knowledge wikis went around, the headline number was 70–90% token savings on repeated queries. The replies were mostly enthusiasm. Buried in the same thread was the comment that matters most:

"Simply ignorant. Do you know how much that wiki will cost to maintain? Figure that out first."

That reply gets less attention than it deserves. It’s also exactly the right question. So we figured it out, at three scales, with assumptions you can challenge.

The short version: at personal scale, the critic is wrong. At enterprise scale, the critic is right, and right by roughly two orders of magnitude.

Personal scale, where Karpathy is right

Use his own setup as the baseline. About 100 articles, ~400K words in wiki/, call it 5M tokens of source material. At Claude Sonnet pricing ($3 per million input tokens, $15 per million output):

  • Initial compile. 5M input + ~500K output ≈ $22.50. A few re-passes for cross-linking, call it ~$50 one-time.
  • Per-query cost. The agent pulls ~50K relevant tokens of wiki context per question, about $0.15. At 30 queries/day, ~$135/month.
  • Weekly lint pass (full wiki read, suggestions output) , $30–60/month.
  • Refresh as raw/ grows 20% monthly , $10–20/month.

Total: roughly $200–250/month for a heavy personal user. For a researcher who lives in that corpus 40 hours a week, that’s excellent value. The "wiki maintenance will kill the savings" objection is wrong at this scale. Karpathy is right.

Now move the same architecture into an enterprise and watch the curve.

Mid-market enterprise, where the curve breaks

A typical €100M ARR B2B SaaS. Hudl-shape, or Viessmann-shape. Conservative assumptions: operational data spanning 10+ TB across SAP, Salesforce, MongoDB, and a data lake. Roughly 2–3 trillion tokens in raw form. Sample at 1% and aggressively deduplicate: the wiki compile corpus is ~20–30 billion tokens. Query volume: 10,000+ AI queries per day across knowledge workers. Schema change rate: dozens of fields per week.

The bill:

  • Initial compile. 20B input × $3/M + 2B output × $15/M ≈ $90,000 one-time.
  • Monthly recompile to stay fresh. Operational data changes daily. Full recompile is ~$90,000/month. Incremental lint passes land at $15,000–30,000/month.
  • Per-query cost on the wiki. At ~20K tokens/query × 10K queries/day × ~$5/M blended ≈ $30,000/month.

Wiki at mid-market scale: ~$45,000–120,000/month of new spend, plus ~$90,000 setup, plus the team needed to run the pipeline.

And here’s the part the cost calculator never catches: none of that resolves cross-system conflicts. The wiki dutifully writes five articles called "Revenue" because that’s what it found across five systems. The CFO still gets five answers. The token bill is paid; the original problem is unsolved.

Large enterprise, where the curve is unrecognizable

Apply the same architecture to a Mercedes-Benz or Commerzbank shape, multi-system, multi-region, regulated, post-M&A. Setup runs $1M+ one-time. Monthly: $500K+ to keep fresh on a corpus that legitimately needs daily refresh in places. Plus a multi-person team to operate the pipeline, plus the security and compliance review costs for pushing regulated data through an external LLM at all.

This is not a hypothetical. It’s the cost curve any honest financial model produces from the public component costs. It’s also why most large enterprises that have piloted a wiki-style approach quietly stop talking about it after the first quarter.

The hidden costs the inference invoice doesn’t show

A wiki costs more than the recompile bill because three things scale with the wiki, not with the model, and none of them appear on the OpenAI or Anthropic invoice.

Drift cost. Karpathy’s raw/ is articles and papers, slow-changing. Enterprise schemas change daily. A wiki compiled on Monday is silently wrong by Friday in ways the model can’t detect without re-reading the source. Catching that drift means either continuous recompile (expensive) or an event-driven detector (complex).

Conflict cost. When two articles in the wiki contradict each other, somebody has to adjudicate. In Karpathy’s setup that’s him, taking ten minutes. At enterprise scale it’s three meetings, four stakeholders, a steering committee, and three weeks before the next AI deployment can ship. That’s not a token cost; it’s the larger one.

Compliance cost. Compiling a wiki of regulated data requires pushing that data through an LLM. For an Allianz or DWS or any publicly-listed entity, that’s a regulator conversation, a security review, possibly a "no" from legal. It is not a "buy more inference credits" problem.

$85K/month today.
$214K in three years.

At 36% annual growth (the current industry rate) a typical mid-market AI deployment compounds into a nine-figure annual problem within a single planning horizon.

Today (baseline)$85K/mo
+12 months$116K/mo
+24 months$158K/mo
+36 months$214K/mo
Source: CloudZero State of AI Costs 2026 · 36% CAGR applied to $85K/month median baseline
52%
of CFOs rank cost management their #1 concern in Q1 2026, ahead of talent, regulation, and geopolitical risk. AI token costs are a fast-growing, poorly-understood contributor.
Deloitte CFO Signals, Q1 2026
10 pts
The margin improvement Gartner projects by 2029 for enterprises that deploy AI with semantic governance in place, versus those that don’t.
Gartner AI Governance Forecast, 2024

The math at every scale.

The wiki cost curve and the semantic layer cost curve diverge by roughly two orders of magnitude at enterprise scale, because they’re compressing different objects.

Scale Wiki. Setup Wiki. Monthly LazyFox. Setup LazyFox. Monthly
Personal (Karpathy) ~$50 ~$200 n/a n/a
Mid-market enterprise ~$90K $45K–120K ~$1–2K $300–700
Large enterprise (Mercedes-tier) ~$1M+ $500K+/mo ~$10–50K $2–10K/mo

The contrast: a different cost function

LazyFox compiles the definition graph, not the content corpus. It operates at the field-and-metric level, not the document level.

At the same mid-market enterprise: setup enrichment across ~10,000 distinct fields, schema plus sample values mapped to semantic type, language mapping, and candidate definition, runs about 30–50M tokens total, roughly $1,000–2,000 one-time. Per-query cost is roughly $0: queries are generated deterministically from the governed graph and executed against the source systems, with no LLM in the hot path. Drift handling, schema changes triggering targeted re-enrichment of only affected slices, runs $50–200/month.

Mid-market on LazyFox: ~$2K setup, ~$300–700/month ongoing. And the semantic problem is actually solved, because we targeted the level the conflict lives at.

Why the curves diverge

The wiki’s cost is governed by corpus size (grows with operational data), query volume (grows with adoption), and refresh frequency (grows with schema change rate). None of those is bounded. All three grow as the company grows or as AI adoption grows.

The semantic layer’s cost is governed by the definition surface, bounded, because your enterprise has thousands of fields, not billions. Content scales unboundedly with the enterprise. Meaning doesn’t, there are only so many ways to define "revenue" in a company before someone has to make a decision.

The wiki is excellent architecture compressing the wrong object for the enterprise workload. The right move is not to argue with Karpathy. It’s to ask which surface your bill is actually being burnt against, and to compile that surface.

For most enterprises, the dominant surface is semantic. That’s why we built LazyFox to compress meaning instead of content, and why the monthly cost stops scaling with query volume at the level you’d otherwise be paying.

That’s the layer we’re at.

"From nothing two years ago, this is the largest single external spend that every software company is making."
Jason Lemkin, SaaStr: 20VC Podcast, 2026

Find out which layer your bill is being burnt against.

Share your email and we’ll run a free token cost audit, showing where your knowledge worker queries are burning budget and what a 90-day reduction path looks like.