How every number is calculated
The credibility of this index rests entirely on transparency. Nothing here is a black box: the rules below, the source data, and the code fully determine every figure.
1. Evidence ledger first, index second
We do not keep a flat date, company, tokens/day table. Disclosures are
heterogeneous: “monthly tokens,” “fiscal-year tokens,” API-only, router pass-through,
app-level consumption. So we store observations that preserve exactly what
each source said, then derive the series from them.
- Unit: everything is converted to trillions of tokens per day (monthly ÷ 30.44, yearly ÷ 365.25).
- Date: each observation is anchored to the midpoint of its reporting period, never the announcement date. A “980T in July 2025” disclosure sits in July 2025.
2. Two ledgers, never mixed in a headline
Supply
Who processed the tokens: Google, OpenAI, Microsoft Foundry, Fireworks, the China aggregate.
Demand
Who generated the demand: Harvey, Cursor, consumer apps. These tokens are also served by supply providers.
Router
Pass-through (OpenRouter), served by the supply-side providers, so it would double-count if summed in.
Demand and router tokens are already served by supply providers, so summing across ledgers double-counts. The reported floor sums supply + national-aggregate only.
3. Overlap handling: the double-count trap
Two explicit fields resolve overlap by human judgment, not silent summation:
contained_in (an observation’s superset) and include_in_floor.
The floor sums only the curated, non-overlapping set.
contained_in
the aggregate and excluded from the floor. Summing them would invent 120 T/day of phantom usage.
4. Estimated band (layer 2)
Above the floor sit estimates for names with no usable disclosure, each by the method that fits its business: revenue-implied for usage-billed providers, usage-implied for free/consumer surfaces. They are additive (none duplicate a floor row), low-confidence, and every assumption is recorded. See the estimates →
5. Compute-implied capacity (layer 3)
An independent cross-check from the hardware side, bounding demand without reference to any disclosure or revenue figure:
capacity = accelerators × tokens/sec/accelerator × inference_fraction × utilization × 86,400 | Parameter | Low | Mid | High | Basis |
|---|---|---|---|---|
| Accelerator installed base million_h100_equiv | 12 | 16 | 20 | Epoch AI: operational ~15-16M H100e end-2025 (~20M cumulative sold) |
| Throughput per accelerator tokens_per_sec | 1500 | 2500 | 5000 | NVIDIA InferenceMAX + Artificial Analysis (H200 ~2.5k/GPU; B200 6-10k) normalized per H100-equivalent |
| Inference fraction fraction | 0.40 | 0.50 | 0.65 | NVIDIA CFO ~40% of data-center revenue is inference (FY24); McKinsey >50% of AI compute by 2030 |
| Utilization fraction | 0.25 | 0.40 | 0.50 | FlexPipe production trace 29% median / 43% mean GPU util; Meta Llama-3 MFU ~40% |
| Capacity (T/day) | 156 | 691 | 2,808 | low/high compound extreme assumptions |
The point is triangulation, not a precise capacity number:
- Demand is physically comfortable. The estimated total (~329.8 T/day) is only ~48% of mid-case capacity, consistent with much of the fleet on training/idle and the floor being a lower bound.
- The models constrain each other. The reported floor (300.1 T/day, a hard fact) exceeds the low-capacity scenario (156), so the all-pessimistic hardware assumptions are physically impossible and get ruled out.
Sourcing discipline
Every observation carries a source and (where captured) an archived
snapshot, because disclosures move, get edited, or vanish. Numbers are re-verified on a dated
pass; see the verified_date column in the
data.
The full method also lives in the repo’s methodology/methodology.md.