Sources & references
Every figure on this site is traceable. Below: the independent estimates we compare against, the primary company disclosures behind the reported floor (each with an archived snapshot), and the research, analyst and hardware literature we draw on.
Independent estimates of the global total, and how each was derived
Third-party estimates, shown for comparison, not summed into our figures. Only the current global-scope estimates feed our headline band; forecasts and narrower or older figures are shown for context.
| Source | Scope | T/day | As of | Methodology |
|---|---|---|---|---|
| Epoch AI / Exponential Viewin band | All providers (global) | ~432 | mid-2026 | Demand estimate relayed by Epoch from Exponential View (~5 billion tokens/sec); the underlying derivation is not published. |
| OpenRouter (1% extrapolation)in band | Global inference | ~400 | late-2025 | Extrapolation: OpenRouter's measured ~1T/day framed by its team as roughly 1% of global inference, implying ~400T/day. |
| Tomasz Tunguz (The Token Race) | Global (all providers) | ~88 | Sep-2025 | Top-down sum of provider disclosures (Google + Azure + neoclouds), plus an estimated OpenAI/Anthropic figure the author openly labels 'from thin air'. |
| a16z / OpenRouter | LLM API market | ~50 | late-2025 | First-party telemetry of routed tokens (measured, request-level). Authors explicitly decline to extrapolate to a global total. |
| Epoch AI (compute-derived) | Frontier lab (OpenAI) | 10–100 | late-2025 | Compute budget: installed H100-equivalents x FLOP/day / FLOP-per-token (2 x active params), at 5-30% inference utilization. Cross-checked against ~4B messages/day x ~4k tokens. |
| Goldman Sachs (Agentic Economy) | Global forecast 2030 | ~4,000 | forecast-2030 | Bottom-up agent-task simulation: tokens per task step x task frequency x knowledge-worker adoption (~12% by 2030); 120 quadrillion tokens/month. |
A normalization caveat. "Tokens processed" (Google's metric: input + reasoning + free and internal usage) and "tokens generated" (output only) differ by a large factor, so estimates built on different definitions are not directly comparable. We flag each estimate's scope rather than forcing one definition.
Primary company disclosures
The sources behind the reported floor and observations, with archived snapshots.
| Source | Original | Archive |
|---|---|---|
| Google I/O 2025 (restated in Alphabet Q2 2025 CEO remarks) | link | snapshot |
| Sundar Pichai at Google I/O 2026 | link | snapshot |
| Microsoft FY2025 Q3 earnings (50T in March) | link | pending |
| Microsoft FY2025 Q4 earnings call | link | snapshot |
| Sam Altman via Epoch AI usage dataset | link | pending |
| OpenRouter/a16z State of AI 2025 | link | snapshot |
| OpenAI statistics (15B tokens/min) | link | snapshot |
| Fireworks AI ($315M ARR; 10T/day) via press | link | pending |
| Fast Company (Lin Qiao interview) | link | snapshot |
| Fireworks AI homepage | link | pending |
| Tomasz Tunguz (The Token Race) | link | pending |
| National Data Bureau (Liu Liehong) via China News Service | link | snapshot |
| Volcano Engine via Robonomics Token Tracker | link | pending |
| Volcano Engine via China Daily | link | snapshot |
| Volcano Engine (Tan Dai) via KuCoin | link | pending |
| Business Insider (CEO Weinberg) via AOL | link | snapshot |
| Alphabet Q3 2025 CEO remarks | link | pending |
Estimate sources
The revenue/usage anchors behind the implied estimates.
| Entity | Basis | Source |
|---|---|---|
| Anthropic | ~$47B annualized run-rate (May 2026) / $3-8 per Mtok blended | link |
| OpenAI | 2.5B messages/day x 1,000-4,000 tokens/message (Epoch's full-context anchor); consumer only, API counted in the floor | link · archive |
| xAI | ~7-10M DAU x 5-20 msgs x 1.5-3k tokens; revenue ~$0.5B mostly free so revenue-implied fails | link · archive |
| Meta | 1B MAU (May 2025) x 8-13% DAU/MAU (~80-130M DAU) x 3-4 msgs x 1-2k tokens/message; Meta discloses users, never tokens | link |
Research & academic
| Source | Year | What it contributes |
|---|---|---|
| Epoch AI: AI Companies token-usage dataset (open CSV) | 2026 | Structured open dataset of per-company daily token figures with sources & confidence; corroborates our Google and OpenAI numbers |
| Epoch AI: Is a compute crunch coming? | 2026 | Estimates global throughput at ~5 billion tokens/sec (~432T/day) across all providers |
| Epoch AI: How many digital workers could OpenAI deploy? | 2025 | Compute-derived estimate of frontier token output (10–100T/day; GPT-5 ~19T/day) |
| Epoch AI: Frontier labs don't use most AI compute | 2026 | Global installed base ~15–16M H100-equivalents operational (end-2025) |
| Epoch AI: AI chip production / installed base | 2026 | Accelerator stock growing ~3.3x/year; NVIDIA >60% of total compute |
| NBER WP 34255: How People Use ChatGPT | 2025 | OpenAI-coauthored study: 700M weekly users sending ~2.5B messages/day |
| a16z + OpenRouter: State of AI: 100 Trillion Token Study | 2025 | Empirical study of 100T+ tokens of real LLM usage; LLM API market ~50T/day |
| State of AI empirical study (arXiv:2601.10088) | 2026 | Peer-archived version of the OpenRouter/a16z 100-trillion-token study |
| Stanford HAI: AI Index Report 2025 | 2025 | Inference price fell ~280x (Nov 2022–Oct 2024); context for volume growth |
| Photons = Tokens (arXiv:2603.06630): a global token balance sheet | 2026 | Energy divided by Wh-per-token supply estimate (~6.5e17 tokens/yr capacity by 2028) with a 2024 demand anchor of ~1e12 to 1e13 tokens/day |
| Erdil / Epoch: Inference economics of language models (arXiv:2506.04645) | 2025 | Roofline per-GPU throughput building blocks (e.g. H100 memory-bound floor ~42 tokens/sec/GPU single-stream) |
| Epoch AI: How much energy does ChatGPT use? | 2025 | Tokens-per-message anchors: ~500 output tokens/query (measured avg ~269), ~4,000 full-context tokens/message; 0.3 Wh/query |
| Meta: Llama usage doubled May through July 2024 | 2024 | Meta's only direct token-volume disclosure: Llama tokens served via CSP partners grew 10x Jan-Jul 2024 (growth multiples only, no absolute number) |
| NBER w34608: The Emerging Market for Intelligence | 2026 | API usage econometrics (OpenRouter + Azure) on token demand and price elasticity |
| Google: Measuring the Environmental Impact of AI at Google Scale (arXiv:2508.15734) | 2025 | 0.24 Wh per median Gemini prompt; energy divided by prompt count, not tokens (no per-token figure) |
| Epoch AI: Computing capacity (installed FLOP) | 2025 | ~20M H100-equivalents global compute stock, from revenue divided by chip price |
Analyst & industry
| Source | Year | What it contributes |
|---|---|---|
| Bond Capital: Trends in Artificial Intelligence (Mary Meeker) | 2025 | Inference cost down ~99.7% over two years; demand flywheel |
| Goldman Sachs: AI Agents Forecast to Boost Tech Cash Flow | 2025 | Forecasts ~24x token-demand growth by 2030 as agentic usage dominates |
| Tomasz Tunguz: The Token Race | 2025 | Cross-provider token tallies (Google |
| Tomasz Tunguz: Is Token Consumption Slowing Down? | 2025 | Google's additive monthly growth roughly halved mid-2025 (rate |
| Azeem Azhar: Exponential View (Magnitudes of intelligence) | 2026 | Source of the ~5 billion tokens/sec (~432T/day) global estimate; per-user token growth |
| OpenRouter / a16z: token usage by billing geography | 2025 | US ~47% / Europe ~18% / Asia ~13% of token spend; basis for the consumption-geography split in the country view |
| AI 2027: Compute Forecast | 2025 | Install-base-derived OpenAI token estimate: ~2T/day in 2024 scaling to ~80T/day by 2027 |
| YipitData: cloud and LLM pricing trends | 2025 | Alt-data market-wide estimate of ~150T tokens/month (~2 quadrillion annualized) |
| Menlo Ventures: State of Generative AI in the Enterprise | 2025 | Enterprise survey (~495 US decision-makers); $37B enterprise AI spend; token usage up ~320x |
| OpenRouter: Series B announcement | 2026 | Platform processing 25T tokens/week (May 2026), up from 5T in six months |
| Morgan Stanley: AI market trends | 2026 | Uses token throughput (6.4T to 22.7T tokens/week) as a demand proxy; forecasts hardware in dollars, not token volume |
| Anthropic via Sentisight: GenAI usage by hour and day | 2026 | Weekday peak 8am-2pm ET (12-18 UTC) per Anthropic Claude usage; weekend visits down ~23% |
| Cloudflare Radar: AI Insights (time-of-day) | 2026 | Infrastructure-level GenAI traffic by time of day; weekdays consistently outpace weekends |
Compute & hardware
| Source | Year | What it contributes |
|---|---|---|
| NVIDIA: Blackwell leads on SemiAnalysis InferenceMAX | 2025 | B200 ~10 |
| Artificial Analysis: Hardware benchmarks | 2026 | Per-GPU production throughput across models and accelerators |
| FlexPipe (arXiv:2510.11938): serving-pipeline efficiency | 2025 | GPU reservation falls ~75% to ~30% with better pipelining (a serving/allocation paper |
| Meta: The Llama 3 Herd of Models (arXiv:2407.21783) | 2024 | Best-case training MFU ~38–43% even on optimized 16K-H100 runs |
| Epoch AI via The Decoder: global AI compute ~15M H100e | 2026 | Global operational AI compute >15M H100-equivalents; >10 GW power |
| TokenPowerBench (arXiv:2512.03024): energy per token | 2025 | Whole-system ~40-60 J/token at high load on a 32xH100 cluster; GPU is >60% of energy |
| LMSYS: Large-scale expert parallelism (DeepSeek R1) | 2025 | ~2,788 tokens/sec/GPU decode for DeepSeek-R1 671B MoE on H100 |
| SemiAnalysis: InferenceMAX open-source inference benchmark | 2025 | Per-GPU throughput primitives (B200 ~60,000 tokens/sec/GPU); aggregate Tokenomics model is subscriber-gated |
A note on comparability. These sources do not all measure the same thing: input vs output tokens, one provider vs all surfaces, marketplace samples vs global totals, multimodal vs text. We preserve what each source reported and flag the differences rather than forcing them into one number. See the methodology for how we normalize and what we exclude.