MTS

Wiki

Annotated Sources

The bibliography. Organized by tier of authority — primary research first, then high-quality secondary, then journalism, then everything else. Each entry has a one-line "what it gives you" note so future-me knows why it's here without re-reading.

Tier 1 — Primary research

  • Li, Yang, Islam, Ren — Making AI Less "Thirsty": Uncovering and Addressing the Secret Water Footprint of AI Models (April 2023, arXiv). arxiv.org/pdf/2304.03271 The paper. Source of the "10–50 mL per medium GPT-3 response" finding; introduces direct vs scope-2 distinction; methodology section is the foundation for every credible per-query number that followed. Pull the PDF and re-cite directly — do not rely on summaries.

  • EESI — Data Centers and Water Consumption. eesi.org/articles/view/data-centers-and-water-consumption Source for the 449 M gal/day (2021) and 200–250 M gal/day (2023) US data center totals. Also useful for explaining why those numbers diverge (changing scope of measurement).

  • Mytton — Data centre water consumption, npj Clean Water (2021). nature.com/articles/s41545-021-00101-w — Author Correction: nature.com/articles/s41545-021-00140-3 The pre-AI baseline academic paper. Confirms 1.7 B L/day = ~449 M gal/day for US data centers. Useful for showing that data-center water as a category was already studied and well-bounded before the AI-water moral panic took off. Cite the corrected version (the author correction is minor — methodology and headline figure stand).

  • HARC / U Houston — Thirsty Data: The Hidden Water and Energy Costs of Texas' Data Center Boom (2025). harcresearch.org/news/thirsty-data-the-hidden-water-and-energy-costs-of-texas-data-center-boom Primary source for the Texas projection. Headline numbers: ~49 B gal/year today, scaling to a 29–161 B gal/year range by 2030 depending on scenario (low / mid / high build-out). Earlier drafts of this wiki carried a 399 B headline figure — that was wrong; the actual upper-bound is 161 B. Cite the range, not the headline.

  • IEA — Energy and AI (2025). iea.org/reports/energy-and-ai Authoritative global figures: ~560 B L/year data-center water consumption today, projected ~1,200 B L/year by 2030. Splits ~60% indirect (power-plant cooling) / ~40% direct (on-site cooling), which independently corroborates the scope-2 dominance argument in scope_2_water.md.

  • Microsoft — Sustainable by design: Next-generation datacenters consume zero water for cooling (Dec 2024). microsoft.com/en-us/microsoft-cloud/blog/2024/12/09/sustainable-by-design-next-generation-datacenters-consume-zero-water-for-cooling Original announcement of the closed-loop / zero-water cooling design. Pilot deployments in Phoenix AZ and Mt Pleasant WI in 2026; FY2024 fleet WUE 0.30 L/kWh. Important for the "remedies are systemic" argument — if the largest hyperscaler can hit zero-water in arid sites, per-query water becomes effectively a transitional concern.

  • Google — 2024 Environmental Report (Jul 2024). blog.google/company-news/outreach-and-initiatives/sustainability/2024-environmental-report Per-facility water disclosure. Total 6.1 B gal across the fleet; Council Bluffs IA highest at ~1 B gal; Pflugerville TX (air-cooled) lowest at ~10 K gal. Important: this is the closest any hyperscaler currently comes to the per-facility transparency Ren et al. and OECD.AI ask for, and it shows the order-of-magnitude range across siting choices.

  • Macknick, Newmark, Heath, Hallett — Operational water consumption and withdrawal factors for electricity generating technologies: a review of existing literature (NREL/TP-6A20-50900, 2012). docs.nrel.gov/docs/fy11osti/50900.pdf / iopscience.iop.org/article/10.1088/1748-9326/7/4/045802 Authoritative per-source water consumption (not withdrawal) factors used in analysis/models.py and the methodology table. Median values for recirculating cooling tower configurations: coal subcritical 479 gal/MWh, gas CC 205, nuclear 672, solar PV 1, wind 0, hydro 4,491 (wide range). Distinguishing consumption from withdrawal is load-bearing for the editorial — many secondary sources quote withdrawal numbers (often 10–50× higher) and label them as consumption.

Tier 2 — Strong analysis / commentary

  • Andy Masley — The AI water issue is fake (Oct 2025). andymasley.com/writing/the-ai-water-issue-is-fake The single most-thorough debunking. Source of the "0.008% of US freshwater" national figure, the WaPo critique, and the "1.3 M prompts per t-shirt" comparison set. The editorial leans heavily on this; cite generously.

  • Andy Masley — Empire of AI is wildly misleading about AI water use. andymasley.com/writing/empire-of-ai-is-wildly-misleading Forensic critique of Karen Hao's book. Source for the "off by a factor of 1,000" claim. Use carefully — one specific error in a 400-page book doesn't invalidate the whole; cite the specific finding, not a global verdict.

  • Karen Hao — Empire of AI: Changes and corrections (Dec 2025). karendhao.com/20251217/empire-water-changes Author's corrections page. Confirms the 1,000× error Masley flagged on the Chile section; page 288 corrected from a global comparison to "more water than the entire population of Cerrillos." Cite alongside Masley to show the correction was made — strengthens "the inflated number was wrong, but the broader local-impact concern is real" framing rather than a global "the book is unreliable" framing.

  • Sean Goedecke — Talking to ChatGPT costs 5 mL of water, not 500 mL (28 Oct 2024). seangoedecke.com/water-impact-of-ai The cleanest independent per-query re-derivation. Source of the 5 mL conservative anchor used throughout this wiki. Methodology is fully shown; that's why we use his number rather than Altman's.

  • Simon Willison — annotation of Masley (Oct 2025). simonwillison.net/2025/Oct/18/the-ai-water-issue-is-fake Useful as credibility cross-validation — Willison endorsing the argument signals it's crossed into the technical mainstream.

  • OECD.AI — How much water does AI consume? The public deserves to know. oecd.ai/en/wonk/how-much-water-does-ai-consume Good policy-flavoured framing. Source for the "indirect use is 80%+ of total" and scope-2 disclosure asymmetry framing.

  • OECD.AI — Ways to minimise water use related to AI operations. oecd.ai/en/wonk/ways-to-minimise-water-use-related-to-ai-operations-not-what-you-think Companion piece. Useful for the cooling-tradeoff discussion in cooling_explained.md.

  • IEEE Spectrum — The Real Story on AI Water Usage at Data Centers. spectrum.ieee.org/ai-water-usage Mainstream technical-press treatment. Useful as a "professionals talking to professionals" cite that doesn't sound advocacy-coded.

Tier 3 — Journalism

Tier 4 — Useful for comparison data, not for the AI-water claim itself

To-find / to-verify before publication

All items previously on this list have been verified and folded into the bibliography above. The two material corrections that fell out of that pass:

  1. Texas HARC projection upper bound was wrong. Earlier drafts cited "49 B → 399 B gal/year by 2030." The actual range from the HARC report is 29–161 B gal/year by 2030 depending on scenario. The 49 B 2025 figure stands. local_vs_national.md and legitimate_concerns.md have been updated.
  2. Karen Hao's Empire of AI corrections page is live. The 1,000× Chile figure has been corrected to "more water than the entire population of Cerrillos." Cite Hao's correction alongside Masley's critique rather than treating the original number as still in print.