Annotated Sources

The bibliography. Organized by tier of authority — primary research first, then high-quality secondary, then journalism, then everything else. Each entry has a one-line "what it gives you" note so future-me knows why it's here without re-reading.

Tier 1 — Primary research

Li, Yang, Islam, Ren — Making AI Less "Thirsty": Uncovering and Addressing the Secret Water Footprint of AI Models (April 2023, arXiv). arxiv.org/pdf/2304.03271 The paper. Source of the "10–50 mL per medium GPT-3 response" finding; introduces direct vs scope-2 distinction; methodology section is the foundation for every credible per-query number that followed. Pull the PDF and re-cite directly — do not rely on summaries.
EESI — Data Centers and Water Consumption. eesi.org/articles/view/data-centers-and-water-consumption Source for the 449 M gal/day (2021) and 200–250 M gal/day (2023) US data center totals. Also useful for explaining why those numbers diverge (changing scope of measurement).
Mytton — Data centre water consumption, npj Clean Water (2021). nature.com/articles/s41545-021-00101-w — Author Correction: nature.com/articles/s41545-021-00140-3 The pre-AI baseline academic paper. Confirms 1.7 B L/day = ~449 M gal/day for US data centers. Useful for showing that data-center water as a category was already studied and well-bounded before the AI-water moral panic took off. Cite the corrected version (the author correction is minor — methodology and headline figure stand).
HARC / U Houston — Thirsty Data: The Hidden Water and Energy Costs of Texas' Data Center Boom (2025). harcresearch.org/news/thirsty-data-the-hidden-water-and-energy-costs-of-texas-data-center-boom Primary source for the Texas projection. Headline numbers: ~49 B gal/year today, scaling to a 29–161 B gal/year range by 2030 depending on scenario (low / mid / high build-out). Earlier drafts of this wiki carried a 399 B headline figure — that was wrong; the actual upper-bound is 161 B. Cite the range, not the headline.
IEA — Energy and AI (2025). iea.org/reports/energy-and-ai Authoritative global figures: ~560 B L/year data-center water consumption today, projected ~1,200 B L/year by 2030. Splits ~60% indirect (power-plant cooling) / ~40% direct (on-site cooling), which independently corroborates the scope-2 dominance argument in scope_2_water.md.
Microsoft — Sustainable by design: Next-generation datacenters consume zero water for cooling (Dec 2024). microsoft.com/en-us/microsoft-cloud/blog/2024/12/09/sustainable-by-design-next-generation-datacenters-consume-zero-water-for-cooling Original announcement of the closed-loop / zero-water cooling design. Pilot deployments in Phoenix AZ and Mt Pleasant WI in 2026; FY2024 fleet WUE 0.30 L/kWh. Important for the "remedies are systemic" argument — if the largest hyperscaler can hit zero-water in arid sites, per-query water becomes effectively a transitional concern.
Google — 2024 Environmental Report (Jul 2024). blog.google/company-news/outreach-and-initiatives/sustainability/2024-environmental-report Per-facility water disclosure. Total 6.1 B gal across the fleet; Council Bluffs IA highest at ~1 B gal; Pflugerville TX (air-cooled) lowest at ~10 K gal. Important: this is the closest any hyperscaler currently comes to the per-facility transparency Ren et al. and OECD.AI ask for, and it shows the order-of-magnitude range across siting choices.
Macknick, Newmark, Heath, Hallett — Operational water consumption and withdrawal factors for electricity generating technologies: a review of existing literature (NREL/TP-6A20-50900, 2012). docs.nrel.gov/docs/fy11osti/50900.pdf / iopscience.iop.org/article/10.1088/1748-9326/7/4/045802 Authoritative per-source water consumption (not withdrawal) factors used in analysis/models.py and the methodology table. Median values for recirculating cooling tower configurations: coal subcritical 479 gal/MWh, gas CC 205, nuclear 672, solar PV 1, wind 0, hydro 4,491 (wide range). Distinguishing consumption from withdrawal is load-bearing for the editorial — many secondary sources quote withdrawal numbers (often 10–50× higher) and label them as consumption.

Tier 2 — Strong analysis / commentary

Andy Masley — The AI water issue is fake (Oct 2025). andymasley.com/writing/the-ai-water-issue-is-fake The single most-thorough debunking. Source of the "0.008% of US freshwater" national figure, the WaPo critique, and the "1.3 M prompts per t-shirt" comparison set. The editorial leans heavily on this; cite generously.
Andy Masley — Empire of AI is wildly misleading about AI water use. andymasley.com/writing/empire-of-ai-is-wildly-misleading Forensic critique of Karen Hao's book. Source for the "off by a factor of 1,000" claim. Use carefully — one specific error in a 400-page book doesn't invalidate the whole; cite the specific finding, not a global verdict.
Karen Hao — Empire of AI: Changes and corrections (Dec 2025). karendhao.com/20251217/empire-water-changes Author's corrections page. Confirms the 1,000× error Masley flagged on the Chile section; page 288 corrected from a global comparison to "more water than the entire population of Cerrillos." Cite alongside Masley to show the correction was made — strengthens "the inflated number was wrong, but the broader local-impact concern is real" framing rather than a global "the book is unreliable" framing.
Sean Goedecke — Talking to ChatGPT costs 5 mL of water, not 500 mL (28 Oct 2024). seangoedecke.com/water-impact-of-ai The cleanest independent per-query re-derivation. Source of the 5 mL conservative anchor used throughout this wiki. Methodology is fully shown; that's why we use his number rather than Altman's.
Simon Willison — annotation of Masley (Oct 2025). simonwillison.net/2025/Oct/18/the-ai-water-issue-is-fake Useful as credibility cross-validation — Willison endorsing the argument signals it's crossed into the technical mainstream.
OECD.AI — How much water does AI consume? The public deserves to know. oecd.ai/en/wonk/how-much-water-does-ai-consume Good policy-flavoured framing. Source for the "indirect use is 80%+ of total" and scope-2 disclosure asymmetry framing.
OECD.AI — Ways to minimise water use related to AI operations. oecd.ai/en/wonk/ways-to-minimise-water-use-related-to-ai-operations-not-what-you-think Companion piece. Useful for the cooling-tradeoff discussion in cooling_explained.md.
IEEE Spectrum — The Real Story on AI Water Usage at Data Centers. spectrum.ieee.org/ai-water-usage Mainstream technical-press treatment. Useful as a "professionals talking to professionals" cite that doesn't sound advocacy-coded.

Tier 3 — Journalism

Sam Altman — The Gentle Singularity (10 Jun 2025). blog.samaltman.com/the-gentle-singularity Primary source for the 0.000085 gal / 0.34 Wh per-query figures. Note the caveats: unaudited, "average" undefined, training not included.
DCD — Sam Altman: ChatGPT queries consume 0.34 Wh of electricity and 0.000085 gal of water (Jun 2025). datacenterdynamics.com/en/news/sam-altman-chatgpt-queries-consume-034-watt-hours-of-electricity-and-0000085-gallons-of-water Trade-press write-up of the same Altman blog post; useful as a secondary citation.
CNBC — Sam Altman defends AI resource usage (Feb 2026). cnbc.com/2026/02/23/openai-altman-defends-ai-resource-usage-water-concerns-fake-humans-use-energy-summit The "completely untrue, totally insane" quote. Useful as the highest-profile rhetorical shot from the industry side.
TechCrunch — ChatGPT users send 2.5 billion prompts a day (Jul 2025). techcrunch.com/2025/07/21/chatgpt-users-send-2-5-billion-prompts-a-day Source for the ChatGPT 2.5B/day baseline. Altman's statement to Axios; also reported in Yahoo Finance, Slashdot, TechRadar.
Stanford / & The West — Thirsty for power and water, AI-crunching data centers sprout across the West. andthewest.stanford.edu/2025/thirsty-for-power-and-water-ai-crunching-data-centers-sprout-across-the-west Source for the Phoenix-metro 60-data-center / 177 M gal/day / 86%-agriculture figures. Stanford-affiliated, useful for institutional credibility.
Lincoln Institute of Land Policy — Data Drain: The Land and Water Impacts of the AI Boom. lincolninst.edu/publications/land-lines-magazine/articles/land-water-impacts-data-centers Good for the local-policy / siting / permitting framing.
Undark — How Much Water Do AI Data Centers Really Use? (Dec 2025). undark.org/2025/12/16/ai-data-centers-water Even-handed long-form. Decent journalism that doesn't lean either way.
Circle of Blue — Data Center Energy Demand Is Putting Pressure on U.S. Water Supplies. circleofblue.org/2025/water-energy/data-center-energy-demand-is-putting-pressure-on-u-s-water-supplies Specialist water-policy outlet. Useful for the scope-2 / power-plant water angle.
Pirate Wires — The Data Center Water Crisis Isn't Real (interview with Masley). piratewires.com/p/andy-masley-ai-water-crisis-isnt-real Long-form interview with Masley. Useful for direct quotes.

Tier 4 — Useful for comparison data, not for the AI-water claim itself

Water Footprint Calculator — Water Footprint of Food Guide. watercalculator.org/water-footprint-of-food-guide Source for hamburger / almond / coffee / chocolate water footprints used in comparisons.md.
FoodPrint — The Water Footprint of Food. foodprint.org/issues/the-water-footprint-of-food Companion source.

To-find / to-verify before publication

All items previously on this list have been verified and folded into the bibliography above. The two material corrections that fell out of that pass:

Texas HARC projection upper bound was wrong. Earlier drafts cited "49 B → 399 B gal/year by 2030." The actual range from the HARC report is 29–161 B gal/year by 2030 depending on scenario. The 49 B 2025 figure stands. local_vs_national.md and legitimate_concerns.md have been updated.
Karen Hao's Empire of AI corrections page is live. The 1,000× Chile figure has been corrected to "more water than the entire population of Cerrillos." Cite Hao's correction alongside Masley's critique rather than treating the original number as still in print.