AI and Energy: The Numbers Nobody Gets Right

Everyone's talking about AI's energy problem. Most of the numbers floating around are wrong, outdated, or missing context. Let me try to fix that.

Training Gets the Headlines. Inference Is the Story.

Most people assume AI training is the energy hog. That was true in 2020-2022, when roughly 70-80% of AI energy went to training. But the balance has flipped dramatically.

In 2025-2026, inference now consumes 80-90% of total AI compute energy.

Why? Training happens once. Inference — every ChatGPT query, every Copilot suggestion, every API call — happens billions of times per day. The cumulative cost of serving those requests dwarfs the one-off training runs.

To put training in perspective: GPT-3 used about 1,287 MWh to train — equivalent to powering 130 US homes for a year. GPT-4 used an estimated 50,000+ MWh. These are big numbers in isolation. They're rounding errors compared to the energy cost of serving billions of queries across millions of users, every day, indefinitely.

The Data Centre Problem

Global data centres already consume about 1-1.5% of world electricity — roughly 460 TWh in 2024. AI is projected to push this to 4.6% by 2030 (IEA estimate). Goldman Sachs projects data centre power demand will increase 160% by 2030.

A single ChatGPT query uses roughly 10x the energy of a Google search. That doesn't sound like much until you multiply it by billions of daily queries. The IEA projects AI-related electricity demand alone could reach 800 TWh by 2030 — more than the entire country of Germany uses.

The hyperscalers know this. Amazon, Google, and Microsoft are all investing in nuclear energy. Microsoft signed a deal with Constellation Energy for the Three Mile Island nuclear plant. Google invested in small modular reactors. Amazon acquired a nuclear-powered data centre campus.

When tech companies start buying nuclear plants, you know the energy problem is real.

The Water Problem Nobody Mentions

It's not just electricity. Data centres need cooling, and cooling needs water. Microsoft's water consumption increased 34% in one year, largely attributed to AI workloads. Google reported a 20% increase. A single data centre can use as much water as a small town.

In drought-prone regions, this creates real tension with communities. It's one thing to abstract "AI energy use" as a global statistic. It's another when the data centre next door is consuming water your local farmers need.

The Local Model Counterargument

Here's where it gets interesting. Google just released Gemma 4, an open-weight model family that runs on consumer hardware. The smallest variant runs on a Raspberry Pi. The largest runs on a single consumer GPU.

I've been running local models for over a year. The experience has improved dramatically — Ollama, LM Studio, and similar tools make it genuinely practical to run capable AI on your own machine. No cloud. No data centre. No water cooling.

The energy cost of a local inference is a fraction of a cloud inference. Your MacBook Pro uses maybe 30 watts running a local model. A data centre GPU uses 300-700 watts per card, plus cooling overhead.

Of course, local models are less capable than frontier cloud models. You're not running GPT-5 on a laptop. But for 70-80% of daily AI tasks — drafting emails, summarising documents, code suggestions, simple analysis — local models are good enough. And they're getting better fast.

The 90/10 Split

I think the future looks like this: 90% of AI queries run locally on devices, using small efficient models. 10% of queries hit cloud infrastructure for tasks that genuinely need frontier-scale intelligence.

This would dramatically reduce the energy footprint of AI. Not because AI uses less energy in total, but because most of that energy comes from your existing device's power supply rather than a water-cooled data centre in Virginia.

The phone on your desk, the laptop in your bag, the GPU in your workstation — these are already powered. Running a local model on them is nearly free in energy terms. The marginal cost is essentially zero.

Where I Land

AI's energy problem is real but solvable. The solution isn't "use AI less" — that's not going to happen. The solution is moving compute to the edge, improving model efficiency, and reserving cloud infrastructure for the tasks that genuinely need it.

The worst outcome is the current trajectory: every query routed to cloud data centres, energy demand growing exponentially, and hyperscalers buying nuclear plants to keep up. The best outcome is a world where your devices are smart enough to handle most tasks locally, and the cloud is reserved for the 10% of work that truly requires it.

We're closer to the best outcome than most people realise. The models are getting smaller and more efficient fast enough that running AI locally will be the norm, not the exception, within a few years.

The greenest AI query never leaves your desk.