Bits & Atoms: The Shifting Tectonics of Compute
Booting up a dusty old desktop computer was a childhood ritual for me, though it ran more like a microwave than a modern PC. As my brother and I prepared to immerse ourselves in the latest Age of Empires edition, it didn’t matter that the computer took an hour to turn on or was prone to crash at any moment. It was all I knew, and I couldn’t get enough. Years later, when I first experienced a gaming laptop, I was in sheer disbelief at the step-change in efficiency and user experience. There was a time when the future of computing was beautifully simple – as it appeared to me.
A desktop computer from the 90s-00s, depicted in 3D.
For years, computing progress was predictable and tangible for the average person. Aside from a handful of technological paradigm shifts of course, like the invention of the internet or touch screens. Computers got faster (and smaller) every cycle, enabling better professional operating systems, everyday apps and smoother gaming experiences. Cloud computing got cheaper as AWS and Azure could consistently drop prices alongside hardware improvements. Each generation of smartphones improved, with more transistors powering better performance, larger storage and longer battery life. Overall, it was a great ride – and one that followed a relatively predictable trend.
Moore's Law defined compute’s trajectory for half a century. Gordon Moore, Founder of Intel, initially observed in 1965 that transistor density was doubling each year with minimal cost increase. In simple terms, this meant chip performance consistently improving without a material increase in manufacturing costs. This wasn't just a technical observation; it was the foundation of the entire tech economy. The "free lunch" of exponential improvement drove new waves of great products, minting decacorns from startups into Fortune 500 staples. Think Apple, Sony & Canon – and each iterative generation of iPhone, PlayStation and Digital Camera. Or indeed Samsung’s smartphones, Microsoft’s Xbox or Nikon’s DSLR. We all had our favourite devices and eagerly awaited each new generation.
Semiconductor evolution has been the engine at the heart of Silicon Valley’s enterprise success. Intel would release a roadmap, and everyone could plan accordingly. The next CPU generation would be 15% faster, 10% more efficient, and arrive largely on schedule. Nvidia’s graphics cards followed the same predictable cadence (until they didn’t). The knock-on effects for devices were straightforward. Wait for the next hardware generation, get automatic efficiency gains, release a better product, rinse repeat. But that world is no longer in front of us, as in the land of compute three seismic changes have converged.
Classical transistor scaling, i.e. Moore’s law, has hit fundamental physical limits. As transistors smashed through 20nm and then 10nm scale, atomic dimensions have been approached. The short explanation is that heat dissipation and leakage of current (via something known as quantum tunnelling) become very challenging to manage at this scale. The techniques of further scaling transistors on frontier chips fundamentally changed. Even though TSMC is now operating <3nm nodes, costs are significantly higher relative to performance. We can no longer progress transistor count on frontier chips without substantial relative cost increases. The old economics of transistor scaling are dead and buried.
AI compute demand arrived with exponential fury after the exhaustion of Moore’s law. ChatGPT burst onto the consumer scene in 2022, changing everything. But since OpenAI’s first model, model size (i.e. no. of parameters) has increased by 500-1000x, and compute required for leading models has grown 5x per year. From nothing a decade ago, data centres harbouring foundation models have rapidly claimed as much global energy demand as the cloud computing sector – and this appetite is growing.
Geopolitics became even more volatile. Once again, the importance of the semiconductor industry to the global economy came to the fore – raising issues around sovereignty, supply chain fragility, and evident interdependence. In 2022, the US CHIPS Act deployed $280 billion in industrial policy. In 2023, a European €43 billion initiative followed suit – and South Korea, Taiwan & Japan made moves to protect their own national champions. China’s export controls created artificial scarcity. TSMC's concentrated production was reinforced as a single point of failure for the global economy – and a chokepoint for a major global recession. The free flow of semiconductor innovation and its deeply complex supply chain suddenly faced borders, tariffs, and national security reviews.
So, why is this important?
Compute workloads encountered the perfect storm. Maximum demand met crippling constraint, and all while the geopolitical landscape has threatened the lifeblood of the industry. Of huge concern is that the collision of these forces created an energy crisis hiding in plain sight. The rising cost and subsequent investment going into each generation of GenAI models is well documented. Grid constraints, the need for baseload power and swathes of other energy-related issues are stacked up on the other side of the equation – and this warrants an article in itself. Thankfully, Irena wrote one!
High-performance computing is underpinned by thirsty hardware. Released in 2017, NVIDIA's V100 GPU consumed 300W at peak demand. The H100, which hit the market in 2023, pulls 700W. The upcoming B200 chips are pushing 1,000W and beyond. The power appetite of one cutting-edge GPU is now comparable to an average household's power draw – like running your dryer, microwave, heating, and multiple TVs – but channelled into a single semiconductor.
Data centres account for roughly 3% of global power consumption today.
Data centre infrastructure can barely keep up. Traditional server racks consume 4-6KW. Modern AI clusters are hitting 50-100KW per rack. Where a decade ago a server rack sipped power like a desktop PC, today’s AI racks gulp down as much as a suburban neighbourhood during dinnertime – requiring entirely new cooling and electrical systems. The liquid immersion cooling systems adopted by leading-edge clusters are closer in resemblance to tech used in nuclear reactors than the air conditioning units of old. When Microsoft signs 20-year deals to revive nuclear facilities and xAI’s latest cluster uses more power than a steel mill, you know we've entered uncharted territory.
The evolution of compute carries profound yet complex implications for our climate crisis. The reality is that the rise of GenAI has caused the first rise in power demand for 20 years across the developed world – whether you buy the argument pushed by the Mag7 that their value will outweigh externalities or not. But there are valid points on both sides, and with the Mag7 being Apple, Microsoft, Google, Amazon, Tesla, Meta, Nvidia – they may have a vested interest. In five years, data centres globally are projected to consume as much electricity as the entirety of India does today. But there is hope even for the cynical, because the quest for efficiency is not just a climate imperative – it’s also a strategic imperative to preserve margins, maintain competitive (and many would argue sovereign) AI capabilities, and avoid regulatory backlash. Trillions of dollars in value are on the line.
Necessity is the mother of invention after all – and now, invention is needed more than ever. For innovators, this paradigm shift is the greatest opportunity in the computing world since the internet. There are exceptional founders building in this hypercompetitive space, and we are looking to partner with them to drive systems change.
Watch this space.