Monday, April 26, 2021

The Rise of Datacenter Computing

The coming decades will see a significant change in the usage and deployment of “compute” (herein CPU cycles, storage, etc), specifically relating to the rise of datacenter computing. I don’t know the exact form this change will take - though I will point to the first phases of it below - but the fact that significant paradigm shifts are coming seems inevitable.

Note that I’m not talking about IoT or anything in that space, though it’s peripherally related. Here I’m specifically talking about the rise of datacenters as the majority share of compute power, and how the economy of computing and our individual access will be reshaped by it.

Part 1: The Data Center of Gravity

Two trends here seem clear: first, the balance of compute power is going to continue shifting towards datacenters and away from personal devices (even as those devices proliferate); and second, economics will dictate a shift in how this capacity is delivered to meet the ever-growing demand.

The balance of compute power will shift inexorably towards Centralized: as the fraction of GDP put towards compute rises, the portion of it situated in the home and on the person will fall, as will non-datacenter corporate compute.

  1. Centralized is efficient: better power utilization (PUE), better resource utilization, more cost-efficient devices (e.g. larger processors, rack-oriented computer designs) all favour datacenter compute. But also because high density computing itself often allows for more efficient processing: lots of computers and memory and storage together on a high bandwidth fabric permits algorithms ill-suited to - or impossible on - highly-distributed small-scale devices.

  2. Our willingness to have compute resources poorly utilized and at high risk will decline. Most compute capability of a cell phone or a personal computer goes almost entirely wasted already, which is tolerable for the convenience they bring at the relative cost. But would you want a $100k computer sitting idle at home, and a $10k cell phone in your pocket, if that’s what it took to keep up with compute demand growth? Almost certainly not. And relatedly, we’re seeing a rapid shift from low efficiency corporate computing - on-prem, colo, and private small datacenters - towards larger datacenters and especially hyperscalars with cost effectiveness as one of the driving motivators.

  3. When we consider the electricity consumption of global compute rising from the present-day ~3% (including consumer devices, datacenters, and networks) to say 21% by 2030 (or perhaps a few years later once efficiencies are played out), we couldn’t carry all those watt-hours in our collective pocket even if we wanted to. At best it could shift to remote devices that live in the home, but efficiency still pushes strongly for it being situated near cheap electricity instead.

This is not to say that individuals won’t continue to grow their net compute usage. Rather, the trend of comparatively weak computing devices on the person and in the home (e.g. 2-4 core cell phones) will continue, augmented by a growing fraction of compute happening in responsibilities offloaded to centralized (datacenter) environments and their hundred-plus core machines.

To make this concrete: today the majority of compute capabilities (e.g. CPU cores) remain in the hands of consumers, in the form of cell phones, tablets, laptops, PCs, game consoles, and other such devices. But server-oriented compute devices are rapidly growing, with disks leading the charge. Note that server-oriented disks do not dominate by unit count yet, but already do by capacity as the server- and consumer-oriented devices diverge. GPUs are soon to tip towards server-dominant as well (and may have already when measured by capabilities, if we account for GPUs used crypto mining that appear in the consumer “Graphics” segment, capability segmentation consumer vs enterprise, and generally higher prices / revenue in the consumer market skewing revenue-based breakdowns). CPUs and ram are the most difficult to pinpoint, but anecdotally servers helping to hold up the ram market, and AMD’s datacenter-first strategy, suggest that we’re nearing the tipping point there too.

By usage aka utilization, datacenters are already likely dominant, and growing rapidly. The average consumer device has an extremely low overall utilization (single-digit percentage), when considering the time-on-device, core scaling, and frequency scaling utilized by devices to keep power consumption low. Whereas datacenters have entire teams dedicated to keeping utilization high and timing capacity delivery against demand, achieving overall utilizations reaching 50% and likely averaging 30-40% industry-wide. Combined with the relative share of compute capabilities, this suggests the balance of computing is already happening in datacenters, and incremental compute coming online suggests usage highly skewed towards datacenters.

On the economic side, the complexity of producing these datacenters will continue to shift as well. “Supply” will tend towards fixed over the coming decades: not deployed according to short-term needs, but rather, at a fixed pace determined over increasingly long time-scales. Consider the recurring semiconductor shortages (bottlenecked on fabrication capacity) combined with the rising price of chip fabs, the lead times on electric grid scaling, the lead time on adjusting mining output for rare earth elements, the cost of a modern datacenter building, and the general inertia of any industry as its share of global GDP rises.

This will see datacenter compute capabilities continue the shift into “base infrastructure” IaaS / PaaS plus higher levels layered above, with Public Clouds being only the first step on this road. One might also expect planning to trend towards more centralized as well: akin to the Oil industry, I could imagine the producers and consumers of bottleneck infrastructure resources - power, computer chips, flash, and disk drives today - coming together to commit to a particular curve years or decades in advance, and pre-committing demand in a way that makes the funding of mega fabs ($20B today; $50B in a couple more generations) possible.

Beyond that, I’m not sure. Do whole countries start to take a seat at the negotiating table? The world’s reliance on TSMC for top-end chips, and the collaboration with the US government to land a fab on US soil, suggests it’s decently likely. Does the cryptocurrency share get reigned in? I guess we’ll see.

Part 2: “Personal” computing

Let’s run a thought experiment: what does it look like for centralized “compute” to become available not only to corporations who pay for it, but also made directly and practically available to the citizens of the world. What will it be used for? How will that be decided?

“Practically” is important - it’s true that individuals can sign up for accounts at public Clouds today, and use the many services available, but the vast majority do not have the knowledge necessary to do so successfully. Whereas almost everyone knows how to download and run an app on a local device. As such, the predominant consumer compute patterns leveraged today are:

  1. Local general-purpose”: apps and programs running on the capabilities of a device you provide.

  2. Ad-supported”: websites, social networks, etc, where the load you personally contribute is small enough to be funded entirely by advertising or offered for free; requiring no payments either way.

  3. "Remote fixed-cost” services: costly enough to host that they’re not offered for free, but where the load you personally contribute is still sufficiently bounded that a flat-rate subscription pricing model suffices.

  4. Plus a comparably small amount of “remote compute services”: e.g. photo storage; paid for intentionally by the user, with variable costs or tiered plans.

The paradigm of yore was “local general-purpose” - you had a computer or device, and used it however you saw fit; the networks didn’t really exist to support anything else. The internet then ushered in an era of “ad-supported” services, e.g. search engines, maps, and social networks. More recently yet “Remote fixed-cost” has been on the rise, with services like Spotify or Netflix leading the charge into the home from the media front with VPNs riding on their coattails, team-oriented services like Zoom gaining significant mindshare in the “free for home, paid for business” space, and most recently “cloud gaming” starting to take off as well. And finally “remote compute services” exist in limited form, particularly for storage applications, but not yet having taken off.

(Tangent: I looked up the Adobe Creative Cloud and other competitors to see if “paid cloud photo/video editing” is a thing, and it seems not yet? I’m a little surprised, as data-heavy applications that idle at 0 but could easily spike to hundreds of cores during active use seems inherently well suited to this model.)

You’ll note that I did not include “remote general-purpose” in the list above, as I do not see this as a prevalent paradigm today, with a bit of niche usage of “virtual desktop in the Cloud” style services and little else. But it’s likely coming next - I expect the consumer portion of computing to shift in decent part towards remote variable-cost compute as the local slice of compute consumption becomes relatively smaller, through some mix of proliferating “remote compute services” offering pay-as-you-go or subscription access to myriad services for storage, photo/video editing, data analysis, model generation, data generation, etc; as well as new “remote general-purpose” paradigms that have yet to be developed, such as “apps” executing on dynamic remote resources.

While this could be presented as “desktop in the Cloud”, I doubt it will be - the desktop paradigm is suited to a small, always on, fixed-sized, pre-purchased unit of compute (one computer), and less clearly suited to expressing irregular bursts of short-lived high-consumption activities. Solving for burst usage with cost predictability will likely require developing paradigms more inherently suited to that task, rather than simply transplanting a (relatively) dying paradigm into “the Cloud”. It’s much more likely to me that subscription services will take the broadest share, and decently likely that a cost-predictable paradigm for “remote apps” will be found to fill the remaining gap.

As a final thought to leave you with, the donation of spare cycles towards causes of interest may be interesting in a future world also. Today imagine collectivist projects like Folding@Home, or seeding torrents to challenge content “ownership”. Will these still exist, and in what form?