It used to be that every company would have an IT closet/wing in their office building to support digital operations. Then there was a realization in the late 90’s / early 2000’s that it would be more effective to move those servers into co-located data centers, which would most effectively cool, network, and maintain the servers. Then, during the 2010’s, the “cloud” was born - providing companies the ability to not concern themselves with owning any hardware or hosting it. What used to be a huge CAPEX endeavor was now a recurring monthly software fee going to hyperscale data centers.
Data center capacity trends
More and more companies have moved to the cloud for the flexibility it provides, hugely benefiting the “hyperscale” behemoths of Amazon (AWS), Microsoft (Azure), and Google (GCP). However, there is a hidden/rising cost associated with big cloud computing, and the pressure it puts on margins can start to outweigh the benefits, particularly as a company scales/grows.
In a prominent article by Andressan Horowitz, “The Cost of Cloud, a Trillion Dollar Paradox” the authors detail the significant total cost of revenue (COR) / cost of goods sold (COGS) that companies face when running their workloads on large cloud providers. A16Z estimates that $100b of market value is being lost among the 50 top public software companies and >$500b by public companies due to cloud impact on margins.
A billion dollar private software company told A16Z that their public cloud spend amounted to 81% of COR, and that “cloud spend ranging from 75 to 80% of cost of revenue was common among software companies” (see chart below). A16Z shared it’s common to see many startups spend >80% of their total capital raised on compute resources! Anthropic, a rival to OpenAI, spent more than half of the revenue it generated last month paying cloud providers such as Amazon and Google to run its large language models.
Estimated annualized committed Cloud spend as % of Cost of Revenue
The question is if the 30% margins currently enjoyed by cloud providers eventually winnow through competition? This is unlikely, given that the majority of cloud spend is currently directed toward an oligopoly of three companies. And here’s a bit of dramatic irony: Part of the reason Amazon, Google, and Microsoft—representing a combined ~6 trillion dollar market cap—are all buffeted from the competition, is that they have high profit margins driven in part by running their own infrastructure, enabling ever greater reinvestment into product and talent while buoying their own share prices.
We’re helping companies train AI at the lowest cost—enabling you to save more.
So, how big can the cloud get before it starts raining?
At Build AI, we think the hyperscale cloud players are big enough, and that it is time for new competition in the market to reduce cost, particularly for expensive AI training workloads.
We are locking in structural advantages in CAPEX and OPEX to drastically reduce the cost of interruptible / batchable workloads, like training AI models:
- CAPEX - Given AI model training is interruptible (can be paused at checkpoints), we don’t have to overbuild backup power systems like a traditional data center. We are deploying modular data centers that can be built in as little as 4 weeks (vs. ~2-3 years it takes to build a traditional data center) and are using more affordable chip types.
- OPEX - We’re deploying our modular data centers in parts of the country with low-cost renewable energy (e.g. West Texas), since energy typically accounts for ~60% of the ongoing cost of operating a data center. We’re also leveraging demand response programs with local utilities, who pay Build AI to turn off when energy prices are highest (and the grid is dirtiest).
Lower cost & lower environmental impact - start training your AI models with Build AI today. To get started, please complete our short questionnaire detailing your requirements. Cut your costs of AI model training by over 50% vs. what you pay today.