The cost and complexity of deploying large-scale GPU clusters for generative AI training will drive many enterprises to the public cloud. Most enterprises will use pre-trained foundation models, to reduce computational overheads.
Dr. Owen Rogers is Uptime Institute’s Senior Research Director of Cloud Computing. Dr. Rogers has been analyzing the economics of cloud for over a decade as a chartered engineer, product manager and industry analyst. Rogers covers all areas of cloud, including AI, FinOps, sustainability, hybrid infrastructure and quantum computing.
orogers@uptimeinstitute.com
The cost and complexity of deploying large-scale GPU clusters for generative AI training will drive many enterprises to the public cloud. Most enterprises will use pre-trained foundation models, to reduce computational overheads.
While the aim of FinOps is to manage just the cloud costs, technology business management seeks to aggregate all costs of IT, including data centers, servers, software and labor, to identify savings and manage return on investment.
Enterprises have various options on how and where to deploy their AI training and inference workloads. This report explains how these different options balance cost, complexity and customization.
To meet the demand driven by AI workloads, a new breed of cloud provider has emerged, delivering inexpensive GPU infrastructure as a service. Their services are highly demanded today, but longer-term, the market is ripe for consolidation.
While GPUs are the power-hungry devices that enable effective AI training, it is innovations in software that are fueling the recent surge in interest and investment. This report explains how neural networks power generative AI.
Although quantum computing promises a revolution in scientific discovery, its use is still constrained to research and continuing development. However, a new IBM quantum data center in Germany signals a growing interest in its capabilities.
Reserved instances are a pricing model for virtual machines offered by cloud providers. As they offer savings of up to 70% compared with on-demand pricing, organizations should use them liberally, especially in challenging times.
The key benefit of cloud computing lies in its on-demand pricing model. This enables organizations to grow or shrink their applications at will without giving the cloud provider any advance notification. Cloud providers can only offer such…
The Uptime Institute Global Data Center Survey highlights the experiences and strategies of data center owners and operators in areas of resiliency, sustainability, efficiency, staffing, cloud and innovative technologies.
In recent conversations with both regulators and some enterprises, a concept borrowed from the financial sector has been discussed with growing frequency: concentration risk. In finance, the term refers to the level of risk arising from the…
Organizations encounter a bewildering assortment of cloud storage platforms. The difference between the offerings lies in who is responsible for scaling, resiliency and performance: the provider or the customer.
The public cloud's on-demand pricing model is vital in enabling application scalability — the key benefit of cloud computing. Resources need to be readily available for a cloud application to scale when required without the customer having to give…
Low latency is the main reason cloud providers offer edge services. Only a few years ago, the same providers argued that the public cloud (hosted in hyperscale data centers) was suitable for most workloads. But as organizations have remained…
Cloud providers divide the technologies that underpin their services into two "planes", each with a different architecture and availability goal. The control plane manages resources in the cloud; the data plane runs the cloud buyer's application.In…
This report shows how rising market concentration and poor visibility drive risk exposure, and explains why organizations should prioritize resilient single-cloud architectures before they consider dual-cloud implementations.