UII UPDATE 492 | MAY 2025
For enterprises testing generative AI in operational systems, differences in service reliability between providers are becoming more visible — and possibly a critical differentiator in their choice of platform. Poor reliability may slow adoption of AI in mission-critical enterprise applications.
Recent status data illustrates the challenge. Over the 90 days to late April 2026, Anthropic reported Claude API availability of around 99.05%, equivalent to roughly 20 hours of service disruption. By usual enterprise standards, this is considered substandard.
Hyperscale cloud providers, including Amazon Web Services (AWS), Microsoft Azure and Google Cloud, do not publish AI-service availability in a comparable way. However, recent incident reporting suggests fewer visible disruptions at the platform level, despite operating at much larger scale and handling significantly higher aggregate demand.
Inference relies on multiple layers: data center infrastructure, IT hardware and the software that serves and orchestrates model execution. The model itself has limited direct influence on availability: reliability depends more on how these supporting layers are operated. In cloud-based inference services, these layers are typically managed by a provider, but the responsibilities for different parts of the process can be split.
Hyperscalers, such as AWS and Google, operate the infrastructure while hosting models developed by third parties. At the same time, model developers such as Anthropic and OpenAI also offer inference services directly to the customer. They rely on hyperscale cloud providers — AWS and Microsoft Azure, respectively — for infrastructure, but remain responsible for how their models are served, scaled and delivered as services.
This difference in availability is unlikely to be driven by infrastructure reliability, as these services often share the same underlying cloud platforms. Uptime Intelligence analysis shows that dual-zone cloud architectures typically achieve availability of around 99.97% (see Figure 1 and AWS outage: what are the lessons for enterprises?). Instead, the variation is more likely to arise from the serving layer or middleware (including request orchestration, scaling policy and capacity management), which is controlled by the model provider rather than the underlying cloud infrastructure provider.
Figure 1 Availability for different application architectures

Anthropic's run-rate revenue grew from $9 billion at the end of 2025 to $30 billion in April 2026. Its low availability may partly reflect the rapid growth of Claude services, with demand potentially increasing faster than capacity can be brought online. The company is increasingly partnering with cloud providers to grow its capability.
In October 2025, Anthropic announced plans to use up to a million of Google Cloud's tensor processing unit (TPU)-backed infrastructure, in a deal worth tens of billions of dollars and expected to bring more than a gigawatt (GW) of capacity online in 2026.
In April 2026, Amazon and Anthropic significantly expanded their partnership, with Anthropic committing to spend more than $100 billion on AWS technologies over the next decade to train and serve its AI models. As part of the agreement, Amazon will invest an additional $5 billion into the company, with the potential for up to $25 billion in total future investment. Anthropic will also secure up to 5 GW of AWS compute capacity utilizing AWS' custom Trainium and Graviton chips.
Anthropic's customers may have limited recourse when reliability falls short of expectations. Claude does offer service level agreements to some customers, but not as standard. AWS, Google Cloud and Microsoft Azure offer 99.9% availability guarantees per month, although the service credits offered will rarely compensate for the customer dissatisfaction and financial impacts that result from poor inference availability.
AI inference services should be evaluated not only on model capability, but on operational maturity, including availability history, service level agreements, scaling behavior and support processes. As AI becomes more embedded in business processes, differences in service reliability are likely to become a more significant commercial and operational risk.