Increasing digital infrastructure complexity coupled with staffing concerns are leading operators to consider more artificial intelligence (AI) and automation in their data centers. In this roundtable, attendees discussed the impacts and considerations when utilizing AI in a mission-critical facility.
Uptime Institute Intelligence analyst Max Smolaks started the discussion by highlighting that AI in the data center remained in the early stages of development and adoption. This technology was expected to transform data center management, but six or seven years after the start of a major resurgence in AI research, case studies detailing the uses of AI in the data center are few, and far between.
In practical terms, AI should be seen as an extension of traditional automation technologies – which form an essential component of any modern data center. AI, at least in theory, should allow data center operators to automate increasingly more complex operations, those that previously required human oversight. In many ways, this is an evolution of familiar automation tools, rather than a revolution promised by overzealous marketing professionals.
This doesn’t mean some of the AI-based tools on the market today are not worth the investment; machine learning models have proven their ability to speed up analysis, enable organizations to process huge amounts of data, and discover new trends or patterns that are invisible to humans.
In the data center, this means forecasts may be more precise or may project further into the future, recommendations may be more accurate, timely and detailed, and alerts may be activated much earlier.
The problem is none of this functionality seems as exciting as the output of some of the machine learning models that are dominating the news, like ChatGPT or Stable Diffusion. It is hard to set realistic expectations when we see technological outcomes that seem almost like magic.
Today, AI in data centers is primarily used for dynamic power or cooling optimization, anomaly detection, alert prioritization, and predictive maintenance and other types of predictive analytics. All of this functionality is available as part of commercial, off-the-shelf products.
Dynamic cooling optimization in particular is reported to deliver results in reducing power consumption of facilities equipment and is now part of several Data Center Infrastructure Management (DCIM) offerings. In this scenario, the ML model learns the placement of data center cooling equipment, relying on existing temperature sensors, and then starts making minute adjustments to the temperature settings and observing what the effects would be. This helps avoid hotspots in the server room and stops operators from over-provisioning their cooling capacity.
Alert prioritization is another interesting use case, where the model ‘learns’ the criticality of various systems updates and helps staff avoid ‘alert fatigue’ that could make them miss a critical issue.
A more recent, and probably lesser-known application employs machine learning models to find the best place to deploy a particular server, based on constraints such as power, cooling, or network availability.
Max noted that according to Uptime Institute research, operator confidence in AI was growing: in the 2022 Global Data center Survey, 57% of respondents said they would trust an adequately trained machine learning model to make operational decisions in their facility.
The majority of data center operators were also confident that AI-based would eventually reduce their staffing requirements – with 19% predicting this would happen in the next five years, and 52% expecting it to take longer.
Max concluded his presentation by noting that risk-averse organizations don’t have to experiment with AI and build their own models – they can wait until this technology gets integrated into existing types of management and operations software. AI will become a standard feature, eventually.
During the roundtable, one Uptime Institute member working for a colocation company shared his experiences with Vigilent, a software system used for dynamic cooling optimization. They said they were initially skeptical, but the ML-powered system delivered a noticeable improvement in energy efficiency. They added that AI in its current state was likely best used to give direction or advice, and not take direct control of facilities equipment.
Another member said their colocation organization was actively investigating the applications of AI but had yet to adopt any in production.
Max added that in the future, once AI models are more prevalent in the data center, they could help with the skills shortage, and enable knowledge transfer. In this scenario, models would be trained on the daily routine of the most experienced staff, and then use this information to guide employees that have just joined the data center workforce.
One concern that often accompanies AI-based tool deployments is the security of training data. This is not an issue when the models are trained in-house, but some products enable data center operators to share anonymized facility data via the network to enable creation of more complex models by software vendors. Sharing this data can make many organizations uncomfortable.
An Uptime member present at the roundtable said they would consider sharing their data if there was a clear payback that would bring benefits back to the organization. Another member noted that the development of better models would be valuable to the industry as a whole – even if the direct impact to the organization was unclear.
Mohamed Hashem, a technical consultant with Uptime Institute, highlighted that no matter the use case, operators would have to do the work to adapt AI models to their specific facility. No two data centers are the same, and those who hope to get major improvements from generic ‘out-of-the-box’ models would be disappointed. He also reiterated that the current generation of AI tools was best placed to advise operators, rather than control aspects of the facility.
Key takeaways:
- The adoption of AI in the data center is increasing, albeit slowly, and many operators now have first-hand experience of this technology;
- Unlike most types of data center software, AI-based tools require careful handling and attention (and data) over time;
- It is fine to adopt a ‘wait and see’ approach to AI since the technology needs to be tested extensively, and any issues fixed before it can be embraced by a risk-averse industry;
- Development of AI in the data center is likely to remain slow as the sector is dealing with other critical matters, like the energy security issues in Europe, and supply chain delays affecting operators across much of the world.