Event Recap
RECAP | ROUNDTABLE | De-energized Work in the New Normal
De-energized work has been a hot button (pun intended) in the data center industry for close to a decade. The restrictions around hot work have been discussed time and time again. However, there appears to be a new set of operating parameters caused by the pandemic that would seem to cause even more consternations around de-energizing equipment. IT workloads are becoming even more critical causing available maintenance work hours to be reduced. In this roundtable, attendees talked about how they are now addressing de-energized work in the New Normal.
Discussion:
In the opening remarks, attendees mentioned how they have adopted strict policies around de-energized work, and how safety is the priority. They stated how they wanted to listen and learn from others, and they want to understand how cultural shifts have changed the thinking. Others indicated they are still dealing with cultural pushback and how there is major concern in making errors in adopting a non-hot work process.
Chris Brown, Uptime Institute CTO, kicked off the discussion by stating Uptime Institute’s position. Uptime Institute’s stance is work on electrical equipment should be conducted when the equipment is de-energized. Safety is the most important aspect and de-energized work provides that protection to workers. If anything goes wrong, it could impact people in a big way, as well as damage equipment. Uptime Institute is seeing more actions from OSHA against people conducting energized work. We are also seeing contractors refusing to do energized work.
For most companies, equipment is able to be shut down when work is being conducted at and above the UPS system. Downstream, below the UPS system, appears to be more challenging because there is little to no protection to the critical load. PDUs are typically not set up where you can safely work in the individual panels. Therefore, you have a lot of branch circuits to transfer in order to move the load, creating additional risk to IT loads. IT avoidance of risk and the time it takes to move the loads and perform the work appears to be the major issues when dealing with IT to get concurrence and approval to perform the work.
One attendee stated moving loads around to essentially single source the affected IT equipment is a critical work effort with notifications going out to customers. Identifying all the customers impacted is the challenge. You need to do the research first to make sure you get all the impacted customers, communicate with them, and then get their buy-in. It can take months before you get all these buy-ins and approvals. Even with approvals and buy-in, this attendee still has had issues with IT causing back-outs at the last minute. If you still have single corded equipment installed, this effort gets even more complicated and lengthy. Through governance and standards, it is recommended to not allow single corded IT equipment.
Another attendee talked about how there definitely needs to be a partnership between data center facilities and IT. It goes back to the importance of data to track IT server owners. In their environment, the data center team is the driver behind the need to perform the work and any maintenance, so they are dictating to the business their needs. They have standards in place as well to not allow single corded equipment. Even though you have a partnership, at some point you may have to draw a line in the sand with safety coming first.
An attendee chimed in about how the historical divide between facilities and IT has caused a different cultural mindset. The IT mindset is they cannot allow single sourcing because it creates an undue risk. Yes, you can have a standard saying no single corded IT equipment, but there is lurking single corded and tri-corded equipment by exception. All this has contributed to reluctance from IT to allow equipment to be isolated and de-energized for maintenance and work.
An attendee then stated how we’ve spoiled our customers in the past by conducting hot work to deploy branch circuits without any issues. In their environment today, when we get down to the stakeholder level and we explain the risks profile and how we mitigate that, IT seems to always come around.
Chris Brown reiterated it is all about risk. He agreed we have spoiled IT in the past. If you explain the higher risks and your mitigation processes, we have seen most IT teams will agree to de-energize the equipment. The key is to communicate and explain all levels of risk, which would include personnel safety. An attendee agreed and added to use IT’s language from a risk perspective. For decades, IT thought power delivery was behind the scenes magic.
An attendee chimed in there is a risk in NOT doing the proper maintenance work. It is on us in data center facilitates to make that case as well. Even though their site is concurrently maintainable, it’s still a hard sell, adds time for approval, and jeopardizes getting the work done. Education is needed around temporary redundancy and what it entails, but in the end it just comes back to personnel safety. The attendee pointed out how history helps us with this, and how the data center trend is our friend per the following:
• Arc flash risks and personnel safety have caused compliance and legal issues
• There is a clear linkage between electrical safety and reliability (preventive maintenance)
Another attendee stated they host a monthly global risk management calendar review with all their customers to provide transparency and so they can ask questions. This has proven to be very successful and provides proactive communication.
The discussion then shifted to the increasing trend where IT wants this type of work to be performed during times when there is less IT risk. One attendee indicated they received buy-in, but the challenges to make sure everything is done right along with the narrow IT windows makes performing the de-energized work more and more challenging each year. Everyone wants safety and resiliency, but non-work windows are increasing. Down windows are driven by system availability but facilitates is not always included in those discussions. It takes time to get those windows changed and nobody wants the windows to be more less than what they presently are.
Lastly, an attendee stated when it comes to upgrading equipment, like PDUs and RPPs, the desire to perform de-energized work is driving equipment selection decisions.
To summarize, all attendees seemed to agree - Safety verses IT business risk is the flash point. Being proactive with communication, education, and the need for personnel safety and ability to perform the work in a de-energized fashion is the key.
For additional information on this topic, below is the link to the Intelligence Collection titled NFPA70E and De-Energized Work.
Request an evaluation to view this report
Apply for a four-week evaluation of Uptime Intelligence; the leading source of research, insight and data-driven analysis focused on digital infrastructure.
Request Evaluation
Already have access? Log in here