Event Recap
RECAP | ROUNDTABLE | New Digital Resiliency Architectures
Participants at Inside Track’s May 13th roundtable shared their experiences with digital resiliency, with two North America-based companies reporting that they had broadly transformed their businesses away from traditional “five-nines” concepts of resiliency. Others on the call reported high levels of organizational interest.
At the outset, Uptime Institute VP IT Optimization and Strategy Todd Traver described the benefits of digital architectures. He noted that traditional measures of resiliency failed when organizations offer services using multiple applications that reside in different platform types and environments. Traver said that 2/3 of service outages were now caused by network or service providers, with only 1/3 due to data center failures, in a complete reversal of historic patterns.
One participant from a traditional telco provider supported Traver’s position. He said that nines are a less relevant measures for his company because a single transaction can sometimes involve as many as 37 applications, each running in geographically distinct and sometimes different facilities. He noted that measuring equipment reliability in these circumstances is difficult, but that the same techniques become infinitely harder when looking at components such as ports and probably impossible for software. It is, he said, preferable to consider all parts of the system to be unreliable, with an architecture that includes failovers and backups for all situations.
A large financial-services company largely agreed. This firm had successfully transitioned all its merchant gateway services to new architecture and wanted to transition its core banking services. The challenges, she noted, included changing the corporate culture and moving away from siloed visions of reliability. It was also very important to have C-level engagement.
Traver noted that the task of moving to robust resiliency architectures could be quite challenging. It involved, for example, increased automation, especially in performance monitoring, as well as frequent testing. Participants pointed out that it was important to “design in” resiliency as part of application development, with validation testing performed during the application release process, and ongoing testing of end-to-end digital infrastructure environment to ensure the quality of service being provided to the end-user.
Individual participants shared areas of concern, based on their experiences. These included the availability of compatible components and software, especially when addressing longer latencies due to distance or because backup systems may not perform as well as primary systems. Kevin Heslin, who facilitated the meeting, asked how enterprises could be certain that third-party systems or facilities in hybrid cloud environments could meet demanding SLAs.
The enterprise participants also shared tips. They agreed about the need for greater involvement from infrastructure teams, more personnel with cross-competency skills, better labeled and more capable off-the-shelf software and tools.
Request an evaluation to view this report
Apply for a four-week evaluation of Uptime Intelligence; the leading source of research, insight and data-driven analysis focused on digital infrastructure.
Request Evaluation
Already have access? Log in here