Strategic approaches for strengthening digital resilience
IT outages are part and parcel of the digital world. What's important is that organisations have sufficient IT choices that enable them to respond quickly and withstand such disruptions.
When you pull the plug, the lights go off. Thankfully, in the "analogue world", a backup power supply kicks in so that essential services can continue normally while we try to resolve the issue at hand.
When it comes to the digital world, things are not that simple.
The backup plan
When a simple programming error wreaked havoc in IT systems worldwide and India during the CrowdStrike global IT outage, the company was able to quickly provide a remedy, albeit an unscalable remedy that required a manual reboot of affected machines.
Imagine a scenario where they were still struggling to identify the cause of the issue. How would customers mitigate the issues? Do they have "digital backup power"?
Our smartphones have multiple chat, messaging, email, banking, and browser apps. But why? If one goes down, we can quickly switch to another without downtime. What we are doing is building digital resiliency on our mobile devices.
What is digital resilience?
Digital resilience refers to an organisation's ability to adapt, recover, and maintain its essential functions and integrity in the face of challenges, disruptions, and cyber threats in the digital environment.
In today's hyper-connected world, when technology fails, we encounter major problems such as losing access to banking services, air travel disruptions, being unable to perform regular chores like purchasing groceries, etc. With a population of 1.4 billion people, the scale of the problems increases significantly, leading to increased costs.
According to a report, the median cost of IT outages at Indian organisations stands at around Rs 520 crore per year.
What can compromise an organisation's digital resilience?
There are a variety of threats that can disrupt IT services. Outages can result from power failures, hardware issues, software glitches, and human errors. Cybersecurity attacks, such as software supply chain exploits, paid "as-a-service" attacks, and Gen AI-based attacks, can also bring down IT services. Finally, disruptions can also result from issues with cloud providers or third-party vendors.
Role of the government in preserving business continuity
Governments worldwide are setting up various regulations and acts to prioritise digital resilience.
On April 30, 2024, the Reserve Bank of India (RBI) released an updated Guidance Note on operational risk management and operational resilience. The guidance emphasises the importance of sound operational risk management for financial institutions, stressing the need for proactive identification, assessment, and management to strengthen operation resilience, and safeguard against potential risks and losses.
Singapore’s central bank, the Monetary Authority of Singapore (MAS), has business continuity management guidelines to help financial institutions be resilient against service disruptions arising from IT outages, pandemic outbreaks, cyberattacks and physical threats.
Singapore also has a Cybersecurity (Amendment) Bill, which requires the owners of critical services to report a wide range of cybersecurity incidents.
In Europe, the Digital Operational Resilience Act (DORA) is a regulatory framework for the financial sector on digital operational resilience, whereby all financial firms need to ensure they can withstand, respond to, and recover from all types of IT-related disruptions and threats.
In the US, the Cybersecurity and Infrastructure Security Agency (CISA) has a Resilience Services Branch that provides guidance to secure and enhance the resilience of the nation's critical infrastructure systems.
However, regulations alone can't fully ensure business continuity. Organisations must also make informed IT decisions when developing their IT strategy.
What can organisations do to improve their digital resilience posture?
First and foremost, organisations need to be able to make IT choices with ease. Consolidating with one IT vendor can ease engagement and perhaps, reduce costs. However, it can also increase software concentration risk, since you have put all your eggs in one basket.
A smarter choice would be to adopt a multi-vendor strategy for key software infrastructure powering your essential services, for example, leveraging more than one type of Operating System for your servers or using more than one distribution of Kubernetes for your containerized applications.
In this way, you can quickly switch to alternate solutions during disruptions.
Secondly, organisations should leverage open, interoperable, and vendor-agnostic solutions that provide flexibility to adopt different vendor solutions and not be locked in to a particular vendor. It enables them to create IT architectures that support digital resiliency.
Thirdly, ensure that your infrastructure is sized appropriately. For example, the RBI has emphasised the need for banks to invest adequately in IT infrastructure as per business growth and transaction volume to reduce the frequency of IT outages.
Next, frameworks must be established to identify mission-critical services and operations and develop an alternative technology stack for them.
Finally, ensure that there is ready access to technical expertise to support the IT infrastructures.
IT outages are part and parcel of the digital world. What's important is that organisations have sufficient IT choices that enable them to respond quickly and withstand such disruptions.
Vishal Ghariwala is the Senior Director and CTO - Asia Pacific at SUSE.
Edited by Suman Singh
(Disclaimer: The views and opinions expressed in this article are those of the author and do not necessarily reflect the views of YourStory.)