A risk management playbook for improving organizational resilience

MIT Sloan experts offer a systematic approach to organizational resilience that can help leaders manage risk and rebound rapidly when catastrophic events strike.

Beth Stackpole

Mar 7, 2023

Consider the recent system failures at Southwest Airlines that put the brakes on holiday travel. Or the Deepwater Horizon oil spill or the Norfolk Southern freight train derailment, both of which rapidly became hazardous-waste disasters.

What do these high-profile incidents have in common? They were unexpected, catastrophic events where system responses consistently lagged behind the incident despite numerous opportunities for intervention. In fact, some of the underlying problems had existed and were known about for a prolonged time.

With so many public examples of the debilitating consequences of inadequate resilience planning, why do so many organizations continue to fail at mitigating risk?

The reason, according to MIT Sloan professors and is that most companies operate without proper awareness of the operational boundaries of their systems and therefore fail to detect in a timely manner that their regular operations have been disrupted.

Organizations lack insight into what their enabling operating conditions are (be they significant or seemingly innocuous) and whether they are threatened, the scholars explained during a recent webinar hosted by MIT Sloan Executive Education.

Hospitals offer just one recent example, they said. Masks and gowns are among the cheapest supplies in a hospital yet are critical enablers for surgery. But before to the COVID-19 pandemic, few if any health care organizations would have prioritized personal protective equipment, because there was no prior indicator that supplies would be at risk.

Resilience, the professors maintain, is an organization’s ability to:

Cultivate awareness of its own systems boundaries and enablers.
Continuously monitor changes over time and identify risks and threats that could impact enabling operating conditions.
Establish a plan of action if one or more of the enabling conditions is violated.

“Resiliency is the moment when your company is punched in the mouth — it’s about how prepared you are for that moment and how well you can recover,” said Levi, a professor of operations management. “Often, the very small things you take for granted can be enabling or boundary conditions of your system. And once that’s violated, your system fails to operate.”

In their executive education course, “Building Organizational Resilience: A System Approach to Mitigating Risk and Uncertainty,” Carrier and Levi lay out a playbook for improving organizational resilience. They advocate for an approach that leans on systems thinking and continuous improvement to help organizations identify problems before they occur.

They also detail how to identify the right intervention points in operations and the supply chain to help an organization stop or recover from an evolving critical situation before its impact turns catastrophic. Here is key advice from their resilience framework.

Establish structured decision processes

In order to understand vulnerabilities, organizations need to understand the operational boundaries of their systems. It starts with leaders asking the right questions and following up on the answers in a timely manner.

While organizations have created many processes that tell people how to perform tasks, most lack a structured decision-making process, from ensuring all relevant aspects are considered to creating proper debriefing protocols so they can learn from and evolve in response to near misses and mistakes.

“You need to scrutinize everything you do and identify the weak points rather than waiting for [a] failure to investigate what happened,” said Levi.

Design systems for both resilience and steady-state operation

Initiatives that ensure success when everything is working according to plan — for example, process tweaks intended to drive efficiency or increase profits — can be the very same things that wreak havoc during unforeseen disruptions.

Organizations need to design systems with enough flexibility and resilience so they can weather a storm of irregular operations. It’s important to carefully evaluate the trade-offs between what it takes to be successful in a steady state versus maintaining operations in the face of disruption.

Understand the possibilities and limits of technology’s enabling role

Sensors, analytics, and applications play an outsize role in any resilience strategy; however, technology and data can go only so far in delivering results. Technology is a tool to automate or enhance existing processes, but it will only generate bad outcomes faster if the quality of those processes isn’t up to snuff, said Carrier, a senior lecturer in system dynamics.

In the same vein, collecting myriad data will generate a lot of signals, but it’s just noise unless you have the ability to interpret the signals and use them to trigger appropriate actions. “You can’t just drop a point solution in and expect things to get better,” Carrier said. “You have to think of the system effects that technology will have.”

Create a culture of resilience

How to find and fix hidden factories

A 4-step process for recovering from business disruption

Supply chain resilience amid steady disruption

It starts at the top with C-suite executives who are fully committed to understanding what’s happening on the front lines so they can ask the right questions and be open to the right recommendations and actions.

Too often, top execs are blinded by return on investment, which encourages them to ignore or downplay what’s required for resilient operations, Carrier said. “You can get into this cycle of ‘Since it didn’t happen this period, we can get away with it a bit longer’ — until you can’t,” he said.

Oftentimes, organizational culture discourages the sharing of bad news — another cue from senior leadership. If employees are punished for reporting on near-miss issues or failures, they are less likely to deliver bad news that could be a leading indicator of a problem. “If you’re not hearing bad news from your organization, you’re allowing it to build up in your systems,” Carrier said.

Practice makes perfect

Talking about and planning a resilience strategy isn’t enough; organizations need to deploy multidisciplinary teams to simulate problems, game out solutions, and learn from dry runs. Testing and learning should be infused into the culture instead of relying on exploring risk and resiliency — orchestrated by a separate business domain — in the aftermath of a decision.

“Everyone needs to understand the severity [of various risks], what’s possible, and what actions they need to take,” Carrier said. “Recognizing that will start building resilience. You don’t need to start a fire to run a fire drill.”

For more info Tracy Mayor Senior Associate Director, Editorial (617) 253-0065 tmayor@mit.edu

A collage of a warehouse, person using a computer keyboard with vital signs superimposed, and a stack of papers with stethoscope

Ideas Made to Matter 4 startups solving logistics problems in industry and health care

New manufacturing imagery: A robot manages an automation; a 3D printer; a worker uses a tablet with a robot arm on the screen

Ideas Made to Matter The 4 themes shaping new manufacturing

An illustration of workers examining a series of cogwheels with magnifying glasses

Ideas Made to Matter Dynamic work design, explained

Which program is right for you?

Executive Programs