Wolfcast

Event 13 August 2024

How to Respond to a Large-Scale Outage: Lessons Learned from the CrowdStrike Incident

As an insurance executive, you know that disruptions can happen at any time, and it’s important to be prepared for them. The recent CrowdStrike outage serves as a reminder of the need for agility and a speedy response when running a crisis management plan. In this blog post, we’ll explore the lessons learned from the CrowdStrike incident and provide a step-by-step guide on how to respond to a large-scale outage.

Introduction

The CrowdStrike outage was a large-scale disruption that affected many organizations worldwide. Despite the severity of the incident, customers seemed to have an accepting attitude towards it, and blame was only aimed at individual firms that failed to recover within similar time frames to their peers. This highlights the need for a speedy response and agility when running a crisis management plan.

Step 1: Acknowledge the Incident

The first step in responding to a large-scale outage is to acknowledge the incident. It’s important to communicate with your customers and stakeholders as soon as possible to let them know that you are aware of the situation and are working to resolve it. In the case of CrowdStrike, they issued a statement on the first day of the outage, but it came across as an attempt to downplay the events unfolding. This caused confusion and frustration among their customers.

Step 2: Provide Regular Updates

The second step is to provide regular updates to your customers and stakeholders. This helps to keep them informed of the situation and the progress being made towards resolving it. CrowdStrike rapidly built a support portal page on the first day of the outage, but their homepage remained unaltered, leaving some confused. It’s important to ensure that all communication channels are updated with the latest information.

Step 3: Take Responsibility

The third step is to take responsibility for the incident. This helps to build trust with your customers and stakeholders and shows that you are committed to resolving the issue. CrowdStrike’s CEO released a video on The Today Show, which included an apology and showed the evident agitation he felt on behalf of his customers. This helped to appease many of their customers.

Step 4: Provide a Solution

The fourth step is to provide a solution to the problem. This could involve offering workarounds or alternative processes to help customers continue their operations. CrowdStrike issued guidance for workarounds and worked with their customers to reverse the faulty update. It’s important to work closely with your customers to find a solution that works for them.

Step 5: Learn from the Incident

The final step is to learn from the incident. This helps to prevent similar incidents from happening in the future. CrowdStrike’s CISO Shawn Henry owned the failures and promised to do better in the future. This resonated with the security, risk, and resilience community and showed that they were taking the incident seriously.

Conclusion

The CrowdStrike incident serves as a reminder of the need for agility and a speedy response when running a crisis management plan. By following these five steps, you can respond to a large-scale outage effectively and build trust with your customers and stakeholders. If you’re interested in learning more about how to turn real-time data into insurance, get in touch with Riskwolf to develop parametric insurance for your case. With Riskwolf, you can ensure that you’re prepared for any disruption that comes your way.

Original article