No one likes to talk about outages. They’re horrible to experience as an employee and they take a heavy toll in customer confidence and future revenue. But they do happen. Even publicly traded tech powerhouses, such as eBay and Microsoft, who have more technical resources than you’ll ever have, fall prey to outages. And when they do, they are closed for business, much to the chagrin of their shareholders and executive teams.
It’s not so much a question of whether an outage will occur in your company but when. The secret to surviving them is to get better at handling them and learning from the mistakes of others. Nobody is perfect all the time (my current employer, LogicMonitor, included) but I hope by talking about these mistakes, we can all begin the hard work required to avoid them in the future.
A well-formed process for handling outages must define who is accountable for resolving issues, who is in the escalation path and who is responsible for communicating about issues. It includes a post-mortem process for analyzing the root cause behind the outage and addressing any gaps, which can range from building redundancy into systems to changing monitoring settings so that issues can be caught and resolved before an outage might reoccur in the future.
Improving management of outage incidents can produce better outcomes for your company’s employees, customers and shareholders. It won’t be easy. But it will be worth it. And it all starts with avoiding some basic mistakes that others have made before you.
Want more on this topic? Read the E-Book
The Top 10 Mistakes Companies Make Handling Outages and How to Avoid Them All
Did you like this post? Feel free to share or like this post. Enter your comments below.
Scott Barnett is an employee at LogicMonitor.
Subscribe to our LogicBlog to stay updated on the latest developments from LogicMonitor and get notified about blog posts from our world-class team of IT experts and engineers, as well as our leadership team with in-depth knowledge and decades of collective experience in delivering a product IT professionals love.
In this ebook, we’ll demonstrate how monitoring and IT automation can help MSPs overcome today’s challenges, and unleash new efficiencies to drive down costs and expedite customer value creation.
LogicMonitor announced that TrustRadius has honored the company with a 2022 Top Rated Award for IT Infrastructure Monitoring.
Using dashboards effectively helps consume metrics and data from multiple data sources, provides a common space for various organizations, and much more. Learn how effective dashboard implementation can propel your IT monitoring initiatives.