The Three Rules of Meaningful Alerts
Monday 8 October 2007
There’s just one simple reason why most monitoring implementations fail: they send out too many alerts.
Most implementers start out from the premise that the biggest problem to avoid is to miss a critical problem. It’s not. Too many alerts are a bigger problem, by far. Once you have too many, your administrators will start ignoring all of the alerts.
When you design monitoring, follow these three rules:
- Every alert must mean “Run to the console now!”
- All alerts must be actionable.
- Alerts must cover every part of an application.
In order to follow all of these rules, most shops have to turn off a great number of monitors, as most of them fall short. We’re going to talk about each of these points in detail in coming articles, because they each require some explanation. Stay tuned for more on each of these.
