Agile Monitoring Support

We recently had a customer come into trial looking around for a new monitoring solution.  This is always good for us.  We love the takeaway.  (Customers defecting from other monitoring systems to us.) As in most takeaway situations this customer had specific needs.  Now there are the obvious ones in which LogicMonitor easily fits the bill such as alerting, dashboards, performance monitoring, etc (and if you fall into that VMWare, Cisco, NetApp sweet spot, game over!).  This guy however, had a very specific need we didn’t fulfill directly out of the gates.  I think anyone who has ever worked with a monitoring solution knows that it’s hard to find one that does everything.  Well in the case of LogicMonitor this is no different.  We don’t do EVERYTHING.  I know, you thought I was going to get all high and mighty and talk about how LogicMonitor is the one monitoring tool that CAN do everything.  Well we don’t (do everything that is…out of the gates that is).  But for just about everything we don’t do (out of the gates), we can get there pretty easily because we provide an easy to use framework that allows us to quickly build almost anything you can think of on a monitoring level.

Back to the issue!  This guy must have got bitten at some point or another with interface flapping because he needed to be alerted if an interface went down and back up again within the minute that we poll it.  I get it.  As a former admin myself, I know we all have that one quirky thing (ok more like 20 things) that happened to us one time when we got bit bad and we vow that we will never get bit by it again.  This type of attitude is what makes a good admin.  I think this was his one thing.  This one thing happened to be an alert that we didn’t provide by default and he definitely wasn’t going to trust our product if he couldn’t get it to provide this alert.  With his other product he accomplished it by collecting some trap info and doing some calculation which in turn let him know that the interface was flapping.

It sounded complicated and I give him kudos for making the most out of what he has.  I knew we could help him setup this same solution, but at LogicMonitor we prefer not to trust traps.  We would much rather poll snmp counters because traps tend to get lost in transit especially during times of duress, not to mention the configuration headache.  Anywho, I digress, what I didn’t know is how we could help him to make his solution better and more reliable while applying it to every interface he planned to monitor.  So I posed the problem to our support department (of which I am also a member).  Two days later I got a response from one of our smartest engineers a.k.a. “that tech Steve”.  He suggested that we collect the counter for the interface which displays the uptime of the last status change for the interface, but monitor it for changes, not for the time.  Genius!  We could now report alerts on if that value changes.  This is also a value that we can poll rather than catch a trap that is potentially lost in the ether and for good measure I graphed it.  Awesome, now we have ammo to use during the takeaway.

This scenario is one good example where we didn’t provide something by default but because of our flexible framework and expert support we were able to easily provide a solution quickly.  Consequently our potential customer came back with an additional problem he was hoping to see us solve before making a final decision on LogicMonitor.  We were able to get him a solution quickly again and are hoping that this will help close the deal as it has many times before.

Where was LogicMonitor when I was a Network Admin?