This is an interview with Premesh Purayil, VP of Engineering for Ranker.com, a social media website that makes it easy to create lists on any topic imaginable, and then have visitors vote and rank their favorites. Since launching in 2009, hundreds of thousands of lists have been created, ranging from the Best Films of All Time to The Top Inexpensive Cars. Headquartered in Los Angeles, Ranker is experiencing significant growth, now with over 3.5 million unique visitors a month.
What monitoring issues were you having prior to using LogicMonitor?
I joined Ranker a year ago, and they already had setup Nagios on local servers in our data center. The person who set that system up was a contract sysadmin. Whenever we added a new server to our production or QA environment, we had to jump through hoops to get him to add the new server into monitoring.
For server issues, our engineers were only receiving alerts via email – SMS alerts weren’t working properly. So we watched our email a lot closer, but there were times – such as on weekends – where we weren’t always online checking email, and the site might be down for an hour before someone realized it.
Once we knew there was a problem, engineers were spending a lot of time figuring out what was going on because we weren’t getting the historical data that we wanted from Nagios. We had to devote from half to a full day of developer resource time to back-trace the issue, which takes away from our normal development cycle of features we’re supposed to be working on.
How have things been different with LogicMonitor?
For us, the main thing right off the bat was how easy LogicMonitor was for a non-sysadmin to setup.
The initial setup took less than 10 minutes. We were monitoring a couple of our servers within a half hour. We were up and going at a comfortable place within a day, and then within a week had our whole environment monitored and running the way we need to.
With LogicMonitor, all of my engineers have been involved from the beginning. They are familiar with the UI and can set things up as we bring on new boxes. I no longer have someone who’s unfamiliar with our infrastructure working on system monitoring. We have engineers who know our environment and can tell quickly when things are breaking down.
When we do have an issue, it’s very easy to go in and look across metrics across all our servers and start pinpointing where something broke down. A lot of times it’s a waterfall affect, something very base level, like a database issue might creep up, then in turn affects a web server, which then affects something else. Instead of taking ½ day to diagnose, we can usually figure it out within 15 minutes of looking at the history of metrics leading up to the issue.
How has LogicMonitor’s support been?
There have been times where we may have set something up incorrectly where it’s been helpful to be able to open the chat window and have a LogicMonitor engineer help diagnose the problem.
Also, we wanted some custom Java monitoring that wasn’t available in LogicMonitor out of the box. We posted a request in their support forum and within a week, they had it implemented and on our servers. We couldn’t have done that without a service like LogicMonitor.
What about cost?
We don’t need a separate contract sysadmin anymore to handle our monitoring. We replaced that cost with just the cost of the LogicMonitor service.
