“If we can’t monitor, predict, and work on the business of the University, then why are we doing what we’re doing?” – Ethan Bateman, Manager of Network Operations Center at LSU
Louisiana State University, located in Baton Rouge, Louisiana, is home to 35,000 students and more than 250 buildings. The Network Operations Center (NOC) is the University’s 24/7 system operations and monitoring center, running 365 days a year through holidays, hurricanes, and more to make sure everything runs smoothly for staff and students.
Like many academic institutions, LSU experiences a significant network spike when students return to campus every August. And for Ethan Bateman, Manager of the NOC, challenges arise due to this cyclical nature of demand. Bateman’s team must be prepared to meet the expectation to provide 100% connectivity and uptime at once across a large environment.
“We needed a monitoring platform that was going to work for leadership all the way down. In order to have a true single pane of glass view, you have to have everything in there,” Bateman explained.
To improve visibility and provide a more proactive monitoring experience, LSU needed a unified observability platform that matched the needs of their growing hybrid environment, as well as quick anomaly detection and automation.
LSU has been moving toward a modern hybrid IT infrastructure with some parts of their environment remaining on-premises, like their public branch exchange phone system and mainframe that hosts the student information system. Other resources, like the O365 shop, are in the cloud.
To help migrate the University’s systems to the cloud, Bateman and his team needed to ensure that their business was functioning properly, getting ahead of any bottlenecks, and delivering top quality service.
During the team’s search for a platform that met their visibility needs, LSU turned to LogicMonitor for a holistic, modern approach to monitoring. By implementing LogicMonitor, LSU was able to address several major business challenges caused by their previous open source monitoring solution.
“We needed to find something we weren’t going to have to dedicate personnel to, something that was hosted so we don’t have to take care of any infrastructure on site, and something that could give us unified observability in a single pane of glass,” Bateman said.
LogicMonitor’s SaaS-based approach and unified display gave Bateman and his team much needed visibility into their modern hybrid IT environment, allowing them to get ahead of any potential issues before they arose. Instead of a reactive approach and scrambling to quickly identify and resolve any issues, the NOC team could now implement proactive alert structures to reduce downtime.
Unified observability allowed the NOC team to engage in more value-add operations for their organization. They have been able to help the Engineering and Architecture teams with lifecycle upgrades, install and configure new switches for their Technical Architecture Group, and support strategic initiatives for the institution.
Furthermore, accurate root cause analysis not only helped LSU to quickly identify issues, but to escalate the issue to the proper administrators for speedy resolution.
“When you have all parts and pieces of your systems and your infrastructure in the platform, and you see a failure higher up in the stack, but the root cause is lower in the stack, you can go to that admin or engineer team with that information,” Bateman explained. “What it does for us, is we can attack the problem immediately and get it resolved faster, instead of going to the app admin, for example, and saying your app is running slow, when that isn’t the issue.”
LogicMonitor has reduced downtime and outages, which contributes to better productivity. Additionally, the NOC team can now predict when issues are going to arise in order to bring services to continuous uptime. With predictive monitoring, the team can create a plan to tackle issues ahead of time, which leads to more process improvements and fewer mistakes.
“If it’s the fiber line that we need to get tested or cleaned, we can get that specific group involved and resolve that quickly,” Bateman said.
Without unified observability, LSU would have a much harder time maintaining the 24/7 uptime that is now industry standard. The University leveraged LogicMonitor metrics and data to proactively work against potential system or service failures, instead of waiting to hear about an issue when an end user experienced it.
“If you don’t know until your users are calling that they can’t get to a specific resource or a service isn’t working the way it’s supposed to, you’re just dealing with what the universe gives you and spending time mitigating after the fact. You’re not being proactive in keeping that uptime standard,” Bateman said.
LSU has also been able to decrease the manual workload when it comes to administrative tasks and platform management. Their previous open source tool required more work and valuable time from people inside their organization to maintain. With LogicMonitor, Bateman and team are able to prioritize working on strategic projects instead of general platform upkeep by accurately predicting potential issues in their environment.
“My vision for observability is to be able to see all of the data points and metrics of everything that exists in the infrastructure, because once you know everything that is going on, you can then begin to predict things,” Bateman explained. “You can see things that are going to go wrong, before they fully break.”
Over time, Bateman’s team monitors for Cyclic Redundancy Check (CRC) errors, which is indicative of an impending optical outage that would cause networking to fail in a building. The implication of downtime is significant for a student. Even 30 minutes of downtime could result in 500 hours of lost productivity. With LogicMonitor, the team has gotten ahead of those instances nearly 100 times.
These CRC errors also degrade the ability to transmit data, causing users to see a “slow network,” which also greatly affects productivity.
In one case, a registration portal that is heavily used at the beginning of each school year would crash after some time. That amount of time wasn’t consistent enough to track a trend, but the team was able to decipher that it would load slower before it completely crashed. LogicMonitor was able to alert once the load time began to increase, and the admin was able to go in to activate a restart within 30 seconds. Being able to proactively address this issue meant the team didn’t have to wait for an outage to occur and then spend valuable time getting in touch with the admin and then time spent resolving the issue. This created a better experience for the University and built trust between students and the IT team.
“The biggest benefit to unified observability is 100% service uptime, and that’s the new standard,” Bateman said.
By partnering with LogicMonitor for unified observability and predictive monitoring, LSU will continue their migration to a modern hybrid infrastructure – all while maintaining 100% service availability.