Speakers

Sarah Terry

Sarah Terry is the Vice President of Product Management at LogicMonitor, where she leads the strategy and development for the core IT infrastructure monitoring supporting the LM Envision platform. With over a decade at LogicMonitor, Sarah has played a pivotal role across many facets of the product organization, including launching and scaling LM Cloud, LogicMonitor’s cloud monitoring solution, helping to expand the company’s capabilities in hybrid and multi-cloud environments. Sarah holds a Bachelor’s degree in Engineering Physics from Cornell University. Based in Santa Barbara, she enjoys running, cycling, and swimming along the California coast. Outside of work, she loves spending time with her husband, their one-year-old son, and their dog.

Karthik Sj

Karthik is currently General Manager of AI at LogicMonitor where he leads the global product and go-to-market teams. Prior to that we was at Aisera, a Gen-AI startup and at SAP where he led Product & Engineering teams with a specialization in AI. He has successfully launched several zero-to-one products and scaled them to over $50 million in ARR.

Andrew Keating

Leading the Product Marketing, Customer Marketing, and Analyst Relations teams to drive go-to-market strategy for the leading hybrid observability platform powered by AI, collaborating with leadership, product teams, and partners to shape messaging, fuel growth, and elevate customer engagement.

Video Transcript

“Well, good morning. Thank you, Brooke. And it’s so great to be here with all of you.

So Christina’s keynote really inspired us, excited me, and we’re gonna spend the next forty five minutes or so digging into the vision for transforming IT operations, helping you navigate that modern data center. And we’re gonna go a little bit deeper than, the the high level vision that Christina framed out to to talk about what we’re doing from the product and engineering perspective to help you day in, day out manage and navigate your modern data centers.

So let’s talk about the modern data center a bit. It’s really everything we’ve been focused on since the very beginning of LogicMonitor.

We’ve highlighted the need for cloud and multi cloud visibility, but our roots are in giving you visibility across the the entire IT environment, all of your infrastructure.

And that only becomes more exigent in this era of AI.

With the needs of AI and customers’ AI initiatives, the initiatives that you all are planning in your organizations, whether you’re doing things like training a machine learning model on prem or directly leveraging GPUs, whether you’re working with generative platforms like Bedrock, or maybe you’re still trying to figure out how this is gonna work in your environment, what your organization is going to need, and and how you’re going to to fit all these pieces together. Wherever you are in in this age of exploring AI, we’re really here to help you. And flexibility and extensibility in the infrastructure is really what’s most important, and that’s why LogicMonitor is perfectly positioned to help you. And that’s what we’re gonna go into for the rest of this platform keynote.

Now our platform has come a long way just over the past few years. And if you’ve been with LogicMonitor for a few years now, you probably remember several years ago when we introduced some of these disparate products like logs, cloud, and on prem. These were kind of separate things. But over the past few years, they’ve really been stitched together into a single cohesive platform. And so if you’re just coming in today and you’re new to LogicMonitor, you’re seeing something that has been years in the making and that has really let us pioneer a new chapter in observability with all of our investments in AI and in the Gentec AI ops, which you’re gonna hear a lot more about throughout the course of the day today.

So let’s talk about cloud for a few minutes. Christina mentioned all the excitement around SaaS. And and ten to fifteen years ago, Salesforce, everything they were doing, SaaS y, was really an exciting time. When we think about cloud infrastructure at that time, there was this talk that maybe we wouldn’t need data centers anymore. Maybe we were just going to do away with everything on prem, and we were just going to be using IaaS in the cloud.

It was meant to be maybe a one way trip that we’d start using Salesforce, we’d start using all these SaaS apps, and our infrastructure would no longer be servers in racks in on prem data centers, and it would just be you know, everything in infrastructure in the cloud or maybe platform as a service. We were gonna shut down the on prem servers. We weren’t gonna need on prem servers or data centers anymore. Right?

Well, it worked out that way for some organizations. So some organizations were cloud native and born in the cloud and never really spun up an on prem data center. Some workloads have migrated to the cloud and and never gone back, like Salesforce, for example.

SaaS applications have certainly proliferated, but the vast majority of us have found that working in the cloud and the journey to the cloud was a lot more complicated than that one way trip that was part of the the vision ten or fifteen years ago. And we at LogicMonitor have been hearing things from all of you about topics like repatriating workloads, about bringing things back on prem, about expanding your data centers, about orchestrating your on prem data centers with cloud technology so that you can get the benefit of all of that together. So hybrid workloads are here. Multiple cloud workloads are here. Bringing things to and from various clouds are here to stay. So the cloud journey has become a lot more complicated than than we thought maybe ten or fifteen years ago.

And so cloud coverage is really key to the modern data center. The modern data center is multi cloud. You need to have options. You need to be able to use AWS to be able to use Azure, GCP, other public clouds, other private clouds. Around forty percent of our customers today are monitoring cloud infrastructure with with LogicMonitor, and this is growing really rapidly. So I fully expect in a year, in maybe even less than that, that that’s flipped around, and fifty, sixty, seventy percent of you are monitoring cloud, with LogicMonitor.

What we’ve also seen is that multi cloud adoption continues to grow. And so twenty five percent are able to to look at multiple clouds in LogicMonitor. I think this number is actually pretty low. Like, what what if you’re not looking at multiple clouds in LogicMonitor, you need to tell us what’s holding you back so we can help you get that full cloud visibility because it’s very powerful when you can have the single pane of glass across multiple clouds, and you’re not locked into a vendor’s proprietary tool. And so that way, you can have everything in a single pane of glass. We can help give that to you. So this number as well, I expect to continue to increase, and we’ve seen dramatic growth in it just over the past few years.

Customers are growing with us and investing with us. The growth rate in cloud for us has been extraordinary, seventy seven percent over the past three three years. And so this is really a moment to thank all of you for coming on this cloud journey with us.

And what we’ve seen is that customers continue to get value, and they continue to to expand their presence with our cloud, with our cloud offerings.

And then there’s AI. It’s a buzzword. There’s a lot of hype around it. We’re all hearing about it all the time.

But it’s here, and it’s part of the AI it’s part of the IT landscape today. It’s actually something that my team and I are using every day at LogicMonitor, and so we’re relying on the IT ops heroes at our organization to help make that possible. If we don’t have access to AI tools, our work is gonna go more slowly. We’ve already incorporated it into all of our workflows, and I imagine the same thing is happening for your users at your organizations.

LogicMonitor customers are evaluating AI services all the time and spinning up new AI initiatives really on almost a daily basis. So we need to go beyond just monitoring, and we need to be able to look at efficient, reliable, and scalable infrastructure in this AI era.

And so everything I’ve been talking about, cloud, AI, the modern data center, relies heavily on networking. And so we’ve continued to see network modernization growth, whether it’s measured in wireless access points, all the different coverage that we have, or when you start looking at SD WAN adoption. These numbers just keep going up and up, and it’s about exposing richer telemetry and deeper insights and providing real time value to to all of you and your teams. We really expect this to to continue to grow.

So as I wrap up and and ask Sarah to come up and talk in more detail, there’s really three areas that we’re gonna go into in the rest of this product keynote and three themes that really motivate and drive everything that we’re working on from a product and engineering perspective. The first is making it very easy to get data into the platform. We’ve always done that with our collector based approach. We’re continuing to expand on that.

Then once the data is in the platform, we wanna make it very easy for you to troubleshoot and give you the context on what’s going on. Our North Star is that you should have everything you need to troubleshoot an issue in that single pane of glass inside of logic monitor. You shouldn’t have to log in to multiple tools, file a ticket to get access to logs, and things like that. We wanna make it very easy for you to do contextual troubleshooting right from a single pane of glass.

And then, of course, we’re pioneering the future with our Agentic AI ops solution and bringing together this rich observability dataset and all the telemetry we have in the platform to power the next generation of Agentic AI ops. And Karthik’s gonna talk much more about that in a few minutes. So this is what’s guiding our product and engineering efforts. We’ve got some amazing stuff to show you in the rest of this keynote and in our breakout sessions today, and we’re really excited to to talk with you more about that.

And with that, I’d like to introduce our VP of product management, Sarah Terry, to go into this in much more detail.

Alright. Welcome, everyone. Glad to be here. So Andrew just walked us through how the LogicMonitor devices.

And fast forward to today, we’re looking at over one point one trillion metrics in our custom time series database across almost four and a half million devices.

So we actually wanna give you guys a behind the scenes look into how we support this scale. Normally, we really like to highlight customer stories, and you’re absolutely gonna get a ton of those today. I promise. But we actually wanna give you insight into something a little bit different, which is how we at LogicMonitor use LogicMonitor.

So with that, we will roll the demo.

Alright. So what you can see here is actually starting with how LogicMonitor helps IT operations teams. This is the account that our IT operations team uses here at LogicMonitor to monitor things like Jira, Slack, Confluence, as well as the on prem equipment for our local offices.

Of course, availability of these resources is of the utmost concern as it directly correlates to employee efficiency.

Here we are on our resource explorer, and what we wanna do is actually start by breaking down the breadth of really what we’re monitoring in this account.

So what we’ll actually do is we will group first by provider and then by resource type and take a look at what we’ve got. You can see that we have a number of physical servers, firewalls, switches, again, these are for our local offices, as well as a number of resources in the cloud.

So let’s go ahead and take a look at one of these cloud resources.

We will take a look at this resource supporting our production Jira instance and look at what we’re monitoring.

You can see that we do have an alert. We’ll get back to this in just a minute. But I actually wanna start with the breadth of metrics that we have here. So we do have data coming from the CloudWatch API for this particular e c two instance, but we’re really complementing that with data from a logic monitor collector to fill in the gaps where CloudWatch isn’t reporting anything like memory usage or our Tomcat statistics.

On the info tab, we can see all of the rich metadata that we have coming from AWS and the collector as it’s auto recognized and detected this device.

And then if we continue on, we can actually take a look at the logs from this resource as well, which when paired with the metrics, really give us a more complete view of what’s happening with this resource and why.

So let’s go ahead and look at how the metrics, logs, and metadata are all coming together to help us with this alert here.

So we can see that we have an issue with our Tomcat request time. If we look at the shaded blue range, the request time that we’re seeing is actually not unusual given the past values of this resource.

On the graphs tab, we can actually look at some of the other metrics for our Tomcat instance here.

And then the logs will show us any anomalies that we have to review. It doesn’t look like we have any. And we can also toggle to all logs to get all of the logs for this e c two instance. It might help us uncover what’s going on.

The maps tab helps us quickly understand the blast radius. Now while the availability of this instance doesn’t seem to be impacted, we can see that we do have a downstream EBS volume.

And then the history tab shows us how often this alert has occurred in the past. As you can see, it’s occurred quite often. And so in this case, it might be a situation where we need to take a look at the static thresholds we have set that triggered this, perhaps even take advantage of LogicMonitor’s dynamic thresholds, which are only going to trigger an alert when the value of the request time is outside of that blue range and truly anomalous.

So this is how the metrics, logs, and metadata come together to give IT operations teams insight to troubleshoot faster.

Now let’s take a look at another example. I’m gonna show you how LogicMonitor can help platform engineering teams.

We’re gonna take a look at this account here, which is used by our technical operations teams to actually support the element vision platform.

Of course, they care about availability as well, but performance is also a concern. We wanna make sure that the application, logic monitors, and vision product is working well for all of you guys. And so that’s something that the team keeps an eye on here. Here we are in our Resource Explorer. Instead of starting with the breadth of what we’re monitoring, we’re gonna start with a slightly different view. Let’s look at all of the resources that are tagged with the application they support, and then we can actually group the display of those resources by the application name.

This should give us a really good view of all of the applications that we’re monitoring that support the ElementVision platform.

You can see there are quite a few.

And so what we’re actually gonna do is we’re gonna go and look at one of these applications and take a look at what we’re monitoring.

We can pick the metrics processor. This is an application on our back end that is actually processing the metrics that comes from all of the collectors in your environment.

We’ll go ahead and we’ll take a look at one of the Kubernetes pods that’s running this particular metrics processing application.

And just like before, let’s take a look at what we’re actually monitoring.

Now in this case, we’re gonna actually start on the info tab and take a look at all of the rich metadata that we have. We’ve got a lot of custom properties set to support data collection.

We also have metadata coming in from the Kubernetes API, such as our Helm chart versions, our application labels, and more. And then we’ve got information like our namespace and node name and pod name coming in as well. And all of this metadata, again, powers features like our resource explorer and as will show dashboard filtering as well.

Now let’s take a look at the metrics that we’re monitoring here. You can see that we’ve got an array of metrics, including metrics coming from the Kubernetes API and kube state metrics.

But what I also wanna call your attention to are the application metrics that we have here, and I’ve just expanded that group. You can see that we actually have a ton of application metrics that give us insight into the performance of this metrics processing application.

Our team writes custom data sources for all of our applications to get insight through metrics, simply through metrics. For this metrics processing application, this is gonna look like how many complex data points are we processing and how long is that taking on average.

Now, if we go back up to the pod level, we can also take a look at the logs that are coming in for this Kubernetes resource.

That’s also gonna give us really great insight into how the application is performing.

We know that a lot of valuable application data is written to logs. You can see that we’ve got all of the logs for the pod here, including what the application itself is writing. And as before, when this is combined with metrics, this can really give us a more complete view of what’s happening with the application and the underlying infrastructure.

The maps tab, again, will tell us what this metrics processing pod relates to. So we can see that we have a metrics processing container running on it. That makes sense. We’re also supporting a metrics processing service. So let’s go ahead and take a look at that service.

And this is actually powered by Service Insight. Service Insight allows you to get an aggregate service level view. So rather than looking at the ten or so pods that are running the metrics processor application and piecing together overall how that application is performing across those pods, I can use service insight to aggregate the metrics that I care about and get this consolidated view that shows me how the metrics processing application is performing as a whole.

The status page is gonna use those metrics that I’ve indicated to tell me whether the application is okay.

And then on the graphs tab, I can get some of those key aggregate metrics that I care about, like how many data points of different types are actually being processed by this application.

We also have health metrics that tell us in aggregate how many containers are running this application, and we can see that we’ve got about thirteen of them. We also have some of our key Kafka metrics that the team really cares about to keep an eye on how this application is performing.

So, again, this service level view enables you to get a consolidated view into how your services or applications are performing as a whole so you don’t have to piece that together over the resources.

Let’s take a look at one of the dashboards that the team tends to leverage when we’re looking at applications like this.

We’ll go ahead and stick with the same example. We’ll look at our metrics processing application, and we’ll take a look at one of the dashboards by region.

Now you can see that the team has a number of widgets here that they really care about when they’re keeping an eye on what they care about for this application.

Everything from CPU and memory to the number of times the pods have restarted this application, the number of containers we actually have running, some of the metrics processing times that we really care about, some of the key Kafka metrics, as well as some of the Kubernetes metrics like how many replicas are actually running.

As we continue to scroll down, you can see some of the insight into how the horizontal pod auto scaling works, any of the alerts that we have for the pods that are actually running this metrics processing application, And again, just more KPIs that the team really wants to keep an eye on to make sure that the performance is optimal.

If we scroll back up to the top, you can actually see that we have our dashboard filters in use here. This dashboard filter allows us us to narrow the scope of the data displayed to a particular pod, and this is really powerful because it means that on the fly, we can have this dynamic dashboard where we can narrow the scope to a particular pod without having to recreate this dashboard for every single value you see listed here. So these dashboard filters really make our team more efficient and save us time.

Now the last thing that I wanna call your attention to is, you know, aside from maximizing availability and performance, how is LogicMonitor really helping our team achieve more business value? Well, one of the ways is through our new cost optimization products that we launched last year. So let’s go take a look at the overview of our multi cloud spend and what that looks like.

You can see that LogicMonitor provides this nice billing overview across cloud providers and breaks it down by a number of valuable dimensions, like the types of resource that the spend is across, the regions that we’re spending across, and then by acute few key valuable tags, like environment, application, owner, and business unit.

These are intended to give insight into how the cloud spend is allocated across the different ways it could be spent.

Let’s go ahead and take a look at the weekly trend up top to take a look at how the spend is changing over time. Is it increasing? Is it decreasing? Where are we gonna end up at the end of the month? These are the views that our team is using to actually keep an eye on how much we’re spending in AWS and how we might need to adjust on the fly.

You can see that we have views by provider, by account, as we do use multiple AWS accounts that the spend is scattered across. These views really allow us to get granular into where we need to make adjustments.

So let’s take a look at the recommendations that LogicMonitor provides as far as what adjustments we might actually be able to make to minimize this spend.

LogicMonitor is using the performance data that we have alongside the cost data to help recommend where we can cut our spend. For example, we might be able to resize a few instances that are underutilized.

We might be able to clean up a few instances and clean up a few unattached or forgotten EBS volumes. Now the amount that we can save annually is listed to the left here. And if we expand these rows, we would actually see the instances that the recommendation is for, as well as some of the contextual data that helps explain why we’re making this recommendation.

We’ve got tons of breakout sessions on this topic as well as the other things that I’ve shown in this demo, so make sure to attend those if you are interested.

But hopefully this gives you a good idea of how we at LogicMonitor are using LogicMonitor to support our IT operations and platform engineering teams, really helping them be more efficient and effective.

So now that I’ve walked through what the platform can do today, I’d like to talk through a little bit about what we’re doing on the road map this year to make it even better.

We’ve got five key solutions that we’re hoping to deliver this year, ranging from expanding our coverage to make sure that we maintain this best in class visibility into these complex hybrid environments to how we can help you all accelerate troubleshooting as these environments get more complex and have a growing amount of data.

We’ll talk about how Edwin AI can be used to take that to the next level and help you move even faster.

And then how we can make our dashboards and our reports more dynamic and interactive and at the same time more usable to help you achieve more meaningful views?

And then lastly, how can we provide you with enough application visibility that doesn’t come at the high cost or intensive setup that something like application tracing comes at? And that’s a lot of along the lines of what I just showed in that video. We were using metrics and logs to get that application visibility, and as you can see, it was very comprehensive.

Now we just announced our first half launch last month in March, so I’m actually gonna talk you through what specifically we announced. We did focus on these first three solutions. And then hopefully, you’ll all be back in the afternoon for the keynote where I’ll unpack those last two solutions and go into what we’ll be working on for the rest of the year.

So with that, let’s go into what we announced in our first half launch.

So when it comes to expanding coverage, we’ve got four key things that we announced. The first is enhancements to our cost optimization product. I’ll get into that in a moment. We also announced, whoops, additional monitoring coverage for AI hardware and services.

That is to provide you all with more visibility into your AI stack. So hopefully, you can continue to adopt AI more confidently and really accelerate your AI adoption.

We’ve announced more monitoring coverage for our cloud and container coverage.

And then we do have more coverage coming on the IT infrastructure monitoring side as well, particularly focused on cloud managed networks.

So let’s jump into these. On the cost optimization side, we actually have new dashboard widgets coming, that you can reference within the LogicMonitor dashboards themselves to get that cost billing view that I was showing earlier in the demo. You’ll be able to see that in your dashboards alongside all of your performance, health, and availability metrics. These are new widget types that really facilitate that BI like drill in and drill out that you’d expect when looking at cost data.

We also have, announced and are about to introduce a recommendations life cycle for these cost optimization recommendations.

You can see on the right hand side, we’ve got the ignore, snooze, and acknowledge buttons.

Those are gonna help make sure that these recommendations and this list of recommendations is staying relevant over time for your team.

When it comes to new monitoring coverage for AI hardware and services, we kinda think about this in three tiers. Starting at the bottom, it all starts with the infrastructure.

We know LogicMonitor has great infrastructure monitoring, but recently, we announced and released new monitoring for NVIDIA GPU. So I’ll go into that.

At the LLM layer, we wanna make sure that you guys have insight into the LLMs that you’re leveraging, including token usage and spend. So we just released a new monitoring integration for OpenAI that reveals that.

And then at the application layer, we just introduced a new monitoring integration with Amazon Q. We’re about to release one with OpenLit as well that works through OpenTelemetry to provide insight into token usage that isn’t available through more traditional mechanisms.

So when it comes to GPU monitoring, this is a snapshot of of what we can collect.

It’s a lot of metrics that you’d probably expect to see, like GPU utilization, memory utilization, temperature.

And this is intended, again, to give you insight into the AI stack that’s supporting the AI that you’re leveraging so you can adopt that AI more confidently and really accelerate your AI adoption.

When it comes to our cloud and container monitoring, we have a very comprehensive Kubernetes monitoring solution.

It does require a proprietary application be installed via Helm directly in the cluster. And what we found from talking to you all is that sometimes that direct cluster access just isn’t available.

And so what we’re launching is a zero touch monitoring option through the cloud providers’ APIs for Amazon and Azure respectively, AKS and EKS, that’ll give you insight into your Kubernetes clusters in these cloud providers without requiring access to the cluster itself.

Now, this will give you great insight into what’s running in the cluster. You’re not gonna get down to the application level, but it’s really easy to deploy.

On the IT infrastructure monitoring side, as I mentioned a few slides ago, we are particularly focused on cloud managed networking. We wanna make sure that we have coverage for all of the leading vendors in this space.

Last year, we announced a huge overhaul to our Cisco Meraki integration. This year, we just released new monitoring for Aruba and specifically those Aruba central managed access points. Later this year, we have updates to Juniper Mist as well as a few new vendors, coverage in the access point space as well.

So that was what we announced in terms of expanding our monitoring coverage. Now I’m gonna talk a little bit about how we are helping to accelerate troubleshooting.

We know that downtime is expensive. We don’t know want you guys switching between tools or even having to contact switch outside of logic monitor. And so I’m gonna show you what we’re doing to try and alleviate that.

We are making improvements to our log solution to help surface insights faster, hopefully, to ex expedite your troubleshooting process.

The middle pillar here is all about a new detailed diagnostics feature that will be going into beta at the end of the first half here, where we’ll actually be able to collect additional information that you would normally have to get outside of Logic Monitor when an alert occurs, such as what are the top processes consuming CPU when you get a high CPU alert? We want to collect that information and display it directly within Logic Monitor to help you troubleshoot faster.

And then lastly, we will be revamping our alert details UI. Whether you’re coming in from an SMS message or Edwin, we want to make sure that you’re provided with the most relevant details and can troubleshoot that much faster.

Of course, with our logs product, we have our anomalies to really surface insights in what’s different about your log data, what might need your attention, as well as our log analysis feature that helps perform natural language processing to show you what are the key sentiment and keywords across large volumes of log across large volumes of log data in a very visual and interactive way. The idea here, again, is to surface insights without you having to write complex queries.

This is the revamped alert UI that I spoke of, specifically the graphs tab. You might recognize it from what I just showed in the demo. The idea here is to show you graphs outside of the data point that went into alert. So rather than just seeing the Tomcat graphs in that example, I would have been able to also see CPU and memory. That will allow you to stay in the alerts UI for longer and hopefully expedite your troubleshooting.

On the logs product, we plan to introduce a feature at the end of the first half into beta called Query List Logs, where you can actually enter your logs query in natural language. We’ll translate that for you to our logs query language.

Ultimately, we hope this helps you find the logs you’re looking for faster and service those insights faster for troubleshooting.

We’ll also be making improvements to a query sharing library, again, to make it easier for you to find those queries that you need, find the logs you need without having to construct a complex query from scratch every single time.

And then lastly, we do have new visualizations coming to the Logs page to really help you interpret your log data faster. Right now, it’s a lot of bar charts and line charts. We’ll have pie charts, big widget, donut options that hopefully help you interpret your log data that much faster and speed your troubleshooting.

So that’s it for the accelerating troubleshooting first half announcements.

I’d like to bring up to the stage to talk about the third pillar for the first half launch, Karthik.

Good morning.

We heard this morning from Christina, you know, what an incredible journey we’re on. And, you know, Andrew came on and talked about, you know, how cloud turned out to be more complicated than we thought.

And AI can tell you is gonna be a whole different journey. You know, this AI generated image, I can tell you some parts of it are accurate.

I live in the Bay Area, so you can see in one on one lot of AI billboards. That is accurate. Lot of drones. But I don’t think we’ll ever get to these highways. Not happening in Bay Area.

So switching to AI reality. Right? There is a lot of AI distortion happening today in the market, and it’s been two years more than Charge e p d launched. And, obviously, there’s this mad rush.

Everybody’s trying to do AI, and everybody’s like, hey, I have AI. And it’s really hard for customers to kind of cut through the noise and figure out what’s real. Right? And I think it’s it’s time this year to kind of talk about, you know, not all AI is created equal.

Right? And so just because you have AI, it doesn’t mean it’s really giving you ROI. It doesn’t mean it’s really so there’s a lot of questioning happening now more than ever. And I’m hearing from CIOs. They’re cutting down on POCs. AI POCs are, like, you know, done with that.

And I want to give you an example of this one company. It happened early this year. It’s a large financial institution.

They had, like, a major power outage and service outage that lasted almost five days. Right? Are you talking about a bank?

Think about what it means to everyday consumers.

You can’t do deposits. You can’t, you know, check your, withdrawals.

And when they did an investigation, turns out that the issue was not with them, but their vendor.

The vendor experienced a power outage in the data center using the hardware, and they just I mean, you would ask, like, well, don’t they have backup?

The backup also failed. Right? So, like, how do you deal with this kind of disruptions? And this is one of the companies that is in the forefront of AI.

They’re already doing microservices. How could how could they experience a five day outage?

Right? Obviously, you know what happens with that. There’s gonna be a lot of customers jumping ship and lawsuits. But the reason I give this example is just because you are doing AI, it’s not a silver bullet. You gotta go back to the fundamentals, like you saw on the cloud.

What is your data strategy? What is your governance strategy? What use case are you solving? And so when you looked at LogicMonitor, we’re already having the best data.

If you think about the data we collect in hybrid, this is all very mission critical data. And then doing AI on top is a lot easier than company just trying to bolt on AI. So that’s something I wanted to just unpack. That is not just about the AI, it’s also about the data that you collect.

And so when we thought about AI ops, and I spoke to so many of you over the last year and, you know, analysts, and there was clearly a lot of frustration. And we said, okay. How do we do this not just on the bolt on, but from the ground up. Right? And one of the things we heard is many customers are frustrated that even though you buy an AI tool, you have to hire two people to manage their AI. Like, what’s the point?

So it has to be completely different.

Agentic first. We talked about agents even before this became a buzzword last year.

And the whole point about agent is now the AI can do eighty percent of your work as opposed to humans doing eighty percent of the work. So that’s the flip.

And bringing all this data that we collect, we’ve been collecting some of the most difficult data on the on prem data centers. The collectors technology we have is beyond par.

Combine that with all the data you have in your enterprise, your incident data, your change data, your problem data, on call transcripts, knowledge base, external knowledge base, we’re gonna stitch all those together. Why? Because that’s exactly what customers want. They don’t wanna just look at observability data and then incident data in silo. They wanna stitch it together. This is a hard problem, but, you know, from a AI perspective, absolutely every customer is thinking about it this way.

And then finally, I mean, the customer asked me yesterday, like, hey, how do you price?

And, you know, is it by alert noise? And I’m like, we’re not in the business of charging you by alert noise. We wanna help you with outcomes.

Whatever is the business you are in, whether it is like making more boxes or in healthcare, We wanna make sure that we’re tying this AI technology to your outcomes. Of course, there’s a lot of different ways, but I wanna be very clear that this is really our commitment to tie this technology to your business outcomes.

And so we bought an Edwin AI. It’s it’s our answer to agentic AI ops. And one of the things that was interesting back in the day when we thought about Edwin is, it’s not just a chatbot. Right? Chatbots are like table stakes. We really want where we wanna take Edwin is towards actions.

And so we went about this with, you know, three use cases.

The first one is our bread and butter, which is how do we reduce noise in the data center?

There’s so much noise there, and you know, you have this already with logic monitor, but you also have other tools.

You may be using an APM tool. You may be using a log tool. You may be using a cloud monitoring tool.

We’re gonna do event intelligence across the stack, not just at the network and front layer, but across the stack. That’s what customers are asking us. I want one tool that can do cross domain correlation.

And it’s a lot easier for us to do that because we’re already collecting the data. We already have your infrastructure network data. It’s a lot easier for us to stitch that with the APM vendor’s data. And now, suddenly, you have a golden insight.

I already know the network was slow. How did that tie to some of your service insights?

Right?

And we’re doing some next level of innovation with some of the customers here, where we talk about using some of those advanced network correlation, like looking at BGP peers and making sure we understand. So this is something where we’re really innovating, and we got a lot of customers deployed. And, you know, that’s, we’ve been doubling down now on GenAI. Right?

So GenAI agents is our next foray where we talk about how do we really give your team productivity. You heard from Christina, like, what what happens? You get forty five minutes back. That’s game changing, folks.

Think about all the other projects you could do. Right? And so this is nothing but a Chargebee interface on your data.

Right? And to close the loop, where we really wanna invest is self healing data centers.

This has been promised for a long time, but we are as really close to this more than ever because our technology is really mature now.

So I’m excited to announce that all the agents that we’ve been hearing from you and we’ve been collaborating with customers, you know, they are coming. Right? Actually, let’s make it better. They’re here today.

Right? So you can go request early access for these agents. We have some of the demo booths where you can check it out. And if you’re really interested, come talk to us.

We can set it up in your environment. And these agents are really exciting.

You can talk about a general purpose IT ops chatbot.

Right? Anything on the IT ops domain. You can talk about public knowledge. Is there an outage with Slack? Is there an outage of the Cisco thing? We can go out there on the web and come back with real time outage information.

You can talk about your insights. Talk to my data. Right? What’s going on with this insight? Give me a timeline what happened.

You don’t need to build a dashboard for everything. You could literally ask these questions in plain language and get answers.

I had this complex alert, help me summarize it, so I can send it to my team that’s been asking me. Right? How much of you, like, you get pinged out there, hey, what’s the status of this p one?

Just give me a summary. I can send it to my boss.

And the last one is really exciting as charts and filters.

How many of you have a backlog of dashboards to build? Right? There’s always a new dashboard that’s that’s waiting. Now, with AI, you could literally ask the question, we will convert that to a query and and build a dashboard on the fly.

This is just a start, folks. So we’re gonna see a lot of agents coming up with logs, with Saratay talked about, with similar incidents from ServiceNow, with playbooks from automation platforms. But let’s now take a look at the demo of what that really looks like.

So you see here the Edwin interface, you have your query where you can really ask questions. And I’m gonna I’m gonna be like, hey, show me some of the alerts by device or CI. And what it’s gonna do is it’s gonna break down that query into something that we understand behind the scenes and literally build a chart on the fly.

How cool is that?

And now you can now interact with this chart and say, like, hey, ask some follow-up questions. Okay. So how many incidents were created on this in ServiceNow?

Notice I didn’t say ServiceNow, but it understands.

And now it’s gonna go and query the ServiceNow table and see, hey, how many incidents did we get?

You could do a follow-up question and say, okay, how many of these are not correlated? They’re just single alerts.

And it’s going to build a second chart.

Right? So these are, like, really advanced reasoning models that are thinking of, like, how exactly do I crunch this data and let you go to the next step.

Right? And so you could really think about all the productivity your team can have where you’re asking questions. And this is just a start. We we wanna do it in a way where there are low hanging use cases, where you see there’s a Gen AI summary, which is pretty widely requested. Tell me more. What’s going on with this insight? Give me a whole timeline.

Right? Copy paste that, put that in your ticket.

You could also ask, how do I fix this? Is there a fix that somebody has done before? I’m not aware of. I’m new I’m new to the company.

And the AI would generate you a step by step runbook.

If you have the runbook, great. If you don’t have it, it’s gonna show you step by step how exactly to fix this issue.

And think about that, folks.

I mean, I want to be a network engineer now.

Right? And then obviously, you can close the loop with creating a post mortem report in ServiceNow and pushing it. So we are really enriching your incident intelligence.

Incidents are placeholders. We’re gonna really enrich it with all the rich data that we collect.

Alright. Let’s let’s go back to the future, what’s coming. Right? So here’s an example where customers are like, I know I like Edwin, but can I just do this in teams and Slack? Because that’s where my users hang out and my CIO never really logs in to the portal. Absolutely.

Edwin is your agent that can sit inside Slack or Teams, and guess what the future is gonna be, agent to agent?

If you already are deploying agents internally with employee experience or customer experience, you’re gonna start thinking about how do these agents talk to each other.

Most of these employee agents do not have the access to the data that we collect.

And so you absolutely need that. If somebody I heard somebody saying, most common IT ticket, network is down. What the heck is network is down? Everybody blames the network, but it’s not the network.

Edwin is gonna tell you it’s not the network.

Right? So give you reduce your incidents, let the network engineers do what they’re doing. Do not bug me with this noise. And so we see examples here where we really wanna go is with this automation.

So with YAML playbooks, if you have something, we will search and find that, Or we would also generate you runbooks on the fly. That’s really a foray into self healing data centers. This is not science fiction. This is real today.

We can do it.

Let me unpack that what’s going on behind the scenes. So I’m really excited. I mean, being an enterprise software where this is unpacking the kind of productivity, you know, for customers. So we’re gonna be shipping hundreds of agents.

Make no doubt. Right? So we’re gonna have, like, what I showed you, there’s gonna be a insight agent, there’s gonna be a metric agent, there’s gonna be a logs agent. We’re gonna keep shipping those.

All that wrap it around the tools that we collect and also third party tools.

What we really wanna do, if you’re a service provider or a customer, we’re gonna expose an agent studio.

You could now build on our platform.

If you have a unique data source, you have a proprietary data source, we want you to build the agent workflows on this platform.

Instead of going to yet another tool and trying to stitch it, we will give you the platform to customize the AI outputs. We’ll give you a platform to build your own AI solutions because we believe there needs to be a central AI state of the art platform where you can build. Not everything is something logic monitor will be able to deliver, but we want you to also build on this platform.

And that’s our vision of the agentic AI ops where we’re doing the heavy lifting for you.

We’re building these connectors.

Why? Because this is what customers care about. I don’t want to go build another connector.

I mean, this is like table stakes. You want to build as much data. So we’re gonna be building these connectors. We already started on this journey with some of the connectors, and we’re gonna build on chain systems and on call transcripts and runbooks, knowledge bases.

So you’re gonna have a very wide richness of data. And we’re also gonna go deep, going deep in the network. Can I look into that packet data? Can I do synthetics?

I mean, we’re just scratching the surface, folks. We we have not figured out everything, but we know the north star where we’re going, and we’re just gonna keep going attacking this problem. And so the second layer is security. You don’t want to build this stuff.

I can guarantee you. It’s really hard. It takes I mean, I had been doing this for five years, and we’re still figuring it out. Right?

It’s really hard to build secure rack solutions. How do we make sure that the employee doesn’t get something, you know, access to a knowledge base? Right?

How do you make sure you’re not locked into one LLM vendor?

What if you wanna switch LLM vendors?

You saw that what happened with, you know, some of the some new shiny LLM comes up and suddenly everyone’s scrambling. What do you do in those situations? So you wanna build an LLM gateway.

You wanna build RAG so that you can access the proprietary information in your system.

You wanna build a knowledge graph so it connects all this data. So all these are important building blocks, and that’s our vision is to give you these building blocks so you can build on top of it.

And that leads me to some of the exciting innovations we’ve been doing, starting with OpenAI, the announcement we did early this year. And then they are on the forefront of agentic innovation.

And some of the partnership we’re doing with them is getting access to the technology.

So whether it’s a reasoning model, like the old models, or some of the GPD models, our team is closely working with them to make sure we can start previewing these much before they hit the market.

Right? And we also get access to some of the stuff that has not yet released, and give feedback on some of this technology.

The second one I’m excited to talk about is in the early innings, but talking to Red Hat and what’s the next.

If you talk about data centers and hybrid, I mean, this is really a great fit in terms of, like, how do I use the knowledge of these patterns and help you generate runbooks? So we are talking directly to the Red Hat team of how do we really have a tighter integration between the logic monitor platform and some of the Ansible platform.

Enough talking about AI, but let’s talk about customers. Right? This is really where the rubber hits the road.

And since we launched Edwin last year, we’ve had amazing success in production. Right? I know some customers are still I wanna deploy, but if you used some of the out of the box capabilities of Edwin, you saw that with Chris Manning from Syngenta, what amazed them was the time to value.

One hour is what it took for them to start seeing the value.

Right? In fact, he said, Edwin pointed out to a config change the team had overlooked for three years.

And Edwin kind of, you know, really gave visibility into that.

Chemist Warehouse, one of the largest retailers in the APAC region. They have two data centers, I I think over six thousand devices.

And they went live with a eighty eight percent noise reduction in the data center.

And they’ve been coordinating with with us on GenAI. So that’s where they’re really going. Like, how do you really give the summaries? And you’ll hear later from Jesse, like, how excited they are.

And then for service providers, Nexon, one of the largest service providers, one point four million, you know, events they’re getting in a month.

We reduced their ServiceNow tickets by seventy percent. Think about the time back we’re giving to each. Think about the client experience.

I don’t need to look at I need to look seventy percent fewer tickets on a monthly basis.

Devoting, they went live in eight weeks. This is not something that takes, like, one year to deploy.

It’s eight weeks.

And then Markel, you know, is using a traditional AROs vendor. They they decided to work with logic monitor because they see the vision where we are going. Right?

So I’d love to kind of show the video and hear directly from some of the customers on kind of what is the impact Edwin is having in their business.

Oh, I think it exceeded my expectations. The results were dramatic. You know, like you said, an eighty eight percent, you know, reduction in alert volume after enabling Edwin AI.

To us, that’s not just a number, it’s a daily quality of life improvement for my team. So I’d say that’s probably actually the biggest, you know, the biggest impact is it it it’s meant that we spend less time firefighting.

We now focus more on building new solutions for the business, improving existing ones.

So really delivering value to the business.

One of the key benefit of of LogicMonitor’s platform is also its AI module called Edwin AI. That brings a lot of value with predictive analytics, but also self remediation, event correlation, and so on.

Almost within the hour, we started to see a marked reduction in tickets. The tickets that we were coming through, had data in the description fields that actually accurately told us what was going on. It gave us a sense for the impact as well because impact is always quite often a really difficult thing, to, to measure. What we want is we want the engineers to be fixing the problem, not spending huge amounts of time analyzing the data. So, you know, one of the things that, Edwin gave us almost out of the box was a much deeper analysis of the data.

I think my favorite quote from Chris was he said he’s been in networking for, like, twenty five years. He said, networking is cool again. You know? So and so this is a really exciting time. And, you know, it’s it’s it’s a new way of working. Like, you know, I I really believe that, you know, this is a high pressure job.

Of all some of the, you know, functions that are there, this is there’s a lot of pain. There’s a lot of redundancy. There’s a lot of high volume work. There needs to be relief. Right? And so if you’re on this on the fences, on the sidelines, hey, I don’t think I’m ready for AI.

We got enough. You know, we can talk to some of these customers. How what’s been their journey. Right?

But if you think about the pain, like, you’re spending twenty, thirty minutes analyzing these incidents. Highly technical information. Extremely hard to interpret. Very human prone.

You’re you’re you’re relying on your best engineers to to kind of save it. Right?

And think about the war rooms.

How many war rooms have you guys sat on? Right? And I know Sean, we are doing a workshop, and just before the workshop, he he had to go into a war room. And I guess there were like, what, forty people, fifty people?

A show of hands, what’s what’s the biggest war room size you guys have been in?

Can anybody can beat forty people?

Yeah?

What are the size?

Hundred and fifty?

Nine nine hundred? Did I hear that right?

What happened?

I heard one customer said they will have more than thousand, and that was the cloud strike. So, okay. It explains it. So this is gonna be the new way of working. You’re gonna have agents side by side. We’re not replacing.

It’s basically sitting side by side productive. We use it in our day to day stuff.

And so it’s a new way of working. It’s a new way of learning how to work with agents to be more productive. You don’t need to spend time doing ticket triage. That is not really value added work.

In fact, I would say, you’re not even looking at warnings. I know many customers today don’t look at warnings, like, okay, we’ll wait for something to become critical. What if your thresholds are not right? How do you know what’s the right threshold to set?

AI can now tell you, like, what is the right threshold to set for your devices.

Right? So it’s really exciting time where we think the engineering team and your SREs and your agents can work side by side.

And, you know, talk about value. Right? So this is super important. I mean, amongst a lot of AI, I’m sure all of you are doing AI in different departments.

But this use case stacks really high in terms of ROI.

I mean, it’s hitting the business in all angles. It talks about your OPEX efficiency. If you’re a service provider, we can do more with less. You saw that in seventy percent reduction. It’s gonna improve your employee productivity, thirty to forty five minutes time back. Customer experience, if you have an outage, we can try to get that back on and save you millions of dollars.

And guess what? I see on the cake. We will help you consolidate your tools and reduce license cost.

So work with us on some of this business value. We can do it for you. And, you know, just to wrap up, whether you’re going with logic monitor or any vendor, this is something where, you know, how do you pick a future proof AI vendor? Right? And I I boil it down to three things because I’ve been talking to many customers.

AI native products are gonna beat AI add ons.

You can’t just bolt on AI and say, hey, we declare victory. Doesn’t work that way. You gotta think about all the layers. Agents are a whole different stack.

Think about SaaS. When SaaS was coming, you had different layers of architecture. Exactly same point. We are building those layers.

What is your memory? What is your storage? What is your, agent observability?

That’s something we’ve been doing with Edwin. It’s an AI native app.

And you can’t just like bolt on an LLM and say, I got AI. It doesn’t work that way. Second thing is, many customers are like, hey, should I just build this or should I buy?

And maybe the answer is both.

If you are in a industry where this is not your core competency, we don’t recommend you to build this.

If you’re in financial services, you want to do credit model, absolutely makes sense. But if this is IT ops and correlation and root cause analysis is not your core competency, it is not worth it because the tech is changing really fast.

And so I say, like, go with the platform and build on top. But you can’t also go in and say, okay, there’s no opportunity to build. So we think about it both. Right? And the last one, really important, is AI, like I said in the beginning, is not a silver bullet. You need the data.

And there are companies going right, and and we chose to shift left. We wanna go closer to the logs, closer to the metrics, closer to the metadata, so we can give engineers what they need to solve the issue faster.

No more command centers who are just delaying the problem. We want to give engineers time, just in time information to fix issues so we can get systems back on. And so the combination of logic monitors, hybrid observability with Edwin is your next level of observability you need. Thank you.

Wow.

That was a lot to take in. Really incredible stuff from Karthik and Sarah.

So briefly recapping, we’ve we’ve talked a lot about our product, our platform, but it really comes down to three key themes I want you to take away.

We’re expanding our hybrid coverage. We’re making it as easy as possible to get as much data into the platform.

You heard from Sarah about all the advancements in that era area around cost optimization, around supporting your AI workloads, container monitoring, enhancing cloud managed networking, and more.

Once that data is in the platform, our mission is to help you accelerate troubleshooting with that. We’re doing that with logs around bringing all that telemetry in so that you’re not reliant on logging into multiple different systems. You can retire legacy monitoring tools. You don’t have to use vendor specific tools for different cloud workloads. You can accelerate your troubleshooting.

And then we just heard from Karthik, the future.

We are the agent for for IT ops, and the agents are here today as Karthik told you. And we’re pioneering this agentic AI ops future. So everything we’re doing from a product and engineering perspective is designed to help all of you, And you’re gonna hear more about that throughout the rest of the course of today in the breakout sessions. And then this afternoon, Sarah is gonna talk some more about the future and what’s coming and what you can expect from us in the second half of the year. And then you’re gonna hear from Andrea about how this is transforming McKesson’s observability posture.

So everything we’ve talked about, all of our continued investments in cloud, in networking, supporting your AI initiatives, improving those troubleshooting workflows, modernizing dashboards and reports, taking service insights and logs to the next level, unlocking application visibility for IT ops, and then, of course, Edwin and Agentic AI ops in the future. We’re here to support all of you, the IT ops heroes and CIOs, to navigate the modern data center in the AI era.”

Ready to get started?