If you’re reading this article, you’re most likely looking for a simple one-stop-shop way to understand logs.
I’m sorry to be the one to tell you this, but logs are not simple enough to deal with easily. In fact, as you start approaching this topic on a practical level you’ll quickly realize how complex and annoying it truly is.
In this post, I’ll share the experiences from my 10-ish years of working with logs to help you understand the components necessary to establish a log infrastructure where you maintain full control of the logs during their flow through your environment.
Setting up Your Log Infrastructure
What do you mean by log infrastructure?
I’ll go through each of the layers you’ll have to think about one by one here, and I’ll do it without diving into specific tools for the job since the tools you should use for the various layers will be highly dependent on your specific environment and your specific needs. Instead, I’ll explain a bit about the things you need to consider for each layer so that you have the best conditions in mind when designing your log infrastructure. It doesn’t matter if it’s a small infrastructure or a large one. You’ll have to design it eventually, and it really does help to do it right from the start.
A log infrastructure is the sum total of a few layers of component types to help you accomplish all the goals and fulfill all of your compliance requirements in one single environment. For some organizations this means a big pile of different components, for others it means a small pile of them or sometimes it’s even one massive tool. The layers that you will usually have in mind as your end goals will be the last ones.
It’s important to know that some of these layers might contain multiple components each and to make it even more confusing, some components cover multiple layers too. It all depends on the needs of your environment and any potential requirements set up in your policies as well as the components you choose to include in your environment.
The Different Layers of Log Infrastructure
Most IT components generate logs. Some of them have great ways to send them out natively. Some of them are limited in how they can send out logs and some can’t send logs anywhere at all, they instead need to be collected locally and sent out to the next layer. Collecting logs locally is often a straightforward process of installing an agent locally on the device to read logs. Sometimes there’s already an agent present that needs to be slightly reconfigured to forward the logs, and sometimes an agent can be used to fetch logs from a source that has no native methods of forwarding logs, thus enabling their logs to be processed, stored, and analyzed anyway.
Your logs need to be sent somewhere. This “somewhere” is what we typically refer to as the ingestion layer, the immediate recipient of your logs. This layer is responsible for taking the logs sent to it and then forwarding it on to one of two places. Some log infrastructures are set up with a buffering layer, where data is streamed through. Most, however, utilize a simple routing layer. The ingestion layer is also often responsible for extracting data from the events or enriching them with metadata where applicable. That means there’s a lot of custom parsing that can and often do happen at this layer. It’s possible to do it in other layers, but my recommendation would be to do it either here or at the collection layer.
A buffering layer (sometimes referred to as a streaming layer) is a bit more of an advanced layer that’s not always present. It can, however, be a very useful one. A buffering layer is typically a layer that, when implemented, both ingestion and routing will interact with. The ingestion layer to send logs onward in the log infrastructure for queueing and the routing layer connects to it to retrieve logs to forward on.
A routing layer will take logs it receives (or fetches, in the case of a buffering layer being involved) and, according to a rule specification, forward logs to the intended destination. This layer is fairly straightforward but is usually one of the more complex ones to create depending on the complexity of your log infrastructure as a whole. To me, personally, this is the most important layer of a proper log infrastructure. The simple reason for this is that it enables you to forward different logs to different analysis components that are designed for different tasks. Security logs to a security log analysis component. Operational log to an operational log component and application logs to application monitoring components. The right log to the right place.
A storage layer will receive logs routed to it and store them for however long you need them to be stored. This is usually related to either compliance requirements or the policies your organization decides to adopt.
The analysis layer is the layer where a lot of the cool magic will happen with your logs. Most of the above layers are just a means to this end. To provide value outside of compliance, logs need to be analyzed. A lot of organizations analyze very little of their log data. Most don’t even have their log data readily available. The analysis layer is there to remedy that. Some organizations settle for a simple searching layer, giving the end-users a way to query their log storage for historical logs matching certain parameters. This can be useful for investigations in many scenarios. Some organizations will send their logs through a live analysis system, capable of alerting on certain event types, tracking metrics over time, or performing security auditing on live data as opposed to historical searches.
Bringing the Layers Together
Having these layers covered means that you’ll have the best possible circumstances to work with your log data. If you forget about your routing layer, for example, you’ll find yourself with issues trying to implement a new analysis layer component into your environment. Miss out on the rewriting aspects of your ingestion system and you’ll run into issues when you try putting your log metric tracking components into play.
As you start working with your logs, always make sure you think about the full pipeline. Don’t just think about that one specific component you’re looking at implementing next. Instead, plan ahead from a perspective of needing or wanting to implement more of them in the future and how the environment can be designed from the beginning to deal with that. Because at some point you’re going to have to implement new solutions. It’s a law of nature in IT, we’re never done.