Tracking HashiCorp Vault with LogicMonitor


Companies around the world are going through digital transformation in some way or another. They are breaking monolithic applications into microservices for more natural development and deployment. HashiCorp Consul is one of the tools helping companies with this transformation. Consul helps with the shift of static infrastructure to dynamic infrastructure through service-based networking instead of host-based. With the added complexity of a company’s environment, it is increasingly important not just to monitor the underlying infrastructure of the servers, but also get crucial information about the Consul application.

LogicMonitor can automatically track the important DataSources (Consul Agent, Consul Server, and Consul Cluster) of your Consul deployment to keep it running healthy.

Monitoring Consul Agent

The Consul agent has two modes—client and server—and they run on every node of the Consul environment. The agents in client mode interface with the server nodes for the majority of their operations, thus making them lightweight. Several key metrics are beneficial to keep an eye on when monitoring the health of the agents:

If you notice that the nodes chart is flapping, this can be an indicator of overloaded agents, configuration errors, or network problems. By having these metrics handy and at a quick view over a set timeframe, you can identify whether or not the agents are behaving as expected or struggling to keep up with the demand.

View of Hashicorp Consul agent datasources collected via API and from the agent in LogicMonitor
View of Consul agent datasources collected via API and from the agent
View of Consul agent datasources collected via API and from the agent

 

What to Look for When Monitoring Consul Server

Server nodes are critical since they are the ones that process the operations of the Consul platform. When tracking the health of a server, more than likely, you want to look at the following data:

These metrics can help your team appropriately size your servers, and if you need to add another server to help with the demand. Troubleshooting becomes simpler since you will be able to pinpoint and focus on the Consul server that is struggling with these server-specific data points.

Display of Consul server metrics in LogicMonitor
Display of Consul server metrics 
Display of Consul server metrics in LogicMonitor

 

Tracking Consul Clusters

Per HashiCorp documentation, it is highly advisable to have 3 or 5 servers in a cluster. A cluster allows redundancy and avoids data loss in case of a failure in one of the servers. With these many servers, you need to keep track of the leader and any leader changes that happen. If you start noticing a high number of leader changes, then this could be an indicator of network issues between the Consul Servers. As a bonus, if you have your network infrastructure added to LogicMonitor, you can see the dependency in the LogicMonitor topology view and drill down to the affect network component. At the cluster level, it is vital to keep track of the key-value store update time, transaction operation time, and the number of raft transactions. These values can help you understand the current state of the cluster, and you can be alerted if any of these metrics get pass a desired threshold. LogicMonitor also consumes the autopilot data, which indicates the overall health of the server cluster, if the value goes to zero, then LogicMonitor can send an alert.

These are some examples of how LogicMonitor can provide insights into your HashiCorp Consul environment. There are plenty of more use cases and data points that are collected. If you are attending HashiConf this year, make sure to visit our booth and we will be happy to answer any questions. Not attending? Don’t worry, request a free trial or visit our blog for more information that can be helpful as you manage your HashiCorp Suite.

Monitoring HashiCorp Vault with LogicMonitor


HashiCorp Vault is an open-source secret management tool that allows organizations to easily “secure, store and tightly control access to tokens, passwords, certificates, encryption keys for protecting secrets and other sensitive data using a UI, CLI, or HTTP API.” This solution prevents sensitive information from being stored in unsecured places, and at times stored in plaintext, throughout the organization’s infrastructure. HashiCorp Vault and all of its components play a critical role in a company, thus making it vital to monitor its health and status. Enter LogicMonitor.
LogicMonitor has the necessary DataSources (Vault Health, Leader, and Replication) to make sure your Vault deployment is running as intended.

Monitoring Your HashiCorp Vault Health and Status

Aside from the usual host metrics (CPU, Memory, Disk, and Network), LogicMonitor can display the current status of your Vault servers and send alerts if any changes occur. LogicMonitor tracks the initiation status of all your servers. If a Vault server is uninitialized, then it has not gone through a configuration process, meaning encryption keys have not been generated, unseal keys have not been created, and the initial root token has not been set up. Know the seal-state of your servers. A sealed Vault performs almost no operations and can hinder other applications’ performance. Unsealing is the process of constructing the master key necessary to read the decryption key to decrypt the data, allowing access to the Vault. You can receive an alert when a server changes status out of schedule.

Key Haschicorp Vault health metrics monitored in LogicMonitor.
Key Haschicorp Vault health metrics monitored in LogicMonitor.

Vault Leader and High Availability

A key offering of Vault Enterprise is the high availability (HA) feature. If you are running Vault in multiple servers within multiple data centers, it is essential to keep track of the leader and any possible failover events. When running in HA mode, Vault servers have two states they can be in standby or active. Only the active server in an HA topology will process requests. You will be able to display the standby status of all your servers and make sure there is always an active server. LogicMonitor will alert you when there is a change in the standby status of a Vault server.

Tracking Vault Replication Status

With multiple servers and data centers, it is essential to make sure all the data gets replicated across your environment. LogicMonitor can track the performance replication status (disabled, secondary, and primary) of each server and alert when there is an unexpected change. Along with the status, you can also see the last Write-Ahead Log (WAL) position. The WALs are used to perform log shipping between Vault clusters. By monitoring the WAL position, you can determine if the servers are struggling to stay synced, helping you to get ahead of an out-of-sync situation. If the servers are out-of-sync, then causes other applications not to access the data they require.
These are just a few examples of how LogicMonitor can provide insights into your HashiCorp Vault environment. There are plenty of more use cases and data points that are collected. If you are attending HashiConf this year, make sure to visit our booth, and we will be happy to answer any questions. Not attending? Don’t worry, request a free trial or visit our blog for more information that can be helpful as you manage your HashiCorp Suite.