What is HAProxy, and what is it used for?

What is HAProxy, and what is it used for?

In December 2022, the latest version of HAProxy, 2.7.0, was released. This open-source software is both a proxy and a load balancer, and is immensely popular due to the sheer volume of features it provides to help reduce or even avoid downtime and manage web traffic.

Website or application downtime is disastrous for businesses. You want to serve as many users as possible, but if you have nothing in place to manage traffic, then your web applications can quickly become overwhelmed and fail. That’s when those users will simply look for an alternative, usually from one of your competitors. HAProxy is a free software solution that promotes more efficient management of web traffic.

What is HAProxy?

HAProxy is an open-source proxy, reverse proxy, and load balancing solution for HTTP- and TCP-based applications. Load balancing is a technique for routing traffic to servers based on rules set up during configuration. Those rules might always look for the server with the least traffic, or they might simply tell the proxy to send connections to different servers in turns.

HAProxy also offers a range of enterprise-level tools and platforms, but we’re focusing on the free proxy and load balancing software today.

What is HAProxy used for?

HAProxy has a number of uses. It’s ideal for websites or web applications that expect high volumes of traffic or sporadic traffic that may spike on occasion. Websites with constant heavy loads of traffic require persistent load balancing to avoid downtime, and HAProxy can help developers achieve this.

HAProxy increases availability at both the application and network layers, ultimately improving user experience and simplifying the process of managing web traffic.

HAProxy is used and recommended by a number of large organizations, including JPMorgan Chase, Boeing, and Amazon Web Services.

How does HAProxy work?

Most users can install HAProxy software for free on their system’s package manager or, alternatively, it can be run as a Docker container. Some companies may prefer to use the enterprise version of HAProxy.

Developers configure HAProxy to determine which IP addresses and ports the proxy should bind to. This essentially states exactly what traffic the proxy is managing.

The configuration also states which servers the traffic goes to and the rules surrounding this. Each configuration defines a frontend that focuses on receiving traffic and a backend that contains the receiving server information. Rules can be as detailed as sending requests on a specific port to a single server only, or they can be as general as an algorithm that instructs requests to route to each server in turn.

Load balancing

We’ve explained load balancing in brief above, but how does it work in an actual web application environment? It really depends on the type of load balancing you decide to use.

Web server load balancing

The simplest load balancing solution for multiple web servers is something called layer 4 or transport layer load balancing. The load balancer uses a predefined range of IP addresses and a port to determine where to route traffic. It’s important in this type of setup to ensure that multiple servers are capable of providing identical and consistent information in line with users’ requests.

Application server load balancing

Application layer load balancing or layer 7 load balancing works slightly differently to layer 4 balancing. Layer 7 can route requests to different backend servers depending on the content of the request. This type of balancing involves more complex rules that will connect the user request to the right backend server depending on what they need. These rules might route a request for a blog article to a server that just delivers the blog content, while an e-shop request might go to a completely different server.

Reverse proxy

A reverse proxy sits between the application and the backend, ensuring user requests reach the right servers, but also providing other features such as security, reliability improvements, or in the case of HAProxy, traffic management.

HAProxy features

HAProxy is preferred over alternative proxies (e.g., NGINX, LoadMaster) due to the extensive features it offers. These include HTTP2 protocol support, SSL/TLS termination, native SSL support, detailed logs for monitoring and observability, RDP cookie support, and a CLI for in-depth server management.

Support for HAProxy is quite unusual in that the developers support the current, most recent version, and the version before that, plus they will assist in any critical fixes for earlier versions too. There is also plenty of documentation available, from HTML starter guides to full configuration manuals, and helpful articles on the HAProxy blog.

Key terminology

ACL (Access Control List): ACLs test conditions and perform actions based on the outcomes of those tests. For example, discovering if a connection arrived via SSL might require the following ACL:

acl ssl_was_used ssl_fc  

Frontend: The part of the configuration that defines how requests will be forwarded to the backend (see below). This configuration section includes IP addresses, a port, ACL, and rules. The rules define what will happen depending on the results of the ACL tests.

Backend: The backend defines what load balancing algorithm will be used and what servers and ports are active for this configuration.

Roundrobin: The default load balancing algorithm that selects servers in a specific order.

Leastconn: An alternative load balancing algorithm that searches for the server, from the list of chosen servers, that has the least connections.

Source: An instruction using a source IP address to connect users to a particular server.

Health check: An automatic process that checks if servers are available for processing requests. If a server is unavailable, HAProxy automatically disables it in the backend. Automation of this type is becoming more vital in all aspects of development and programming.

High availability: Also simply called HA (hence HAProxy, High Availability Proxy), this type of setup ensures there is no single point of failure that could cause potential downtime if the load balancer becomes overwhelmed or goes down. This type of setup normally uses a secondary load balancer or an active/passive HAProxy pair.

In conclusion, HAProxy is a simple, scalable solution for dealing with heavy or unpredictable web traffic. As a free, open-source piece of software, it’s reassuring for smaller or growing organizations to know that it’s the same proxy used by some of the biggest corporations in the world.