Have you ever wondered how a web application can scale to serve millions of users worldwide? To serve a vast number of user requests, web applications must build their services to multiple instances. You might then wonder: How can an application evenly distribute the user requests so that all the user requests can be handled with peak efficiency? The short answer to that question is load balancing. The complete answer is…well, please reserve a few minutes to go through the article! You will learn all about load balancing: what a load balancer is, how it works, its benefits, methods of load balancing, and how to implement a load balancer for your use cases.
A load balancer is a hardware device or software application responsible for evenly distributing the requests across multiple application instances. (An “instance” is a single deployment of an application or service running on a server.) As a result, the application can cope with a high volume of requests efficiently.
If an additional app instance is introduced, the load balancer will redistribute the requests to include the new instance, thus reducing the workload on the existing instances. If an app instance goes down, all the requests to the problematic instance will be redistributed to other operational instances. As a result, the app is highly available and fault tolerant, offering users an uninterrupted service.
Load balancers can be categorized into different types based on how they manage and redistribute incoming requests. The two primary types are network load balancers, and application load balancers. Another mode of categorization is by physicality type, in which case we divide them into hardware and software load balancers. Let’s take a look at each of these in depth.
Network load balancers forward the requests at the transport layer, layer 4 of the Open System Interconnection (OSI) model. The forwarding mechanism is based solely on network attributes, such as the IP addresses of the clients and the corresponding application instances.
Network balancers do not consider the contents of the requests when forwarding them to the app instances, which allows them to offer low latencies when redistributing the requests. Network load balancers would be a great fit for applications with extreme performance requirements, such as streaming or game applications.
Application load balancers forward the requests at the application layer, also known as layer 7 of the OSI model. They examine the content of the requests, such as HTTP Headers, request paths, or request methods. This way, the application load balancer can flexibly distribute the requests to different app instances to match the business requirements.
Application load balancers are appropriate for e-commerce or social network applications that need support for custom HTTP responses and health checks for the app instances but do not require extremely low latencies.
Network and application are categories of load balancers based on how they manage and redistribute incoming requests.
Hardware load balancers are purpose-built devices designed to redistribute the requests among app instances. They are often used in on-premise infrastructure alongside the company’s network systems and application servers. Hardware load balancers are a good choice for applications that want to store all data in self-managed servers or require special hardware customization when forwarding the requests to the target instances. They also offer enhanced security options.
A load balancer can help with application performance in a number of ways, including scalability, cost reduction, availability, and request processing speed. Let’s take a closer look at each of these in turn.
When more user requests are sent to the application server instances, the CPU utilization of the server instances is high.
An e-commerce application would benefit from the scalability the load balancers offer. Typically, the volume of user requests for e-commerce applications escalates far above normal levels during Black Friday sales.
If one application instance goes down, the load balancer will forward the requests to other instances so the end user does not encounter any error or stoppage in service. The load balancer helps to ensure an application’s high availability by circumventing problematic instances.
To create a load balancing system that effectively forwards requests to the application instances, it’s first essential to understand how a load balancer works. Let’s review the inner workings of load balancing and explore some popular load balancing methods.
Different algorithms and combinations thereof are used by load balancers. The algorithm(s) depend on the complexity and features of the load balancer in question. A basic load balancer usually uses an algorithm called Round Robin to assign requests to the app instances. The Round Robin algorithm distributes the requests to the app instances one-by-one, resulting in an equal load distribution. No single app is overly taxed.
Let’s say you have three application instances. The first user request will be sent to instance number one. The second request will be sent to instance number two. The third request will be sent to instance number three. The fourth request will be sent to instance number four. Here, we have four instances available, so request number five will be sent to app instance one, and so on.
Instead of interacting directly with the application server, your application’s end users send requests to the load balancer.
A typical load balancer consists of four parts:
Besides the Round Robin algorithm already discussed, other load balancing methods and algorithms exist, including Weighted Round Robin and resourced-based methods. In general, the load-balancing methods can be divided into two categories: static load balancing and dynamic load balancing. Let’s take a closer look at each.
With static load balancing, the load balancers forward the requests to the app instances without examining the current state of these app instances. This makes static load balancing easy to implement. The drawback of the static load balancing method is that it cannot adapt to the states of the app instances, which could be very different in runtime from what you anticipated, potentially affecting performance, and thus user experience. Some static load balancing methods are:
With dynamic load balancing, load balancers forward requests to the app instances based on the current state of these instances. As a result, the dynamic load balancers can adapt to the ongoing changes of the app instances and tend to be more efficient than the static load balancers. However, dynamic load balancing is more complicated to implement. Some examples of dynamic load balancing methods are:
Setting up and maintaining a group of load balancers is a challenging task. To create and manage load balancers efficiently, you need to:
At Gcore, we understand the difficulties and challenges of setting up a load balancer from scratch. There are a huge number of options available, and your choice directly affects performance and user experience—for better or for worse. Gcore’s Load Balancer solves these challenges and comes with built-in features support for:
To learn more about how to get started, configure, and troubleshoot the Gcore Load Balancer, please take a look at our knowledge page.
With a growing number of users coming to your app, having a load balancer to distribute user requests to your instances appropriately is essential for performance and user experience. However, setting up a load balancer that appropriately distributes user requests takes a lot of work. The Gcore Load Balancer helps you to distribute your user workload in the most elegant and efficient way possible.