How Load Balancing Algorithms Really Work
You are now 148,001+ subscribers strong. Let’s try to reach 149k subscribers by 21 May. Share this post & I'll send you some rewards for the referrals. Get the powerful template to approach system design for FREE on newsletter signup: This post outlines 6 popular load balancing algorithms. You will find references at the bottom of this page if you want to go deeper.
Note: This post is based on my research and may differ from real-world implementation. Once upon a time, there lived 2 QA engineers named John and Paul. They worked for a tech company named Hooli. Although bright, they never got promoted. So they were sad and frustrated. Until one day, they had a smart idea to build a photo-sharing app. And their growth skyrocketed every day. So they scaled by installing more servers. But uneven traffic distribution caused server overload. So they set up a load balancer for each service. Think of load balancer as a component that distributes traffic evenly among servers. Yet each service has a different workload and usage pattern. Onward. Securing AI agents and non-human identities - SponsorNHIs surged with the rise of AI agents, microservices, and distributed cloud systems. This ebook gives you a practical roadmap to secure NHIs in your architecture, with Zero Trust principles at the core:
Load Balancing AlgorithmsHere’s how they load balance traffic across different services: 1. Round RobinOne weekend, their app became trending on the play store. Because of this, many users tried to log in at the same time. Yet it’s necessary to distribute traffic evenly among auth servers. So they installed the round robin algorithm on the load balancer. Here’s how it works:
Once it reaches the end of the list, it starts again from the first server. This approach is simple to set up and understand. Yet it doesn't consider how long a request takes, so slow requests might overload the server. Life was good. 2. Least Response TimeUntil one day, a celebrity uploads a photo on the app. Because of that, millions of users check their feed. Yet some servers handling the feed might be slow due to garbage collection and CPU pressure. So they use the least response time algorithm to route the requests. Here’s how it works:
This approach has the lowest latency, yet there’s an overhead with server monitoring. Besides latency spikes might cause wrong routing decisions. Life was good again. 3. Weighted Round RobinBut one day, they noticed a massive spike in photo uploads. Each photo gets processed to reduce storage costs and improve user experience. While processing some photos is complex and expensive. So they installed the weighted round robin algorithm on the load balancer. Here’s how it works:
Imagine weighted round robin as an extension of the round robin algorithm. It means servers with higher capacity get more requests in sequential order. This approach offers better performance. Yet scaling needs manual updates to server weights, thus increasing operational costs. 4. AdaptiveTheir growth became inevitable; they added support for short videos. A video gets transcoded into different formats for low bandwidth usage. Yet transcoding is expensive, and some videos could be lengthy. So they installed the adaptive algorithm on the load balancer. Here’s how it works:
Put simply, servers with lower load receive more requests. It means better fault tolerance. Yet it’s complex to set up, also the agent adds an extra overhead. Let’s keep going! 5. Least ConnectionsUntil one day, users started to binge-watch videos on the app. This means long-lived server connections. Yet a server can handle only a limited number of them. So they installed the least connections algorithm on the load balancer. Here’s how it works:
It ensures a server doesn’t get overloaded during peak traffic. Yet tracking the number of active connections makes it complex. Also session affinity needs extra logic. Life was good again. 6. IP HashBut one day, they noticed a spike in usage of their chat service. Yet session state is necessary to track conversations in real-time. So they installed the IP hash algorithm on the load balancer. It allows sticky sessions by routing requests from a specific user to the same server. Here’s how it works:
This approach avoids the need for an external storage for sticky sessions. Yet there’s a risk of server overload if IP addresses aren’t random. Also many clients might share the same IP address, thus making it less effective. There are 2 ways to set up a load balancer: a hardware load balancer or a software load balancer. A hardware load balancer runs on a separate physical server. Although it offers high performance, it's expensive. So they set up a software load balancer. It runs on general-purpose hardware. Besides it's easy to scale and cost-effective. And everyone lived happily ever after. Subscribe to get simplified case studies delivered straight to your inbox: Want to advertise in this newsletter? 📰 If your company wants to reach a 148K+ tech audience, advertise with me. Thank you for supporting this newsletter. You are now 148,001+ readers strong, very close to 149k. Let’s try to get 149k readers by 21 May. Consider sharing this post with your friends and get rewards. Y’all are the best. TL;DR 🕰️ You can find a summary of this article here. Consider a repost if you find it helpful. References |