The power of two random choices

(brooker.co.za)

68 points | by zdw 1274 days ago

4 comments

  • anschwa 1272 days ago
    I did my undergrad CS thesis on this topic and implemented a "power of two choices" load balancing module in NGINX.

    When simulating load distribution across servers, two-choices does an outstanding job at keeping things evenly distributed. However, I was unable to observe a meaningful difference between load balancing algorithms when measuring request latency from servers under load.

    I think there is still a lot to be gained from exploring two-choices, but coming up with an appropriate strategy for "real-world" performance benchmarking is quite challenging.

    • mjb 1271 days ago
      > I did my undergrad CS thesis on this topic and implemented a "power of two choices" load balancing module in NGINX.

      Cool!

      > When simulating load distribution across servers, two-choices does an outstanding job at keeping things evenly distributed. However, I was unable to observe a meaningful difference between load balancing algorithms when measuring request latency from servers under load.

      That's very interesting. Were you serving static content? Or running code, too?

      Intuitively, I'd expect that NGINX serving static content would tend to scale fairly linearly up to the saturation point, and then get rapidly bad. That's a pretty pessmisic case for load balancer performance measurement, because you won't see effects until you're running the whole system very hot. That 'very hot' modality is important to real-world applications, but certainly makes benchmarking challenging. Does that sound right to you?

      In my experience, that tends to change when running complex business logic code, or generating dynamic content. There, the load-latency graph has a higher linear slope. The effects of good load balancing are more immediately visible. It's also more visible in things which explicitly have longer latency with higher load, like some kinds of queue-based and stream-based systems.

      • anschwa 1268 days ago
        > Intuitively, I'd expect that NGINX serving static content would tend to scale fairly linearly up to the saturation point, and then get rapidly bad

        That was essentially what I saw. I was serving very minimal dynamic content that had some random sleeps. I even reached out to the NGINX mailing list and got a similar response about how they wouldn't expect much measurable difference. However, I still think it would be possible to construct a better test.

  • mjb 1271 days ago
    Hey, this is my blog post! Great to see it here. Best-of-two is one of my favorite things. If you enjoy the simplicity of it, check out the related paper "How asymmetry helps load balancing" by Berthold Vöcking (https://dl.acm.org/doi/10.1145/792538.792546).

    Back in 2018 I wrote a follow-up to this, with a little more about the general balls-into-bins problem: http://brooker.co.za/blog/2018/01/01/balls-into-bins.html

  • guyu96 1272 days ago
    Link to the paper mentioned in the blogpost: http://www.eecs.harvard.edu/~michaelm/postscripts/handbook20...
  • stevefan1999 1271 days ago
    i read it as (the (power of 2) (random choices))...for some reason