Title: Geometric Generalizations of the Power of Two Choices
Authors: John Byers, Jef Considine, and Michael Mitzenmacher
Abstract:
A well-known paradigm for load balancing in distributed systems is the
``power of two choices,'' whereby an item is stored at the less loaded of
two (or more) random alternative servers. We investigate the power of two
choices in natural settings for distributed computing where items and
servers reside in a geometric space and each item is associated with the
server that is its nearest neighbor. This is in fact the backdrop for
distributed hash tables such as Chord, where the geometric space is
determined by clockwise distance on a one-dimensional ring.
Theoretically, we consider the following load balancing problem. Suppose
that servers are initially hashed uniformly at random to points in the
space. Sequentially, each item then considers d candidate insertion
points also chosen uniformly at random from the space, and selects the
insertion point whose associated server has the least load. For the
one-dimensional ring, and for Euclidean distance on the two-dimensional
torus, we demonstrate that when n data items are hashed to n servers, the
maximum load at any server is log log n / log d + O(1) with high
probability. While our results match the well-known bounds in the
standard setting in which each server is selected equiprobably, our
applications do not have this feature, since the sizes of the
nearest-neighbor regions around servers are non-uniform. Therefore, the
novelty in our methods lies in developing appropriate tail bounds on the
distribution of nearest-neighbor region sizes and in adapting previous
arguments to this more general setting. In addition, we provide
simulation results demonstrating the load balance that results as the
system size scales into the millions.