Authors: Michalis Potamias, Francesco Bonchi, Carlos Castillo, and Aristides Gionis
Title: Fast shortest path distance estimation in large networks
Abstract:
We study the problem of preprocessing a large graph so that
point-to-point shortest-path queries can be answered very fast.
Computing shortest paths is a well studied problem, but exact
algorithms do not scale to huge graphs encountered on the web, social
networks, and other applications. In this paper we focus on
approximate methods for distance estimation, in particular using
landmark-based distance indexing. This approach involves selecting a
subset of nodes as landmarks and computing (offline) the distances
from each node in the graph to those landmarks. At run time, when the
distance between a pair of nodes is needed, we can estimate it quickly
by combining the precomputed distances of the two nodes to the
landmarks. We prove that selecting the optimal set of landmarks is an
NP-hard problem, and thus heuristic solutions need to be employed.
Given a budget of memory for the index, which translates directly into
a budget of landmarks, different landmark selection strategies can
yield dramatically different results in terms of accuracy. A number of
simple methods that scale well to large graphs are therefore developed
and experimentally compared. The simplest methods choose central nodes
of the graph, while the more elaborate ones select central nodes that
are also far away from one another. The efficiency of the suggested
techniques is tested experimentally using five different real world
graphs with millions of edges; for a given accuracy, they require as
much as 250 times less space than the current approach in the
literature which considers selecting landmarks at random. Finally, we
study applications of our method in two problems arising naturally in
large-scale networks, namely, social search and community detection.