next up previous
Next: Appendix: Parsing Support Routines Up: BRITE: Universal Topology Generation Previous: Conclusions

Appendix: Heavy-tailed Distributions

Heavy-tailed distributions (also known as power-law distributions) have been observed in many natural phenomena including both physical and sociological phenomena. One example is the geographic distribution of people around the world. Most places in the world are completely empty or barely populated, while there are a relatively small number of geographical locations which are very densely populated.

A distribution is said to have a heavy-tail if:

\begin{displaymath}P[X > x] \sim x^{-\alpha}, \mbox{ as x } \rightarrow \infty, 0 < \alpha < 2 \end{displaymath}

This means that regardless of the distribution for small values of the random variable, if the asymptotic shape of the distribution is hyperbolic, it is heavy-tailed [7]. The simplest heavy-tailed distribution is the Pareto distribution which is hyperbolic over its entire range and has probability mass function:

\begin{displaymath}p(x) = \alpha k^{\alpha}x^{-\alpha - 1}, \alpha, k > 0, x \ge k. \end{displaymath}

and its cumulative distribution function is given by:

\begin{displaymath}F(x) = P[X \le x] = 1 - (k/x)^{\alpha} \end{displaymath}

where $k$ represents the smallest value the random variable can take.

Heavy-tailed distributions have properties that are qualitatively different to commonly used (memoryless) distributions such as the exponential, normal or Poisson distribution.

In the Internet, heavy-tailed distributions have been observed in the context of traffic characterization and in the context of topological properties. In the area of traffic characterization, evidence indicates that Ethernet traffic exhibits self-similar properties [14]; also WAN traffic exhibits self-similar properties [20], as is the case for traffic specifically associated with WWW transfers [7]. The main implication of such discoveries is that most previous analytic work done in Internet studies adopted assumptions such as exponentially-distributed packet interarrivals. Conclusions reached under such exponentiality assumptions may be misleading or incorrect in the presence of heavy-tailed distributions.

In the context of topological properties, recent empirical studies [9] have shown that Internet topologies exhibit power laws of the form $y = x^{a}$ for the following relationships: (P1) outdegree of node (domain or router) versus rank, (P2) number of nodes versus outdegree, (P3) number of node pairs within a neighborhood versus neighborhood size (in hops), and (P4) eigenvalues of the adjacency matrix versus rank. Prior to this discovery, most Internet studies and analyses had been done using underlying topologies that lack such properties (e.g. random networks). Several possible causes and plausible analytical models that explain the appearance of these properties in Internet topologies have been proposed. However, the area of topology characterization has not been explored so extensively and causes for the appearance of such power laws have not been convincingly given. In [2] the authors indicate that two possible causes are (F1) preferential connectivity and (F2) incremental growth. In [16] the authors examine these two factors in the formation of Internet topologies, plus (F3) distribution of nodes in space, and (F4) locality of edge connections.

There is agreement in that heavy-tailed distributions are ubiquitous in the Internet. To the best of our knowledge, node placement and the distribution of bandwidth and delays have not been conclusively established, although Paxson observed wide variability in path characteristics such as losses, round-trip times and bandwidth [19], and high variability is one of the landmarks of heavy-tailed distributions. BRITE incorporates heavy-tails in the topology generation for some models. In particular, for models such as Waxman and BarábasiAlbert, the user can select to place the nodes in the plane according to a heavy-tailed distribution. Furthermore, for all the models provided, included the Imported file model, the user can select a heavy-tailed distribution of bandwidths to links. The idea is to generate annotated graphs with bandwidth information to study the effect that such distributions may have on the performance of certain protocols and algorithms. Finally, for the experimental generation model Bottom-up hierarchical, the user can select to assign routers to ASs according to a heavy-tailed distribution.

next up previous
Next: Appendix: Parsing Support Routines Up: BRITE: Universal Topology Generation Previous: Conclusions
Alberto Medina 2001-04-12