Appendix: Heavy-tailed Distributions

Heavy-tailed distributions (also known as power-law distributions) have been observed in many natural phenomena including both physical and sociological phenomena. One example is the geographic distribution of people around the world. Most places in the world are completely empty or barely populated, while there are a relatively small number of geographical locations which are very densely populated.

A distribution is said to have a heavy-tail if:

This means that regardless of the distribution for small values of the random variable, if the asymptotic shape of the distribution is hyperbolic, it is heavy-tailed [7]. The simplest heavy-tailed distribution is the Pareto distribution which is hyperbolic over its entire range and has probability mass function:

and its cumulative distribution function is given by:

where represents the smallest value the random variable can take.

Heavy-tailed distributions have properties that are qualitatively different to commonly used (memoryless) distributions such as the exponential, normal or Poisson distribution.

In the Internet, heavy-tailed distributions have been observed in the
context of **traffic characterization** and in the context of
**topological properties**. In the area of traffic
characterization, evidence indicates that Ethernet traffic exhibits
self-similar properties [14]; also WAN traffic exhibits
self-similar properties [20], as is the case for traffic
specifically associated with WWW transfers [7]. The main
implication of such discoveries is that most previous analytic work
done in Internet studies adopted assumptions such as
exponentially-distributed packet interarrivals. Conclusions reached
under such exponentiality assumptions may be misleading or incorrect
in the presence of heavy-tailed distributions.

In the context of topological properties, recent empirical studies [9] have shown that Internet topologies exhibit power laws of the form for the following relationships: (P1) outdegree of node (domain or router) versus rank, (P2) number of nodes versus outdegree, (P3) number of node pairs within a neighborhood versus neighborhood size (in hops), and (P4) eigenvalues of the adjacency matrix versus rank. Prior to this discovery, most Internet studies and analyses had been done using underlying topologies that lack such properties (e.g. random networks). Several possible causes and plausible analytical models that explain the appearance of these properties in Internet topologies have been proposed. However, the area of topology characterization has not been explored so extensively and causes for the appearance of such power laws have not been convincingly given. In [2] the authors indicate that two possible causes are (F1) preferential connectivity and (F2) incremental growth. In [16] the authors examine these two factors in the formation of Internet topologies, plus (F3) distribution of nodes in space, and (F4) locality of edge connections.

There is agreement in that heavy-tailed distributions are ubiquitous
in the Internet. To the best of our knowledge, node placement and the
distribution of bandwidth and delays have not been conclusively
established, although Paxson observed wide variability in path
characteristics such as losses, round-trip times and bandwidth
[19], and high variability is one of the landmarks of
heavy-tailed distributions. BRITE incorporates heavy-tails in the
topology generation for some models. In particular, for models such as
*Waxman* and *BarábasiAlbert*, the user can select to place
the nodes in the plane according to a heavy-tailed
distribution. Furthermore, for all the models provided, included the
*Imported file model*, the user can select a heavy-tailed
distribution of bandwidths to links. The idea is to generate *annotated* graphs with bandwidth information to study the effect that
such distributions may have on the performance of certain protocols
and algorithms. Finally, for the experimental generation model *Bottom-up hierarchical*, the user can select to assign routers to ASs
according to a heavy-tailed distribution.