Introduction
This web site provides links and very brief summaries for some of the
work that has been done in the areas of self-similarity and long range
dependence in computer networks. Self-similar and long range dependent
characteristics in computer networks present a fundamentally different set
of problems to people doing analysis and/or design of networks, and many of
the previous assumptions upon which systems have been built are no longer
valid in the presence of self-similarity.
Papers documenting self-similarity and long range dependence
in networks
- Will Leland and Daniel Wilson,
High Time-Resolution Measurement and Analysis of LAN Traffic: Implications
for LAN Interconnection IEEE INFOCOM 1991.
This paper presents the first analysis of high resolution Ethernet LAN
traffic. The data set discussed in this work is the now famous Bellcore
data which consisted of measurements of hundreds of millions of packets.
The paper shows packet bursts over many time scales and is the precursor
to the Leland, Taqqu, Willinger, Wilson paper "On the Self-Similar Nature of
Ethernet Traffic".
- Will Leland, Murad Taqqu, Walter Willinger, and Daniel Wilson,
On the Self-Similar Nature of Ethernet Traffic (Extended
Version),
IEEE/ACM Transactions on Networking, Vol. 2, No. 1, pp. 1-15, February 1994.
(From an earlier version of the paper in SIGCOMM 93, Sept. 1993.)
This is it - THE paper which establishes that network packet traffic has
self-similar characteristics. It utilizes a very large set of Ethernet packet
traces taken at Bellcore and does a very thorough job of analyzing the data
statistically. The reason for the significance of this work is that up to
this point, network traffic was modeled as a Poisson process and all
analysis of networks were based on that assumption. This paper has won the
IEEE's highest award for research and was also part of ACM SIGCOMM's
Computer Communications Review 25th anniversary issue of top papers.
- Vern Paxson and Sally Floyd,
Wide-Area Traffic: The Failure of Poisson Modeling
IEEE/ACM Transactions on Networking, Vol. 3 No. 3, pp. 226-244, June 1995.
This paper evaluates a number of wide-area TCP arrival processes
(session and connection arrivals, FTPDATA connection arrivals within
FTP sessions, and TELNET packet arrivals) using 21 wide area traffic
traces. It shows that some are well modeled using Poisson process and
others are not. It also offers some results regarding how the
findings relate to the possible self-similarity of wide-area traffic.
- Mark E. Crovella and Azer Bestavros,
Self-Similarity
in World Wide Web Traffic: Evidence and Possible Causes
in IEEE/ACM Transactions on Networking, 5(6):835--846, December 1997.
(An earlier version of this work appeared under the BU technical report
TR-95-015
Boston University Computer Science Department, Revised, October 12, 1995)
This paper presents evidence of self-similarity in WWW traffic. It uses
a large set of WWW traces taken in a university computer lab over a period
of months. It shows that self-similarity in Web traffic can be explained
based on the underlying distribution of transferred document sizes, the
effects of caching and user preference in file transfer, the effect of
user ``think time'', and the superimposition of many such transfers in
a local area network.
- Ashok Erramilli, Onuttom Narayan, and Walter Willinger,
Experimental Queuing Analysis with Long-Range Dependent Packet Traffic
IEEE/ACM Transactions on Networking, Vol. 4, No. 2, pp. 209-223, April 1996.
- Walter Willinger, Murad Taqqu, Robert Sherman, and Daniel Wilson,
Self-Similarity Through High-Variability: Statistical Analysis of Ethernet
LAN Traffic at the Source Level IEEE/ACM Transactions on
Networking, Vol. 5, No. 1, pp. 71-86, February 1997
This paper is the follow-on work to the "On the Self-Similar Nature of
Ethernet Traffic" paper and presents an explanation for the occurrence of
self-similarity in LAN traffic. The paper shows that the superposition of
many ON/OFF traffic sources (with strictly alternating ON and OFF periods
where either the ON or the OFF period have very high or infinite
variability) produces aggregate network traffic which is self-similar.
There is a rigorous statistical analysis which supports this ON/OFF model
and a discussion of its implications.
- Walter Willinger, Vern Paxson, and Murad Taqqu,
Self-similarity and Heavy Tails: Structural Modeling of
Network Traffic. In
A Practical Guide to Heavy Tails: Statistical Techniques and
Applications, Adler, R., Feldman, R., and Taqqu, M.S.,
editors, Birkhauser, 1998.
This book chapter gives a very good review of measurements that
demonstrate network traffic to have self-similar characteristics. It also
presents structural models for this behavior which have physical
meaning in the network context.
- A. Feldmann, A. C. Gilbert, W. Willinger, and T. G. Kurtz,
The Changing Nature of Network Traffic: Scaling Phenomena
,
ACM Computer Communication Review, vol. 28, pp. 5-29, Apr. 1998.
"In this paper, we report on some preliminary results from an
in-depth, wavelet-based analysis of a set of high-quality, packet-level
traffic measurements, collected over the last 6-7 years from a
number of different wide-area networks (WANs). We first validate
and confirm an earlier finding, originally due to Paxson and Floyd
[14], that actual WAN trafficc is consistent with statistical
self-similarity for sufficiently large time scales... We present
here original results about a detailed statistical analysis of
Web-session characteristics."
Self-similarity and real-time network traffic
- Mark Garrett and Walter Willinger,
Analysis,
Modeling and Generation of Self-Similar VBR Video Traffic
In Proceedings of ACM SIGCOMM, September, 1994.
This paper presents a statistical analysis of a VBR video sample. The
paper shows that the marginal bandwidth distribution can be described as
being heavy-tailed and that the video sequence itself is long-range
dependent and can be modeled using a self-similar process. The paper
presents a new source model for VBR video traffic and describes how it may
be used to generate VBR traffic synthetically.
- Jan Beran, Robert Sherman, Murad Taqqu, and Walter Willinger,
Long-Range Dependence in Variable-Bit Rate Video Traffic,
IEEE Transactions on Communications, Vol. 43, No. 2/3/4, pp. 1566-1579,
February/March/April 1995.
- B. K. Ryu and A. Elwalid,
The
Importance of Long-range Dependence of VBR Video Traffic in ATM Traffic
Engineering: Myths and realities
ACM Computer Communication Review, vol. 26, pp. 3-14, Oct. 1996.
This paper studies the effects of long-range dependence in the context of
VBR video traffic through an ATM switch. It shows through a simulation
study that cell loss rates for realistically configured ATM switchs (buffer
sizes), are not adversely affected by LRD traffic. It goes on to show
that in fact the short term correlations have a much more dominant
effect on performance and, thus, prior Markovian models are sufficient for
analyzing performance.
Self-similarity and end-to-end congestion control
- Kihong Park, Gi Tae Kim, and Mark E. Crovella,
On the
Relationship Between File Sizes, Transport Protocols, and
Self-Similar Network Traffic. In Proceedings of the International
Conference on Network Protocols, pages 171-180, October, 1996.
(Also available as
Technical Report BU-CS-96-016, Boston University,
Computer Science Dept., July, 1996.)
This paper presents experimental results from simulations which examine
network traffic generated by transfers of files whose sizes are drawn from
a heavy-tailed distribution. The paper shows that the degree to which file
sizes are heavy-tailed can directly determine the degree of traffic
self-similarity. The paper also shows that reliable transmission and flow
control mechanisms in TCP serve to maintain long-range dependence in
network traffic.
-
Jon M. Peha,
Retransmission mechanisms and self-similar traffic models,
IEEE/ACM/SCS Communications
Networks and Distributed Systems Modeling and Simulation
Conference,
Phoenix, Arizona, pp. 47-52, Jan. 1997.
"In this paper, it is shown that even with the traditional Poisson
arrival of packets into a network, simple retransmission mechanisms
can make traffic appear self similar over time scales of engineering
interest. Moreover, some techniques intended to decrease the
likelihood of congestion also have the effect of prolonging congestion
when it does occur. This increases burstiness over large time-scales,
reinforcing the appearance of self-similarity."
- Kihong Park, Gitae Kim, and Mark E. Crovella,
On the Effect of Traffic Self-Similarity on Network Performance
. In Proceedings of SPIE International Conference on
Performance and Control of Network Systems, November, 1997. (Also available as
Technical Report CSD-TR-97-024, Purdue University,
Dept. of Computer Sciences, April, 1997.)
This paper investigates the effects of bursty network traffic when
transport layer functionality and the interaction of traffic sources
sharing bounded network resources are considered. The paper uses
simulations to shows how throughput and loss are affected as the degree of
burstiness is increased. It also shows how changes in network resources
such as link and buffer capacity affect performance when traffic is
bursty. It also investigates how various versions of TCP congestion
control affect network performance when traffic is bursty.
- Matthias Grossglauser and Jean Bolot,
On
the Relevance of Long Range Dependence in Network Traffic,
IEEE/ACM Transactions on Networking, 1998.
In this work we have examined under which conditions the salient long
range dependence feature of network traffic must be taken into account in
network performance evaluation. Our results confirm that "it is all a
matter of time scales". Specifically, when studying the performance of a
networking system or an application, many time scales must be taken into
account -- the time scales in the input traffic, but also the time scales
of the system (they show up for example because of finite buffer queues)
and the time scales of the performance metric of interest (J-C Belot).
- Tsunyi Tuan and Kihong Park.
Multiple Time Scale Congestion Control for Self-Similar Network
Traffic To appear in Performance Evaluation, 1999.
This paper presents a method of congestion control called multiple time
scale congestion control (MTSC). MTSC considers network traffic to be
self-similar and exploits this characteristic to adjust congestion control
on two different time scales. Simulation results are presented which show
that network performance under MTSC can be improved.
Other references relating to self-similarity and network traffic
-
Bong K. Ryu and Steve B. Lowen,
Point-Process Approaches to the
Modeling and Analysis of Self-Similar Traffic, in Proceedings of the
Conference on Computer Communications (IEEE Infocom), (San Fransisco,
California), Mar. 1996.
"In this paper, we present five fractal stochastic point processes
(FSPPs) as novel approaches to modeling and analyzing various types
of self-similar traffic: the fractal renewal process (FRP), the
superposition of several fractal renewal processes (Sup-FRP), the
fractal-shot-noise-driven Poisson process (FSNDP), the
fractal-binomial-noise-driven Poisson process (FBNDP), and the
fractal-Gaussian-noise-driven Poisson process (FGNDP). ... All
but the FRP permit finer control of the resulting process, in
particular the time scales over which self-similarity becomes
apparent."
- Vern Paxson,
Fast, Approximate Synthesis of Fractional Gaussian Noise for Generating
Self-Similar Network Traffic. Computer Communications
Review, V. 27 N. 5, October 1997, pp. 5-18.
A fast Fourier transform method for generating an approximation for
fractional Gaussian noise (FGN) is presented in this paper. FGN is known
to be one type of self similar process. The method presented is shown to
be as fast or faster than other methods of generating self-similar sample
paths and its output can be used in simulations as traces of self-similar
network traffic.
- A. Feldmann, A. C. Gilbert, and W. Willinger,
Data networks as cascades: Investigating the multifractal nature of
Internet WAN traffic, ACM Computer Communication Review, vol. 28, pp.
42-55, Sept. 1998.
"In apparent contrast to the well-documented self-similar (i.e.,
monofractal) scaling behavior of measured LAN traffic, recent
studies have suggested that measured TCP/IP and ATM WAN traffic
exhibits more complex scaling behavior, consistent with multifractals.
To bring multifractals into the realm of networking, this paper
provides a simple construction based on cascades (also known as
multiplicative processes) that is motivated by the protocol hierarchy
of IP data networks."
- Nick Duffield,
Applications of Large Deviations to Performance Analysis with Long-range
Dependent traffic, Keynote talk for Workshop on Stochastic
modeling and analysis of communication networks, Lund, Sweden, October
1-3, 1998.
This set of slides just has a great overview of terms like self-similarity,
long range dependence, etc. as well as what they mean to network
performance
analysis.
- Matthew Roughan and Darryl Veitch,
Measuring Long-Range Dependence under Changing Traffic
Conditions, In Proceedings of INFOCOM '99, New York, NY, 1999.
This paper examines the issue of stationarity in a process and how the new
Abry-Veitch (AV) estimator for long range dependence is affected by various
types of non-stationarity.
- Patrice Abry and Darryl Veitch,
Wavelet Analysis of Long-Range Dependence Traffic , IEEE
Transactions on Information Theory, 44(1), pp. 2-15, January, 1998.
This paper presents a wavelet-based method for estimating the Hurst
parameter in LRD data. The authors demonstrate that their tool/method is
unbiased, computationally efficient and robust against deterministic
trends. The paper gives a short background on wavelets and a
construction for their wavelet-based estimation method. They then compare
their method against the classic Whittle estimator for the Hurst parameter
and show that the wavelet method has some significant advantages. The
paper presents a discussion of the issue of stationarity in data sets and
shows that the wavelet method is still able to make very accurate estimates
of the Hurst parameter when there are trends in the data. They demonstrate
their tool/method using the Bellcore data for Ethernet packet interarrival
times and also briefly discuss mono versus multi-fractal modeling of this
data.
Heavy tailed distributions in network traffic
- Gordon Irlam,
Unix File Size Survey, Posted to USENET newsgroup
comp.os.research November 1993.
This posting gives the results of an on-line survey of Unix file systems.
It contains data on over 12 million files and shows evidence for the heavy
tailed nature of their sizes. This page also contains links to the data
files themselves.
- Will Leland and Teun Ott,
Load-balancing Heuristics and Process Behavior, PERFORMANCE '86
and ACM SIGMETRICS 1986 Joint conference on Computer Performance Modelling,
Measurement and Evaluation, North Carolina State Univ., pp. 54-69, May 1986.
This paper presents a model for process runtimes in an interactive
environment. The model (for processes which last longer than 3 seconds)
is: the probability that a process will run for more than t seconds is
proportional to rT**k, with k in the range -1.25 < k < -1.05 (r normalizes
the distribution).
- Mor Harchol-Balter and Allen Downey,
Exploiting Process Lifetime Distributions for Dynamic Load Balancing
ACM Transactions on Computer Systems vol. 15, no. 3, August
1997.
This paper examines the issue of load balancing between CPU's in a network
of workstations. It investigates the issue of whether/when to migrate
processes from one CPU to another which is less busy by evaluating process
lifetime data (as in the Leland and Ott cite above). They show that
process lifetimes measured on systems in a university setting are well
modeled by a heavy tailed distribution (Pareto) and show through a
simulation study that a preemptive migration strategy based on process
lifetimes is effective.
- Carlos Cunha, Azer Bestavros, Mark Crovella,
Characteristics of WWW Client-based Traces Boston University
Technical Report BUCS-95-010, July 18, 1995.
This paper presents some of the first Web client measurement ever made. It
characterizes traces taken using an instrumented version of Mosaic from a
university computer lab and shows that a number of Web properities can be
modeled using heavy tailed distributions. These properities include
document size, user requests for a document, and document popularity.
- Paul Barford, Azer Bestavros, Adam Bradley, Mark Crovella,
Changes in Web Client Access Patterns: Characteristics and Caching
Implications in World Wide Web, Special Issue on
Characterization and Performance Evaluation, Vol. 2, pp. 15-28, 1999..
This paper characterizes a set of Web client traces taken in a university
computer lab in 1998. It compares the traces with those taken in a similar
setting in 1995 (see reference above). It shows that while certain
characteristics such as document sizes are still well modeled with heavy
tailed distributions, the weight of the tail is diminished. The results
are then used in a study of the potential for the effectiveness of caching
in the Web. The paper shows that "between 1995 and 1998, (1) the benefits
of using size-based caching policies have diminished; and (2) the potential
for caching requested files in the network has declined."
General references
- Walter Willinger, Murad Taqqu, and Ashok Erramilli,
A
Bibliographical Guide to Self-Similar Traffic and Performance Modeling for
Modern High-Speed Networks Stochastic Networks: Theory and
Applications, Royal Statistical Society Lecture Notes Series, Vol. 4,
Oxford University Press, 1996.
This is a VERY extensive bibliographical guide to work done in the area of
self-similar network traffic and network performance modeling. Although it
is now just a little out of date, it provides an very large number of
citations on a wide variety of work done in these general areas through 1996.
-
Zafer Sahinoglu and Sirin Tekinay,
Self-similar traffic and network performance,
IEEE Communications Magazine, vol. 37, pp. 48-52, Jan. 1999.
"In this article we present a survey of the self-similarity phenomenon
observed in multimedia traffic and its implications on network
performance. ... An immediate consequence of this study is the
demonstration of the limitations or validity of conventional resource
allocation methods in the presence of self-similar traffic."
-
Nick
Duffield
has a number of additional publications dealing with long range
dependence in network traffic.
- Amara Graps,
Wavelet Page
This is a very nice Web page that has links to all kinds of information on
Wavelets. Wavelets are a useful tool for analyzing self-similar behavior.
Paul Barford <barford@cs.bu.edu>
Sally Floyd <floyd@aciri.org>