These datasets are the basis for many published studies. They are
available from the
Internet Traffic Archives. The format of the traces
and collection process are documented in our tech report
BUCS-TR-1995-010.
This material is based upon work supported by the National Science Foundation under Grant no. CCR-9501822.
In 1998 we captured a new set of client logs, using a method different
from the 1995 set. The format of the trace data and collection process
are documented in our tech report
BUCS-TR-1999-011,
and the trace itself is
here.
This material is based upon work supported by the National Science Foundation under Grant no. CCR-9501822.
Tcpeval constructs critical path analyses of TCP transactions. It was developed by Paul Barford during his PhD thesis research. Its algorithms
are described in the paper
Critical Path Analysis of TCP Transactions in
Proceedings of the 2000 ACM SIGCOMM Conference, Stockholm. Sweden, September 2000.
The source code is available in a compressed tarfile here. Included in the tarfile is a HOWTO with installation instructions.
If you download this code, please send an email to Paul Barford
(pb at cs.wisc.edu) let him know you are using it, and whether you find it useful.
This material is based upon work supported by the National Science Foundation under Grant no. CCR-9706685.
Surge, which generates Web requests intended to mimic measured
statistical properties is availble
here.
The
paper describing Surge's rationale and design is Generating Representative Web
Workloads for Network and Server Performance Evaluation in Proceedings of Performance '98/ACM SIGMETRICS '98.
However, the default models and parameter settings used in this version of Surge are based on analyses of the 1998 dataset, documented in
Changes in Web Client Access Patterns: Characteristics and Caching Implications in World Wide Web, Special Issue on Characterization and Performance Evaluation, Vol. 2, pp. 15-28, 1999.
This is the HTTP/1.1 compliant version of the code (HTTP/1.0 is still
supported in this release). There is a detailed HOW-TO included which
should get you going.
If you download this code, please send an email to Paul Barford
(pb at cs.wisc.edu) who developed it and will put your
name on the SURGE interest mailing list so that you will be notified about
future updates. We'd also be interested in what you will be
using the code for - if you could give him a brief overview I would
appreciate it.
While the HOW-TO suggests using the MIT pthreads, if you are using a 2.2
Linux kernel, we recommend you compile using kernel threads (make sure
your thread limit is set high enough!). To do that make the following
mods:
- In the Makefile, replace PGCC by CC. Added the pthread library
to Surgeclient.c rule (ie. -lpthread and -D_REENTRANT).
- Remove pthread_init() from Surgeclient.c
- Added #include <netinet/in.h> to Surgeclient.c (it is needed for the
struct sockaddr_in). (For Linux this should be #include <linux/in.h>).
- In Surgeclient.c you may need to change sys/times.h to sys/time.h (no
plural)
This material is based upon work supported by the National Science Foundation under Grant nos. CCR-9501822 and CCR-9706685.
BPROBE is a tool for measuring bottleneck bandwidth of an Internet path,
using the packet-pair technique. It was developed by Bob Carter
during his PhD research. Source code for
BPROBE is available
here and the paper describing the design of BPROBE is
here.
This material is based upon work supported by the National Science Foundation under Grant no. CCR-9501822.
This tool provides an estimation of the tail index alpha for empirical
heavy-tailed distributions, such as have been encountered in
telecommunication systems. It uses a method (called the ``scaling
estimator'') based on the scaling properties of sums of heavy-tailed
random variables. The software is available
here, and the paper describing aest is available
here.
In the paper
Mining
Anomalies Using Traffic Feature Distributions we used data from two
networks: GEANT and Abilene. THe GEANT data was provided to us under
NDA so we can't distribute it, but the Abilene data is freely
distributable. It can be downloaded
here as a Matlab file with associated
metadata and instructions. Note: this data consists of byte counts per
unit time (not the entropy measures used in the paper).
This material is based upon work supported by the National Science Foundation under Grant no. CCR-0325701.
Virtual Landmarks uses Lipschitz embedding of network nodes based on distances to landmark, along with dimensionality reduction via PCA. The method is described in
this paper. The datasets used in that paper are
here.
This material is based upon work supported by the National Science Foundation under Grant no. ANI-0322990.
Constraint-Based Geolocation (CBG) uses measured round-trip-time delays to estimate geographic position. The technique is described in
this paper. The code for CBG is
here as a collection of routines in R (you can get R itself
here).
This material is based upon work supported by the National Science Foundation under Grant no. ANI-0322990.
Multidimensional Scaling in the Poincare Disk is a method of
embedding a set of points equipped with interpoint distances
(or dissimilarities) into the Poincare model of hyperbolic space,
in a way that seeks to minimize the difference between the
input distances, and the distances as measured in the embedding.
Matlab code for MDS-PD is
here and the
method is described in
this paper.
This material is based upon work supported by the National Science Foundation under Grant no. CNS-1018266.
Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

All code on this page is licensed under a
Creative Commons License.
Mark Crovella /