BU/NSF Workshop on
Internet Measurement, Instrumentation and Characterization

Boston University, Boston, Massachusetts, USA
Monday August 30, 1999

Final Report


Objectives and Overview

Because of its growth in size, scope, and complexity---as well as its increasingly central role in society---the Internet has become an important object of study and evaluation. Many significant innovations in the networking community in recent years have been directed at obtaining a more accurate understanding of the fundamental behavior of the complex system that is the Internet. These innovations have come in the form of better models of components of the system, better tools which enable us to measure the performance of the system more accurately, and new techniques coupled with performance evaluation which have delivered better system utilization. The continued development and improvement of our understanding of the properties of the Internet is essential to guide designers of hardware, protocols, and applications for the next decade of Internet growth.

As a research community, an important next step involves an comprehensive look at the challenges that lie ahead in this area. This includes an an evaluation of both the current unsolved challenges and the upcoming challenges the Internet will present us with in the near future, and a discussion of the promising new techniques that innovators in the field are currently developing. To this end, the Web and InterNetworking Research Group at Boston University (WING@BU), with support from the National Science Foundation, (grant #9985484) organized a one-day workshop which was held at Boston University on Monday, August 30, 1999 (immediately preceding ACM SIGCOMM '99).


Attendance Report

We were extremely pleased to find that there was greater than expected interest in the IMIC workshop as soon as it was announced. We put up a Web page with information about the workshop and sent out announcements via e-mail only six weeks before the conference. Eighty people pre-registered for the conference, including thirty-three students. The NSF support that allowed us to waive registration fees for students clearly had an impact in driving student registrations; almost all students specifically requested the waiver. Excluding the students, the remaining people who pre-registered were almost evenly split between industry and academia, demonstrating the wide appeal of the workshop subject.

Attendance at IMIC was superlative and exceeded all expectations of the organizers. The following are some statistics.

The IMIC workshop has attracted over 135 participants (excluding organizers and speakers). Those participants were split almost evenly between academia (73 participant) and industry (62 participant). More than two-thirds of IMIC participants (82 participants) were not local (i.e. not from the Greater Boston Area).

The organization of this workshop on the day before the start of SIGCOM'99 had a positive effect on outreach since over 100 of the participants indicated that they had plans to be at SIGCOMM'99; the rest indicated that they treveled exclusively to attend the IMIC workshop.

Participation of graduate students was particularly encouraging. Of the participants from academia 61 identified themselves as graduate students and 22 of these were females and 12 identified themselves as belonging to an under-represented group. The free registration at IMIC for all student participants was possible thanks to NSF funding as well as to Boston University's matching funds.


Technical Program Report

The BU/NSF Internet Measurement, Instrumentation and Characterization (IMIC) workshop featured four technical sessions: Each session consisted of 3 invited presentations, which were followed by an open discussion. The workshop concluded with a panel of researchers from academia, industry, and NSF. The panel discussed the opportunities and challenges that lie ahead, and initiatives to be undertaken. Brief reports on the sessions and panels follow.

Session I: Instrumentation and Measurement


Chair: Paul Barford

There were three presentations in this session which generally dealt with wide area, Internet measurements. Issues involving measurements of Internet topology and performance were discussed in some detail as well as the applications for the measurement data. The session was made up of the following presentations:

Ramesh Govindan discussed a set of heuristic methods for generating a router-level topology map of the Internet. The method for gathering route information in this work is realized in a tool called Mercator which uses informed random address probing to determine which parts of the IP address space to examine. Mercator also employs alias resolution which enables a router advertising multiple IP interfaces to be distinguished. Dr. Govindan also discussed some of his groups experiences during their path probing study. These included social and practical issues involved with sending active probes into networks. Probe discovery results were also presented including the distribution of the number of routers per ISP (approximately 2700 ISP's had been discovered with the largest having over 1000 routers), the effectiveness of alias resolution (approximately 15% of interfaces were not reachable), topology maps for certain IPS's, and results in validating what had been discovered.

Sugih Jamin discussed methods for determining where in the Internet to place measurement systems which are focused on collecting distance metrics (hop count, round-trip time, etc.). The data gathered by these systems is used to generate IDMaps which could be used by clients to, for example, determine which which web server mirror site is the closest. Dr. Jamin presented a set of heuristic placement methods based on network topology features and then analyzed their effectiveness in simulation. The results showed that information from IDMaps can significantly improve the accuracy of the nearest mirror selection over random selection. Results were also presented on how the number of measurement systems used can affect the accuracy of mirror selection.

Matt Zekauskas described the Surveyor IP Measurement project and discussed a variety of lessons learned and results from that project. The Surveyor project is based on a measurement infrastructure (consisting of 56 systems) that is distributed throughout the global Internet. This infrastructure is used to make highly synchronized measurements of route, packet delay and packet loss characteristics between all of its systems. The daily summary results of the continuous measurements made by Surveyor are available on-line. Some of the results from Surveyor show routing to often be asymmetric, queuing along symmetric paths is often asymmetric and significant variability in both packet delay and loss along certain paths.

Impact:

The presentations in this session highlighted the fact that Internet measurements can provide valuable insight to network researchers. It is clear that Internet topology, while complex and not well understood, can be systematically discovered without direct information from ISP's. It is also clear that carefully placed measurement systems can provide end users with information which can enhance their performance, and that paths between systems can show very different characteristics over time. The presentations also showed that there are many challenges to making effective measurements in the Internet. These difficulties principally arise from the Internet's immense size, continued growth and complexity. Finally, the session outlined a number of significant opportunities for future, measurement-based research. These include development of a repository of topological data, careful deployment and management of measurement systems which can be used by a variety of applications and the need for new analysis techniques for measurement data.


Session II: Modeling and Characterization


Chair: Mark Crovella

The three presentations in this session dealt with modeling and characterization problems in the Internet. The session included the following presentations:

Walter Willinger led off the session with a provocative take on the state of Internet modeling and characterization. His review of current practices included examples of each of "the good, the bad, and the ugly." He appealed for the "good" --- characterization that focuses on invariants, and yields transparent models. Examples of the "good" include efforts to characterize data traffic and Internet topologies via scaling laws and heavy-tailed distributions. He also warned against "bad and ugly" approaches; examples include fitting statistical models (such as timeseries models) without seeking any physical basis, and experimental evaluations that ignore the multifarious sources of variability in the Internet evnironment.

Don Towsley the described his recent work in internet tomography -- the process of inferring network-internal state from measurements taken only at the edges. Despite the seeming inscrutability of the Internet when viewed from the vantage point of end systems, his results show that considerable information can be obtained by comparing observations made at different points along the edge of the network. In particular, multicast protocols induce correlation among packet streams, and these correlations can be observed at the edge, suggesting elegant estimation methods with remarkably accurate results.

Finally, Jim Pitkow closed the session with a summary of his recent work in characterizing user "surfing" behavior in the Internet (i.e., the sequences of Web requests made by individudal users). His approach models a user's link-following as a random walk with a decreasing utility function; these models predict long-tailed distributions in the lengths of sequences of URL requests. The feature of his model is in agreement with empirical studies showing long tails in the lengths of user click-sequences.

Impact:

This session showed that sophisticated models for Internet characteristics are starting to emerge. Like Session I, this session showed that considerable information about the internal state of the Internet can be obtained without cooperation from service providers. In addition, the session showed that the models used for characterizing Internet properties are substantially different from those commonly encountered in performance evaluation; in particular, the long-tailed properties of the models proposed for user behavior and network traffic represent radical departures from traditional Markovian models.


Session III: End-to-End Protocols and Services


Chair: John Byers

There were three presentations in this session which generally dealt with wide area, Internet measurements. Issues involving measurements of Internet topology and performance were discussed in some detail as well as the applications for the measurement data. The session was made up of the following presentations:

Mostafa Ammar discussed a novel solution to the challenging problem of selecting a server from a number of mirror sites with an eye toward minimizing the transfer time. By using the technique of network-layer anycasting (described in RFC 1541), Dr. Ammar demonstrated how a client could make use of a DNS-like anycast hierarchy in conjunction with special anycast addresses to resolve a request for service into an appropriate server from among the anycast group. The proposed selection procedure is measurement-driven, based on response time from past requests. The three most significant challenges for this architecture are scalability, as it is impractical for each resolver to maintain measurement estimates to all servers in all anycast groups; keeping estimates up-to-date, which is accomplished by a combination of periodic server updates and resolver probes; and eliminating oscillatory behavior, which is achieved by hysteresis and randomization in the selection process.

Venkata Padmanabhan argued for coordination across concurrent data streams sharing both a common endpoint and common network resources. For example, in the context of multiple TCP streams emanating from a commmon source, coordinating congestion control enables faster shared learning of network resources and more predictable performance across streams. Dr. Padmanabhan demonstrated a strong correlation in queuing delay along connections which shared a single common congested Intranet link, but also pointed to the fact that this correlation can be weakened in the presence of multiple congested links. He then went on to describe a router tagging mechanism which used a form of explicit congestion notification (ECN) to mark packets sharing a congested link. On a related point, Padmanabhan argued that not all packets are created equal (such as TCP SYN packets), and recommended shielding those packets against loss.

Vaduvur Bharghavan discussed the challenges associated with supporting heterogeneous packet flows (HPF) at the transport layer. One emerging example of an HPF that Dr. Bharghavan cited was in the context of multimedia applications, in which video, audio and text may have different QoS policies regarding reliability, priority, sequencing and deadlines. The theme for the architecture he defined allows applications to specify the QoS policies they desire, while the transport layer implements the policies on a per-frame basis. Rather than segregate these subflows under separate administrative control, Bharghavan motivated a unified, integrated transport architecture which determines what and when to send across the entire HPF. As examples of how one might achieve this, he provided details of a selective reliability option, a goodput control mechanism and some preliminary performance results.

Impact:

The presentations in this session highlighted the rich array of problems that can be addressed by enabling emerging applications to better manage their use of network resources from an end-to-end perspective. One theme of this session was that these policies should be implemented as middleware services or as enhanced transport-level mechanisms which applications can leverage from without specific application-level customization. Dr. Ammar's talk motivated a new form of hierarchical directory service kept up to date with scalable point-to-point measurements; Dr. Padmanabhan's talk motivated the problem of shared bottleneck detection which coincided with the initiation of several other studies which seek to identify bottlenecks using purely end-to-end methods; Dr. Bharghavan's talk motivated end-to-end methods of improving the quality-of-service of flows he termed Heterogeneous Packet Flows, treating them not as a collection of heterogeneous objects, but by treating them as an ensemble of objects which can be coordinated. Talks from this session have already been cited extensively in the literature as they all study new problems of fundamental importance emphasizing the end-to-end methods that form a cornerstone of the current Internet architecture.


Session IV: Network Support for Next Generation Applications

Chair: Ibrahim Matta

This session consisted of the following three presentations, which were concerned with network services needed to effectively support next-generation applications.

Henning Schulzrinne discussed network support for adaptive multimedia applications. Henning argued that the network should provide interactive multimedia applications with not only the capability of reserving resources, but also the incentive to adapt their reservation so as to trade between blocking of their requests and the quality of service they will experience if they ask for lower reservation. To this end, Henning presented a scalable resource reservation approach, called YESSIR, that addresses the scalability problems of RSVP. YESSIR sets up reservations by having routers process RTCP messages to estimate how much resources to reserve for an RTP flow. YESSIR also implements reservation aggregation. Henning also presented RNAP, a pricing-based adapation protocol, where a service is offered at a certain price for a limited time, after which a renegotiation takes place. Both a centralized and a distributed implementation of RNAP were presented.

Bala Rajagopalan discussed practical issues in the development and deployment of constraint-based (QoS/policy) unicast and multicast routing mechanisms. Bala described the role of these mechanisms in the overall QoS framework of the Next-Generation Internet, and the current developments in the Internet protocol areas that facilitate the deployment of these mechanisms. Bala discussed the provision of VPNs and diff-serv SLAs using LSPs. For diff-serv, he pointed to the difficulties involved due to the lack of knowledge of the traffic demand matrix, which has to be estimated/measured. Bala raised several practical issues, including scalability and multicast. He presented a  flexible methodology for constraint-based routing (CBR) based on distributed overlays, where the underlying IGP is used by CBR entities to communicate.

John Zinky discussed the need for a network resource status service. This service would allow a distributed application to know the expected network performance BEFORE it starts using the resources, so that it can choose among several alternatives or know when to switch to a new alternative. John described how applications could use the service within the BBN Quality Objects (QuO) framework, which adds QoS control and measurement into CORBA. John also discussed options on how to implement such a network resource status service, and experience with several proto-type implementations, including CMU Remos.
 

Impact:

This session raised a number of challenging open questions that the community needs to address. First, the network, even diff-serv,  needs some form of admission control to support controlled degradation in quality for multimedia. In addition, users need economic incentives to adapt. Second, constraint-based routing has to address the practical issues of scalability, flexible deployment and measurement/estimation of traffic demands. Finally, it was felt that applications are in need for a wide-area network resource status service (a QoS layer) that measures and predicts QoS.


Panel: IMIC Challenges, Opportunities & Initiatives


Coordinator: Azer Bestavros

There were five panelists invited to the panel. Four of the panelists were chosen as representatives of the four sessions of the workshop. The Fifth panelist was chosen as a representative of funding agencies (namely NSF). The panelists were:

The panel discussions started with short remarks presented by the Panel Coordinator, Azer Bestavros, in which he introduced the panelists and framed the discussions into the following set of topics. The following were "teaser" questions, which were directed to the panelists to start off the discussion. The discussion that ensued touched on may of the above issues (and questions). The following observations were made by the panelists as well as by participants.


The BU/NSF Workshop Organizing Committee


Created on: 2000.06.01