IAP Abstracts 2004


Networking


Virtual Landmarks for the Internet (Poster)
Liying Tang
Supervised By: Mark Crovella


Internet coordinate schemes have been proposed as a method for estimating minimum round trip time between hosts without direct measurement. In such a scheme, each host is assigned a set of coordinates, and Euclidean distance is used to form the desired estimate. Two key questions are: How accurate are coordinate schemes across the Internet as a whole? And: are coordinate assignment schemes fast enough, and scalable enough, for large scale use? In this paper we make contributions toward answering both those questions. Whereas the coordinate assignment problem has in the past been approached by nonlinear optimization, we develop a faster method based on dimensionality reduction of the Lipschitz embedding. We show that this method is reasonably accurate, even when applied to measurements spanning the Internet, and that it naturally leads to a scalable measurement strategy based on the notion of virtual landmarks.


Diagnosing Network-Wide Traffic Anomalies (Poster)
Anukool Lakhina
Supervised By: Mark Crovella


Anomalies are unusual and significant changes in a network's traffic levels, which can often span multiple links. Diagnosing anomalies is critical for both network operators and end users. It is a difficult problem because one must extract and interpret anomalous patterns from large amounts of high-dimensional, noisy data. We propose a general method to diagnose anomalies. This method is based on a separation of the high-dimensional space occupied by a set of network traffic measurements into disjoint subspaces corresponding to normal and anomalous network conditions. We show that this separation can be performed effectively using Principal Component Analysis. Using only simple traffic measurements from links, we study volume anomalies and show that the method can: (1) accurately detect when a volume anomaly is occurring; (2) correctly identify the underlying origin-destination (OD) flow which is the source of the anomaly; and (3) accurately estimate the amount of traffic involved in the anomalous OD flow. We evaluate the method's ability to diagnose (i.e., detect, identify, and quantify) both existing and synthetically injected volume anomalies in traffic from two backbone networks. Our method consistently diagnoses the largest volume anomalies, and does so with a very low false alarm rate.


A Distributed Approach to Monitor Network Flows on an IP Backbone  (Poster)
Manish Raj Sharma
Supervised By: John Byers


Internet Service Providers (ISPs) need to deploy monitoring systems to perform tasks such as network capacity planning, traffic engineering and fault diagnosis. Such systems are not expected to be ubiquitous or directly integrated into routers but are usually deployed at strategic locations. In this work, we address the problem of maximally monitoring a given number of flows with a fixed number of monitors. Our techinque is based on a message exchange model amongst monitors which maintain Bloom filter summaries of the monitored flows. The contribution of our methods is to ensure zero false positives at the cost of minimal duplication of monitoring effort for any flow. Ongoing work involves experimental validation of our methods on synthetic and real network topologies. Our focus is on minimizing communication cost, developing efficient load balancing techniques and minimizing convergence time.


Elastic TCP-based Tunnels Providing Soft Bandwidth Guarantees (Poster)
Mina Guirguis, Niky Riga, Gali Diamant, Yuting Zhang
Supervised By: Azer Betavros, Ibrahim Matta


The best-effort nature of the Internet poses a significant obstacle to the deployment of many applications that require guaranteed bandwidth. In this paper, we present a novel approach that enables two edge/border routers---which we call Internet Traffic Managers (ITM)---to use an adaptive number of TCP connections to set up a tunnel of desirable bandwidth between them. The number of TCP connections that comprise this tunnel is elastic in the sense that it increases/decreases in tandem with competing cross traffic to maintain a target bandwidth. An origin ITM would then schedule incoming packets from an application requiring guaranteed bandwidth over that elastic tunnel. Unlike many proposed solutions that aim to deliver soft QoS guarantees, our elastic-tunnel approach does not require any support from core routers (as with IntServ and DiffServ); it is scalable in the sense that core routers do not have to maintain per-flow state (as with IntServ); and it is readily deployable within a single ISP or across multiple ISPs. To evaluate our approach, we develop a flow-level control-theoretic model to study the transient behavior of established elastic TCP-based tunnels. The model captures the effect of cross-traffic connections on our bandwidth allocation policies. Through extensive simulations, we confirm the effectiveness of our approach in providing soft bandwidth guarantees. We also outline our kernel-level ITM prototype implementation.

Efficiently and Fairly Allocating Bandwidth at a Highly Congested Link (Poster)
Tao Wang
Supervised By: Ibrahim Matta, Azer Bestavros


We consider the problem of efficiently and fairly allocating bandwidth at a highly congested link to a diverse set of flows, including TCP flows with various Round Trip Times (RTT), non-TCP-friendly flows such as Constant-Bit-Rate (CBR) applications using UDP, misbehaving, or malicious flows. Though simple, a FIFO queue management is vulnerable. Fair Queueing (FQ) can guarantee max-min fairness but fails at efficiency. RED-PD exploits the history of RED's actions in preferentially dropping packets from higher-rate flows. Thus, RED-PD attempts to achieve fairness at low cost. By relying on RED's actions, RED-PD turns out not to be effective in dealing with non-adaptive flows in settings with a highly heterogeneous mix of flows. In this paper, we propose a new approach we call RED-NB (RED with No Bias). RED-NB does not rely on RED's actions. Rather it explicitly maintains its own history for the few high-rate flows. RED-NB then adaptively adjusts flow dropping probabilities to achieve max-min fairness. In addition, RED-NB helps RED itself at very high loads by tuning RED's dropping behavior to the flow characteristics (restricted in this paper to RTTs) to eliminate its bias against long-RTT TCP flows while still taking advantage of RED's features at low loads. Through extensive simulations, we confirm the fairness of RED-NB and show that it outperforms RED, RED-PD, and CHOKe in all scenarios..

Packet Loss Inference for TCP (Poster)
Nahur Fonseca
Supervised By: Mark Crovella


One of TCP's critical tasks is to determine which packets are lost in the network, as a basis for control actions (flow control and packet retransmission). Modern TCP implementations use two mechanisms: timeout, and fast retransmit. Detection via timeout is necessarily a time-consuming operation; fast retransmit, while much quicker, is only effective for a small fraction of packet losses. In this paper we consider the problem of packet loss detection in TCP more generally. We concentrate on the fact that TCP's control actions are necessarily triggered by *inference* of packet loss, rather than conclusive knowledge. This suggests that one might analyze TCP's packet loss detection in a standard inferencing framework based on probability of detection and probability of false alarm. This paper makes two contributions to that end: First, we study an example of more general packet loss inference, namely optimal Bayesian packet loss detection based on round trip time. We show that for long-lived flows, it is frequently possible to achieve high detection probability and low false alarm probability based on measured round trip time. Second, we construct an analytic performance model that incorporates general packet loss inference into TCP. We show that for realistic detection and false alarm probabilities (as are achievable via our Bayesian detector) and for moderate packet loss rates, the use of more general packet loss inference in TCP can improve throughput by as much as 25%.

ROMA: Reliable Overlay Multicast with Loosely Coupled TCP Connections (Poster)
Gu-In Kwon
Supervised By: John Byers


We consider the problem of architecting a reliable content delivery system across an overlay network using TCP connections as the transport primitive. We first argue that natural designs based on store-and-forward principles that tightly couple TCP connections at intermediate end-systems impose fundamental performance limitations, such as dragging down all transfer rates in the system to the rate of the slowest receiver. In contrast, the ROMA architecture we propose incorporates the use of loosely coupled TCP connections together with fast forward error correction techniques to deliver a scalable solution that better accommodates a set of heterogeneous receivers. The methods we develop establish chains of TCP connections, whose expected performance we analyze through equation-based methods. We validate our analytical findings and evaluate the performance of our ROMA architecture using a prototype implementation via extensive Internet experimentation across the PlanetLab distributed testbed.

Model-based Loss Inference by TCP over Heterogeneous Networks (Poster)
Dhiman Barman
Supervised By: Ibrahim Matta


The Transmission Control Protocol (TCP) has been the protocol of choice for many Internet applications requiring reliable connections. The design of TCP has been challenged by the extension of connections over wireless links. In this paper, we investigate a Bayesian approach to infer at the source host the reason of a packet loss, whether congestion or wireless transmission error. Our approach is ``mostly'' end-to-end since it requires only one {\m long-term average} quantity (namely, long-term average packet loss probability over the wireless segment) that may be best obtained with help from the network (e.g. wireless access agent). Specifically, we use Maximum Likelihood Ratio tests to evaluate TCP as a classifier of the type of packet loss. We study the effectiveness of {\m short- term} classification of packet errors (congestion vs. wireless), given stationary prior error probabilities and distributions of packet delays conditioned on the type of packet loss (measured over a longer time scale). Using our Bayesian-based approach and extensive simulations, we demonstrate that an efficient online error classifier can be built as long as congestion-induced losses and losses due to wireless transmission errors produce sufficiently different statistics. We introduce a simple queueing model to underline the conditional delay distributions arising from different kinds of packet losses over aheterogeneous wired/wireless path. To infer conditional delay distributions, we consider Hidden Markov Model (HMM) which explicitly considers discretized delay values observed by TCP as part of its state definition, in addition to an HMM which does not. We demonstrate how estimation accuracy is influenced by different proportions of congestion versus wireless losses and penalties on incorrect classification.

Exploiting the adaptive behavior of Internet Elements (Poster)
Mina Guirguis
Supervised By: Azer Bestavros and Ibrahim Matta


In this work, we expose an adversarial attack scheme that hinders an Internet element from reaching a quiescent operating point, through exploiting its transient behavior. We show that such attack would reduce the quality of service, while evading detection through consuming a low portion of that element's reduced capacity. Through exposing such vulnerabilities, more resilient mechanisms could be constructed. We present numerical and simulation results along with real Internet experiments.

Generalized API for Internet Traffic Managers (Poster)
Gali Diamant, Leonid Veytser, Mina Guirguis, Liang Guo, Yuting Zhang and Sean Chen
Supervised By: Ibrahim Matta and Azer Bestavros


Internet Traffic Managers (ITMs) are special machines placed at strategic places in the Internet. itmBench is an interface that allows users (e.g. network managers, service providers, or experimental researchers) to register different traffic control functionalities to run on one ITM or an overlay of ITMs. Thus {\em itmBench} offers a tool that is extensible and powerful yet easy to maintain. ITM traffic control applications could be developed either using a kernel API so they run in kernel space, or using a user-space API so they run in user space. We demonstrate the flexibility of {\em itmBench} by showing the implementation of both a kernel module that provides a differentiated network service, and a user-space module that provides an overlay routing service. Our itmBench Linux-based prototype is free software and can be obtained from http://www.cs.bu.edu/groups/itm/.

The Effect of Router Buffer Size on HighSpeed TCP Performance (Poster)
Georgios Smaragdakis and Dhiman Barman
Supervised By: Ibrahim Matta


We study the effect of the IP router buffer size on the throughput of Highspeed TCP (HSTCP). We are motivated by the fact that in high speed routers, the size of buffer is important as such a large buffer size might be a constraint. We first derive an analytical model for HighSpeed TCP and we show that for small buffer size equal to 10% of bandwidth-delay product, HSTCP can gain more then 90% of bottleneck capacity. We also show that the setting the buffer size equal to 20% can increase the utilazation of HSTCP up to 98%. On the contrary, setting the buffer size to less than 10% of the bandwidth-delay product can decrease the HSTCP throughput significantly. We study the performance effects using both DropTail and RED AQM which tends to remove bias against bursty flows. Analytical results obtained by fixed point approach are compared to those obtained by simulation.


Pitfalls of Data Aggregation in Sensor Networks: Bane of Synchronization  (Poster)
Vijay Erramilli
Supervised By: Ibrahim Matta


One of the unique traits of sensor networks is the ability to aggregate data of concast flows as intermediate nodes forward this data to the sink. While there are numerous advantages of data aggregation(namely reduced messages, conservation of energy etc.), in this poster we highlight some of the vulnerabilities which are introduced into the system by the order imposed by data aggregation and how one can exploit these vulnerabilities. We model data aggregation using Cellular Automata and show by means of simulation that the synchronization introduced in the system by aggregation forces the system to operate in a narrow region of stability and any small perturbation can lead to wide-scale instabilities and oscillations in the system.


A Randomized Solution to BGP Divergence (Poster)
Selma Yilmaz
Supervised By: Ibrahim Matta


The Border Gateway Protocol (BGP) is an interdomain routing protocol that allows each Autonomous System (AS) to define its own routing policies independently and use them to select the best routes. By means of policies, ASes are able to prevent some traffic from accessing their resources, or direct their traffic to a preferred route. However, this flexibility comes at the expense of a possibility of divergence behavior because of mutually conflicting policies. Since BGP is not guaranteed to converge even in the absence of network topology changes, it is not safe. In this paper, we propose a randomized approach to provide safety in BGP. The proposed algorithm dynamically detects policy conflicts, and tries to eliminate the conflict by changing the local preference of the paths involved. Both the detection and elimination of policy conflicts are performed locally, i.e. by using only local information. Randomization is introduced to prevent synchronous updates of the local preferences of the paths involved in the same conflict.

Vision and Graphics


Learning Euclidean Embeddings for Efficient Indexing  (Poster)
Vassilis Athitsos, Jonathan Alon, Panagiotis Papapetrou
Supervised By: Stan Sclaroff, George Kollios


An embedding method is presented, that can be used to speed up nearest neighbor retrieval in non-Euclidean spaces. Given a space X with a computationally expensive distance measure (like edge images with the chamfer distance, or time series with dynamic time warping), we construct an embedding that maps database and query objects into a Euclidean space, in which similarities can be rapidly measured using a weighted Manhattan distance. Embedding construction is formulated as a machine learning task, where AdaBoost is used to combine many simple, 1D embeddings into a multidimensional embedding that preserves a significant amount of the proximity structure in the original space. Performance is evaluated on a large database of hand images, and a video database of American Sign Language Signs. In both datasets, BoostMap significantly increases efficiency, with minimal losses in accuracy. Moreover, the experiments indicate that BoostMap compares favorably with existing embedding methods that have been employed in computer vision and database applications, i.e., FastMap and Bourgain embeddings.


Fast Segmentation of Pulmonary Tumors by Contour Propagation in 4DCT (Poster)
William Mullally
Supervised By: Margrit Betke


4D Computed Tomography provides time-resolved volumetric information about respiratory organ motion. To incorporate organ motion based on 4DCT data into radiotherapy treatment planning, target contours are needed for each of the typically 10 volumes in the dataset. For routine use, the increase in workload to contour 10 volumes is unacceptable. We propose a method for quickly segmenting lung tumors across 4DCT datasets. Our method provides a quick automatic propagation of expert knowledge marked on one volume in a 4DCT dataset onto the volumes at different respiratory phases. It is difficult to automatically determine the boundary between pulmonary tumors and adjacent tissues with similar densities such as the lung wall and vascular structure. However, given one hand-marked contour in the initial scan, our method can account for the respiration-induced motion and deformation of the tumor in sequential scans. Segmentation is achieved using a rigid registration of tumor contours to lung surfaces. Low-level morphological operators are applied to account for non-rigid deformations. Using our method, contours can be propagated within minutes between different respiratory states. We compare our approach to a non-rigid intensity based registration method and report promising segmentation results.


Automated Placement of Cameras with Optimal Cost in a Floorplan to Satisfy Task Specific Constraints (Poster)
Murat Erdem
Supervised By: Stan Sclaroff


In many multi-camera vision systems the effect of camera locations on the task-specific quality of service is ignored. Researchers in Computational Geometry have proposed elegant solutions for some sensor location problem classes. Unfortunately, these solutions utilize unrealistic assumptions about the cameras' capabilities that make these algorithms unsuitable for many real-world computer vision applications: unlimited field of view, infinite depth of field, and/or infinite servo precision and speed. In this work, the general camera placement problem is first defined with assumptions that are more consistent with the capabilities of real-world cameras. The region to be observed by cameras may be volumetric, static or dynamic, and may include holes that are caused, for instance, by columns or furniture in a room that can occlude potential camera views. A subclass of this general problem can be formulated in terms of planar regions that are typical of building floorplans. Given a floorplan to be observed, the problem is then to efficiently compute a camera layout such that certain task-specific constraints are met. A solution to this problem is obtained via binary optimization over a discrete problem space.


Simultaneous Localization and Recognition of Dynamic Hand Gestures (Poster)
Jonathan Alon, Vassilis Athitsos
Supervised By: Stan Sclaroff


A novel framework for the simultaneous localization and recognition of dynamic hand gestures is proposed. At the core of this framework is a dynamic space-time warping algorithm, a natural extension of Dynamic Time Warping (DTW), that aligns a pair of query and model gestures in both space and time. For every frame of the query sequence a skin detector generates multiple hand region candidates. Local distances are then computed for every candidate and model hand pairs based on feature similarity, and are stored in a three-dimensional table. Dynamic programming is then used to compute both a global matching score, which is used to recognize the query gesture, and a warping path, which indicates the most likely hand candidates, effectively localizing the gesturing hand in every query frame. The performance of the approach is evaluated on a dataset of hand signed digits gestured by people wearing short sleeves shirts, in front of a background containing other non-hand skin-colored objects. The algorithm simultaneously localizes the gesturing hand and the hand-signed digit.


Real-Time 4D Tumor Tracking and Modeling From Internal and External Fiducials In Fluoroscopy (Demo And Poster)
Johanna Brewer
Supervised By: Margrit Betke


Fluoroscopy is currently used in treatment planning for patients undergoing radiation therapy. Radiation oncologists would like to maximize the amount of dose the tumor receives, and minimize the amount delivered to the surrounding tissues. During treatment, patients breathe freely, and so the tumor location will not be fixed. This makes calculating the amount of dose delivered to the tumor difficult. We present and examine a method of tracking the 2D motion of internal markers (surgical clips) placed around the tumor. We establish a ground truth for comparison with the tracker, and examine other available tracking software. Using two orthogonal fluoroscopy videos, we present and examine a method for reconstructing the average and maximum 3D motion of the clips given sequential videos, and actual 3D motion of the clips given simultaneous videos. Using this 3D data in conjunction with a Computed Tomography (CT) scan, we can give an estimate of 3D tumor motion and deformation. If imaging is possible during treatment, this motion can be used for beam guided radiation, otherwise, it can be correlated to a set of external markers (beads) for use in respiratory gating.

An Improved Algorithm for Estimating Nonrigid 3D Scene Flow from Binocular Sequences (Poster)
Li, Rui
Supervised By: Stan Sclaroff


Scene flow methods estimate the three-dimensional motion field for points in the world, using multi-camera video data. Such methods combine multiview reconstruction with motion estimation methods. This paper describes an alternative formulation for dense scene flow estimation that provides convincing results using only two cameras by fusing stereo and optical flow estimation into a single coherent framework. Segmentation-based stereo and optical flow techniques are used to preserve discontinuities and prevents over-smoothing -- two problems commonly associated with common stereo and optical flow algorithms. Preliminary results of segmentation based stereo and flow computation demonstrates the effectiveness of this approach.


Hand localization in American sign language (Poster)
Quan Yuan
Supervised By: Stan Sclaroff


The purpose of this work is to locate/segment hands in sign language sequence. In grey level sign sequences a new feature of tranlation residues is proposed.For every two consecutive frames, the first frame is partitioned into blocks and then we try to find best match of each block in the next frame by translation. The block matching process is performed on each level of Gaussian Pyramid. Because hands are always under non-rigid motion, the blocks of hands usually can't find a good match in the second frame. The blocks with high matching residues are grouped into possible hand areas.

Motion Recognition Through Frames Matching (Poster)
Taipeng Tian
Supervised By: Stan Sclaroff


Video-based human computer interfaces are gaining popularity as the speed of processors and video card capabilities are increasing. The aim of this project is to design a motion recognition algorithm capable of understanding a pre-defined set of human posture commands through the use of computer vision techniques. One application of interest is the automated recognition of a flight director's command gestures on an airfield or an aircraft carrier, by an unmanned aerial vehicle (UAV). Human motion is modeled as a sequence of static poses, in a manner that is akin to the keyframes used in cartoon animation. As each frame from the video stream arrives, the algorithm categorizes that frame into one of the motion categories. The motion category that garners the most votes will be deemed the most likely motion observed.

Shape-based curve growing model and adaptive regularization for pulmonary fissure segmentation in CT (Demo And Poster)
Jingbin Wang
Supervised By: Margrit Betke


This paper presents a shape-based curve growing model for object recognition in the field of medical imaging. The proposed curve growing process, modelled by a Bayesian network, is simultaneously influenced by the image data and knowledge of a prior curve shape. A maximum a posterior (MAP) solution is derived by an energy-minimizing mechanism, which is implemented in an adaptive regularization framework. This "adaptiveness" characteristic models the interaction between image data and shape prior influences and reflects the related causal dependencies in the Bayesian network. The method effectively avoids over-smoothing, an effect existing in other regularization methods. Moreover, the proposed framework also address initialization and local minima problems. Robustness and performance of the proposed method are demonstrated by experiments of pulmonary fissure segmentation on computed tomography (CT) images.


Stochastic Mesh-Based Multiview Reconstruction (Poster)
John Isidoro
Supervised By: Stan Sclaroff


An iterative method for reconstructing a 3D polygonal mesh and color texture map from multiple calibrated views of an object is presented. In each iteration, the method first estimates a texture map given the current shape estimate. The texture map and its associated residual error image are obtained via reprojection of the multiple views into texture space. Next, the shape estimate is adjusted to minimize residual error in texture space. Shape adjustments are determined by a novel sampling-based algorithm; The surface is deformed towards a photometrically-consistent solution via a series of 1D epipolar searches for photometrically-consistent displacements at randomly selected surface points. Moreover, shape adjustments can be constrained such that the recovered model's silhouette matches those of the input images.



Periodicity detection in sign language video (Poster)
Ashwin Thangali
Supervised By: Stan Sclaroff, George Kollios


Periodic or repeated movements are frequently used to express various aspects in sign languages. Examples of aspects using repetitions include frequentative aspect, ex. "fall sick freqeuntly"; distributive aspect, ex. "distribute handouts in a class"; pluralization, ex. "a row of chairs". An automatic technique to identify repeated motions is hence very useful in a sign language recognition system. In this work we are interested in localizing repeated hand motions. Some of the significant challenges that we need to address are, the location of the hands typically drifts in space through a sign, we hence need to robustly locate hands in video frames. The durations of repeated signs i.e the number of periods spanned by a sign is very small (2-4). Also human motion is inherently noisy, motions in different periods are only approximately similar. We present some techniques that can be employed to detect periodicities with preliminary results.

EyeKeys: A Real-time Vision Interface Based on Gaze Detection from a Low-grade Video Camera (Demo And Poster)
John Magee
Supervised By: Margrit Betke


Over 100,000 people in the United States are so severely paralyzed that they only have the ability to control the muscles in their eyes. For these people, eye movements or blinks are the only way to communicate. Currently available interface systems are often intrusive, require special hardware, or use active infrared illumination. We present a system that runs on an average PC with video input from an inexpensive USB camera. The face is tracked using multi-scale template correlation. Symmetry between left and right eyes is exploited to detect if the computer user is looking at the camera, or off to the left or right side. The detected eye direction can then be used to control applications such as spelling programs or games. We developed the game "BlockEscape" to gather quantitative results to evaluate our interface system with test subjects. We compare our system to a mouse substitution interface.

Operating Systems


Efficient End-Host Architecture for High Performance Communication Using User-level Sandboxing (Poster)
Xin Qi, Gabriel Parmer, Jason Gloudon and Luis Hernandez
Supervised By: Richard West


Current low-level networking abstractions on modern operating systems are commonly implemented in the kernel to provide sufficient performance for general purpose applications. However, it is desirable for high performance applications to have more control over the networking subsystem to support optimizations for their specific needs. One approach is to allow networking services to be implemented at user-level. Unfortunately, this typically incurs costs due to scheduling overheads and unnecessary data copying via the kernel. In this paper, we describe a method to implement efficient application-specific network service extensions at user-level, that removes the cost of scheduling and provides protected access to lower-level system abstractions. We present a networking implementation that, with minor modifications to the Linux kernel, passes data between ``sandboxed'' extensions and the Ethernet device without copying or processing in the kernel. Using this mechanism, we put a customizable networking stack into a user-level sandbox and show how it can be used to efficiently process and forward data via proxies, or intermediate hosts, in the communication path of high performance data streams. Unlike other user-level networking implementations, our method makes no special hardware requirements to avoid unnecessary data copies. Results show that we achieve a substantial increase in throughput over comparable user-space methods using our networking stack implementation.


Adaptive Routing of QoS-constrained Media Streams over Scalable Overlay Topologies (Poster)
Gerald Fry
Supervised By: Richard West


Current research on Internet-based distributed systems emphasizes the scalability of overlay topologies for efficient search and retrieval of data items, as well as routing amongst peers. However, most existing approaches fail to address the transport of data across these logical networks in accordance with quality of service (QoS) constraints. Consequently, this paper investigates the use of scalable overlay topologies for routing real-time media streams between publishers and potentially many thousands of subscribers. Specifically, we analyze the costs of using k-ary n-cubes for QoS-constrained routing. Given a number of nodes in a distributed system, we calculate the optimal k-ary n-cube structure for minimizing the average distance between any pair of nodes. Using this structure, we describe a greedy algorithm that selects paths between nodes in accordance with the real-time delays along physical links. We show this method improves the routing latencies by as much as 40%, compared to approaches that do not consider physical link costs. We are in the process of developing a method for adaptive node placement in the overlay topology, based upon the locations of publishers, subscribers, physical link costs and per-subscriber QoS constraints. One such method for repositioning nodes in logical space is discussed, to improve the likelihood of meeting service requirements on data routed between publishers and subscribers. Future work will evaluate the benefits of such techniques more thoroughly.

Towards an Internet-wide Distributed System for Media Stream Processing and Delivery (Poster)
Rich West, Xin Qi, Gabriel Parmer, Jason Gloudon, Gerald Fry
Supervised By: Rich West


We focus on the construction of an Internet-wide distributed system for media stream processing and delivery. Applications targeted by this system include live webcasts, tele-medicine and interactive distance learning. The aim is to support numerous channels of real-time data streams, generated by a number of publishing nodes and delivered to hundreds of thousands of clients with specific QoS constraints. The novelty of our approach concerns the use of overlay topologies (now popular with peer-to-peer systems) to deliver QoS-constrained streams between publishers and potentially thousands of subscribers. Consequently, we discuss preliminary results from the study of k-ary n-cubes for data delivery, given that membership of the system, publishers, subscribers and QoS constraints may change at run-time. Additionally, we describe work we are doing on the construction of an efficient end-host architecture, as part of this Internet-wide system. This includes a user-level sandboxing mechanism, to predictably, efficiently and safely execute stream processing agents (SPAs), as data is transported along its path from source to destination. Techniques that support the safe execution of application-specific service extensions, while avoiding unnecessary data copying and heavyweight context-switching will be discussed.


Virtual Machine Friendly Framework for Resource Management (Poster)
Yuting Zhang, Mina Guirguis
Supervised By: Azer Bestavros, Richard West, Ibrahim Matta


Various operating system designs, including microkernels and those that are extensible, have attempted to provide support for services that are tailored to the specific needs of applications. Similarly, recent work on the use of efficient virtual machine monitors (such as Xen), has prompted the support of multiple virtual machines to isolate services for different applications. Consequently, this work describes a Friendly Virtual Machine (FVM) framework that allows multiple virtual machines, running on the same host, to share resources efficiently and fairly. Rather than implementing resource management complexity in the virtual machine monitor (VMM), our scheme relies on the self-adaptation of virtual machines based on dynamic changes in resource usage and availability. This is achieved by the use of monitoring and adaptation logic implemented in the virtual machines themselves. By moving resource control to the virtual machines, no assumptions are required about the underlying resources or the number of virtual machines competing for them. Hence, our scheme prevents virtual machines from operating in regions where resources are wasted or unfairly managed. It is completely transparent to both applications running within virtual machines, and also the host operating system. To emphasize the elegance and simplicity of our approach, we have implemented the FVM framework in User-Mode Linux, in less than 400 lines of code.

A Virtual Deadline Scheduler for Window-Constrained Service Guarantees (Poster)
Yuting Zhang, Xin Qi
Supervised By: Richard West


We presents a new approach to window-constrained scheduling, suitable for multimedia and weakly-hard real-time systems. We originally developed an algorithm, called Dynamic Window-Constrained Scheduling (DWCS), that attempts to guarantee no more than x out of y deadlines are missed for real-time jobs such as periodic CPU tasks, or delay-constrained packet streams. While DWCS is capable of generating a feasible window-constrained schedule that utilizes 100% of resources, it requires all jobs to have the same request periods (or intervals between successive service requests). We describe a new algorithm called Virtual Deadline Scheduling (VDS), that provides window-constrained service guarantees to jobs with potentially different request periods, while still maximizing resource utilization. VDS attempts to service m out of k job instances by their virtual deadlines, that may be some finite time after the corresponding real-time deadlines. Notwithstanding, VDS is capable of outperforming DWCS and similar algorithms, when servicing jobs with potentially different request periods. Additionally, VDS is able to limit the extent to which a fraction of all job instances are serviced late. Results from simulations show that VDS can provide better window-constrained service guarantees than other related algorithms, while still having as good or better delay bounds for all scheduled jobs. Finally, an implementation of VDS in the Linux kernel compares favorably against DWCS for a range of scheduling loads.

Locality-Aware Buffering Techniques for Approximate Join Processing Over Data Streams (Poster)
Li Feifei
Supervised By: George Kollios, Azer Bestavros


We investigate adaptive buffer management techniques for approximate evaluation of sliding window joins over multiple data streams. In many applications, data stream processing systems have limited resources or have to deal with very high speed data streams. In both cases, computing the exact results of joins between these streams may not be feasible, mainly because the buffers used to compute the joins contain much smaller number of tuples than the tuples contained in the sliding windows. Therefore, a stream buffer management policy is needed in that case. We show that the buffer replacement policy plays an important role on the quality of the produced results. Furthermore, we propose an adaptive and locality-aware buffering technique for managing these buffers. Our technique aims to exploit the temporal and spatial correlations that are prominent in many real life data streams. We note that our algorithm is readily applicable to multiple data streams and multiple joins and requires almost no additional system resources. We report results of an experimental study using both synthetic and real-world data sets, that validate that our approach produces better results than other existing techniques.


Stateful Real Time Scheduling (Poster)
Kanishka Gupta
Supervised By: Azer Bestavros, Ibrahim Matta


Classical Scheduling algorithms try to minimize metrics such as the schedule length, the sum of completion times, the weighted sum of completion times etc. Real time scheduling theory, on the other hand, has focussed primarily on meeting the deadlines of periodic tasks. The traditional metrics are typically ignored as it doesn't really matter how early a job completes as long as it meets its deadline. This is reasonable if tasks are assumed not to require any extra resource besides the CPU. The resource requirements of applications which require additional resources, however, is dependent on the order in which the application's jobs are executed in the schedule. The sequence of execution of the jobs becomes vital for resource constrained embedded systems, such as nodes in a wireless sensor network, where minimizing the energy consumption is widely considered to be one of the fundamental challenges. Real Time Scheduling, therefore, needs to be stateful i.e. the scheduler needs to be cognizant of the currently executing job in making its decision of which job to schedule next. The challenge lies in meeting the conflicting objectives of scheduling jobs in the preferred order while at the same time completing all the jobs by their deadlines. We present an initial approach which uses the slack of the system to produce a more resource friendly schedule compared to EDF and LLF.

Programming Languages


Programming Examples Needing Polymorphic Recursion (Poster)
Joseph J. Hallett
Supervised By: Assaf Kfoury


Inferring types for polymorphic recursive function definitions (abbreviated to polymorphic recursion) is a recurring topic on the mailing lists of popular typed programming languages. This is despite the fact that type inference for polymorphic recursion using forall-types has been proved undecidable. This report presents several programming examples involving polymorphic recursion and determines their typability under various type systems, including the Hindley-Milner system, an intersection-type system, and extensions of these two. The goal of this report is to show that many of these examples are typable using a system of intersection types as an alternative form of polymorphism. By accomplishing this, we hope to lay the foundation for future research into a decidable intersection-type inference algorithm. We do not provide a comprehensive survey of type systems appropriate for polymorphic recursion, with or without type annotations inserted in the source language. Rather, we focus on examples for which types may be inferred without type annotations, with an emphasis on systems of intersection-types. This talk presents work done in collaboration with Assaf Kfoury (Boston University). Our paper on this work can be found at: http://www.church-project.org/reports/.

Enforcing Static Access Control with Guarded/Asserting Types (Poster)
Sa Cui, Chiyan Chen
Supervised By: Hongwei Xi


Access control is a security mechanism to protect resources of the local system from untrusted agents. The Java Security Architecture (JSA) provides a dynamic mechanism to support access control by performing stack inspection at run-time. In contrast to JSA, there are also approaches that have been proposed to support access control statically, which can avoid potentially high run-time overhead of stack inspection. For instance, in the type system of lambda-sec, certain access control policies can be enforced through statically type checking. In this paper, we propose an approach to supporting static access control through the use of guarded and asserting types. The notion of guarded and asserting types can be traced back to the work on restricted form of dependent types (as is developed in DML) and guarded recursive datatypes. This notion has since been formalized in the framework Applied Type System (ATS). We show that guarded and asserting types can also be used in a natural manner to cope with the issue of static access control. In particular, we are to translate lambda-sec into a calculus that essentially extends simply typed lambda calculus with guarded and asserting types.


Combining Dependent Types with Hoare Logic (Poster)
Dengping Zhu
Supervised By: Hongwei Xi


A fundamental question in computer science is how to develop reliable software. The verification of the correctness of a program is of great importance for the construction of safe and reliable software. We propose a system which can statically capture some undesired program behaviours as well as type errors, and thus significantly improve software reliability. We achieve this by enrich the type system with dependent types and state types. Besides type errors, we show by examples that our system can detect a lot of common errors such as array out-of-bound, access control violation and deadlock at compile-time. On the other hand, program verification based on Floyd-Hoare logic has been studied for at least three decades but its actual use in practical software development is greatly limited. Part of the reason, as pointed out by J. C. Reynolds, is in that the development of approaches to specification such as Floyd-Hoare logic is separate from the development of other practical systems such as type systems. In this paper, we build a bridge to unify type theory with Floy-Hoare logic. We demonstrate that our system can readily encode Floyd-Hoare logic and even Seperation logic, which is an extension of Floyd-Hoare logic.


Distributed Programming through Typeful Code Representation (Poster)
Chiyan Chen, Rui Shi
Supervised By: Hongwei Xi


Distributed computation is popular in scientific computing tasks for the advantages of computation speedup and improvements on data availability and reliability. Although there have been many practices of distributed computation in real world applications, there yet lacks a well developed type-theoretical foundation of it. Several recent proposals devote efforts to address this problem through the use of modal logic frameworks. In this work we design and implement an alternative type-based approach to safe and secure distributed programming, where the key salient feature lies in the introduction of a form of concrete typeful representations for code at remote locations. We expect the techniques developed in this work can significantly facilitate the production of distributed programs that is more robust and reliable.

Cryptography


Intrusion-Resilient Secure Channels (Poster)
Robert McNerney and Scott Russell
Supervised By: Gene Itkis


Suppose that Alice lives in Alaska and her friend Bob lives in Boston. In order to communicate securely, they set up a secure channel using some encryption algorithms and agreed-upon secret keys. Unfortunately, some malicious party Eve might covertly obtain the secret key of Alice, Bob, or even both of them. In that case, from that point onward Eve will be able to decipher their supposedly private conversation. We propose a new primitive called an Intrusion-Resilient Secure Channel which automatically restores the privacy of messages after such a compromise by Eve. This ability to recover without repeating the potentially cumbersome initial key agreement process is desirable. Recovery is possible even if Eve observes all messages exchanged by Alice and Bob. We give a formal definition of a two-party intrusion-resilient channel and also provide a simple generic construction using ordinary, existing public key encryption algorithms. Additionally, we prove a bound on the security of this construction assuming that Eve is able to observe but not interfere with the communication between Alice and Bob. An intrusion-resilient secure channel can be a useful building block for constructing more robust, intrusion-resilient digital signature and encryption schemes, as well as some proactive cryptographic mechanisms.


Simple, Stateless Steganography (Poster)
Scott Russell
Supervised By: Leonid Reyzin


Steganography is the study of hiding the very presence of a secret message within a public communication channel. This is in contrast to the more widely studied cryptography which hides the content but not the presence of a message. Like encryption, steganography can be used for good (e.g., to thwart censorship by totalitarian regimes) or for bad (e.g., to facilitate covert communication of nefarious designs). While it has received a lot of publicity, particularly of late, a formal understanding of steganography has been elusive. We provide the first provably secure secret-key steganographic construction that is both black-box and stateless. Black-box means the sender need not have knowledge of the public communication channel beyond the ability to sample from it. The black-box property is important since in many settings it is unrealistic to assume detailed knowledge of the underlying channel distribution. Stateless means the sender and the recipient need not maintain synchronized state for multi-bit messages. This is important since maintaining synchronized state between the sender and the recipient in steganography is particularly difficult because any communication to resynchronize the state will likely alert the adversary. For channels of sufficient entropy, our construction is more efficient than previous black-box constructions. Moreover, it is the first construction to provide an explicit tradeoff between the number of samples needed by the steganographic encoder and the rate at which hidden message bits are transmitted.

Theory


Quantum Lower Bounds for Fanout (Poster)
Maosen Fang
Supervised By: Steve Homer


We prove several new lower bounds for constant depth quantum circuits. The main result is that parity (and hence fanout) requires log depth circuits, when the circuits are composed of single qubit and arbitrary size Toffoli gates, and when they use only constantly many ancillae. Under this constraint, this bound is close to optimal. In the case of a non-constant number a of ancillae, we give a tradeoff between a and the required depth, that results in a non-trivial lower bound for fanout when a=n^(1-o(1)).


Quantum Algorithms (Poster)
Debajyoti Bera
Supervised By: Steven Homer


A primary goal of the theory of quantum computing is to determine how quantum computers may offer computational speed up over classical computational models. Till date, there only a few results which give a polynomial time algorithm for a problem which has no known polynomial time quantum algorithm. Our current approach is to look at problems for which there are already efficient deterministic or randomized classical algorithms and try to see if there are better quantum algorithms for them. We mainly consider some typical graph theory problems and a few other closely related ones. We try to focus on, whenever possible, the query complexities of these problems and lower bounds.


Computational Power of Random Strings (Poster)
Benjamin Hescott
Supervised By: Steven Homer


Randomness has long been considered a fundamental resource in computation. Following the work of Allender, et al. we investigate the power of computations which use random strings. We consider the possibility of creating a new hierarchy in exponential space by applying a form of the Turing jump to the set of random strings. We begin by studying the link between Kolmogorov Complexity and current derandomization techniques, and use these to show the inclusion of well known deterministic complexity classes in this new hierarchy. This hierarchy is interesting in that it starts near PSPACE and jumps to NEXP, and while other jump hierarchies in exponential space have been shown to collapse, we consider the possibility that this hierarchy may separate, yielding an interesting hierarchy of complexity classes useful in classifying natural computational problems.

Optimal Aggregation Location of Grid Sensor Networks (Poster)
Kyle Burke
Supervised By: Shanghua Teng


Aggregating the results of a sensor network query is not always a simple task. The specific problem we examine is deciding where to perform the aggregate computation. The problem can be modelled by representing the network as a graph, and the query as a subset of the nodes. These nodes are given weights cooresponding to the amount of data collected there. Thus, our task then requires deciding to which node of the network-graph we should move all the data. Under reasonable assumptions, we have found a fast method by which to locate this ``aggregation node''. This procedure requires only that the graph be an m-by-n planar grid with unweighted edges and runs in O(s log(s)) time, where s is the number of nodes in the query. In fact, this method generalizes to grids of any dimension d, running in O(ds log(s)) time. If, however, we apply results from Johnson and Lindenstrauss, then we can find an algorithm running in O(s log(s)log(d)) time, but could have an error of O(log (d)). This algorithm runs by splitting the grid in half, summing the node weights on each half, then moving all the data to the greater of the two halves. This continues recursively until the graph has been reduced to a single point. We hypothesize that this solution can be extended to handle similar conditions. These could include graphs with weighted edges or those which are not quite grids.


Mixing in Majority Vote Probabilistic Cellular Automata  (Poster)
Katharine Mullen
Supervised By: Peter Gacs


The state at each site of a majority vote probabilistic cellular automata (PCA) is determined by a majority vote of the sites in its neighborhood, except in the case of an "error". An error causes the state of the site to be drawn from a noise distribution. We consider majority vote PCA's in which the noise distribution puts positive mass on each state in the state space, the probability an error occurs is positive, and the state space is binary. Such PCA's are a subset of a class of CA's called spin models, an example of which is the widely-studied Ising model of ferromagnetism. A long-standing open problem is whether the majority vote PCA's we consider are mixing in the discrete time two-dimensional (2D) case; i.e., whether discrete time 2D PCA's in this class converge to a unique invariant stationary distribution independent of initial configuration. It is well known that continuous time 2D PCA's in this class do not mix. We present here evidence via computer simulation that mixing does not occur in the discrete time 2D case; these empirical results suggest that a rigorous proof for this case is possible. Such a proof would imply that even when prone to "errors", 2D discrete time majority vote PCA's remember information regarding their initial configuration forever.

On Approximating Arbitrary Metrics by Tree Metrics and possible applications (Poster)
Dihan Cheng
Supervised By: Shanghua Teng


The poster is concerned with probabilistic approximation of metric spaces. We study probabilistic approximations of Eulicidean distance of arbitrary metric spaces by "hierarchically well separated tree" metric spaces. This has proved as a useful technique for simplifying the solutions to various problems. The best approximation factor is O(logn). This results have applications in a variety of areas including approximation algorithms, on-line algoriths and distrubuted computation such as the following problems: The k-median problem, The group Steiner tree problem, Buy-at-bulk network design, Vehicle Routing. We are now interesting in exploring further applications.


Geometric Separator for d-dimension ball graphs (Poster)
Kebin Wang
Supervised By: Shang-Hua Teng


We study graph partitioning problem on d-dimension ball graphs in a geometric way. Let S be a set of balls in d-dimension Euclidean space with ratio \delta and \lambda-precision. We prove that it can be partitioned into three sets S_i,S_o, S_c such that the intersection of S_i and S_o is empty, and for some constant \alpha, \beta, the volume of S_i and S_o is less than \beta portion of volume of S, and the volume of S_c is of size O(\sigma/\lambda,\mu(G)^{1-\frac{1}{d}}). We also provide a randomized algorithm to find such a partition in linear time.