Research on Web Servers

Under light to moderate workloads, a web server implementation is not pushed to its limits and even in the presence of inefficiencies, users will experience acceptable performance levels. As the loads imposed on web servers grow ever more, and the tolerance for slow services decreases, it is increasingly important to determine any source of inefficiencies, identify bottleneck and adopt strategies in the implementation of web servers that allow them to perform effectively under conditions of heavy load. One fundamental issue that must be looked at, in the search for high performance levels, is how to schedule the set of currently schedulable requests.

An important factor that must be considered is the distribution of file sizes for the requests received by a web server. Such distribution has been proven to be heavy-tailed, that is, most of the requests are for small files whereas very few will be for big files. The heavy-tail distribution has a significant effect on the performance achieved by the web server since usually less than 1% of the requested files make up half of the load experienced by the server. This property, known as the heavy-tail property, significantly affects the performance of the scheduling policy in place.

Our goal is to study the effect of the distribution of file sizes on the performance of different components of a web server architecture, and study the performance improvements that can be achieved by applying scheduling policies that use certain knowledge about the sizes of the requests received at the server. The studies have both a theoretical component and an experimental component. The theoretical component deals with the creation and analysis of models of web server architectures in order to obtain theoretical performance bounds. The QoS Networking Laboratory serves as a testing environment for implementing and experimenting with new web server architectures. Currently we are studying the effect of having different scheduling policies at different components of a web server architecture, that is, CPU, disk and network.

Last updated 8/22/99