When a web server becomes heavily loaded, system overhead increases due to the lack of available resources, which may cause the system to thrash. Recent studies have shown that for every system, there is a maximum-number-of-processes that defines the thin line between system thrashing and maximum optimization of system resources. Today's web servers, such as Apache, restrict the number of processes that are running concurrently at any given moment to prevent this from occurring. However, it may be beneficial to insert a mechanism which regulates (HTTP requests) access to system resources before a request can be completed. By creating a more controlled environment, system overhead can be monitored to a greater degree, and thrashing can be prevented. The addition of such a design can also be useful for scheduling purposes as an alternative to the more conventional first-come-first-served policy for scheduling requests.
In this paper, I will examine the possibility of adding a number of admission control and scheduling protocols to increase efficiency and decrease latency on a typical web server.
Any number of approaches can be taken with the scheduling algorithm using this design. One particular manner in which to implement it may be to allow requests to be processed in an order that maximizes the performance of the web server(s), according to what resources are currently available (i.e. cache, I/O). Another method which may improve performance is to distribute the request to a process or server that is dedicated to a specific type of request (i.e. gif images, HTML) or a specific file size. Finally, in a cluster of servers, one can apply the admission control mechanism to serve as a central point that forwards requests to other machines for load-balancing purposes. The Commonwealth Server Project at Boston University is currently researching this type of mechanism as a part of their main goal.
One of the Commonwealth Server Project's ongoing research areas is networking, which includes routing, packet rewriting, and connection migration. In a cluster of web servers, load balancing becomes an important issue because we ideally want an even distribution of workload across the servers. We must also avoid overloading a single server, since performance is significantly degraded when this occurs. Consequently, research has focused on how to create as equal a load as possible on each machine within a cluster to maximize performance. One of the solutions presented by members of the Commonwealth Server Project is to re-route requests from a central machine to another which currently has the resources to handle the request. This design has an inherent problem - the central machine can become a bottleneck when the network load is high. In response to this, another design has been introduced: the concept of packet re-writing.
Packet re-writing employs the same notion of load balancing among a number of virtual servers; however, there are two main distinctions between the two approaches. In packet re-writing, there are a number of IP addresses associated with one domain name. For instance, when a client wants to access "cs.bu.edu", the name resolves to different IP addresses, provided that the client honors the TTL limit so that subsequent clients wishing to access the same domain name do not receive a cached copy of the previously-used IP address. This ensures that the several IP addresses will be assigned in a round-robin fashion, hence distributing the load more evenly across a cluster of servers. Secondly, load-balancing is enforced further by forwarding a request from a heavily-loaded server to another less-loaded machine to avoid overloading the original server. The newly assigned server will then create a connection with the client, process the request and send back a response, using the IP address of the original machine so as not to confuse the client's application.
Packet re-writing clearly is a very logical solution to load-balancing. However, in packet re-writing, the client's request is merely forwarded to another machine, which is then responsible for establishing a connection with the client. Using this approach, there is no information available about the client's request until the second web server has established a connection. It may be useful to have that knowledge prior to forwarding a request, since we may wish to forward them based on a certain property of the request (i.e. file size requested, file type requested). The notion of connection migration achieves exactly this. The first machine to receive a request from the client establishes a connection, acquires information about the request, and forwards the request to another machine based on the contents of the request. The admission control and scheduling protocol for a single web server and the forwarding mechanism in a cluster of servers carry the same goal - to provide a hook in between the service and registration phases for forwarding or scheduling purposes.
The admission control and scheduling protocol can be implemented in conjunction with the load-balancing designs described above. Its functionality could be adopted to serve as a forwarding mechanism among machines within a cluster. Although this is beyond the scope of this project, it may well be used to incorporate it as a part of the Commonwealth Server Project for future purposes.
I. Structure of the Apache source code
The structure of the code can be divided into the following sections:
1. The Server Core Files
2. Third-Party Modules
3. Utility Files
4. Site Directory Files
The server core files consist of the files that deal directly with the handling of incoming requests and servicing of a request. The third-party modules provide code for execution of directives. These modules can be compiled along with Apache or can be left out by commenting them in the configuration file. The ease of modification comes partly from the fact that one can write one's own module and incorporate it into Apache's original functionality. The utility files contain basic utility functions (i.e. buffered I/O routines, allocation of resources). Finally, the site directory contains the configuration file(s), HTML files that reside on the server, and log files for record-keeping. Relevant files for this project are:
http_protocol.c -- all communication between the server and the client
through this file. It includes functions for dealing directly with the client,
such as reading the request, sending back acknowledgments, and error handling.
http_request.c -- includes functions that control the invocation of
functions for the various phases of handling a request.
http_core.c -- contains the core module structure, as well as tables
http_config.c -- contains functions for reading the configuration
when running the httpd executable. Also responsible for dispatching to the
http_log.c -- contains the error log.
http_main.c -- the main file for parent and child processes.
functions that deal specifically with how parent and child processes are
evoked, accept connections, and service requests. System startup,
restarts, and time-out handling are also included in this file.
alloc.c -- contains functions specifically for allocation of resources
(i.e. memory, files, child processes).
httpd.h -- header file for httpd daemon. Contains request_rec,
connection_rec, and server_rec structure definitions.
scoreboard.h -- header file for scoreboard. Contains structure
scoreboard, which keeps track of status and process id's for all processes.
httpd.conf -- main server configuration file. Determines much
server's start-up and run-time settings. Contains directives which oversee
the operation of the server as a whole, such as logging and resource management.
access.conf -- defines server settings which affect which types of services
are allowed access to, and by whom.
II. The Apache API
The Apache application programming interface provides a gateway between
the programmer and the modules, allowing the programmer's application to
make requests of the operating system through them. This section
will explore many of the basic concepts behind the Apache API, and how
they are manifested in the code.
A handler is a piece of code built into Apache that performs certain
actions when a file with a particular MIME or handler type is called.
A handler can also be based on filename extensions or by file location,
which enables a file to be associated with both a file type and a handler.
Apache has a number of built-in handlers, and others can be added by using
the AddHandler directive. Its built-in handlers are:
1) Translates URL filename
2) Checks access authorization
3) Determines MIME type of file requested
4) Sends response back to client
5) Logs the request
Each of these steps are handled by examining a few pre-specified modules
for the existence of a handler, invoking it if one is found. The
handler returns one of the following integers:
All handlers take one argument, the request_rec structure. This
structure stores all necessary information about a single request made
by a client to the server for use by various functions in both the modules
and core files. Generally, a single connection results in the generation
of one request_rec structure (information about a particular connection
is held in a connection_rec).
request_rec *next; /* If we wind up getting redirected,
* pointer to the request we redirected to.
request_rec *prev; /* If this is an internal redirect,
* pointer to where we redirected *from*.
request_rec *main; /* If this is a sub_request (see request.h)
* pointer back to the main request.
/* Info about the request itself... we begin with stuff that
* protocol.c should ever touch...
char *the_request; /* First line of request, so we can log it */
int assbackwards; /* HTTP/0.9, "simple" request */
int proxyreq; /* A proxy request */
int header_only; /* HEAD request, as opposed to GET */
char *protocol; /* Protocol, as given to us, or HTTP/0.9 */
int proto_num; /* Number version of protocol; 1.1 = 1001 */
char *hostname; /* Host, as set by full URI or Host: */
int hostlen; /* Length of http://host:port in full URI */
char *status_line; /* Status line, if set by script
int status; /* In any case */
/* Request method, two ways; also, protocol, etc.. Outside of protocol.c,
* look, but don't touch.
char *method; /* GET, HEAD, POST, etc. */
int method_number; /* M_GET, M_POST, etc. */
int sent_bodyct; /* byte count in stream is for body
long bytes_sent; /* body byte count, for easy access */
/* MIME header environments, in and out. Also, an array containing
* environment variables to be passed to subprocesses, so people can
* write modules to add to that environment.
* The difference between headers_out and err_headers_out is that the
* latter are printed even on error, and persist across internal redirects
* (so the headers printed for ErrorDocument handlers will have them).
* The 'notes' table is for notes from one module to another, with no
* other set purpose in mind...
char *content_type; /* Break these out --- we dispatch
on 'em */
char *handler; /* What we *really* dispatch on */
/* What object is being requested (either directly, or via include
* or content-negotiation mapping).
/* complete URI for a proxy req, or
URL path for a non-proxy req */
char *args; /* QUERY_ARGS, if any */
struct stat finfo; /* ST_MODE set to zero if no such file */
/* Various other config info which may change with .htaccess files
* These are config vectors, with one void* pointer for each module
* (the thing pointed to being the module's business).
void *per_dir_config; /* Options set in config files, etc. */
void *request_config; /* Notes on *this* request */
* a linked list of the configuration directives in the .htaccess files
* accessed by this request.
* N.B. always add to the head of the list, _never_ to the end.
* that way, a sub request's list can (temporarily) point to a parent's list
const struct htaccess_result *htaccess;
Every invocation of the httpd executable results in a fully capable,
independent process that can handle all the basic functions of a web server.
Since any of these instances can handle a request for any of the configured
virtual or main hosts, we need to keep record of the information associated
with each host. The server_rec structure serves this purpose:
/* Full locations of server config info */
/* Contact information */
short port; /* for redirects, etc. */
/* Log files --- note that transfer log is now in the modules... */
/* Module-specific configuration for server, and defaults... */
/* true if this is the virtual server */
void *module_config; /* Config vector containing pointers to
* modules' per-server config structures.
void *lookup_defaults; /* MIME type info, etc., before we start
* checking per-directory info.
/* Transaction handling */
struct in_addr host_addr; /* The bound address, for this
short host_port; /* The bound port, for this server */
int timeout; /* Timeout, in seconds, before we give up */
int keep_alive_timeout; /* Seconds we'll wait for another request */
int keep_alive; /* Maximum requests per connection */
char *path; /* Pathname for ServerPath */
int pathlen; /* Length of path */
char *names; /* Wildcarded names for HostAlias
char *virthost; /* The name given in <VirtualHost> */
Every time a connection is initiated between a client and the server, the core ensures that the right information about the connection is available to modules and other functions in order to process the request. If the server is configured not to allow persistent connections, every time a new request is received, a new conn_rec is created and filled in. Otherwise, the same conn_rec is kept until the keepalive limit is reached. This, and other information, is packaged in a conn_rec structure.
/* The number of the child handling conn_rec */
BUFF *client; /* Connection to the guy */
int aborted; /* Are we still talking? */
/* Who is the client? */
struct sockaddr_in local_addr; /* local address */
struct sockaddr_in remote_addr;/* remote address */
char *remote_ip; /* Client's IP address */
char *remote_host; /* Client's DNS name, if known.
* NULL if DNS hasn't been checked,
* "" if it has and no address was found.
* N.B. Only access this though
* get_remote_host() */
char *remote_logname; /* Only ever set if doing_rfc931
* N.B. Only access this through
* get_remote_logname() */
char *user; /* If an authentication check was made,
* this gets set to the user name. We assume
* that there's only one user per connection(!)
char *auth_type; /* Ditto. */
int keepalive; /* Are we using HTTP Keep-Alive? */
int keptalive; /* Did we use HTTP Keep-Alive? */
int keepalives; /* How many times have we used it? */
Apache has a number of standard modules that can be used to perform extra functions beyond its basic capabilities. A module is a piece of code that executes a number of directives that can be compiled into Apache by uncommenting its name in the compilation configuration file. One can also customize Apache's functionality by writing modules of his own. Most of the Apache modules that were downloaded with the source code are not required for the purposes of this project, and will be left uncompiled to prevent unnecessary degradation in performance.
IV. Apache Directives
Apache has approximately 150 configuration directives that control startup
and run-time settings of Apache. They are located in the configuration
files, and are usually set to "default" unless otherwise specified by the
webmaster. Relevant directives include:
KeepAliveTimeout: sets number of seconds to wait for the next request.
MaxClients: sets the number of requests that will be dealt with
MaxRequestsPerChild: sets the number of requests each child can
child server will die after having served this many requests.
MaxSpareServers: sets the maximum number of IDLE child servers
MinSpareServers: sets the minimum number of IDLE child servers
StartServers: sets the number of child servers at startup.
Basic Code Sequence
- creates a socket, binds the local address to it, and listens
to socket for incoming connections.
- forks off number of startup processes specified in config file.
- enters maintenance loop:
- continually checks each child process to see whether
it is dead or not.
- if child is dead, it removes it from the scoreboard in
shared memory, writing "SERVER_DEAD" in the
- counts the number of idle servers. If it is less than
MaxSpareServers, it creates a new child server.
- counts the number of running child processes. If it
exceeds MaxClients, it will not create a new child
- stays in loop until Apache is killed.
- if MaxRequestsPerChild is reached, it kills itself and exits.
- if not, it loops waits until a request is received and
updates its status in the scoreboard to "SERVER_READY".
- once a request is received, it seizes the mutex lock and
accepts the connection, then releases the mutex lock for
another child process to seize.
- updates the scoreboard to "SERVER_BUSY_READ".
- reads the request and fills in information about the
request in the request_rec structure.
- updates the scoreboard to "SERVER_BUSY_WRITE".
- processes the request and sends a reply back to the client.
- then checks for persistent connections in a loop:
- if KeepAlive has not been reached, it updates the
status in the scoreboard to "SERVER_BUSY_KEEPALIVE".
- reads another request from the same connection
and updates the scoreboard to "SERVER_BUSY_WRITE".
- processes the request and sends back a reply to
- stays in the loop until the KeepAlive limit has
been reached, then closes the connection, exits the
loop, and waits for another request through a new
- repeats entire process until MaxRequestsPerChild is reached.
The number of child processes is controlled dynamically by the parent process, thereby continuously adjusting to the current load on the server:
- The child servers count the number of IDLE servers, during its
IDLE state. If it exceeds MaxSpareServers, it exits and dies.
- Child servers keep track of the number of requests served.
it exceeds MaxRequestsPerChild, it exits and dies.
- MaxClients limits the number of servers that can run at one
The parent server will not create new children if this limit is
An important feature of this implementation is the fact that only one child server can accept an incoming connection at any time. When idle, a child server waits for an incoming request. Once a request is received, it seizes the mutex, only if it is available. If it is not available, it will wait in a queue until its turn is reached. Once the mutex is seized, it will accept the connection, immediately releasing it thereafter. Then it services the request, and then goes back to the idle state. This loop repeats until MaxRequestsPerChild is reached.
The design of the modified web server incorporates the very essence
of the fundamentals of admission control. However, it will also maintain
most of the features of the original version of Apache to preserve its
reliability and efficiency. In this section, I will give a detailed
overview of the changes made to the original version of Apache.
I. Overview of Implementation
The idea behind the implementation of admission control is to divide the sequence of steps of serving a request into two phases - the registration and service phases. In order to do this, we introduced a mechanism to block processes immediately after they have read and parsed a request. This blocking mechanism is used to make a process wait until it is instructed to proceed with the actual transmission of the data. The manner in which a new process is selected to run is distributed among the child processes. Each child process, upon completion of handling a request, has the capacity to select a new process to begin handling its request. The parent process is only responsible for administrative duties such as server initialization and child process creation.
To add new functionality to the original version of Apache, we modified
the main module, "http_main.c", and added a new module, "admctrl.c".
The file "admctrl.c" contains the new functions relating to admission control
and scheduling: put_child_in_queue, get_next_child, mark_child_free,
setup/destroy_requests_table, setup_semaphores, semaphore_wait, and semaphore_signal.
The main module contains code for both the parent process and the child
processes; each section was modified to handle the functions described
To keep information about processes and their requests during the child main loop, we created a table in shared memory for scheduling and administrative purposes. Within this shared structure, an abridged version of the request_rec structure holds information necessary for a given scheduling policy to run. Each entry in this table holds the following information about a particular request:
There are two main loops in Apache, one for the parent process (standalone_main),
and the other for the child processes (child_main). We modified the
standalone_main loop in order to remove the adaptive technique of creating/destroying
child processes based on the number of active child processes, which is
the standard behavior of Apache. To do this, the parameter MaxRequestsPerChild
was set to 0, and MaxSpareServers was set to 31, to avoid killing off child
processes when system load is low. A new parameter, MAX_RUNNING,
was added to configure the server to limit the number of processes running
concurrently at any given time. This parameter must be set to any
number less than 30, due to the fact that each process requires a semaphore
for blocking purposes. Whenever a new request arrives, the admission
control algorithm checks this value and decides whether or not to block
the process based on the number of servers currently active.
III. Code Sequence
In the revised version of Apache, the use of semaphores is heavily relied upon, for two reasons. Firstly, they are used in order to block processes immediately after the registration phase has been completed. At this moment, the process is placed in a queue, where it waits to be selected to run by the scheduling algorithm. Secondly, semaphores are used to protect critical sections within the code that access shared data by all child processes.
The parent process in the our version of Apache essentially handles the same duties as it had previously, with a few minor exceptions. Now, it is responsible for handling the creation and destruction of semaphores and data structures that were added to the code. It also updates data in shared memory that is needed to output statistics for a specific run.
The child process main loop is where most of the revisions were made.
The figure below breaks down the steps each child process takes from the
listen( ) system call until the request is processed and sent back to the
Each child process, upon creation, begins by server initialization and
updating its status to "SERVER_READY" in the scoreboard. The first
process opens the critical section, then waits for an acceptable connection
to arrive at the socket, locking around the accept() call to prevent two
or more processes accidentally accepting a single connection simultaneously.
Once one arrives, it releases the mutex, and the next child server enters
the critical section. After the accept mutex is released, server
status is updated to "SERVER_BUSY_READ" at which time it proceeds to read
the request and fill in the request_rec struct. Next, server status
is updated to "SERVER_BUSY_WRITE", and the function put_child_in_queue(
) is called. This function's main purpose is to determine whether
the number of running servers has exceeded MAX_RUNNING - if this condition
holds, it will signal the process to wait in the queue before proceeding
with the request. Otherwise, (i.e. there is no queue), the process
calls mark_child_free, which updates the appropriate information in the
request table and calls get_next_child(). This function is
responsible for the actual evaluation of the scheduling policy to select
the next process in the queue to begin transmission of data in response
to the request. Once this is determined, the process is awakened,
and then returns to the beginning of the loop, marking its status as "SERVER_READY"
in the scoreboard, waiting for subsequent requests to arrive.
IV. Scheduling Policies
In the modified version of Apache, an additional queue is inserted immediately after a child process reads and parses a request and its arguments. This extra queue acts as a central point in which requests wait until they are signalled to begin transmission of data. The manner in which they are selected is determined by the scheduling policy. We conducted experiments using three scheduling policies: FIFO, Shortest-File-First, and Batch Processing.
FIFO essentially uses the same concept as the original version of Apache; requests are served in the same order that they arrive in. Shortest-File-First employs the concept of giving precedence to the processes requesting smaller files (i.e. HTML) over the larger files (i.e. .wav and .gif files). This policy is aimed at achieving better overall performance of a web server by allowing the smaller, easier-to-serve requests to bypass the larger files in the queue; thus reducing the average response time. Batch Processing similarly gives precedence to processes requesting documents that were most recently served. The purpose of this policy is to optimize the use of the cache by reducing the total number of cache misses, as well as the number of times it may be necessary to retrieve a file from main memory. The performance of these 3 policies are presented in the next section.
I. Testing Environment
In order to assess the performance of the web server implementing the admission control and scheduling policies, we created a simulated web environment using a URL request generator tool, or SURGE (Scalable URL Reference Generator), which generates a sequence of URL requests in accordance with distributions obtained from statistics of real world Web activity for several different factors, including document popularity, document sizes, request sizes, temporal/spatial locality, and embedded document count. SURGE is a multiprocessed and a multithreaded application, enabling us to run several experiments using different parameters to simulate all levels of load to test our server. Typically, we ran SURGE with 5 client subprocesses for 5 minutes on each of the three DELLs. The number of threads per client determined the load factor; this number varied from 10 (representing a light load) to 60 (very heavy load). The systems we used for test purposes were Pentium II 233 MHz systems.
The testing setup was completed by compiling each
version of Apache running different scheduling policies (FIFO, Shortest-File-First,
and Batch Processing), with different MRP's (1, 2, 5, 10, 15, 20, and 25)
and running each server separately throughout the duration of every test.
To preclude discrepancies in data from the experiments, we repeated each
test a minimum of 5 times.
II. Data Gathered
Data was gathered from 3 sources for analysis
for our web server. SURGE provided us with statistics from the client
side, such as mean transfer delay and throughput. Internal measurements
within Apache were taken as well, to gather statistics for the changes
we made within the server to implement scheduling/admission control.
Finally, WebMonitor provided us with statistics at the kernel level, such
as CPU utilization and time spent making system calls.
a) mean transfer delay vs. load
A good indication of whether a server performs well is the measurement of mean transfer delay from the client's point-of-view. We compiled all the data for most of the different servers running different scheduling policies, and compared the mean transfer delay with the original version of Apache:
b) throughput vs. load
Again, throughput is optimal for the original version of Apache
due to the fact that a certain amount of overhead is introduced when adding
admission control and scheduling. Still, Shortest-File-First running
with MRPs equal to 10 and 15 are very close to the original version with
no admission control. There is also a significant performance hit
when load reaches 30 (medium load), for no apparent reason, which declines
when load approaches 40. This performance degradation may be significantly
harmful to a single web server; however, when admission control and scheduling
is implemented in a distributed environment, the performance decline will
become negligible as the number of documents served increases.
c) mean waiting time vs. load
d) CPU User/System Time
From these graphs, it is apparent that when the load is equal to 30, CPU user time and system time hit a peak. The same is seen for the server without admission control and scheduling, most likely for the same reason, but at a different threshold (load = 40). Once load increases, the percentage of CPU used decreases. Also, CPU system utilization for Shortest-File-First with MRPs equal to 3, 5, and 15 are significantly lower than the rest of the servers shown in the graph, possibly due to cache optimization achieved by allowing the shorter, more popular files to be served first, as page swaps are accounted for in the system, not the application.
e) comparison of SFF, BP, and FCFS
Socket migration centers around the same idea, but at the TCP level as opposed to the IP level, for actually creating the connection and serving the request. It uses DPR to rewrite the source IP address when sending back packets to the client, so as to not confuse the client.
There are many ways one could envision implementing admission control for a distributed web server. The main advantage is that any machine will be able to extract useful information about a request, such as file type requested, file size, and URL to re-route the request as needed to another machine that can better handle the request. The individual machines could be organized in a way that each machine dedicates itself to a specific range of file sizes, file types, or actual files. When an incoming request arrives at a specific machine within the cluster, it will first parse the header of the request, then either serve the request itself, or use socket migration to re-route the request to another machine. In this way, an optimal balance of load can theoretically be achieved.
Socket migration and DPR can also be implemented in conjunction with a task assignment policy, as mentioned above. The only difference would be that a "collective" centralized point to which the incoming requests arrive must be present in order to create a queue for scheduling purposes. For example, in a cluster of servers, a few machines could be dedicated to receiving incoming requests from clients. Another set of machines, invisible to the outside, could be dedicated to actually serving the requests, based on file size. A request will arrive at any of the machines accepting incoming connections, get inserted in a queue, and deployed according to a scheduling policy, such as Shortest-File-First or Batch Processing. When the request reaches the beginning of the queue, the machine will migrate the connection to the server handling the specific range of file sizes, which, in turn will serve the request.
Created on: 1997.08.21 Updated on: 1998.05.26