Next: Implementation details
Up: Scalability of content-aware server
Previous: Design solutions for efficiency
Dispatching policies
In this section, we present the subset of the content-aware dispatching algorithms
that are available in our Web switch and are used for the experimental results.
The Client Aware Policy [7]
is oriented to Web sites providing heterogeneous services with different
computational impact on system resources.
The set of static and dynamic services provided by the Web site
is divided in classes, each one stressing the system components
in different ways.
The CAP algorithm works as follows. A list of circular pointers
to servers is maintained (one for each service class).
As soon as a client request is received at the switch, the parser
module extracts the embedded URL and identifies the
associated service class. Then, a round robin assignment
on the given service class is performed, by using the appropriate
pointer.
The basic observation of CAP is that when the Web site provides
heterogeneous services, each client request could stress a different
Web system resource. Although the Web switch cannot estimate the
service time of a static or dynamic request accurately, it can
distinguish the class of the request from the URL and estimate
its main impact on each Web system resource. A feasible
classification for CAP is to consider disk bound, CPU bound, and
network bound services, but other choices are possible depending
on the content and services provided by the Web site.
We also implemented the Locality-Aware Request
Distribution [2,22], which tends to maximize cache hit rates of
static resources. As soon as the Web switch receives an HTTP request,
the parser module extracts the URL. Next, it checks whether
the requested URL has already been handled by any Web server node.
If this is the case, the request is forwarded to that node, unless
it is overloaded. To avoid potentially unfair assignments,
the server load is estimated through a centralized load monitor that
counts the number of active connections for a given request class
(static, dynamic).
A Web server is considered overloaded if the number of opened
connections exceeds a given threshold. If the chosen server is
overloaded, the least loaded node is chosen.
If the URL has not yet been assigned to a Web server, the
least loaded node is chosen as well.
The rationale behind LARD is that assigning the same Web object to the same
Web server, the requested object is more likely to be found into the
disk cache of the server node.
Next: Implementation details
Up: Scalability of content-aware server
Previous: Design solutions for efficiency
Mauro Andreolini
2003-03-13