Engineering and Hosting Adaptive
Freshness-Sensitive Web Applications on Data Centers
Wen-Syan Li Oliver Po Wang-Pin Hsiung K. Selçuk
Candan Divyakant Agrawal
NEC Laboratories America, Inc.
10080 North Wolfe Road, Suite SW3-350, Cupertino, California 95014, USA
Email:wen@sv.nec-labs.com Tel:408-863-6008 Fax:408-863-6099
Copyright
is held by the author/owner(s).
WWW2003, May 20-24, 2003, Budapest, Hungary.
ACM 1-58113-680-3/03/0005.
Abstract:
Wide-area database replication technologies and the
availability of content delivery networks allow Web applications to
be hosted and served from powerful data centers. This form of
application support requires a complete Web application suite to be
distributed along with the database replicas. A major advantage of
this approach is that dynamic content is served from locations
closer to users, leading into reduced network latency and fast
response times. However, this is achieved at the expense of
overheads due to (a) invalidation of cached dynamic content in the
edge caches and (b) synchronization of database replicas in the
data center. These have adverse effects on the freshness of
delivered content. In this paper, we propose a freshness-driven
adaptive dynamic content caching, which monitors the system
status and adjusts caching policies to provide content freshness
guarantees. The proposed technique has been intensively evaluated
to validate its effectiveness. The experimental results show that
the freshness-driven adaptive dynamic content caching technique
consistently provides good content freshness. Furthermore, even a
Web site that enables dynamic content caching can further benefit
from our solution, which improves content freshness up to 7 times,
especially under heavy user request traffic and long network
latency conditions. Our approach also provides better scalability
and significantly reduced response times up to 70% in the
experiments.
H.4Information SystemsInformation Systems Applications
D.2SoftwareSoftware Engineering D.2.8Software
EngineeringMetrics[complexity measures, performance measures]
Performance, Reliability, Experimentation
dynamic content, web acceleration, freshness, response time,
network latency, database-driven web applications
Introduction
For many e-commerce applications, Web pages are created dynamically
based on the current state of a business, such as product prices
and inventory, stored in database systems. This characteristic
requires e-commerce Web sites to deploy Web servers, application
servers, and database management system (DBMS) to generate and
serve user requested content dynamically. When the Web server
receives a request for dynamic content, it forwards the request to
the application server along with its request parameters (typically
included in the URL string). The Web server communicates with the
application server using URL strings and cookie information, which
is used for customization, and the application server communicates
with the database using queries. When the application server
receives such a request from the Web server, it may access the
underlying databases to extract the relevant information needed to
dynamically generate the requested page. To improve the response
time, one option is to build a high performance Web site to improve
network and server capacity by deploying the state of art IT
infrastructure. However, without the deployment of dynamic content
caching solutions and content delivery network (CDN) services,
dynamic contents are generated on demand. In this case, all
delivered Web pages are generated based on the current business
state in the database. However, when users receive the contents,
the business state could already have changed due to the network
latency. An alternative solution is to deploy network-wide caches
so that a large fraction of requests can be served remotely rather
than all of them being served from the origin Web site. This
solution has the advantage of serving users via caches closer them
and reducing the traffic to the Web sites, reducing network
latency, and providing faster response times. Many CDN
services [1] provide Web
acceleration services. A study in [2] shows that CDN indeed has
significant performance impact. However, for many e-commerce
applications, HTML pages are created dynamically based on the
current state of a business, such as product prices and inventory,
rather than static information. Therefore, content delivery by most
CDNs is limited to handling static portions of the pages and media
objects, rather than the full spectrum of dynamic content that
constitutes the bulk of the e-commerce web sites. Wide-area
database replication technologies and the availability of data
centers allow database copies to be distributed across the network.
This requires a complete e-commerce web site suite (i.e., Web
servers, application servers, and DBMS) to be distributed along
with the database replicas. A major advantage of this approach is,
like the caches, the possibility of serving dynamic content from a
location close to the users, reducing network latency. However,
this is achieved at the expense of overhead, caused by the need of
invalidating dynamic content cached in the edge caches and
synchronization of the database replicas in the data center. How to
architect Web sites and tune caching policies dynamically to serve
fresh content in short response time is a complex problem. In this
paper, we focus on the issue of how to maintain the best content
freshness for a given set of user request rate, database update
rate, and network latency parameters. We propose an
freshness-driven adaptive dynamic content caching
technique that monitors response time and invalidation cycle as
feedback for dynamic adjustment of caching policy. By considering
the trade-off of invalidation cycle and response time, it maintains
the best content freshness to the users. The rest of this paper is
organized as follows. In Section 2
we describe a typical data center architecture for hosting
database-driven Web application. In Section 3, we describe how to enable dynamic
content caching for data center-hosted Web applications. In Section
4 we address some limitation of
these architectures in supporting fresh dynamic content. In
Section 5, we give an
overview of our proposed solution for assuring content freshness.
In Section 6 we describe the
dynamic content invalidation schemes in the scope of NEC's
CachePortal technology. In Section 7, we describe dependency between
request response time and invalidation cycles. In Section 8 we describe how to engineer
adaptive freshness-sensitive Web applications on data centers that
balances response time and invalidation cycles. We also present
experimental results that evaluate effectiveness of our proposed
technique for ensuring content freshness as well as its benefit in
accelerating request response. In Section 9 we summarize related work and
compare with our approach. In Section 10 we give our concluding remarks.
Data Center Architecture for Hosting Web Applications
A typical data center architecture for hosting Web applications
requires a complete e-commerce Web site suite to be distributed
along with the database replicas. The figure shows a configuration
in which the WS/AS/DBMS suite is installed in network
edges to serve non-transaction requests which require accesses to
only read-only database replicas. In order to distinguish between
the asymmetric functionality of master and slave DBMSs, we refer
the mirror database in the data center as data cache or DB Cache.
DB Cache can be a lightweight DBMS without the transaction
management system and it may cache only a subset of the tables in
the master database. Updates to the database are handled using a
master/slave database configuration: all updates and transactions
are processed at the master database at the origin site. The scheme
for directing user requests to the closest server is the same as
what typical CDNs deploy. The interactions between these software
components for hosting Web applications on data centers are
summarized as follows:
- A request is directed to a nearby data center based on network
proximity.
- If the request is non-transaction and the DB Cache has all the
required data, the request is processed at the data center and the
result page is generated dynamically and returned to the
user.
- Otherwise, the request is forwarded to the master database at
the original site.
- The changes of database contents are periodically migrated from
the master database to the DB Cache at the data centers.
Figure 1: Data
Center-hosted Database-driven Web Site with Deployment of
CachePortal
|
Either a pull- or a push-based method can be used to synchronize DB
Cache with the master database. A typical production environment
will employ a hybrid approach in which a complete refresh is done
at a coarser granularity (e.g., once in a day) and incremental
refresh is done at a finer granularity (e.g., once every hour). In
most of e-commerce Web applications, the content freshness needs to
be assured at a much higher standard and asynchronous update
propagation is used. Although its popularity, this system
architecture has the following drawbacks:
- all dynamic content pages have to be generated at the data
centers or the origin Web site as user requests arrive;
- only network latency between the data centers and origin Web
site is reduced; the network latency between users and the data
centers remain the same; and
- synchronization is pre-scheduled and the dynamic content pages
may be generated based on outdated database content between two
synchronizations.
Proposed System Architecture
In [3], we presented a
theoretical framework for invalidating dynamic content. These works
introduce two new architectural components:
- sniffer: sniffer components are installed at the WAS
and the master database. The sniffer at the WAS is responsible for
creating the mappings between the URLs (identifications of pages
requested) and the query statements issued for the pages requested.
The sniffer at the master database is responsible for tracking the
database content changes.
- invalidator: invalidator is responsible for the
following two tasks: (1) retrieving the database content change log
from the sniffer at the master database and propagating the changes
to the mirror database; (2) performing invalidation checking for
the cached pages at the edge caches based on the database content
change log, URL and database query mapping, and the content in the
mirror database.
Note that the knowledge about dynamic content is distributed across
multiple servers. In contrast to the other approaches [4,5] which assume such
mappings are provided by system designers, the construction of
mapping between the database content and the corresponding Web
pages is automated. In this paper, we build on these
results by developing a novel system architecture that accelerates
data center-hosted Web applications through deployment of dynamic
content caching solutions. The proposed system architecture is
shown in Figure 1. It is similar
to the typical data center architecture except the two new software
modules. The interactions between the components in the data
center-hosted Web site with edge caches are as follows:
- A request is directed to the edge cache closest to the user
based on the network proximity. If there is a cache hit, the
requested page is returned to the user. Otherwise, the request is
forwarded to the WAS in the closest data center.
- If the request is non-transaction and the DB Cache has all the
required data, the request is processed and the page is generated
dynamically and returned to the user.
- Otherwise, the request is forwarded to the master database at
the original site.
- The changes of database contents are periodically reflected
from the master database at the origin Web site to the DB Cache at
the data center for synchronization and invalidation. In our
implementation, database update log is scanned every second and the
new log is copied to the data center.
- The invalidator reads the unprocessed database update log and
performs the invalidation checking and synchronization tasks. These
tasks are done as an invalidation/synchronization cycle. After one
cycle is completed, the invalidator starts the next cycle
immediately. Since the log scanning and
invalidation/synchronization are performed in parallel, the
invalidator does not need to wait for the completion of log
scanning. We visualize the relationship among log scanning, edge
cache invalidation, and DB Cache synchronization in Figure 2.
- The dynamic content pages generated are cached in the edge
caches.
As shown in Figure 1, a data
center may deploy multiple edge caches depending on the user
request rates. Dynamic content pages in the edge caches may be
generated by the database content in the DB Cache or the master
database. The IP addresses of edge caches and URL strings of cached
pages are tracked by the invalidator for invalidation if it is
applied.
The parameters that have an impact on the resulting freshness of
delivered contents as follows: Response time at edge cache
servers: This is the round trip time for requests that are
served by an edge cache server (as a result of a cache hit). The
response time from edge caches is expected to be extremely fast.
Response time at the data center: This is the round trip
time for requests that are served by a data center (as a result of
an edge cache miss and DB Cache hit). The response time from the
data center is expected to be fast, but is impacted by the request
rate at the data center. The network latency between the end users
and the data center has limited impact to the response time since
the network latency is low.
Figure 2: Visualization of
the Relationship Among Log Scanning, Edge Cache Invalidation, and
DB Cache Synchronization
|
Response time at origin Web sites: This is the round trip
time for requests that are served at the origin Web servers (as a
result of an edge cache miss and a DB Cache miss). Invalidation
time: This is the time required to process invalidation checks
for all the pages in the edge caches. Synchronization
time: This is the time required to propagate database updates
from the master database log to the DB Cache in the data center.
Invalidation cycle: This is the time required to process
invalidation checks for all the pages in the cache servers
plus propagating database updates. Note that the
invalidation process requires the database update log and
synchronization of the master database and DB Cache.
Synchronization cycle: The synchronization time could be
shorter than the invalidation time; however, the synchronization
cycle is the invalidation time plus synchronization time since
synchronization and invalidation are interleaved. Note that the
invalidation time and synchronization time vary very little from
one cycle to the other, the length of synchronization cycle and
invalidation cycle are almost of the same (as shown in Figure 2) although the length of the
synchronization time and invalidation time are different.
A data center architecture with deployment of CachePortal has the
following advantages:
- serving cached dynamic content pages is much faster than
generating pages on demand;
- edge caches are deployed close to the end users; consequently,
the network latency between end users and data centers is
eliminated;
- since a bulk of the load is distributed to the edge caches, the
WAS and the DB cache at the data center have lighter loads and thus
they can generate requested pages faster.
- freshness of pages cached in the edge caches and those
generated at the data center on demand is assured to be not older
than the invalidation/synchronization cycle.
Since the freshness that can be provided depends on the length of
the invalidation/synchronization cycle, on the other hand, this
parameter has to be carefully adjusted. In the next section, we
discuss challenge in greater detail.
Issues in Content Freshness
Figure 3: Response Time
and Freshness of Delivered Content by Various Existing System
Architectures
|
In Figure 3, we illustrate
response time and content freshness of three system architectures.
For Web sites that do not deploy any dynamic content caching
solution (Figure 3(a)), all pages
need to be dynamically generated at the origin Web site. The
content freshness such a system can assure is the response time at
the Web sites. For example, if the average response time at a Web
site is 10 seconds, even every Web page is generated based on up to
date information in the database, the assured content freshness
(i.e., age) of this Web site is 10 seconds since the
database content may change after the Web page is generated. For
data center-hosted applications, as described in Section 2, response time can be improved by
caching database content at the data centers (benefiting from lower
network latency). However, in this case, database content must be
synchronized in a timely manner so that pages are not generated
based on outdated DB Cache content. The content freshness assured
by this system architecture, then, is the maximum value among (1)
DB Cache synchronization cycle, (2) response time at the data
center, and (3) response time at the master database. For example,
a DB Cache is set to be synchronized every minute (Figure 3(b)) and contents are served at
both the data center and the origin Web site. Although the response
time at the data center and the origin Web site could be as fast as
less than 1 second and few seconds, respectively, the DB Cache
content may be out of synchronization as long as 30 seconds.
Therefore, the assured freshness of the delivered content is 30
seconds; even the average response time at the data center could be
as low as few seconds. For a data center that deploys dynamic
content caching solutions, as shown in Figure 1, requested pages may be delivered
from the edge cache servers; or dynamically generated at the data
center or the origin Web site. The dynamic pages in the edge caches
need to be invalidated and the content in the DB Cache needs to be
synchronized periodically to ensure the freshness of delivered
content. In this architecture, the content freshness that can be
assured is the maximum of (1) the response time from
origin Web sites; (2) the response time from edge caches; (3) the
response time at the data center; (4) the edge cache invalidation
cycle; and (5) the DB Cache synchronization cycle. Note that since
the response time from the edge caches and data centers is much
lower than the response time from the origin Web site, the
invalidation cycles for the edge cache content, and the
synchronization cycle of DB Cache, the content freshness that can
be assured is the maximum of (1) the response time at origin Web
sites; (2) the invalidation cycle of edge cache content; and (3)
the synchronization cycle of DB Cache content. This is illustrated
in Figure 3(c). As we can see in
Figure 3, the system
architectures of a typical Web site and data center-hosted Web
applications are not suitable for applications that require
assurance on the response time and it is difficult to estimate the
TTL (Time To Live).
Solution Overview
Figure 4: Response Time
and Freshness of Delivered Content with Deployment of
Freshness-driven Adaptive Caching
|
In Figure 3(c), we show a system
configuration that caches few pages or database content. As a
consequence, it is expected to have a short invalidation cycle and
DB Cache synchronization time; at the expense of slow response time
at the origin Web site when there is a cache miss at the edge
caches and DB Cache. On the other hand, the system configuration in
Figure 4(a) has a different
caching policy that caches a large number of Web pages at edge
caches and/or more database content at the DB Cache. This
configuration provides fast response time: when there is a cache
hit, the response time is naturally very fast; when there is a
cache miss, the response is still reasonable as the Web site has
lighter work load since most of the requests are served directly
from the edge cache servers or DB Caches. However, this
configuration and its caching policy has potentially low content
freshness since it will take longer time to complete the necessary
invalidation check for all the pages in the edge caches and to
synchronize the DB Caches in the data centers. In this paper, we
propose a freshness-driven adaptive dynamic content
caching technique. The proposed technique aims at maintaining
the best content freshness that a system configuration can support
and assure. Our technique does not blindly attempt to maintain the
lowest average response time nor average content freshness. Instead
it dynamically tunes the caching policy: the hit rate is tuned to a
point where the response time from the origin site and invalidation
cycle for the cached pages are equivalent as shown in Figure 4(b). At this point, the freshness
of requested pages can be ensured to be at the highest level. For
the configuration in Figure 4(b),
the content freshness that can be assured is equal to the length of
the the edge cache invalidation cycle (or the DB Cache
synchronization cycle). Given that the response time at the origin
Web site is lower, the proposed freshness-driven adaptive caching
would tune its caching policy by decreasing the number of cached
pages and database content so that the edge cache invalidation
cycle and DB Cache synchronization cycle get shorter accordingly.
The number of cached pages and database content are decreased until
the response time at the origin Web site is close to the
invalidation cycle (and synchronization cycle) as shown in Figure
4(b). At this equilibrium point,
the assured freshness of delivered content is optimal. In addition,
the response time at the Web site is also assured. Thus, in Figure
4(b), the freshness of dynamic
content and response time for the Web site are assured to be less
than 3 seconds. The equilibrium point in Figure 4(b), however, may change when the
values of the influential system parameters change. If network
latency, database update rates, and user request rates increase,
the response time at the origin Web site will increase accordingly
(Figure 4(c)). To reduce the
response time at the origin Web site, the freshness-driven adaptive
caching will increase the number of cached pages at the edge cache
and/or database content at the DB Cache. Consequently, the request
rate at the origin Web site would be reduced (so would be the
response time) at the expense of invalidation cycle and
synchronization cycle. Note that the equilibrium point in Figure 4(d) is higher than the equilibrium
point in Figure 4(b). The
proposed freshness-driven adaptive caching has various advantages.
First, it yields and assures the best content freshness among these
system architectures. Second, it also yields fast response times.
Third, it provides assurance for both freshness and response time
as follows:
- the system does not serve the content that is older than the
assured freshness; and
- the system does not serve the content slower than the
assured response time.
Invalidation Checking Process
In this section, we describe the invalidation schemes used in the
proposed system architecture. Assume that the database has the
following two tables: Car(maker, model, price) and
Mileage(model, EPA). Say that the following query,
Query1, has been issued to produce a Web page,
URL1:
select Car.maker, Car.model, Car.price, Mileage.EPA
from Car, Mileage
where Car.maker = "Toyota" and
Car.model = Mileage.model;
Now a new tuple (Totota, Avalon, $25000) is
inserted into the table Car. Since
Query1 accesses two tables, we first check if the new
tuple value can satisfy the condition associated with only the
table Car stated in Query1. If it
does not satisfy, we do not need to test the other condition and we
know the new insert operation does not impact the query result
ofQuery1 and consequently URL1 does not
need to be invalidated or refreshed. If the newly inserted tuple
does satisfy the condition associated with the table
Car, we cannot determine whether or not the query
result of Query1 has been impacted unless we
check the rest of the condition associated with the table
Mileage. To check whether or not the condition
Car.model = Mileage,model can be satisfied, we need to access the table
Mileage. To check this condition, we need to issue
the following query, Query2, to the database:
select Mileage.model, Mileage.EPA
from Mileage
where "Avalon" = Mileage.model;
If the result of Query2 is non-empty, the query
result for Query2 needs to be invalidated. The
queries, such as Query3, that are issued to determine
if certain query results need to be invalidated are referred as
polling queries. Now assume that we observe the following
three queries, Query3 and Query4,
in the URL/database query mapping to generate user requested pages:
select maker, model, price
from Car where maker = "Honda";
select maker, model, price
from Car where maker = "Ford";
we can derive a query type, Query_Type1 as:
select maker, model, price
from Car
where maker = $var;
Therefore, multiple query instances can have the same bound
query type; and, multiple bound query types may have the same query
type. We can create a temporary table Query_Type to
represent the above two query instances as follows:
QUERY_ID QUERY_INSTANCE
---------- ------------------
Query3 Honda
Query4 Ford
---------- ------------------
Let us also assume that the following four tuples are
inserted to the database:
(Acura, TL, $30000)
(Honda, Accord, $20000)
(Lexus, LS430, $54000)
A temporary table Delta to represent the above
three tuples. We can consolidate a number of invalidation checks
into more compact form through transformation. For example, a
single polling query, Query5 , can be issued as
follows:
select Query_Type.QUERY_ID
from Car, Query_Type, Delta
where Delta.Maker = Query_Type.QUERY_INSTANCE;
Query_Type.QUERY_ID is a list of query results
that need to be invalidated. In this
example,Query3 will be invalidated. For details
of the invalidation scheme, please see [6]. We summarize important
characteristics of the consolidated invalidation checking schemes
as follows: (1) the invalidation cycle is mainly impacted by the
number of cached query types since it determines the number of
polling queries executed for invalidation checking; and (2)
database update rates and the number of query instance per query
type would have relatively lower impact on the invalidation cycle
since they only increase query processing cost.
Dependency between Response Time and Invalidation Cycle
In this section, we examine the dependency between the response
time and the invalidation cycle length. We have conducted
experiments to verify this dependency. We first describe the
general experiment setup that consists of Web servers, application
servers, DBMS, and network infrastructure that are used in the
experiments.
We used two heterogeneous networks that are available in the NEC's
facility in Cupertino, California: one is used by the C&C
Research Laboratories (referred to as CCRL) and the other one is
used by cacheportal.com (referred to as CP). Users, edge
cache servers, Web server, application server, and DB Caches are
located in the CP network while the master databases are located in
the CCRL network. The average round trip time on the CCRL-CP
connections is around 250 ms while the round trip time within the
same network is negligible. To summarize, connectivity within the
same network is substantially better than that across the Internet
and there is notable network latency. BEA WebLogic 7.0 is used for
the WAS. Oracle 9i is used as the DBMS. The database contains 7
tables with 1,000,000 rows each. The database update rate is 600
rows per table per minutes. The Squid server is modified to use as
edge cache servers. The maximum number of pages that can be cached
is 1,000,000 pages. These pages are generated by queries that can
be categorized into 1,000 query types. Thus, on average, there are
1,000 pages (query instances) per query type. Among these 1,000
query types, 200 query types are non-cacheable (i.e. queries
involved transactions, privacy, or security). All servers are
located in dedicated machines. All of the machines are Pentium III
700Mhz one CPU PCs with 1GB of memory. They are running Redhat
Linux 7.2.
Correlation between Number of Cached Query Types and Edge Cache
Hit Rates
The problem of cache replacement has been extensively studied. Many
algorithms have been proposed for general purpose caching, such as
LRU and LFU. Some variations of these are designed specifically for
cache replacement of Web pages. However, in the scope of dynamic
caching for a Web site, cache invalidation rate is an important
factor since a high invalidation rate will lead to a potentially
high cache miss rate in the future. Another consideration is that
there is overhead for invalidation checking process; a well-tuned
cache management should cache only a small portion of pages to
serve most of the requests.
Figure 5: Correlation
between Number of Cached Query Types and Load
Distribution
|
We have developed a cache replacement algorithm that takes into
consideration (1) user access patterns, (2) page invalidation
pattern, (3) temporal locality of the requests, and (4) response
time. Consequently, we are able to select only a small number of
query types (i.e., pages generated by these query types) to cache,
but maintain a high hit rate at edge caches. Figure 5 shows the correlation between the
number of selected cached query types and the request
distribution percentages in the edge caches, data centers, and
origin Web site. As we can see in the figure, when we choose to
cache 200 query types (25% of all query types), the cache hit rate
is close to 48%. However, when we increase cached query types from
500 to 800, the cache hit rate can only be improved by an
additional 5%. Note that user requests are distributed among edge
caches, data centers, and the origin Web site. Thus, when there is
a high edge cache hit rate, the load at both the data center and
the master database is reduced. The figure shows that approximately
75% of the user requests due to edge cache miss can be handled by
the data center and the master database handles the rest 25% of the
requests, including requests due to edge cache misses and data
center misses, as well as all transactions. The ratio of 75%:25% is
independent from the edge cache hit rate.
Effects of Request Rates and Cache Hit Rates on Request Response
Time
Figure 6: Effects of
Request Rates and Number of Cached Query Types on Request Response
Time at the Data Center
|
Figure 7: Effects of
Request Rates and Number of Cached Query Types on Response Time at
the Origin Web Site
|
The next experiment we conducted is to measure the effects of cache
hit rates (i.e., number of cached query types) and request rates on
the response time. Note that we do not present the effects of cache
hit rates and request rates on the response time at the edge cache
server since the response time at edge cache server fast. In
Figures 6 and 7, we plot the response time at
the data center and the origin Web site with respect to various
request rates (i.e., from 20 to 120 requests per second) and
numbers of cached query types (i.e., from 100 to 500). The
experimental results indicate the following:
- the origin Web site is much more sensitive to the request rate
than the data center caused by the network latency between the
users and the origin Web site. The master database needs to keep
more connections open for a longer period and many requests are
queued waiting for processing.
- the response time at the origin Web site is higher than that at
the data center. This is also due to the network latency;
- when the request rate reaches a certain threshold, the response
time increases sharply. We observe this effect in both figures.
This is because the requests start to accumulate after the request
rate reaches a certain level; and
- the response time is reduced when the number of cached query
types is increased. We observe such effects in both figures (at the
data center and the origin Web site).
Effects of Network Latency on Request Response Time at the Origin
Web Site
Figure 8: Effects of
Request Rates and Network Latency on Request Response Time at the
Origin Web Site
|
In earlier experiments, the network round trip latency between
users and the origin Web site was set around 250 ms. In this
experiment, to measure the effects of network latency, we fixed the
edge cache hit rate (i.e. the number of cached query type) but
varied the request rate and the network latency between users and
the origin Web site. The round trip time is altered as 250 ms, 500
ms, 750 ms, 1000 ms, and 1250 ms. In Figure 8, we plot the response time at the
origin Web site with respect to a set of different user request
rates and network latency settings. The experimental results in the
figure indicate that request response time at the origin Web site
is very sensitive to the request rate and network latency when the
user request rates reach a certain level. When the request rate and
network latency reach a certain threshold, the response time
increases sharply. Again, this is because request accumulation
frequently occurs while the system experiences heavy request load
and long network latency. We also observe that the response time
increases at a faster rate with longer network latency; as we can
see from the figure, the slope of the plot for the network latency
of 1250 ms is much steeper than that for the network latency of 250
ms.
Freshness-driven Caching
In this section, we describe the proposed freshness-driven adaptive
dynamic content caching followed by the discussion and analysis of
the experimental results.
To achieve the best assured dynamic content freshness, we need to
tune the caching policy so that the response time is close to the
invalidation cycle in an equilibrium point. However, the response
time and invalidation cycle can only be improved at the expense of
each other. In Section 7, we found
that response time can be impacted by (1) network latency, (2) user
request rates, (3) database update rates, (4) invalidation cycle
(frequency), and (5) edge cache hit rates. Network latency, request
rates, and database update rates depend on the network
infrastructure and application characteristics and hence can not be
controlled. However, we can tune the edge cache hit rate (to a
certain degree) by adjusting the number of cached query types; and
consequently affect response time. Figure 9 summarizes the effect of the
number of query types on response time, invalidation cycle, and
synchronization cycle. In this configuration, the database update
rate is 120 updates per second and user request rate is 105
requests per second. We vary the number of cached data types
between 50, 100, 200, 300, 400, and 500. We plot the response time
at the edge cache, data center, and origin Web site. We also plot
the length of the invalidation and data center synchronization
cycle (note that the invalidation cycle and data center
synchronization cycle are identical since invalidation and
synchronization are performed together). When the number of cached
query types decreases from 500 to 100, the
invalidation/synchronization cycle and response time at the origin
Web site move to an equilibrium point. At the equilibrium point
(i.e., QT=100), the assured freshness is higher than the response
times at the edge cache and the data center. Since the response
times at the edge cache and data center are much lower than the
response at the origin Web site, seconds is
the best freshness we can assure. The value will increase when the
network latency between users and the origin Web site or user
request rate increase and vice versa.
Figure 9: Effects of the
Number of Cached Query Types on Response Time and
Invalidation/Synchronization Cycle
|
We derive the following adaptive caching policy for maintaining
request response time and invalidation cycle close to an
equilibrium point:
- If the response time at the origin Web site is larger than the
length of the invalidation cycle, the response time can be reduced
by increasing the number of cached query types until the request
response time and invalidation cycle reach an equilibrium point.
Note that when we increase the number of cached query types, the
edge cache hit rate will increase. As a result, the request rates
at the data center and the origin Web site are reduced.
Consequently, the response time at both the data center and the
origin Web site is improved.
- Similarly, if the invalidation cycle is longer than the request
response time, we can lower the invalidation cycle by decreasing
the number of cached query types until the request response time
and invalidation cycle reach an equilibrium point.
In the current implementation, the adaptive caching policy is
deployed at the edge server. The response time is measured at the
cache server assuming that the round trip between users and the
cache server (i.e., functioning as a user side proxy) is
negligible.
Figure 10: Impact of
Adaptive Caching on Content Freshness
|
Figure 11: Comparisons of
Assured Content Freshness for Three System Configurations
|
We conducted a series of experiments to evaluate the proposed
freshness-driven adaptive dynamic content caching technique. In
these experiments, we created setups by varying request rates and
database update rates (every 10 minutes) for the first 60 minutes
of experiment. In the first 60 minutes, the network latency is
stable at 500 ms delay on average. After 60 minutes, the database
update rate and request rate are fixed while the network latency is
varied. We observed the effects of our freshness-driven adaptive
caching technique on maintaining the best freshness that can be
assured for a given system configuration and setup. In Figure 10, we plot the response time at
the origin Web site as a dotted line and invalidation cycle as a
solid line. The numbers next to the dotted line are the number of
cached query types. In this figure, we do not plot the response
time at the edge cache and the data center since the response times
are very low. The response times measured are the response time at
the origin Web site. A sudden change in the request rate and
network latency will cause temporary imbalance of response time (at
the origin Web site) and invalidation/synchronization cycle (at the
edge caches and the data center). As the imbalance is detected, the
freshness-driven adaptive caching technique makes necessary
adjustments to the number of cached query types, and this impacts
the cache hit rate and the response time. As time elapses, the
response time and invalidation cycle shifts to a new equilibrium
point which supports the best content freshness that can be
assured. For example, when the database update rate suddenly
changes from 40 updates per second to 120 updates per second at
20th minute of the experiment, the number cached query types is
decreased to 100, where a new equilibrium point is reached. This
compensates the sharp increase of invalidation/synchronization
cycle due to the additional 80 updates per seconds; the number
cached query types is decreased to 100. The request response time
at the origin site can be adjusted fairly quickly since it is very
sensitive to the cache hit rate. And, we also observe in all our
experiments that when we change the number of cached query types to
compensate temporary imbalance of response time and invalidation
cycle, both response time at the origin Web site and invalidation
cycle moves toward a new equilibrium point at the same time. The
observations in the experiments are consistent with our
experimental results in Section 7.
Next, we compare the content freshness that can be assured by four
system configurations as follows:
- a data center that does not deploy any dynamic content caching
solution;
- a data center with dynamic content caching but the numbers of
cached query types are preset as 200;
- a data center with dynamic content caching but the numbers of
cached query types are preset as 400; and
- a data center that deploys dynamic content caching and the
proposed freshness-driven adaptive caching technique.
In Figure 11 we plot the larger
of the response time and the invalidation/synchronization cycle
length at data centers of the four system configurations. This
value gives the content freshness that a given system configuration
can support for given request and database update rates. In the
middle of each period in the figure, we indicate the ranking of
four configurations in term of freshness they assure (the lower the
better). The figure shows the huge benefit provided by deploying
Cacheportal. The three configurations with CachePortal (i.e.,
QT200, QT400, and Dyna) consistently provides much fresher content
than the typical data center configuration. Especially, during the
heavy traffic conditions in the periods of 10-20, 30-40, and 50-60
minutes and the long network latency delay condition in the period
of 70-80 minutes, the system configuration with the
freshness-driven adaptive caching supports content freshness up to
15 times better than those of the typical data center. As we can
see, some time (i.e., the periods of 10-20, 30-40, 50-60, and 70-80
minutes) the configuration 2 (QT200) performs better than the
configuration 4 (QT400) and opposite is true in other times (i.e.,
the periods of 0-10, 20-30, and 40-50 minutes). However, the
adaptive caching can consistently tune the caching policy to
provide the best assured freshness feasible for a specific set of
conditions and system configuration. In our experiments, the
adaptive caching supports content freshness up to 10 times better
than even those systems that already deploy dynamic content caching
solutions. The experiments strongly show the effectiveness and
benefits of the proposed freshness-driven adaptive caching
technique in providing fresh dynamic content.
Figure 12: Response Time
of Data Center-hosted Web Applications with and without
CachePortal
|
Figure 13: Effects of Edge
Cache Hit Rates (i.e., Number of Cached Query Types) on Weighted
Average Response Time
|
The next experiment we conducted is to measure the performance
gain (in terms of the response time observed by the users) achieved
through our proposed approach. In this experiment, the baseline
setting is used (i.e., database update rate is 600 tuples per table
per minute; the network latency is 250 ms round trip; the number of
cached query types is 300; and the request rate is 100 requests per
second). We set up two system configurations: (1) typical data
center architecture; and (2) data center with deployment of the
CachePortal and freshness-driven adaptive dynamic content caching.
For the first system configuration, the request rates distributed
to the data center and the master database (i.e., origin Web site)
are 75 requests per second and 25 requests per second,
respectively. For the second system configuration, the request
rates distributed to the edge caches, the data center ,and the
master database (i.e., origin Web site) are 40, 45, and 15 requests
per second, respectively. The characteristics of request
distribution are described in Section 7.2 and Figure 5.
Figure 12 shows the
response times measured at the edge caches, data centers, and
origin Web sites for two system configurations. These figures also
show the weighted average response time. As we can see, with
deployment of CachePortal, we can reduce the average response time
by almost 70%. When the system caches 300 query types, the edge
cache hit rate reaches 40%. In Figure 13, we show the average response
time of the second system configuration with various edge cache hit
rates (from 20% to 60%). We observe that when the edge cache hit
rate increases from 20% to 60%, the average response time can be
reduced by 70%. This is because delivering dynamic content from the
edge caches is much faster than garnering content on demand. Thus,
when the edge cache hit rate increases, the weighted average
response time can reduce significantly.
Figure 14: Effects of User
Request Rate on Weighted Average Response Time
|
Figure 15: Effects of
Network Latency on Weighted Average Response Time
|
The next experiment we conducted is to measure the effects of
the request rate on response time for these two system
configurations. In Figure 14, the request rate is increased
from 20 requests per second to 180 requests per second while the
other parameters remain the same. As we can see, when the request
load to the system increases 8 times, the average response time of
the first system configuration increases 3.5 times, which indicates
that the data center architecture is reasonably scalable. On the
other hand, we see that the average response time by the data
center architecture with deployment of Cacheportal (i.e., the
second system configuration) increases at a slower rate.
We also measure the effects of network latency on the average
response time. We increase the network latency between the data
center and the origin Web site (the master database) from 250 ms to
1250 ms. We observe the the second system configuration is less
sensitive to the network latency as the average response time
increases at a slow rate as shown in Figure 15. This is because the data
center architecture with deployment of Cacheportal serves a higher
percentage of requests from the edge than the typical data center
architecture.
Figure 16: Effects of
Update Rate on Invalidation Overhead
|
Figure 17: Effects of
Cached Query Types on Invalidation Overhead
|
In Section 6, we have
analytically concluded that the invalidation cycle is mainly
impacted by the number of cached query types, and that database
update rates have relatively lower impact on the length of the
invalidation cycle. We have conducted experiments to validate our
analysis.
Figures 16 and 17 show the results for the
evaluation of the scalability of our proposed solutions. We first
look at the effects of the update rate on invalidation overhead. As
shown in Figure 16, the
update rate has an impact on the computing time of polling queries
in the invalidation as well as synchronization time. Since the
polling queries are submitted to the same machine and since
appropriate indexing schemes are used, the impact of the increase
in the update rate is limited.
We also evaluate the effects of the number of cached query types
on the invalidation overhead. The number of cached query types
determines the number of polling queries that need to be executed;
however, the computation cost of each polling query remains the
same. Since the polling queries are submitted to the same machine,
the impact of the increase in cached query types is limited,
although the effect of the number of cached query types is more
notable than the effect of database update rate. This is shown in
Figure 17. We can see that
as we increase the number of cached query types, the reduction of
response time is much higher than the increase of invalidation
overhead.
Related Work
Applying caching solutions for Web applications and content
distribution has received a lot of attention in the Web and
database communities[7,8,9,10,11,12,3]. These provide various
solutions to accelerate content delivery as well as techniques to
assure the freshness of the cached pages. Note that since Web
content is delivered through the Internet, the content freshness
can only be assured rather than being guaranteed.
WebCQ [13] is one of the
earliest prototype systems for detecting and delivering information
changes on the Web. However, the change detection is limited to
ordinary Web pages. Yagoub et al. [14] have proposed caching
strategies for data intensive Web sites. Their approach uses
materialization to eliminate dynamic generation of pages but does
not address the issue of view invalidation when the underlying data
is updated. Labrindis and Roussopoulos [15] present an innovative approach to
enable dynamic content caching by maintaining static
mappings between database contents and Web pages, and therefore
requires a modification to underlying Web applications.
Dynamai [4] from Persistence
Software is one of the first dynamic caching solution that is
available as a product. However, Dynamai relies on proprietary
software for both database and application server components. Thus
it cannot be easily incorporated into existing e-commerce
framework. Levy et al. [5] at IBM Research have
developed a scalable and highly available system for serving
dynamic data over the Web. The IBM system was used at Olympics 2000
to post sport event results on the Web in timely manner. This
system utilizes database triggers to generate update events as well
as intimately relying on the semantics of the application to map
database update events to appropriate Web pages. Heddaya and
Mirdad [16], where authors
propose a diffusion-based caching protocol that achieves
load-balancing, [17] which uses
meta-information in the cache-hierarchy to improve the hit ratio of
the caches, [18] which
evaluates the performance of traditional cache hierarchies and
provides design principles for scalable cache systems, and [19] which highlights the fact that
static client-to-server assignment may not perform well compared to
dynamic server assignment or selection. SPREAD [20], a system for automated content
distribution, is an architecture which uses a hybrid of client
validation, server invalidation, and
replication to maintain consistency across servers. Note
that the work in [20] focuses on
static content and describes techniques to synchronize static
content, which gets updated periodically, across Web servers.
Therefore, in a sense, the invalidation messages travel
horizontally across Web servers.
Concluding Remarks
In this paper, we propose a freshness-driven adaptive dynamic
content caching technique that maintains the best content
freshness that an application-hosting data center configuration can
support. The technique monitors the response time and the length of
the invalidation cycle and dynamically adjusts the caching policy
accordingly. By balancing invalidation cycle length and response
time our technique is able to maintain the best content freshness
that can be assured. The experiments validate the effectiveness of
our technique. Under heavy traffic, the freshness-driven adaptive
caching supports content freshness up to 20 times better than those
data center-hosted applications without dynamic content caching. It
also supports content freshness up to 7 times better than those
data center-hosted applications that deploy dynamic content caching
solutions. Furthermore, our approach also provides faster response
times and better scalability.
- 1
- Akamai Technology.
http://www.akamai.com/.
- 2
- B. Krishnamurthy and C.E. Wills.
Analyzing factors that influence end-to-end web performance.
In Proceedings of the 9th World-Wide Web Conference, pages
17-32, Amsterdam, The Netherlands, May 2000.
- 3
- K. Selcuk Candan, Divyakant Agrawal, Wen-Syan Li, Oliver
Po, and Wang-Pin Hsiung.
View Invalidation for Dynamic Content Caching in Multitiered
Architectures .
In Proceedings of the 28th Very Large Data Bases
Conference, Hong Kong, China, August 2002.
- 4
- Persistent Software Systems Inc.
http://www.dynamai.com/.
- 5
- Eric Levy, Arun Iyengar, Junehwa Song, and Daniel Dias.
Design and Performance of a Web Server Accelerator.
In Proceedings of the IEEE INFOCOM'99, New York, New York,
March 1999. IEEE.
- 6
- Wen-Syan Li, Wang-Pin Hsiung, Dmitri V. Kalashnikov, Radu
Sion, Oliver Po, Divyakant Agrawal, and K. Selçuk
Candan.
Issues and Evaluations of Caching Solutions for Web Application
Acceleration.
In Proceedings of the 28th Very Large Data Bases
Conference, Hong Kong, China, August 2002.
- 7
- Ben Smith, Anurag Acharya, Tao Yang, and Huican Zhu.
Exploiting Result Equivalence in Caching Dynamic Web Content.
In Proceedings of USENIX Symposium on Internet Technologies and
Systems, 1999.
- 8
- P. Deolasee, A. Katkar, A. Panchbudhe,
K. Ramamritham, and P. Shenoy.
Adaptive Push-Pull: Dissemination of Dynamic Web Data.
In the Proceedings of the 10th WWW Conference, Hong Kong,
China, May 2001.
- 9
- C. Mohan.
Caching Technologies for Web Applications.
In Proceedings of the 2001 VLDB Conference, Roma, Italy,
September 2001.
- 10
- Anoop Ninan, Purushottam Kulkarni, Prashant Shenoy, Krithi
Ramamritham, and Renu Tewari.
Cooperative Leases: Scalable Consistency Maintenance in Content
Distribution Networks.
In Proceedings of the 2002 World-Wide Web Conference,
Honolulu, Hawaii, USA, May 2002.
- 11
- Anindya Datta, Kaushik Dutta, Helen M. Thomas,
Debra E. VanderMeer, Suresha, and Krithi Ramamritham.
Proxy-Based Acceleration of Dynamically Generated Content on the
World Wide Web: An Approach and Implementation.
In Proceedings of 2002 ACM SIGMOD Conference, Madison,
Wisconsin, USA, June 2002.
- 12
- Qiong Luo, Sailesh Krishnamurthy, C. Mohan, Hamid
Pirahesh, Honguk Woo, Bruce G. Lindsay, and Jeffrey F.
Naughton.
Middle-tier Database Caching for e-Business.
In Proceedings of 2002 ACM SIGMOD Conference, Madison,
Wisconsin, USA, June 2002.
- 13
- Ling Liu, Calton Pu, and Wei Tang.
WebCQ: Detecting and Delivering Information Changes on the
Web.
In Proceesings of International Conference on Information and
Knowledge Management, Washington, D.C., November 2000.
- 14
- Khaled Yagoub, Daniela Florescu, Valérie Issarny, and
Patrick Valduriez.
Caching Strategies for Data-Intensive Web Sites.
In Proceedings of the 26th VLDB Conference, Cairo, Egypt,
2000.
- 15
- A. Labrindis and N. Roussopoulos.
Self-Maintaining Web Pages - An Overview.
In Proceedings of the 12th Australasian Database Conference
(ADC), Queensland, Australia, January/February 2001.
- 16
- A. Heddaya and S. Mirdad.
WebWave: Globally Load Balanced Fully Distributed Caching of Hot
Published Documents.
In Proceedings of the 1997 IEEE International Conference on
Distributed Computing and Sytems, 1997.
- 17
- M.R. Korupolu anf M. Dahlin.
Coordinated Placement and Replacement for Large-Scale Distributed
Caches.
In Proceedings of the 1999 IEEE Workshop on Internet
Applications, 1999.
- 18
- Renu Tewari, Michael Dahlin, Harrick M. Vin, and
Jonathan S. Kay.
Design Considerations for Distributed Caching on the
Internet.
In Proceedings of the 19th International Conference on
Distributed Computing Systems, 1999.
- 19
- R.L. Carter and M.E. Crovella.
On the network impact of dynamic server selection.
Computer Networks, 31(23-24):2529-2558, 1999.
- 20
- P.Rodriguez and S.Sibal.
Spread: Scaleable platform for reliable and efficient automated
distribution.
In Proceedings of the 9th World-Wide Web Conference, pages
33-49, Amsterdam, The Netherlands, May 2000.