Utility of co-operating Web proxy caches

P. Krishnan and Binay Sugla

Network and Service Management Research, Bell Labs, Lucent Technologies,
Holmdel, NJ 07733-3030, U.S.A.

pk@research.bell-labs.com and sugla@research.bell-labs.com

Abstract
In this paper, we study the utility of co-operation among intranet proxy caches. We target our study to a corporate intranet environment having multiple Web proxies that each serve a set of hosts. Experiments with our co-operation methods and architectures with real data suggest that daily hit rate improvements due to co-operation can vary from 0.7–28.9%. We study the effect of community size on co-operation performance, and analyze the other side-effects of co-operation, e.g., the extra intranet traffic and server load. We develop metrics to deduce information relevant to making decisions about the effectiveness of co-operation, and draw intuitions on when it makes sense for the caches to co-operate. Our work demonstrates a feasible set of metrics that proxies should measure to help in their evaluation of caching effectiveness and their configuration.

Keywords
Proxy caching; Hierarchy; Co-operation; Intranet performance; Cache management

1. Introduction

In the context of Internet object caches, the benefits of hierarchical caching have been well-documented [1]. Currently, within several intranets, a number of stand-alone Web proxies are deployed. Each proxy services a number of hosts/Web browsers. The browser has its own disk and memory cache. When an object requested by a user is not found in its browser cache, the request is sent to the proxy server. The proxy server first looks in its own cache for the object, and then contacts the Web server hosting the object, if needed. An issue that is particularly relevant in this context is whether it benefits Web proxies to co-operate. By co-operation, we mean that an object not found in the cache of one proxy (i.e., a miss in cache) is requested from another proxy within the intranet, before a request is sent to the Web server hosting the object (or the parent, in the case of a hierarchical cache system, e.g. [9]). Some intranets currently participate in hierarchical caching.

Conceptually, co-operation between proxies is a simple concept. It is based on the premise that it should be quicker and cheaper to find an object in another proxy's cache within an intranet rather than access the Internet server hosting the object. However, there are important issues in co-operation. Objects from another proxy must be retrieved as quickly as possible. Methods to co-operate and manage this co-operation introduce complexity into the system; caches should co-operate only if there is a performance gain from it.

The question of how to reduce the time taken for servicing requests by a co-operating proxy has been studied by other researchers [1, 3, 4]; we do not tackle that issue here. The Harvest object cache [1] was the first system designed to operate in concert with other instances of itself. It used unicasting for achieving co-operation. Malpani et al. [4] have presented a multicasting-based technique to reduce the time it takes for communication between caching proxies. Gadde et al. [3] consider an approach of maintaining a centralized database of pages existing in various caches.

Our goal in this paper is orthogonal to the issue of getting to a page in a co-operating proxy's cache quickly. We are interested in experimentally analyzing the benefit provided by co-operation. To the best of our knowledge, there are no results that quantify the actual advantage gained by co-operation between intranet proxies. There are some simulation results presented in [3] describing benefits of shared caches. However, these results demonstrate the benefit of having proxy caches, not the effect of proxy caches co-operating with each other. Other works (e.g., in [10]) look at issues in hierarchical cache systems. From that point of view, we are evaluating intranets with Squid-like proxies [11] at the bottom of the hierarchy that co-operate to maximize benefit.

From a conceptual standpoint, performance improvements due to co-operation come about for similar reasons to what make proxies popular. Apart from increasing the effective size of cache, we can leverage from what we call virtual prefetching. That is, a user accessing object i through proxy X is effectively prefetching the object for a user who will access object i in the future through another co-operating proxy Y.

1.1. Our work

In this paper, we address the performance benefits of co-operation between intranet caching proxies. In particular, we address the following issues:
  1. Does co-operation help? If yes, by how much? Why does it help?
  2. When should co-operation be used, and when should it not be used?
  3. What is the effect of co-operation on users and the network?
  4. What are the engineering parameters useful in making decisions about co-operation?
We present experimental results that quantify the improvement in cache hit rate (i.e., the ratio of the number of "units" served by cache relative to the total number of units requested, where the units are typically bytes or requests) obtained by co-operation between intranet proxies, and the effect on proxy and network load. Our experiments are trace-driven simulations using real proxy logs. The proxy logs used in our experiments were obtained from three sites within Bell Labs. By using logs from three separate sites but from the same corporation, we get an interesting insight into the level of sharing between users from somewhat similar, but not exactly the same environment.

For our study, we concentrate on isolating the virtual prefetching aspect of the performance improvement and look at the effect of co-operation on each proxy separately. We observe that byte hit rate improvement due to virtual prefetching can vary from as low as 0.7% to as high as 28.9%. We try to understand the factors that affect this improvement, and in particular, study the effect of community size. We quantify the overhead due to co-operation on network traffic and server load. We observe that, depending on the co-operation technique and architecture, about 7.5% of the extra requests created by proxies can be satisfied, and server load can increase from 10–300%. This indicates that co-operation should be done with care. We infer policies that proxies should follow to request data from other proxies or entertain requests from other proxies. Another contribution of our work is to emphasize that proxies should monitor the correct variables and provide hooks for use of these monitored variables for performance and configuration management. This is particularly important with more vendors now adopting the Internet Cache Protocol (ICP) [6] to provide inter-cache communication abilities.

In Section 2, we present the methodology for our study. We present our results in Section 3. Based on these results, in Section 4 we make inferences that will aid in deciding when and how intranet proxies should co-operate, and briefly describe our ongoing work. We conclude in Section 5.

2. Methodology

In this section we describe the assumptions we make and characteristics of the data used.

2.1. Data

We collected proxy logs from three proxy servers within Bell Labs. These servers are referred to in this paper as B, M, and S. The proxies served different Bell Labs locations. The log files were in the common log format [8].

We considered 11 consecutive days of data from these proxy logs. A brief summary of some information about the logs is given in Table 1.

Loc.
# IP end-points
# Unique accessors
# Accesses
Cache needed
B
approx. 1900
471
613834
2.2 Gbytes
M
approx. 1500
177
142502
0.94 Gbytes
S
approx. 375
228
91805
483 Mbytes
Table 1. Some parameters related to the three proxy servers and the access logs

The "#IP end-points" column specifies the number of hosts that are connected to the network served by the proxy server. (There may be no Web accesses from many of these end-points.)

The last three columns are derived from the logs. The "#Unique accessors" specifies the actual number of unique IP hosts that accessed the proxy server over the course of the 11 days. The "#Accesses" column specifies the number of URL requests received by the proxy. The "Cache needed" column specifies the amount of cache that would have accommodated all the requests for that proxy, with no replacement required in cache. That is, a cache with the space as specified in that column qualifies as an infinite cache for all practical purposes.

2.2. Experimental assumptions and technique

Given our interest in isolating the effect of virtual prefetching compounded with the observation that disk space is getting relatively cheap, we assumed an infinite cache for each location. (Notice from Table 1 that for our scenario, an infinite cache corresponds to 1–2 Gbytes.) Hence, an object that was brought into the cache was never replaced, except by a newer copy of itself. This has the advantage that effects due to replacement policy do not dilute the result, and we are able to isolate the effect of virtual prefetching. We made certain other simplifying assumptions for our experiments. As in [7], we assumed that we had a "hit" if the URL and number of bytes transferred matched with what was in cache, and disregarded the time required for transfer (which is not recorded in the common logfile format). It is possible that pages may change without a change in size, and our method will not be able to detect such changes. Lack of transfer time information in the common logfile format makes it difficult to estimate response time improvements.

We studied two protocols for co-operation: symmetric and asymmetric. We say that proxy x is "bigger than" proxy y if proxy x supports more clients than proxy y; we denote this relationship by x > y. (We assume that the proxies are configured so that the number of hosts they serve is in proportion to their "capacity".) Based on Table 1, we see that B > M > S. (The symbols for the three proxies have been chosen to stand for "Big", "Medium", and "Small", respectively, for easier reference through the paper.). In a symmetric protocol, a co-operation between proxy servers X and Y implies that every miss by proxy X is requested from proxy Y and vice-versa. To achieve this, we parsed the logs of the co-operating proxies based on time of access, and assumed that the clocks on these machines were synchronized. We assumed a hit if either of the two caches could have serviced the request. If an object was serviced from another proxy's cache, the local copy was updated. This update assumption will not affect our reported numbers for the hit rate improvements due to co-operation, but will reduce our network load numbers. Some proxy servers [11] support the idea of not maintaining a local copy if the peer is easily accessible. With an asymmetric protocol, a miss by proxy X is requested from proxy Y only if Y > X.

We studied each proxy acting independently (i.e., B, M, and S, separately), all three pairs of co-operation (i.e., B+M, B+S, and M+S), and the case when all three proxies co-operated (i.e., B+M+S).

3. Results

While hit rate improvement is an important criterion, we also analyze the other issues (like extra traffic and load on the server) created due to co-operation. Section 3.1 deals with hit rate improvement and can be interpreted as the direct effect of co-operation on the end-user. Section 3.2 deals with trying to quantify the effect of co-operation on intranet load, and Section 3.3 quantifies the effect of co-operation on server load. In Section 4, we put together the observations to draw inferences on management of co-operation.

3.1. Hit rate improvement

One way to look upon the problem is to consider the entire set of three proxies as one unit and measure the total hit rates for this system of three proxies with and without co-operation. Table 2 gives this information.

No co-op.B+SM+SB+MB+M+S
30.1631.2430.3631.9233.08
Table 2. Total hit rate taken over all the three proxies, assuming symmetric co-operation

However, a more informative assessment is to study the impact of co-operation on each proxy individually. The hit rate improvements for the three proxies when considering bytes transferred and number of requests are shown in Table 3.

Loc. Co-operation with
noneS M B full
S20.9022.6825.7926.57
M25.0926.1430.0330.49
B32.6933.4733.9734.64
(a) Hit rates for number of bytes.
Loc. Co-operation with
noneSMBfull
S29.7232.7035.0236.16
M37.6339.32 42.1042.94
B69.9070.4970.7571.21
(b) Hit rates for number of requests.
Table 3. Hit rates for the proxies in various combinations of co-operation. Full co-operation implies that a proxy co-operates with the other two proxies

The numbers reported in these tables are based on the entire 11 days of data. Recall that the hit rate for proxy p in asymmetrically co-operating with other proxies is equivalent to the hit rate obtained for p by symmetrically co-operating with just those q where q > p.

We observe that even with symmetric co-operation, the hit rate improvements are most pronounced for the smaller proxy (S), and virtually non-existent for the largest proxy (B). This is not too surprising, since the proxy serving a larger number of hosts is likely to see different URLs requested and will then be in a position to serve these pages from its local cache. It is interesting to compare the hit rate numbers without co-operation (i.e., the hit rate numbers under the "none" column in Table 3) with comparable studies of hit rate numbers from [3, 7]. Our hit rate numbers are closer to the ones reported in [3] and lower than the ones reported in [7]. This is due possibly to two reasons. First, there seems to be a difference in locality of access between universities (as in the study in [7]) and corporations (as in the study in [3] and here). Second, the access patterns have also been changing over the past few years, with more dynamically generated objects in user requests; the study in [7] was performed with earlier data than our study here.

Since it is well known that Web traffic varies with the day of the week and also within each day, we try to get a finer granularity for reporting hit rates. In Fig. 1, we depict for each proxy server and for each day, the computed hit rate for requests made to the proxy server for that day, with and without co-operation.

Daily hit rate figure for S Daily hit rate figure for M
(a) Proxy: S (b) Proxy: M
Daily hit rate figure for B
  
Loc. Daily hit rate improvement
Max.Min.Avg.
S28.952.877.69
M14.150.844.94
B2.770.751.77
(c) Proxy: B (d) Hit rate improvements
Fig. 1. Daily byte hit rates for each of the three proxies for symmetric co-operation in different configurations. Day 1 was a Friday.

While the general observation still stands (that the smaller proxy has the maximum benefit from co-operation), we make a couple of other interesting observations. It is well-known that hit rates are better with warmer caches. In our case, since most of the significant hit rate improvements due to co-operation are in the later days when the cache is warm, we might infer that co-operation with a warmer cache is also more beneficial. Hit rate improvements in co-operation with B, the cache with the largest set of users, is maximum. The impact on S in co-operating with both B and M is virtually identical to co-operating with B alone. This suggests that indiscriminate co-operation is not a good policy. This issue will be attacked in more detail in subsequent subsections. The significant variation in benefit from co-operation over the days implies the importance of online monitoring and management of co-operation.

3.2. Extra Intranet traffic

While hit rate is an important entity, co-operation also introduces extra traffic on the intranet. Our goal here is to get a handle on the amount of extra traffic generated for both symmetric and asymmetric co-operation, and devise a metric that captures how much of this traffic is "useful".

We calculate the increase in traffic as the extra number of requests generated due to co-operation. By definition, the extra traffic created by proxy p is the number of misses by p when it is in stand-alone mode multiplied by the number of proxies that it queries. The extra traffic on the network is the sum over all proxies of the extra traffic created by each proxy. We define the percentage of useful traffic generated by proxy p as the ratio of the number of requests generated by p that are served by another proxy to the extra traffic created by proxy p. The percentage of useful traffic for the network is defined to be the ratio of the total increase in the number of hits (taken over all proxies) to the extra traffic on the network.

Loc. Co-operation with
SMBfull
S4.247.554.59
M2.727.174.26
B1.962.832.18
(a) For each proxy
ProtocolB+SM+SB+MB+M+S
Symmetric3.403.364.243.18
Asymmetric7.554.247.175.64
(b) For the network
Table 4. The useful extra traffic in different co-operation configurations, expressed as a percentage. Results are for the 11 day period.

In Table 4a we evaluate for each proxy, the percentage of useful extra traffic generated by the proxy. To understand the table better, consider the 4.24% number corresponding to the (S,M) entry. It implies that if proxies S and M co-operate, then 4.24% of the requests created by proxy S for M (due to co-operation) result in hits. This gives a handle on whether S should co-operate with M or not. We observe that, relatively speaking, proxy B is not generating too much useful co-operation traffic. However, when S and M co-operate with B, they create more useful extra traffic. The last column also suggests what we intuitively noted in Section 3.1: Full co-operation in this case is not as useful as just co-operating with B. We also observe that asymmetric co-operation is better than symmetric co-operation from the point of view of usefulness.

Table 4b illustrates from the network's point of view as to how much useful traffic is generated for both symmetric and asymmetric protocols. We note that for the symmetric case, the numbers are uniformly low and there is not a substantial difference between the various co-operating architectures. The asymmetric case is uniformly better than the symmetric case.

3.3. Effect on proxy load

From a system design point of view we need to evaluate if it makes sense for a proxy to co-operate rather than serve more hosts directly. There are two issues that we try to quantify here: the extra load on the server due to co-operation, and a measure of "fairness" in co-operation. Given a proxy p, we let nc denote the number of requests without any co-operation, and wc denote the number of requests with co-operation. (The dependence of nc and wc on proxy p is implicit.)

3.3.1. Extra load on the server

We compute the extra load on a server due to co-operation as the additional number of requests it receives in comparison to the total number of requests it receives without co-operation. That is, the extra load is (wc – nc)/nc. Table 5 gives the extra load on a server in various configurations, for symmetric co-operation.

Loc. Configurations
B+SM+SB+MB+M+S
S202.85%97.58%300.43%
M44.92%129.66%174.58%
B10.43%14.48%24.91%
Table 5. Extra load on the server using a symmetric protocol for various configurations

Notice that the number in the last column will be the sum of the numbers in the previous columns. The corresponding numbers for asymmetric co-operation are obtained from Table 5 by observing that there is no extra load on a smaller server in co-operating with a larger server.

We observe that the larger proxy B does not see an appreciable increase in its load due to co-operation. With symmetric co-operation, the smaller servers see a pronounced increase in their extra load. The numbers suggest that it makes sense to do asymmetric co-operation, if extra load were the consideration.

3.3.2. Fairness

Another issue that arises in any scenario involving sharing or distribution is "fairness". Fairness is an inherently tricky concept to quantify. Intuitively, we want to capture the fact that the gains to a server due to co-operation should not outweigh the extra work it performs; in other words, we want to evaluate if the co-operation is one-sided. This is important since fairness to a server is really measuring fairness to the users that access the server; users of overloaded servers feel the impact in bad performance.

We develop a simple metric to make such an evaluation. We define ri(p) as the relative increase in hit rate as seen by proxy server p due to co-operation in comparison to the extra number of requests it serves, i.e.,

ri(p)=Delta(Number of hits for p)/(wc-nc)

The dependence of ri(p) on the set of co-operating proxies is implicit. Notice that, by definition, wc – nc excludes requests made directly to proxy p, and ri(p) can, therefore, be greater than 1.

A large value for ri(p) suggests that proxy p is involved in a "good" co-operation from its point of view. Various metrics can be defined to put together individual ri(p)'s to obtain a system-wide measure of fairness (e.g., a product function). We do not tackle system-wide fairness in this paper. Intuitively, a system-wide definition of fairness must include the relationship of the ri(p)'s to proxy p's hit rate without co-operation. Asymmetric co-operation implies that the smallest proxy in a co-operating set will have its ri value equal to infinity, and the largest proxy will have its ri value equal to 0.

Using the ri metric, Table 6 shows how useful it is for a server to co-operate.

Configuration ri(p) for proxy p=
S M B
B+S0.0260.057
M+S0.0310.038
B+M0.0340.059
B+M+S0.0210.0300.053
(a) Symmetric co-operation
Configuration ri(p) for proxy p=
S M B
B+Sinfty0
M+Sinfty0
B+Minfty0
B+M+Sinfty0.10
(b) Asymmetric co-operation
Table 6. Measure of fairness in co-operation.

We observe that with symmetric co-operation, the ri's are uniformly low for all servers. Not surprisingly, since the larger proxies serve more requests and encounter more faults (in absolute numbers), they can inundate a smaller proxy. (This can also be inferred from the data in Table 5 on extra load on a server.) They, therefore, end up having larger values of ri. With an asymmetric co-operation, there is no issue of fairness.

4. Observations and inferences

Our first main observation based on the hit rate improvement numbers from Section 3.1 is that hit rates do improve by co-operation. This can be as large as 28.9% and as low as 0.7% when considering requests for a day. The average hit rate improvements varied from 1.77–7.69%. This large variability suggests that a dynamic mechanism for managing the proxies and turning co-operation on and off will be useful. In turn, it implies that the proxies must be managed, and a proxy manager should monitor appropriate information to keep track of the utility of co-operation.

The second observation is based on putting together the hit rate improvement information with the results on extra intranet traffic and load on the server. If the proxies are of different "sizes" (meaning, serving different number of hosts), an asymmetric protocol would be preferable to a symmetric protocol. At the very least, the larger proxies should be more choosy in redirecting their misses to other proxies. Proxies should also have policies on rejecting or ignoring requests from other proxies based on their monitoring of how useful they are in serving the other proxy's requests.

The third observation is that it might be difficult to expect co-operation to be fair. We have observed that an asymmetric protocol is inherently not fair (in the sense that we have defined fairness in this paper), and a symmetric protocol is not particularly good for our measure of fairness. We infer from the numbers in this paper, and intuitively, that when the proxies are not all equal, a symmetric protocol has more chances of dragging everybody down.

The last observation is that even if co-operation can be implemented so that the response is fast, indiscriminate co-operation is not a good idea. Many a time, most of the benefit is obtained by co-operation with only one of many servers, and without care, the performance of the other servers will degrade unnecessarily.

Cache Management. One contribution of this work relates to what metrics proxies (or proxy managers) should employ to keep track of their performance, with respect to adapting co-operation policies. Based on this study, our opinion is that post-processing logs to configure proxy caches is inadequate. While proxy managers can use offline analyses and results of studies like rate of change metrics [2] to set caching policies, online monitoring and management must be provided for effective use of proxies and their features. Our ongoing work on building the PROXYMAN proxy caching manager uses some of the intuitions provided by this study. For PROXYMAN, we have defined a management information base (MIB) for a proxy caching server keeping in mind what monitored variables will be needed by future applications to improve the hit rate of proxy caches. Our prototype builds a manager/agent with capability to provide performance enhancements with intelligent monitoring.

The increasing importance of caching on the Web is well known [10]. We expect that in the future, proxies will have to be more adaptively configurable, incorporate new features (e.g., applet caching, host advertising services, etc.), and use techniques to reduce the cost of misses (by using delta encoding principles, for example [5]). Hit rate improvements directly impact network infrastructure costs and can be used for pricing the value of caching.

5. Conclusions

In this paper, we have analyzed the effect of co-operation between Web proxies using real-life proxy logs. We have studied symmetric and asymmetric co-operation and obtained results on hit rate improvements. We have observed that byte hit rate improvement can vary a lot (from 0.7–28.9%), implying the need for better management of co-operation. We have developed metrics to monitor co-operation, like useful extra network traffic (that varies between 2% and 7.55%, depending on the co-operation method), and increased server load (that varies from 10–300% depending on the proxy size and co-operation method). We have outlined what information could be monitored by caching proxies for effective cache management.

Acknowledgements

We would like to thank Ravi Narayan, Ralph Loura, Alex Podlecki, and Eric Grosse for help in access to the proxy logs used in this paper, and anonymous referees for valuable comments.

References

1
A. Chankhunthod, P.B. Danzig, C. Neerdaels, M.F. Schwartz, and K.J. Worrell, A hierarchical Internet object cache, in: Proceedings of the 1996 Usenix Technical Conference.

2
F. Douglis, A. Feldman, B. Krishnamurthy and J. Mogul, Rate of change and other metrics: a live study of the World Wide Web, in Proc. of the 1st Usenix Symposium on Internet Technologies and Systems (USITS), December 1997.

3
S. Gadde, M. Rabinovich, and J. Chase, Reduce, reuse, recycle: an approach to building large Internet caches, in: Proceedings of the 1997 Conference on Hot Topics in Operating Systems.

4
R. Malpani, J. Lorch and D. Berger, Making World Wide Web caching servers cooperate, in: Proc. of the 4th International World Wide Web Conference, December 1995.

5
J. Mogul, F. Douglis, A. Feldman and B. Krishnamurthy, Potential benefits of delta encoding and data compression for HTTP, in: SIGCOMM 97.

6
D. Wessels and K. Claffy, Internet Cache Protocol (ICP), Version 2, Network Working Group RFC 2186, September 1997.

7
S. Williams, M. Abrams, C.R. Stanridge, G. Abdulla, and E.A. Fox, Removal policies in network caches for World Wide Web documents, in: Proceedings of the 1996 SIGCOMM Conference.

8
Common Logfile Format, http://www.w3.org/Daemon/User/Config/Logging.html/A>.

9
NLANR: A distributed testbed for national info. provisioning, http://ircache.nlanr.net

10
NLANR Web Caching Workshop, http://ircache.nlanr.net/Cache/Workshop97/

11
Squid Internet Object Cache, http://www.nlanr.net/Squid

Vitae


P.K. P. Krishnan (who is called "Krishnan" or "PK") is a Member of Technical Staff at Bell Labs, Lucent Technologies. His research interests include IP management, the development and analysis of algorithms, prefetching and caching, and mobile computing. He received his Ph.D. in Computer Science from Brown University, and his B. Tech in Computer Science from the Indian Institute of Technology, Delhi. He can be reached at pk@research.bell-labs.com.
Binay Binay Sugla is currently Department Head, Network and Services Management Research Department at Bell Labs, Lucent Technologies. Presently, he is working on tools and techniques for IP management. His past work produced tools like the Network Flight Simulator, a real-time simulator for networks, and CAPER, a concurrent application programming environment. He can be reached at sugla@bell-labs.com.