Next: server statistics for Up: Web traffic characterization: an Previous: motivation

approach

We examine the www access log files for the NCSA Mosaic server for two days: 2-3 August 1994. Although NCSA is only one of thousands of web servers, and two days only a brief snapshot of its usage, the high volume of requests that it services renders it an ideal location to investigate the potential of caching strategies and methodologies that could benefit this as well as many other servers.

The NCSA web server consists of four servers in a cluster. During this two-day period, the server cluster received 837,046 requests, for a total of approximately 14.3 gigabytes leaving the server, although from among 8,949 unique documents (approximately 400 megabytes). The wide geographic diversity of query sources present a strong case for deployment of geographically distributed caching mechanisms to improve server and network efficiency.

We first aggregate the logs from all four servers, and depict a time series of bandwidth and transaction demands for the cluster. We then break down the transaction load by geographic zones. We analyze the impact of caching the results of queries from the same zones, in terms of reduction of transactions and bandwidth from the main server. We investigate multiple timeouts for cached documents, and discuss implications for cache maintainers regarding optimal timeouts or cache configurations.

We also discuss future questions that caching inevitably poses, such as how to redirect queries initially destined for a central server to a preferred cache site. Preference of a cache site may be a function of not only geographic proximity, but also current load on nearby servers or network links. Such refinements in the web architecture will be essential to the stability of the network as the web continues to grow, and operational geographic analysis of queries to information resource servers will be fundamental to its effective evolution.



Next: server statistics for Up: Web traffic characterization: an Previous: motivation


kc@
Thu Sep 15 22:53:05 PDT 1994