Several recent studies have pointed out that file I/Os can be a major performance bottleneck for some large Web servers. Large I/O buffer caches often do not work effectively for large servers. This paper presents a novel, light-weight, temporary file system called TFS that can effectively improve I/O performance for large servers with minimal cost. TFS is a much more cost-effective scheme than the full caching policy for large servers. It is a user-level application that manages files on a raw disk and works in conjunction with a regular file system as an I/O accelerator. Since the entire system runs in the user space, it is easy and inexpensive to implement and maintain. It also has good portability. TFS uses a novel disk storage subsystem called Cluster-structured Storage System (CSS) to manage files. CSS uses only large disk reads and writes and does no have garbage collection problems. Comprehensive trace-driven simulation experiments show that, TFS achieves up to 160% better system throughput and reduces at up to 77% I/O latency per URL operation than that in a traditional Unix Fast File System in large Web servers.
Various recent studies [1] have shown that disk I/O is one of the important performance bottlenecks for large Web servers. One problem with full caching (cache all Web files in very large RAM) is that the solution would be very expensive at high throughput levels. This is because that the file system size often increases linearly with the server throughput.
Another important change in large-scale Web server traffic is that, client browser caches and proxy servers filter out most access locality at server side. It makes the Least Recently Used (LRU) cache management policy less effective. Greedy-Dual algorithm developed by Cao et al. is a better algorithm to consider locality, size and network latency/cost [2]. But it does not consider how to make I/O read intensive workload (an important character in largescale Web servers) faster.
Other recent techniques such as content distributed network (CDN) or reverse proxy server aim to improve Web server performance by dynamic load balancing, but are not much helpful for I/O problems.
For the above-stated reasons, we argue that, rather than caching a huge file data set in RAM , we should pay more attention to the efficiencies of secondary storage management. We aim to design a cost-effective, high performance I/O subsystem with only a modest I/O buffer cache (such as 256 MB) to replace a full caching scheme (which requires 4 GB of RAM or more) for largeWeb servers. The I/O performance of this new I/O subsystem would reach that of full caching strategy.
Currently most Web servers are designed to run on top of a general purpose file system. Such a design has several inherent performance disadvantages: they cannot efficiently handle small files, leading to very poor performance; there are very few techniques studied to improve the I/O read performance since read requests dominate the file system traffic in Web server workloads; current solutions do not have effective prefetching functions that can be very helpful in Web server workloads with good spatial locality; the meta-data overhead also poses a serious problem for regular file systems.
In order to solve the above-mentioned problems, we may also develop a new file system that is optimized for Web servers to replace the file system of the host OS. Since it is too expensive to implement and maintain a new file system in the kernel, we propose to let Web servers to use raw disks to manage their own data and meta-data and by-pass the regular file systems (RFS). For reasons of compatibility, simplicity and easy implementation, we develop a Temporary File System (TFS) for large Web servers as an accelerator. TFS works in front of a RFS to service repeated file accesses while RFS still handles issues like file creations, invalidations, security and etc. When a file is read for the first time, the system sends another file copy from the RFS to the TFS (resides on a raw disk partition). All later accesses to this file will be handed over to TFS until the file is modified or deleted. The entire process is completely transparent to users. The advantages of such a design are simple but more efficient implementations and good portability.
The system consists of the following major components. They are discussed in details in the rest of this section.
Figure 1 shows the illustration of the system architecture.
We used HTTP trace-driven simulation to evaluate the TFS performance. The Unix FFS mounted on the asynchronous mode is chosen as the baseline system. The FFS simulator is built on top of the Greg Ganger's DiskSim. We chose a Quantum Atlas 10K 9.1 GB as the basic model (the largest model provided by DiskSim). It has 40 MB/sec date transfer rate and 4.7 ms seek time. The TFS system simulator consists of a TFS simulator and a RFS (FFS) simulator. To ensure fairness, the I/O buffer cache size of the baseline system (FFS) is the sum of the I/O buffer cache size of RFS and the TFS buffer cache size.
We obtained a very recentWeb trace, UCBSep01, from the HTTP server of the Computer Science Department of University of California at Berkeley. We also obtained two other small traces,World-Cup98 and ClarkNet, from the Internet Achieve and expanded by linear insertions.
We define URL latency equaling to the sum of URL I/O latency and CPU processing time (including idle time). Figure 2 compares the average disk latency per URL request, calculated by dividing the total disk I/O latency by the total request number. We can clearly see that TFS achieves dramatical performance improvements over FFS -async, reducing the I/O latency per URL request by 18-65%.
Figure 2: Average Disk I/O Latency per URL Request between TFS and FFS
TFS shares some insights with its sister project called UCFS [3], a user-level customized file system to boost I/O performance for Web proxy servers. The main differences are TFS is optimized for Web servers which are almost read-only, while UCFS is for proxy servers, which also see heavy write traffic. Kant and Won [1] examined the characteristics of SPECWeb99 benchmark with respect to its file-caching properties and suggested to address the efficiencies of file cache management, I/O management, and I/O subsystem components instead of attempting to cache huge portions of the file-set.
This paper presents a novel light-weight, custom temporary file system accelerator called TFS for large Web servers. Extensive simulation confirms that TFS can drastically improve large Web server performance.