next up previous
Next: 2 Elastic Quota Usage Up: The Design and Implementation Previous: The Design and Implementation

1 Introduction

The increasing ubiquity of Internet access and the increasing number of online information sources are fueling a growing demand for storage capacity. Vast amounts of multimedia data are ever more widely available to users who continue to download more of it. As the storage demands of users and the storage capacity of installations increases, the complexity of managing storage effectively now dominates the total cost of ownership, which is five to ten times more than the original purchase cost [7]. This is particularly true in large multi-user server computing environments, where allocation of storage capacity should ideally be done in a manner that effectively utilizes the relevant resource while multiplexing it among users to prevent monopolization by a single user. Although the cost per megabyte of disk space has been on the decline, any given installation will have a finite amount of disk space that needs to be partitioned and used effectively. While there is ongoing work in developing better mechanisms for partitioning resources such as processor cycles and network bandwidth [1,5,11,17,19,30], file system resource management has not received as much attention.

Quotas are perhaps the most common form of file system resource management. However, there are two fundamental problems with this simple fixed-size limit disk-allocation scheme because it does not effectively account for the variability of disk usage among users and across time. The first problem is that in a large heterogeneous setting, some users will use very little of their quota, whereas experienced users, regardless of their quota, will find their quotas too constraining. Assuming a large environment where having quotas is required due to administrative costs, a potentially substantial portion of the disk is allocated to users who will not use their allocation, and is thus wasted. The second problem is that users' disk usage is often highly variable and much of this variability is caused by the creation of files that are short-lived; eighty percent of files have lifetimes of only a few seconds [8,18,29]. As a result, the vast majority of files created have no long term impact on available disk capacity, yet the current quota system would disallow such file operations when a quota limit has been reached even when there may be available disk capacity. The existence of a separate storage for temporary files is often ineffective because its separate file name-space requires complex reconfiguration of applications to use it, not to mention its tendency to require administrator intervention to avoid it being filled up. The result is often frantic negotiation with a system administrator at the least opportune time for additional space that wastes precious human resources, both from a user's and an administrator's perspective.

Traditional file systems operate on the assumption that all data written is of equal importance. Quota systems then place the burden of removing unwanted files on the user. However, users often have critical data and non-critical data storage needs. Unfortunately, it is not uncommon for a user to forget the reasons for storing a file, and in an effort to make space for an unimportant file, delete important ones. In a software development environment, an example of critical data would be source code, whereas non-critical data would be the various object files and other by-products that are generally less interesting to the developer. Deletion of the former would be devastating whereas the latter are generally only of interest during compilation. This burden of file management becomes only more complex for the user as improving storage capacity gives users the ability to store, and the need to manage, many more files than previously possible.

To provide more flexible file system resource management, we introduce elastic quotas. Elastic quotas provide a mechanism for managing temporary storage that improves disk utilization in multi-user environments, allowing some users to utilize otherwise unused space, with a mechanism for reclaiming space on demand. Elastic quotas are based on the assumption that users occasionally need large amounts of disk space to store non-critical data, whereas the amount of essential data a user keeps is relatively constant over time. As a result, we introduce the idea of an elastic file, a file for storing non-critical data whose space can be reclaimed on demand. Disk space can be reclaimed in a number of ways, including removing the file, compressing the file, or moving the file to slower, less expensive tertiary storage and replacing the file with a link to the new storage location [26]. In this paper, we focus on an elastic space reclamation model based on removing files.

Elastic quotas allow hard limits analogous to quotas to be placed on a user's persistent file storage whereas space for elastic files is limited only by the amount of free space on disk. Users often know in advance what files are non-critical and it is to the users' benefit to take advantage of such knowledge before it is forgotten. Elastic quotas allow files to be marked as elastic or persistent at creation time, later providing system-wide automatic space reclamation of elastic files as the disk fills up. Files can be re-classified as elastic or persistent after creation as well using common file operations. Elastic files do not need to be stored in a designated location, but instead users can make use of any locations in their normal file system directory structure for such files. A system-wide daemon reclaims disk space consumed by elastic files in a manner that is flexible enough to account for different cleaning policies for each user.

Elastic quotas are particularly applicable to any situation where a large amount of space is required for data that is known in advance to be temporary. Examples include Web browser caches, decoded MIME attachments, and other replaceable data. For instance, files stored in Web browser disk caches can be declared elastic so that such caches no longer need to be limited in size; instead, cached elastic data will be automatically removed if disk space becomes scarce. Users then benefit from being able to employ larger disk caches with potentially higher cache hit rates and reduced Web access latency without any concern about such cached data reducing their usable persistent storage space.

We designed elastic quotas to be simple to implement and install, requiring no modification of existing file systems or operating systems. The main component of the elastic quota system is the Elastic Quota File System (EQFS). EQFS is a thin stackable file system layer that can be stacked on top of any existing file system exporting the Virtual File System (VFS) interface [14] such as UFS [16] or EXT2FS [4]. EQFS stores elastic and persistent files separately in the underlying file system and presents a unified view of these files to the user. It makes novel use of the user ID space to provide efficient per user disk usage accounting of both persistent and elastic files using the existing quota framework in native file systems. A secondary component of the elastic quota system is the rubberd file system cleaner. Rubberd is a user-level program that cleans up elastic files when disk space becomes scarce. We have implemented a prototype elastic quota system in Sun's Solaris operating system and measured its performance on a variety of workloads. Our results on an untuned elastic quota system prototype show that our system provides its useful elastic functionality with low overhead compared to a commercial UFS implementation.

This paper describes the design and implementation of elastic quotas and is organized as follows. Section 2 describes the system model of how elastic quotas are used. Section 3 describes the design of the Elastic Quota File System. Section 4 describes the rubberd file system cleaner. Section 5 presents measurements and performance results comparing an elastic quota prototype we implemented in Solaris 9 to the Solaris 9 UFS file system. Section 6 discusses related work. Finally, in Section 7 we present some concluding remarks and directions for future work.

next up previous
Next: 2 Elastic Quota Usage Up: The Design and Implementation Previous: The Design and Implementation
Erez Zadok 2002-06-21