The File systems and Storage Laboratory (FSL) at Stony Brook's computer science department has extensive experience in storage research. In particular, we have experience in file system and storage benchmarking, file system tracing, and file system trace replaying. (For example, our tracing and replaying software, as well as a number of traces, had been contributed to SNIA's IOTTA Working Group in 2007.)
In this project, we will begin by collecting a number of traces on our own file servers which have several terabytes of developed software (e.g., kernel sources), mail archives, documents, many multimedia files, and hundreds of large VMware virtual machine images---spanning several hundred users in our lab alone. We will collect traces using NFS traffic snooping as well our own Tracefs software. We will also use existing traces such as the Harvard NFS traces, and convert it to replayable formats. Next, we will update our trace-replaying software, Replayfs, so it can replay traces at various speeds, from normal to accelerated speeds.
We will collect an assortment of machines and storage hardware, using much of our existing server-class and workstation-class systems, as well laptops. We will configure them with Linux because Linux supports many different file systems, and will form a good common basis OS for comparison. We will use assorted disk drives from low-end PATA/SATA drives, to high-end SCSI/SAS drives, of various speeds, sizes, and ages. We will carefully document every aspect of the hardware components for accuracy and reproducibility (e.g., we will record minute details such as disk firmware level and manufacturing date, as well as SMART data).
We will replay traces over different hardware and software configurations, and experiment with various aging strategies for disk drives. We will measure overall as well as instantaneous power consumption of the system. We will aim to answer the following questions, among others: what is the impact of disk types and sizes on power consumption? How does the file system age affect power use? What file systems are tend to consume more or less power, and under which access patterns? Is there a way to tune or reformat a file system to consume less power? Are there any kernel or disk configuration parameters that can conserve power beyond typical APM/ACPI systems? How do extra virtualization layers affect power consumption (LVM, RAID, virtual machines, etc.)? (Given enough time, we may also investigate the same on other operating systems, such as AIX, Windows, BSD, Solaris, etc.).
We plan to collect all the data, release it publicly, and publish at least one top-quality conference paper on this research.
The scope of the above project is one year. One possibility to continue this project into future year(s) is to develop new file systems whose main design purpose is power conservation.