One of the most significant random read IOp workloads will be out-of-RAM index lookups and insertions. There is a lot of interest in this area due to search, rapid file attribute indexing, deduplication, provenance, revision control, client-side cloud caching, and more. Therefore the number of indexes, and the percentage of IOps they will consume is only going to increase.
We propose to conduct a comprehensive study of the performance of the most effective indexing technologies under mixed workloads across a multi-tier cache hierarchy including a client-side FLASH tier, a LAN network storage tier, and a remote backup or cloud tier. We will explore the most influential and popular index technologies to work directly above a FLASH SSD and will provide an instrumentation framework suitable for capturing and analyzing IOp-intensive workloads across a multi-tier storage architecture.
Our results will show which indexing technologies, under which settings, perform best in a multi-tier caching scenario; we will explore various workloads where a network storage device is a component in the cache hierarchy. Our results will be useful for optimally configuring customer sites for indexing workloads, and for optimizing NetApp products which must already perform indexing across multiple tiers (e.g., deduplication). The results of a truly comprehensive survey of various indexing methods across several multi-tier storage hierarchies will be useful also to the entire storage community.