Managing Variability in the I/O Performance of Petascale Storage Systems
SESSION: System I/O Optimization
EVENT TYPE: Paper
TIME: 2:00PM - 2:30PM
SESSION CHAIR: Tevfik Kosar
AUTHOR(S):Jay Lofstead, Fang Zheng, Qing Liu, Scott Klasky, Ron Oldfield, Todd Kordenbrock, Karsten Schwan, Matthew Wolf
ABSTRACT: Significant challenges exist for achieving peak or even consistent levels of performance when using I/O systems at scale. They stem from sharing I/O system resources across the processes of single large-scale applications and/or multiple simultaneous programs causing internal and external interference, which in turn, causes substantial reductions in I/O performance. This paper presents interference effects measurements for two different file systems at multiple supercomputing sites. These measurements motivate developing a `managed' I/O approach using adaptive algorithms varying the I/O system workload based on current levels and use areas. An implementation of these methods deployed for the shared, general scratch storage system on Oak Ridge National Laboratory machines achieves higher overall performance and less variability in both a typical usage environment and with artificially introduced levels of `noise'. The latter serving to clearly delineate and illustrate potential problems arising from shared system usage and the advantages derived from actively managing it.