Enabling Parallel Filesystem Semantics In The Cloud

AUTHOR(S):Milo Polte, Esteban Molina-Estolan, John Bent, Garth Gibson, Carlos Malztzahn, Maya Gokhale, Scott Brandt

Although cloud filesystems provide appropriate performance and resilience for the highly parallel web applications designed for them, their APIs are often unsuitable for the I/O patterns of scientific applications. Scientific applications typically expect a POSIX-like interface and support for semantics such as concurrent writers and out of order writes, which are not generally supported by cloud filesystems. In this poster, we present an interposition layer technique allowing unmodified scientific applications to run on cloud filesystems. This interposition layer transparently decouples concurrent writers accessing a shared file into efficient, log-style writers maintaining individual data files, while still providing the application with a view of a single, flat file. These techniques enhance the semantics of a cloud filesystem with support for concurrent, non-sequential writing, enabling previously unsupported scientific applications. The technique is currently being implemented as a FUSE filesystem running on top of the HDFS cloud filesystem.

Milo Polte - Carnegie Mellon University

Esteban Molina-Estolan - University of California, Santa Cruz

John Bent - Los Alamos National Laboratory

Garth Gibson - Carnegie Mellon University

Carlos Malztzahn - University of California, Santa Cruz

Maya Gokhale - Lawrence Livermore National Laboratory

Scott Brandt - University of California, Santa Cruz

