AUTHOR(S):John Bresnahan, Kate Keahey, Tim Freeman, David LaBissoniere
ABSTRACT: Amazon's S3 protocol has emerged as the de-facto interface for storage in the commercial cloud. However, it is closed source and unavailable to the numerous data centers actively used for science. Just as Amazon's S3 provides reliable storage access to commercial users, scientific data centers must provide their users with a similar level of service. Ideally scientific data centers could allow the use of the same clients and protocols that have proven effective to Amazon's users, but can the S3 interface compare to the data cloud transfer services used in today's computational centers? Does it have the feature set needed to support the scientific community, and if not can it be extended to include them without loss of compatibility? Can it scale and distribute resources equally when presented with common scientific usage patterns?
We address these questions by experimenting with Cumulus: an open source implementation of the S3 REST API.