JAWS: Job-Aware Workload Scheduling for the Exploration of Turbulence Simulations
SESSION: Runtime Resource Allocation and Scheduling
EVENT TYPE: Paper
TIME: 3:30PM - 4:00PM
SESSION CHAIR: Joel Saltz
AUTHOR(S):Xiaodan Wang, Eric Perlman, Randal Burns, Tanu Malik, Tamas Budavari, Charles Meneveau, Alexander Szalay
ABSTRACT: We present JAWS, a job-aware, data-driven batch scheduler that improves query throughput for data-intensive scientific database clusters. As datasets reach petabyte-scale, workloads that scan through vast amounts of data to extract features are gaining importance in the sciences. However, acute performance bottlenecks result when multiple queries execute simultaneously and compete for I/O resources. Our solution, JAWS, divides queries into I/O-friendly sub-queries for scheduling. It then identifies overlapping data requirements within the workload and executes sub-queries in batches to maximize data sharing and reduce redundant I/O. JAWS extends our previous work by supporting workflows in which queries exhibit data dependencies, exploiting workload knowledge to coordinate caching decisions, and combating starvation through adaptive and incremental trade-offs between query throughput and response time. Instrumenting JAWS in the Turbulence Database Cluster yields nearly three-fold improvement in query throughput when contention in the workload is high.