Distributed Likelihoods Computation for Large Spatial Data
SESSION: Research Poster Reception
EVENT TYPE: Poster
TIME: 5:15PM - 7:00PM
AUTHOR(S):Wei Zhuo, P. Prabhat, Cari Kaufman, Chris Paciorek
ABSTRACT: We investigate the problem of fitting geospatial models to large spatial and climate datasets. The process of fitting a model fundamentally involves efficient computation of likelihoods. An exact solution of the problem for n observations requires computing the determinant and inverse of the nxn covariance matrix, which can be expensive for large n. We examine two modes of parallelization to overcome these limitations: multi-threaded (within single node) and distributed (across multiple nodes).
On a single node, we used the Multi-threaded Cholesky implementation within R/LAPACK/BLAS to achieve a significant performance gain over the single threaded implementation.
For a cluster of compute nodes, we implemented a distributed Cholesky decomposition using Rmpi. The resulting Cholesky decomposition utilized all available cores on a single node, as well as multiple nodes on the cluster. Our preliminary result suggests that the time required for analyzing 32k spatially indexed observations would only take a few hours on a moderate cluster of computing nodes instead of a week on a single core.
Wei Zhuo - Georgia Institute of Technology
P. Prabhat - Lawrence Berkeley National Laboratory
Cari Kaufman - University of California, Berkeley
Chris Paciorek - University of California, Berkeley