Accelerating sparse Cholesky factorization on GPUs - Texas A&M University (TAMU) Scholar

abstract

2016 Elsevier B.V. Sparse factorization is a fundamental tool in scientific computing. As the major component of a sparse direct solver, it represents the dominant computational cost for many analyses. For factorizations which involve sufficient dense math, the substantial computational capability provided by GPUs (Graphics Processing Units) can help alleviate this cost. However, for many other cases, the prevalence of small/irregular dense math and the relatively slow communication between the host and device over the PCIe bus, make it challenging to significantly accelerate sparse factorization using the GPU. In this paper we describe a left-looking supernodal Cholesky factorization algorithm which permits improved utilization of the GPU when factoring sparse matrices. The central idea is to stream subtrees of the elimination tree through the GPU and perform the factorization of each subtree entirely on the GPU. This avoids the majority of the PCIe communication without the need for a complex task scheduler. Importantly, within these subtrees, many independent, small, dense operations are batched to minimize kernel launch overhead and many of these batched kernels are executed concurrently to maximize device utilization. Performance results for commonly studied matrices are presented along with suggested actions for further optimization.

authors

Davis, Timothy

published proceedings

PARALLEL COMPUTING

author list (cited authors)

Rennich, S. C., Stosic, D., & Davis, T. A.

citation count

31

complete list of authors

Rennich, Steven C||Stosic, Darko||Davis, Timothy A

publication date

January 2016

publisher

Elsevier Publisher

published in

Parallel Computing Journal

keywords

Cholesky
Factorization
Gpu
Parallel
Sparse

Digital Object Identifier (DOI)

10.1016/j.parco.2016.06.004

start page

140

end page

150

volume

59

URL

http://dx.doi.org/10.1016/j.parco.2016.06.004

Accelerating sparse Cholesky factorization on GPUs Academic Article

Overview

abstract

authors

published proceedings

author list (cited authors)

citation count

complete list of authors

publication date

publisher

published in

Research

keywords

Identity

Digital Object Identifier (DOI)

Additional Document Info

start page

end page

volume

Other

URL