Optimized sparse Cholesky factorization on hybrid multicore architectures

abstract

2018 Elsevier B.V. We present techniques for supernodal sparse Cholesky factorization on a hybrid multicore platform consisting of a multicore CPU and GPU. The techniques are the subtree algorithm, pipelining and multithreading. The subtree algorithm [15] minimizes PCIe transmissions by storing an entire branch of the elimination tree in the GPU memory (the elimination tree is a tree data structure describing the workflow of the factorization), and also reduces the total kernel launch time by launching BLAS kernels in batches. The pipelining technique overlaps the execution of GPU kernels and PCIe data transfers. The multithreading technique [17] creates multiple threads for both the CPU and the GPU, to utilize concurrency of the elimination tree. Our experimental results on a platform consisting of an Intel multicore processor along with an Nvidia GPU indicate a significant improvement in performance and energy over CHOLMOD (SuiteSparse 4.5.3), a sparse algorithm, after these techniques are applied.

authors

Davis, Timothy

published proceedings

JOURNAL OF COMPUTATIONAL SCIENCE

altmetric score

0.5

author list (cited authors)

Tang, M., Gadou, M., Rennich, S., Davis, T. A., & Ranka, S.

citation count

3

complete list of authors

Tang, Meng||Gadou, Mohamed||Rennich, Steven||Davis, Timothy A||Ranka, Sanjay

publication date

May 2018

publisher

Elsevier Publisher

keywords

Cholesky Factorization
Cuda
Gpu
Sparse Direct Methods
Sparse Matrices

Digital Object Identifier (DOI)

10.1016/j.jocs.2018.04.008

start page

246

end page

253

volume

26

URL

http://dx.doi.org/10.1016/j.jocs.2018.04.008

Optimized sparse Cholesky factorization on hybrid multicore architectures Academic Article

Overview

abstract

authors

published proceedings

altmetric score

author list (cited authors)

citation count

complete list of authors

publication date

publisher

Research

keywords

Identity

Digital Object Identifier (DOI)

Additional Document Info

start page

end page

volume

Other

URL