Algorithm 980: Sparse QR Factorization on the GPU - Texas A&M University (TAMU) Scholar

abstract

Sparse matrix factorization involves a mix of regular and irregular computation, which is a particular challenge when trying to obtain high-performance on the highly parallel general-purpose computing cores available on graphics processing units (GPUs). We present a sparse multifrontal QR factorization method that meets this challenge and is significantly faster than a highly optimized method on a multicore CPU. Our method factorizes many frontal matrices in parallel and keeps all the data transmitted between frontal matrices on the GPU. A novel bucket scheduler algorithm extends the communication-avoiding QR factorization for dense matrices by exploiting more parallelism and by exploiting the staircase form present in the frontal matrices of a sparse multifrontal method.

authors

Davis, Timothy

published proceedings

ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE

author list (cited authors)

Yeralan, S. N., Davis, T. A., Sid-Lakhdar, W. M., & Ranka, S.

citation count

19

complete list of authors

Yeralan, Sencer Nuri||Davis, Timothy A||Sid-Lakhdar, Wissam M||Ranka, Sanjay

publication date

June 2017

publisher

Association for Computing Machinery (ACM) Publisher

published in

ACM Transactions on Mathematical Software Journal

keywords

Algorithms
Experimentation
Gpu
Least-square Problems
Performance
Qr Factorization
Sparse Matrices

Digital Object Identifier (DOI)

10.1145/3065870

start page

1

end page

29

volume

44

issue

2

URL

http://dx.doi.org/10.1145/3065870

Algorithm 980: Sparse QR Factorization on the GPU Academic Article

Overview

abstract

authors

published proceedings

author list (cited authors)

citation count

complete list of authors

publication date

publisher

published in

Research

keywords

Identity

Digital Object Identifier (DOI)

Additional Document Info

start page

end page

volume

issue

Other

URL