LightSpMV is a novel CUDA-compatible sparse matrix-vector multiplication (SpMv) algorithm using the standard compressed sparse row (CSR) storage format. We have evaluated LightSpMV using various sparse matrices and further compared it to the CSR-based SpMV subprograms in the leading CUSP and cuSPARSE. Performance evaluation reveals that on a single Tesla K40c GPU, LightSpMV is superior to both CUSP and cuSPARSE, with a speedup of up to 2.60 and 2.63 over CUSP, and up to 1.93 and 1.79 over cuSPARSE for single and double precision, respectively.
Note: for the presen time users can refer to my example code for the graph PageRank algorithm to know how to use LightSpMV template class (see files main.cu, LigthSpMVCore.h and PageRankLightSpMV.cu).
- Latest release (v1.0)
More details about the changes in this version are available at ChangeLog.
- Sparse matrices
The set of sparse matrices used in our publications.
- PageRank example code
I have implemented the graph PageRank algorithm using the following four SpMV implementations: LigthSpMV, CUSP, cuSparse and ViennelCL. For the present times, users can read this simple example code to know how to embed the aforementioned SpMV implementations into existing code.
- Referred poster
- Yongchao Liu, Jorge Gonzalez-Dominguez, Bertil Schmidt: "Faster compressed sparse row (CSR)-based sparse matrix-vector multiplication using CUDA". GPU Technology Conference (GTC 2015), USA
- Yongchao Liu and Bertil Schmidt: "LightSpMV: faster CSR-based sparse matrix-vector multiplication on CUDA-enabled GPUs". 26th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP 2015), 2015, pp. 82-89
- Yongchao Liu and Bertil Schmidt: "LightSpMV: faster CUDA-compatible sparse matrix-vector multiplication using compressed sparse rows". Journal of Signal Processing Systems, 2017, doi:10.1007/s11265-016-1216-4.
- -i <string> sparse matrix A file (in Matrix Market format)
- -x <string> vector X file (one element per line) [otherwise, set each element to 1.0]
- -y <string> vector Y file (one element per line) [otherwise, set each element to 0.0]
- -o <string> output file (one element per line) [otherwise, no output]
- -a <float> alpha value, default = 1
- -b <float> beta value, default = 1
- -f <int> formula used, default = 1
- 0: y = Ax
- 1: y = alpha * Ax + beta * y
- -r <int> select the routine to use, default = 1
- 0: vector-based row dynamic distribution
- 1: warp-based row dynamic distribution
- -d <int> double-precision floating point, default = 0
- -g <int> index of the single GPU used, default = 0
- -m <int> number of SpMV iterations, default = 1000
- CUDA 6.5 toolkit
- CUDA-enabled GPUs with compute capability 3.0 or higher
Download and compiling
- Download the source code tarball
- Uncompress using the "tar -zxvf" command
- Type command "make" to compile the program
LightSpMV accepts sparse matrices stored in Matrix Market file format, and performs SpmV in memory using the standard CSR format.
- ./lightspmv -i matrix.mm
- ./lightspmv -i matrix.mm -m 1 -d 1
- ./ilghtspmv -i matrix.mm -m 1 -f 0 -o out.y
If any questions or improvements, please contact Liu Yongchao (Email: yliu860 (at) gatech (dot) edu).