Towards Singular Value Decomposition for Rank-Deficient Matrices: An Efficient and Accurate Algorithm on GPU Architectures
Singular Value Decomposition (SVD) is a fundamental tool in numerous scientific and engineering domains. Many high-performance libraries, such as LAPACK, MAGMA, and cuSOLVER, provide general, truncated, and randomized SVD routines. However, when the input is a low-rank matrix whose rank is not explicitly known, existing routines usually treat it as full-rank, which leads to suboptimal performance. In this paper, we propose an efficient SVD algorithm specifically for rank-deficient matrices based on a recently proposed rank-revealing QR factorization, termed QB factorization. To further enhance numerical stability and efficiency, we introduce a Householder QB factorization and a mixed-precision SVD algorithm, accompanied by a rigorous error analysis demonstrating correctness and stability. Experimental results show that our method achieves up to 6978.71x speedup over the general (full) SVD routine in cuSOLVER and is 9.99x faster than randomized SVD in FP32 precision. Moreover, our method exhibits higher numerical accuracy than cuSOLVER full SVD, achieving substantially smaller backward errors while maintaining stable and reliable singular values. Beyond synthetic benchmarks, we also demonstrate its effectiveness in an image compression application with higher efficiency.
Wed 4 FebDisplayed time zone: Hobart change
09:50 - 11:10 | Matrix and Linear Algebra AlgorithmsMain Conference at Pyrmont Chair(s): William S. Moses University of Illinois Urbana-Champaign | ||
09:50 20mTalk | Towards Singular Value Decomposition for Rank-Deficient Matrices: An Efficient and Accurate Algorithm on GPU Architectures Main Conference Lu Shi University of Electronic Science and Technology of China, WeiWei Xu Nanjing University of Information Science and Technology, Shaoshuai Zhang University of Electronic Science and Technology of China DOI | ||
10:10 20mTalk | A Diagonal Block Memory-Aware Polynomial Preconditioner for Linear and Eigenvalue Solvers Main Conference Xiaojian Yang National University of Defense Technology, Yuhui Ni National University of Defense Technology, Fan Yuan Xiangtan University, Shengguo Li National University of Defense Technology, Dezun Dong NUDT, xuchuanfu National University of Defense Technology, Haipeng Jia Jia, Jie Liu National University of Defense Technology DOI | ||
10:30 20mTalk | A Distributed Matrix-Block-Vector Multiplication in Presence of System Performance Variability Main Conference Yuchen Ma College of William & Mary, Bin Ren College of William & Mary, Andreas Stathopoulos College of William & Mary DOI | ||
10:50 20mTalk | Characterizing Matrix Multiplication Units across General Parallel Patterns in Scientific Computing Main Conference Yuechen Lu China University of Petroleum-Beijing, Hongwei Zeng , Marc Casas Barcelona Supercomputing Center, Weifeng Liu China University of Petroleum-Beijing DOI | ||