Towards Singular Value Decomposition for Rank-Deficient Matrices: An Efficient and Accurate Algorithm on GPU Architectures (PPoPP 2026 - Main Conference)

Sat 31 January - Wed 4 February 2026 Sydney, Australia

co-located with HPCA/CGO/PPoPP/CC 2026

Who

Lu Shi, WeiWei Xu, Shaoshuai Zhang

Track

PPoPP 2026 Main Conference

Time Zone

The program is currently displayed in (GMT+11:00) Hobart.

Use conference time zone: (GMT+11:00) HobartSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 4 Feb 2026 09:50 - 10:10 at Pyrmont - Matrix and Linear Algebra Algorithms Chair(s): William S. Moses

Abstract

Singular Value Decomposition (SVD) is a fundamental tool in numerous scientific and engineering domains. Many high-performance libraries, such as LAPACK, MAGMA, and cuSOLVER, provide general, truncated, and randomized SVD routines. However, when the input is a low-rank matrix whose rank is not explicitly known, existing routines usually treat it as full-rank, which leads to suboptimal performance. In this paper, we propose an efficient SVD algorithm specifically for rank-deficient matrices based on a recently proposed rank-revealing QR factorization, termed QB factorization. To further enhance numerical stability and efficiency, we introduce a Householder QB factorization and a mixed-precision SVD algorithm, accompanied by a rigorous error analysis demonstrating correctness and stability. Experimental results show that our method achieves up to 6978.71x speedup over the general (full) SVD routine in cuSOLVER and is 9.99x faster than randomized SVD in FP32 precision. Moreover, our method exhibits higher numerical accuracy than cuSOLVER full SVD, achieving substantially smaller backward errors while maintaining stable and reliable singular values. Beyond synthetic benchmarks, we also demonstrate its effectiveness in an image compression application with higher efficiency.

DOI

https://doi.org/10.1145/3774934.3786427

Lu Shi

University of Electronic Science and Technology of China

WeiWei Xu

Nanjing University of Information Science and Technology

Shaoshuai Zhang