PPoPP 2026
Sat 31 January - Wed 4 February 2026 Sydney, Australia
co-located with HPCA/CGO/PPoPP/CC 2026
Wed 4 Feb 2026 10:50 - 11:10 at Pyrmont - Matrix and Linear Algebra Algorithms Chair(s): William S. Moses

Matrix multiplication units (MMUs) in modern parallel processors enable efficient execution of tiled matrix multiplications at varying precisions. While their effectiveness in AI workloads has been well demonstrated, their utility in scientific computing lacks systematic analysis. In this work, we characterize MMUs across a broad range of scientific computing patterns by evaluating performance, power consumption, numerical precision, and memory access behavior. To support this analysis, we develop Cubie, a comprehensive benchmark suite comprising ten MMU-optimized kernels of key parallel patterns. We also categorize MMU utilization patterns into four quadrants and identify the MMU limitations that arise in scientific computing. Through detailed comparisons with vector units, we provide nine key observations on the behavior and implications of MMUs in general scientific workloads, offering valuable insights for architecture, algorithm, and application researchers.

Wed 4 Feb

Displayed time zone: Hobart change

09:50 - 11:10
Matrix and Linear Algebra AlgorithmsMain Conference at Pyrmont
Chair(s): William S. Moses University of Illinois Urbana-Champaign
09:50
20m
Talk
Towards Singular Value Decomposition for Rank-Deficient Matrices: An Efficient and Accurate Algorithm on GPU Architectures
Main Conference
Lu Shi University of Electronic Science and Technology of China, WeiWei Xu Nanjing University of Information Science and Technology, Shaoshuai Zhang University of Electronic Science and Technology of China
DOI
10:10
20m
Talk
A Diagonal Block Memory-Aware Polynomial Preconditioner for Linear and Eigenvalue Solvers
Main Conference
Xiaojian Yang National University of Defense Technology, Yuhui Ni National University of Defense Technology, Fan Yuan Xiangtan University, Shengguo Li National University of Defense Technology, Dezun Dong NUDT, xuchuanfu National University of Defense Technology, Haipeng Jia Jia, Jie Liu National University of Defense Technology
DOI
10:30
20m
Talk
A Distributed Matrix-Block-Vector Multiplication in Presence of System Performance Variability
Main Conference
Yuchen Ma College of William & Mary, Bin Ren College of William & Mary, Andreas Stathopoulos College of William & Mary
DOI
10:50
20m
Talk
Characterizing Matrix Multiplication Units across General Parallel Patterns in Scientific Computing
Main Conference
Yuechen Lu China University of Petroleum-Beijing, Hongwei Zeng , Marc Casas Barcelona Supercomputing Center, Weifeng Liu China University of Petroleum-Beijing
DOI