ASM-SpMM: Unleashing the Potential of Arm SME for Sparse Matrix Multiplication Acceleration
Sparse Matrix–Matrix Multiplication (SpMM) is a core kernel in scientific computing, data analytics, and artificial intelligence, supporting applications such as linear solvers and Graph Neural Networks (GNNs). The Scalable Matrix Extension (SME) in Armv9 introduces dedicated matrix acceleration for ARM CPUs, but exploiting its full potential for SpMM requires architecture-aware optimizations to address irregular sparsity and hardware constraints.
We present ASM-SpMM, a high-performance SpMM library co-designed with ARM SME. ASM-SpMM combines a memory-efficient compression format, an SME-aware prefetching kernel optimized for outer-product execution, a hybrid matrix–vector execution strategy, and work-stealing-based dynamic load balancing across heterogeneous cores. Experiments on emerging Armv9 platforms demonstrate up to 7.9× speedup over state-of-the-art SpMM libraries across diverse matrices. A GNN inference case study further shows that ASM-SpMM significantly improves end-to-end performance over widely used GNN frameworks, highlighting the effectiveness of SME-aware SpMM optimization on ARM CPUs.
Tue 3 FebDisplayed time zone: Hobart change
09:50 - 11:10 | Stencil and Sparse Matrix ComputationMain Conference at Pyrmont Chair(s): Shoaib Kamil Adobe Research | ||
09:50 20mTalk | SPIDER: Unleashing Sparse Tensor Cores for Stencil Computation via Strided Swapping Main Conference Qiqi Gu Shanghai Jiao Tong University, Chenpeng Wu Shanghai Jiao Tong University, Heng Shi , Jianguo Yao Shanghai Jiao Tong University; Shanghai Enflame Technology DOI | ||
10:10 20mTalk | ASM-SpMM: Unleashing the Potential of Arm SME for Sparse Matrix Multiplication Acceleration Main Conference Jiazhi Jiang Sun Yat-sen University, Xijia Yao Sun Yat-sen University, Jiayu Chen Sun Yat-sen University, jinhui wei Sun Yat-sen University, Dan Huang , Yutong Lu Sun Yat-sen University DOI | ||
10:30 20mTalk | Exploiting Efficient Mapping and Pipelined Execution for Accelerating SpMV on Tensor Cores Main Conference Kaige Zhang Beihang University, Hailong Yang Beihang University, Xin You Beihang University, Tianyu Feng Beihang University, Yufan Xu Independent Researcher, Zhongzhi Luan Beihang University, Yi Liu Beihang University, Depei Qian Beihang University DOI | ||
10:50 20mTalk | VDHA: Vector-Driven Hash Aggregation for Sparse Matrix-Sparse Vector Multiplication on GPUs Main Conference Yuchen Li Tsinghua University, Zhe Pan Tsinghua University, Peng Qu Tsinghua University, Youhui Zhang Tsinghua University DOI | ||