TAC: Cache-Based System for Accelerating Billion-Scale GNN Training on Multi-GPU Platform
Graph neural networks (GNNs) have been proven to have increasingly widespread applications in the real world. In the mainstream mini-batch training mode, multiple cache-based GNN training acceleration systems have been proposed because of the possibility of selecting the same vertex multiple times during the sampling process. However, on ultra-large scale graphs, especially those exhibiting power-law characteristics, these systems are difficult to fully utilize the distribution characteristics of cached data, which limits training performance. To this end, we propose TAC, a GNN training acceleration system that fully exploits the distribution characteristics of cached data to optimize both data transmission and computational efficiency. Specifically, we have designed a data affinity optimization algorithm that significantly enhances the locality of cache access. Secondly, an adaptive sparse matrix operator for sparsity perception is proposed, which dynamically selects the optimal computing mode based on the location of data. Finally, we have constructed a fine-grained training pipeline that maximizes system parallelism by hiding the sampling and computation. The experimental results show that TAC significantly outperforms existing state-of-the-art cache acceleration systems on multiple benchmark datasets, demonstrating higher training efficiency.
Tue 3 FebDisplayed time zone: Hobart change
15:50 - 17:10 | Graphs and Graph Neural NetworksMain Conference at Pyrmont Chair(s): Ali Jannesari Iowa State University | ||
15:50 20mTalk | ElasGNN: An Elastic Training Framework for Distributed GNN Training Main Conference Siqi Wang Beihang University, Hailong Yang Beihang University, Pengbo Wang Beihang University, Hongliang Cao Beihang University, Yufan Xu Independent Researcher, Xuezhu Wang Beihang University, Zhongzhi Luan Beihang University, Yi Liu Beihang University, Depei Qian Beihang University DOI | ||
16:10 20mTalk | APERTURE: Algorithm-System Co-optimization for Temporal Graph Network Inference Main Conference Yiqing Wang Beihang University, Hailong Yang Beihang University, Enze Yu Beihang University, Qingxiao Sun Beihang University, Kejie Ma Beihang University, Kaige Zhang Beihang University, chenhao xie Beihang University, Depei Qian Beihang University DOI | ||
16:30 20mTalk | TAC: Cache-Based System for Accelerating Billion-Scale GNN Training on Multi-GPU Platform Main Conference Zhiqiang Liang , Hongyu Gao , Fang Liu Computer Network Information Center, Chinese Academy of Sciences,University of Chinese Academy of Sciences, Jue Wang Computer Network Information Center, Chinese Academy of Sciences;University of Chinese Academy of Sciences, Xingguo Shi University of Chinese Academy of Sciences, Juyu Gu University of Chinese Academy of Sciences, Peng Di Ant Group & UNSW, San Li University of Chinese Academy of Sciences, Lei Tang University of Chinese Academy of Sciences, Chunbao Zhou University of Chinese Academy of Sciences, Lian Zhao University of Chinese Academy of Sciences, yangang wang University of Chinese Academy of Sciences, Xuebin Chi University of Chinese Academy of Sciences DOI | ||
16:50 20mTalk | DTMiner: A Data-Centric System for Efficient Temporal Motif Mining Main Conference hou yinbo Huazhong University of Science and Technology, Hao Qi Huazhong University of Science and Technology, Ligang He University of Warwick, Jin Zhao Huazhong University of Science and Technology, Yu Zhang School of Computer Science and Technology, Huazhong University of Science and Technology, Hui Yu Hong Kong University of Science and Technology, Longlong Lin Southwest University, Lin Gu Huazhong University of Science and Technology, Wenbin Jiang Huazhong University of Science and Technology, XIAOFEI LIAO Huazhong University of Science and Technology, Hai Jin Huazhong University of Science and Technology DOI | ||