PPoPP 2026
Sat 31 January - Wed 4 February 2026 Sydney, Australia
co-located with HPCA/CGO/PPoPP/CC 2026
Tue 3 Feb 2026 14:30 - 14:50 at Pyrmont - Parallel Algorithms Chair(s): Kenjiro Taura

Automatic Differentiation (AD) is a technique that computes the derivatives of numerical programs by systematically applying the chain rule, playing a critical role in domains such as machine learning, simulation, and control systems. However, parallelizing differentiated programs remains a significant challenge due to the \textbf{conflict between tapes (a data structure for intermediate variable storage) and summations}: the differentiation process inherently introduces inter-thread summation patterns, which require prohibitively expensive atomic operations; and traditional tape designs tightly couple data retrieval with the program’s control flow, preventing code restructuring needed to eliminate these costly dependencies.

To address these challenges, we present ParDiff, a novel AD system with a direct-indexed tape design, which enables summation-aware loop transformations and various parallel schemes for differentiated programs. This results in a higher degree of parallelization, less synchronization, and reduced inter-thread data movement. We conduct comprehensive experiments on both multi-core CPUs and GPUs. Results show that ParDiff delivers up to $483.21\times$ (geometric mean: $30.88\times$) speedup over the state-of-the-art fully-AD system, Enzyme. It also achieves a speedup of $2.05\times$ and $2.06\times$ over PyTorch on CPU and GPU, respectively. The source code is publicly available at \url{https://github.com/roastduck/FreeTensor}.

Tue 3 Feb

Displayed time zone: Hobart change

14:10 - 15:30
Parallel AlgorithmsMain Conference at Pyrmont
Chair(s): Kenjiro Taura The University of Tokyo
14:10
20m
Talk
Pipelonk: Accelerating End-to-End Zero-Knowledge Proof Generation on GPUs for PLONK-Based Protocols
Main Conference
Zhiyuan Zhang Shandong University, Yanxin Cai Shandong University, Wenhao Yin Shandong University, Xueyu Wu The University of Hong Kong, Yi Wang Shenzhen University, Lei Ju Shandong University, Zhuoran Ji Shandong University
DOI
14:30
20m
Talk
ParDiff: Efficiently Parallelizing Reverse-Mode Automatic Differentiation with Direct Indexing
Main Conference
Shuhong Huang Tsinghua University, Shizhi Tang Qingcheng.AI, Yuan Wen University of Aberdeen, Huanqi Cao Tsinghua University, Ruibai Tang Tsinghua University, yidong chen , Jiping Yu Tsinghua University, Yang Li Lenovo Research, Chao Jiang Lenovo Research, Limin Xiao Lenovo Research, Jidong Zhai Tsinghua University
DOI
14:50
20m
Talk
Faster and Cheaper: Pushing the Sequence Alignment Throughput with Commercial CPUs
Main Conference
Zhonghai Zhang Institute of Computing Technology, Chinese Academy of Sciences / University of Chinese Academy of Sciences, Yewen Li The Hong Kong University of Science and Technology, Ke Meng Chinese Academy of Sciences, Chunming Zhang Institute of Computing Technology, Chinese Academy of Sciences, Guangming Tan University of Chinese Academy of Sciences
DOI
15:10
20m
Talk
PIM-zd-tree: A Fast Space-Partitioning Index Leveraging Processing-in-Memory
Main Conference
Yiwei Zhao Carnegie Mellon University, Hongbo Kang Tsinghua University, Ziyang Men University of California, Riverside, Yan Gu University of California, Riverside, Guy E. Blelloch Carnegie Mellon University, Laxman Dhulipala University of Maryland, College Park, Charles McGuffey Reed College, Phil Gibbons Carnegie Mellon University
DOI