Dynamic Detection of Inefficient Data Mapping Patterns in Heterogeneous OpenMP Applications
With the growing prevalence of heterogeneous computing, CPUs are increasingly being paired with accelerators to achieve new levels of performance and energy efficiency. However, data movement between devices remains a significant bottleneck, complicating application development. Existing performance tools require considerable programmer intervention to diagnose and locate data transfer inefficiencies. To address this, we propose dynamic analysis techniques to detect and profile inefficient data transfer and allocation patterns in heterogeneous applications. We implemented these techniques into OMPDataPerf, which provides detailed traces of problematic data mappings, source code attribution, and assessments of optimization potential in heterogeneous OpenMP applications. OMPDataPerf uses the OpenMP Tools Interface (OMPT) and incurs only a 5 % geometric‑mean runtime overhead.
Mon 2 FebDisplayed time zone: Hobart change
15:50 - 17:10 | GPU and Heterogeneous ComputingMain Conference at Pyrmont Chair(s): Frank Mueller North Carolina State University, USA | ||
15:50 20mTalk | PRISM: An Efficient GPU-Based Lossy Compression Framework for Progressive Data Retrieval with Multi-Level InterpolationBest Paper Nominee Main Conference Bing Lu Institute of Computing Technology of Chinese Academy of Sciences, Zedong Liu University of Chinese Academy of Sciences, Hairui Zhao Jilin University, Dejun Luo University of Chinese Academy of Sciences, Wenjing Huang University of Chinese Academy of Sciences, Yida Gu University of Chinese Academy of Sciences, Jinyang Liu University of Houston, Guangming Tan University of Chinese Academy of Sciences, Dingwen Tao Institute of Computing Technology, Chinese Academy of Sciences DOI | ||
16:10 20mTalk | Dynamic Detection of Inefficient Data Mapping Patterns in Heterogeneous OpenMP Applications Main Conference Luke Marzen Iowa State University, Junhyung Shim Iowa State University, Ali Jannesari Iowa State University DOI | ||
16:30 20mTalk | Root-Down Exposure for Maximal Clique Enumeration on GPUs Main Conference DOI | ||
16:50 20mTalk | ROME: Maximizing GPU Efficiency for All-Pairs Shortest Path via Taming Fine-Grained Irregularities Main Conference Weile Luo The Hong Kong University of Science and Technology, Guangzhou, Yuhan Chen The Hong Kong University of Science and Technology, Guangzhou, Xiangrui Yu The Hong Kong University of Science and Technology, Guangzhou, Qiang Wang Harbin Institute of Technology, Shenzhen, Ruibo Fan The Hong Kong University of Science and Technology, Guangzhou, Hongyuan Liu Stevens Institute of Technology, Xiaowen Chu The Hong Kong University of Science and Technology, Guangzhou DOI | ||