HPCA/CGO/PPoPP/CC 2026
Sat 31 January - Wed 4 February 2026 Sydney, Australia

This program is tentative and subject to change.

You're viewing the program in a time zone which is different from your device's time zone change time zone

Sat 31 Jan

Displayed time zone: Hobart change

07:45 - 16:00
08:45 - 10:30
CACHPPPoPP Workshops and Tutorials at Bondi
Chair(s): Jose Nelson Amaral University of Alberta, Bruce Hoppe Massachusetts Institute of Technology, Yihan Sun University of California, Riverside

Website with schedule: https://fastcode.org/events/coevolution-workshop/

08:45 - 10:30
08:45 - 10:30
08:45 - 10:30
08:45 - 10:30
Opening and Keynote TalkCC Main Conference at Coogee
Chair(s): Uday Bondhugula Indian Institute of Science
09:00
15m
Day opening
Opening note from program chairs
CC Main Conference
Uday Bondhugula Indian Institute of Science
09:15
75m
Keynote
Building Compilers for AI Accelerators: Lessons from Real Hardware
CC Main Conference
K: Nicholas Smith Tenstorrent
08:45 - 10:30
10:30 - 11:00
10:30
30m
Coffee break
Break
Catering

11:00 - 12:45
11:00 - 12:45
CACHPPPoPP Workshops and Tutorials at Bondi
Chair(s): Jose Nelson Amaral University of Alberta, Bruce Hoppe Massachusetts Institute of Technology, Yihan Sun University of California, Riverside

Website with schedule: https://fastcode.org/events/coevolution-workshop/

11:00 - 12:45
11:00 - 12:45
11:00 - 12:45
11:00 - 12:45
OptimizationsCC Main Conference at Coogee
Chair(s): Martin Kong The Ohio State University
11:00
26m
Talk
GraalMHC: ML-Based Method-Hotness Classification for Binary-Size Reduction in Optimizing Compilers
CC Main Conference
Milan Cugurovic Oracle and University of Belgrade, Aleksandar Prokopec Oracle Labs, Boris Spasojevic Oracle Labs, Zurich, Switzerland, Vojin Jovanovic Oracle Labs, Milena Vujosevic Janicic University of Belgrade and Oracle
11:26
26m
Talk
It’s about Time - Temporal Abstractions for Asynchronous GPU Tensor Computations
CC Main Conference
11:52
26m
Talk
Optimizing Sparse Tensor Compilation for Sparse Output
CC Main Conference
Shideh Hashemian University of Edinburgh, Michael F. P. O'Boyle University of Edinburgh, Amir Shaikhha University of Edinburgh
12:18
26m
Talk
RIFS: Run-time Invariant Function Specialization
CC Main Conference
Saba Jamilan University of California, Santa Cruz, Snehasish Kumar Google LLC, Heiner Litz UC Santa Cruz
11:00 - 12:45
12:45 - 13:45
12:45
60m
Lunch
Lunch
Catering

13:45 - 15:30
13:45 - 15:30
13:45 - 15:30
13:45 - 15:30
Optimizations for safety and moreCC Main Conference at Coogee
Chair(s): V Krishna Nandivada IIT Madras
13:45
26m
Talk
DiTOX: Fault Detection and Localization in the ONNX Optimizer
CC Main Conference
Nikolaos Louloudakis The University of Edinburgh, Ajitha Rajan The University of Edinburgh
14:11
26m
Talk
SSMR: Statically Detecting Speculation Safe Memory Regions to Mitigate Transient Execution Attacks
CC Main Conference
Ange-Thierry Ishimwe University of Colorado Boulder, Sam Mcdiarmid-sterling University of Colorado Boulder, Zack McKevitt University of Colorado Boulder, Tamara Silbergleit Lehman University of Colorado Boulder
14:37
26m
Talk
CHEHAB: Automatic Compiler Code Optimization for Fully Homomorphic Encryption
CC Main Conference
Riyadh Baghdadi New York University Abu Dhabi, Abdessamed Seddiki New York University Abu Dhabi and Ecole Superieure d'Informatique, Arab Mohammed New York University Abu Dhabi and Ecole Superieure d'Informatique, Zakaria Hebbal Ecole nationale Supérieure d'Informatique, Aimad Chabounia Ecole Superieure d'Informatique; New York University Abu Dhabi, Eduardo Chielle New York University Abu Dhabi, Michail Maniatakos New York University Abu Dhabi, MENACER Djamel Eddine Ecole Superieure d'Informatique, Karima Benatchba Ecole Nationale Supérieure d'Informatique, Challal Yacine University of Doha for Science and Technology
15:03
26m
Talk
Parallel and Customizable Equality Saturation
CC Main Conference
Jonathan Van der Cruysse McGill University, Abd-El-Aziz Zayed McGill University, Mai Jacob Peng McGill University, Christophe Dubach McGill University
13:45 - 15:30
15:30 - 16:00
15:30
30m
Coffee break
Break
Catering

16:00 - 17:45
16:00 - 17:45
16:00 - 17:45
16:00 - 17:45
Code generation and tuningCC Main Conference at Coogee
Chair(s): Ari Rasch University of Muenster
16:00
26m
Talk
Accelerating Sparse Algebra with Program Synthesis
CC Main Conference
José Wesley De Souza Magalhães University of Edinburgh, Shideh Hashemian University of Edinburgh, Alexander Brauckmann University of Edinburgh, Jackson Woodruff University of Edinburgh, Elizabeth Polgreen University of Edinburgh, Michael F. P. O'Boyle University of Edinburgh
16:26
26m
Talk
Schedgehammer: Auto-Tuning Compiler Optimizations Beyond Numerical Parameters
CC Main Conference
Johannes Lenfers University of Münster, Martin Lücke AMD, Sven Spehr University of Münster, Justus Dieckmann University of Münster, Johannes Jansen University of Münster, Sergei Gorlatch University of Muenster
16:52
26m
Talk
TinyGen: Portable and Compact Code Generation for Tiny Machine Learning
CC Main Conference
Gaeun Ko Kyung Hee University, Seonyeong Heo Kyung Hee University
17:18
26m
Talk
CPerfSmith - A Randomized C Program Generator for Performance-Oriented Compiler Testing
CC Main Conference
Boda Yashwanth Indian institute of Technology Roorkee, Chunduri Abhijit Indian institute of Technology Roorkee, Ruchi Kumari Indian institute of Technology Roorkee, Awanish Pandey IIT Roorkee
16:00 - 17:45

Sun 1 Feb

Displayed time zone: Hobart change

07:45 - 19:00
08:45 - 10:30
08:45 - 10:30
08:45 - 10:30
08:45 - 10:30
08:45 - 10:30
Panel + ToolsCC Main Conference at Coogee
Chair(s): Martin Kong Brookhaven National Laboratory
08:45
20m
Talk
Inside VOLT: Designing of an Open-Source GPU Compiler (Tool)
CC Main Conference
Shinnung Jeong Georgia Institute of Technology, Chihyo Ahn Georgia Tech, Huanzhi Pu Georgia Institute of Technology, Jisheng Zhao Georgia Institute of Technology, Hyesoon Kim Georgia Institute of Technology, Blaise Tine University of California, Los Angeles
09:05
20m
Talk
Nsight Python: A Python-First Profiling Toolkit for Seamless GPU Kernel Analysis (Tool)
CC Main Conference
09:30
60m
Panel
Panel: The role of compilers in the era of AI chips and programming frameworks
CC Main Conference
P: Ayal Zaks Mobileye, P: Albert Cohen Google DeepMind, P: Nicholas Smith Tenstorrent, P: Uday Bondhugula Indian Institute of Science
08:45 - 10:30
08:45 - 10:30
ScaleDNNPPoPP Workshops and Tutorials at Curl Curl
Chair(s): Dhabaleswar K. Panda Ohio State University, Nawras Alnaasan Ohio State University

Website with schedule: https://nowlab.cse.ohio-state.edu/tutorials/hidl_PPoPP26/

10:30 - 11:00
10:30
30m
Coffee break
Break
Catering

11:00 - 12:45
11:00 - 12:45
11:00 - 12:45
11:00 - 12:45
11:00 - 12:45
AnalysisCC Main Conference at Coogee
Chair(s): Ajitha Rajan The University of Edinburgh
11:00
26m
Talk
HORIZON: Estimating Alias Analysis Precision Bounds and Their Impact on Performance
CC Main Conference
Khushboo Chitre IIIT Delhi, Piyus Kedia IIIT Delhi, Rahul Purandare University of Nebraska-Lincoln
11:26
26m
Talk
Type Deduction Analysis: Reconstructing Transparent Pointer Types in LLVM-IR
CC Main Conference
Niccolò Nicolosi Politecnico di Milano, Gabriele Magnani Politecnico di Milano, Emilio Corigliano Politecnico di Milano, Davide Baroffio Politecnico di Milano, Federico Reghenzani Politecnico di Milano, Giovanni Agosta Politecnico di Milano, Italy
11:52
26m
Talk
Compact Representation and Interleaved Solving for Scalable Constraint-Based Points-to Analysis
CC Main Conference
Ramya Kasaraneni IIT Madras, V Krishna Nandivada IIT Madras
12:18
26m
Talk
Practical MHP Analysis for Java
CC Main Conference
Samuel Moses IIT Madras, V Krishna Nandivada IIT Madras
11:00 - 12:45
11:00 - 12:45
ScaleDNNPPoPP Workshops and Tutorials at Curl Curl
Chair(s): Dhabaleswar K. Panda Ohio State University, Nawras Alnaasan Ohio State University

Website with schedule: https://nowlab.cse.ohio-state.edu/tutorials/hidl_PPoPP26/

12:45 - 13:45
12:45
60m
Lunch
Lunch
Catering

13:45 - 15:30
13:45 - 15:30
13:45 - 15:30
DiffPPPPoPP Workshops and Tutorials at Bondi
Chair(s): Paul Hovland Argonne National Laboratory, Jan Hueckelheim Argonne National Laboratory

Website with schedule: https://diffprog-ppopp.github.io/

13:45 - 15:30
13:45 - 15:30
DDRPPPoPP Workshops and Tutorials at Bungan
Chair(s): Umang Mathur National University of Singapore, Andreas Pavlogiannis Aarhus University

More information at https://sites.google.com/view/race-prediction-tutorial.

13:45 - 15:30
15:30 - 16:00
15:30
30m
Coffee break
Break
Catering

16:00 - 17:45
16:00 - 17:45
16:00 - 17:45
DiffPPPPoPP Workshops and Tutorials at Bondi
Chair(s): Paul Hovland Argonne National Laboratory, Jan Hueckelheim Argonne National Laboratory

Website with schedule: https://diffprog-ppopp.github.io/

16:00 - 17:45
16:00 - 17:45
DDRPPPoPP Workshops and Tutorials at Bungan
Chair(s): Umang Mathur National University of Singapore, Andreas Pavlogiannis Aarhus University

More information at https://sites.google.com/view/race-prediction-tutorial.

16:00 - 17:45
18:00 - 20:00
Welcome ReceptionCatering at Parkside Ballroom

All attendees registered for the main conference are invited to attend the welcome reception from 18:00 on Sunday evening, where there will be great food and drink and an opportunity to engage with the vibrant HPCA/CGO/PPoPP/CC community.

18:00
2h
Social Event
Welcome Reception
Catering

18:00 - 20:00
18:00
2h
Poster
Tensor Abstraction Enabling Explicit Layout Optimization in Homomorphic Encryption
CGO Student Research Competition
Seongho Kim Yonsei University, Hanjun Kim Yonsei University
18:00
2h
Poster
UniCon: Unified Controllers for the Quantum Computers
CGO Student Research Competition
Ercüment Kaya Technical University of München and Leibniz Supercomputing Centre, Hossam Ahmed Technical University of München and Leibniz Supercomputing Centre, Martin Schulz Technical University of Munich
18:00
2h
Poster
MDH-DSL: Reduction-Aware Data Parallelism via Multi-Dimensional Homomorphisms
CGO Student Research Competition
Richard Schulze University of Muenster, Sergei Gorlatch University of Muenster
18:00
2h
Poster
Effective Tiling for the Snitch Cluster
CGO Student Research Competition
Emily Sillars University of Murcia, Spain, Alexandra Jimborean University of Murcia
18:00
2h
Poster
Automated Adversarial Test Generation for Debugging Neural Compiler Optimizations
CGO Student Research Competition
Vasu Jindal Columbia University
18:00
2h
Poster
Unlocking Vectorization Scope: Extensible Vectorization via Unified Dependence Semantics
CGO Student Research Competition
Shihan Fang Shanghai Jiao Tong University, Wenxin Zheng Shanghai Jiao Tong University
18:00
2h
Poster
Unifying Medium Sparse Processing Frameworks
CGO Student Research Competition
Meisam Tarabkhah University of Edinburgh, Amir Shaikhha University of Edinburgh
18:00
2h
Poster
Bridging Linalg Dialect with Gemmini Backend
CGO Student Research Competition
Jaemin Kim Yonsei University, Hanjun Kim Yonsei University
18:00
2h
Poster
Leveraging Alias Analysis Without Porting
CGO Student Research Competition
Ravikiran Ravindranath Reddy University of Murcia, Alberto Ros University of Murcia, Alexandra Jimborean University of Murcia

Mon 2 Feb

Displayed time zone: Hobart change

07:45 - 16:00
08:30 - 08:45
WelcomePlenary Keynotes at Pyrmont
Chair(s): Steve Blackburn Google and Australian National University, Tony Hosking Australian National University, Shuaiwen Leon Song Together AI and University of Sydney

The conference will formally open with a Welcome to Country from a Traditional Owner of the Eora Nation where the ICC is located. Following that, the General Chairs will welcome you.

08:30
15m
Day opening
Welcome
Plenary Keynotes
Steve Blackburn Google and Australian National University, Tony Hosking Australian National University, Shuaiwen Leon Song Together AI and University of Sydney
08:45 - 09:45
2025 ACM/IEEE-CS Ken Kennedy AwardPlenary Keynotes at Pyrmont
Chair(s): Steve Blackburn Google and Australian National University
08:45
60m
Keynote
Compiler 2.0: Building the Next Generation Compilers with Machine Learning
Plenary Keynotes
Saman Amarasinghe Massachusetts Institute of Technology
09:50 - 11:10
Compiling for ML 1CGO Main Conference at Bronte
Chair(s): Albert Cohen Google DeepMind
09:50
20m
Talk
Enabling Spill-Free Compilation via Affine-Based Live Range Reduction Optimization
CGO Main Conference
Pre-print
10:10
20m
Talk
GRANII: Selection and Ordering of Primitives in GRAph Neural Networks using Input Inspection
CGO Main Conference
Damitha Lenadora University of Illinois at Urbana-Champaign, Vimarsh Sathia University of Illinois Urbana Champaign, Gerasimos Gerogiannis University of Illinois at Urbana-Champaign, Serif Yesil NVIDIA, Josep Torrellas University of Illinois at Urbana-Champaign, Charith Mendis University of Illinois at Urbana-Champaign
Pre-print
10:30
20m
Talk
Fast Autoscheduling for Sparse ML Frameworks
CGO Main Conference
Bobby Yan Stanford University, Alexander J Root Stanford University, Trevor Gale Stanford University, David Broman KTH Royal Institute of Technology, Fredrik Kjolstad Stanford University
Pre-print
10:50
20m
Talk
Eliminating Redundancy: Ultra-compact Code Generation for Programmable Dataflow Accelerators
CGO Main Conference
Prasanth Chatarasi IBM Research, Alex Gatea IBM, Bardia Mahjour IBM, Jintao Zhang Unaffiliated, Alberto Mannari IBM, Chris Bowler IBM, Shubham Jain IBM Research, Masoud Ataei Jaliseh IBM, Nicole Khoun IBM, Kamlesh Kumar Unaffiliated, Viji Srinivasan IBM Research, Swagath Venkataramani IBM Research
Pre-print
09:50 - 11:10
Cache Coherence and Chiplet InterconnectsHPCA Main Conference at Collaroy
Chair(s): Alberto Ros University of Murcia
09:50
20m
Talk
$C^3$ : CXL Coherence Controllers for Heterogeneous Architectures
HPCA Main Conference
Anatole Lefort Technical University of Munich (TUM), David Schall Technical University of Munich, Nicolò Carpentieri Technical University of Munich, Julian Pritzi Technical University of Munich, Soham Chakraborty TU Delft, Nicolai Oswald NVIDIA, Pramod Bhatotia TU Munich
Pre-print
10:10
20m
Talk
Cohet: A CXL-Driven Coherent Heterogeneous Computing Framework with Hardware-Calibrated Full-System Simulation
HPCA Main Conference
Yanjing Wang National University of Defense Technology, Lizhou Wu National University of Defense Technology, Sunfeng Gao National University of Defense Technology, Yibo Tang National University of Defense Technology, Junhui Luo National University of Defense Technology, Zicong Wang National University of Defense Technology, Yang Ou National University of Defense Technology, Dezun Dong NUDT, Nong Xiao National University of Defense Technology & Sun Yat-sen University, Mingche Lai National University of Defense Technology
10:30
20m
Talk
PhasedStore: Supporting High-performance Write-through Cache-coherence Protocols under TSO
HPCA Main Conference
Burak Ocalan University of Illinois Urbana-Champaign, Chloe Alverti University of Illinois at Urbana-Champaign, Shashwat Jaiswal University of Illinois Urbana-Champaign, USA, Antonis Psistakis University of Illinois Urbana-Champaign, David Koufaty Unaffiliated, Suyash Mahar UC San Diego, Steven Swanson University of California San Diego, Josep Torrellas University of Illinois at Urbana-Champaign
10:50
20m
Talk
Deadlock-Free Bridge Module for Inter-Chiplet Communication in Open Chiplet Ecosystem
HPCA Main Conference
Zhiqiang Chen National University of Defense Technology, Wenwen Fu National University of Defense Technology, Yongwen Wang National University of Defense Technology, Hongwei Zhou National University of Defense Technology
09:50 - 11:10
Best Paper CandidatesHPCA Main Conference at Coogee
Chair(s): Moinuddin K. Qureshi Georgia Tech
09:50
20m
Talk
Focus: A Streaming Concentration Architecture for Efficient Vision-Language Models
HPCA Main Conference
Chiyue Wei Duke University, Cong Guo Duke University, Junyao Zhang Duke University, Haoxuan Shan Duke University, Yifan Xu Duke University, Ziyue Zhang Duke University, Yudong Liu Duke University, Qinsi Wang Duke University, Changchun Zhou Duke University, Hai "Helen" Li Duke University, Yiran Chen Duke University
10:10
20m
Talk
LoCaLUT: Harnessing Capacity–Computation Tradeoffs for LUT-Based Inference in DRAM-PIM
HPCA Main Conference
Junguk Hong Seoul National University, Changmin Shin Seoul National University, Sukjin Kim Seoul National University, Si Ung Noh Seoul National University, Taehee Kwon Seoul National University, Seongyeon Park Seoul National University, Hanjun Kim Yonsei University, Youngsok Kim Yonsei University, Jinho Lee Seoul National University
10:30
20m
Talk
RPU - A Reasoning Processing Unit
HPCA Main Conference
Matthew Adiletta Harvard University, David Brooks Harvard University, Gu-Yeon Wei Harvard University
10:50
20m
Talk
PinDrop: Breaking the Silence on SDCs in a Large-Scale Fleet
HPCA Main Conference
Peter W. Deutsch Massachusetts Institute of Technology/Meta, Harish D. Dixit Meta, Gautham Vunnam Meta, Carl Moran Meta, Eleanor Ozer Meta, Sriram Sankar Meta
09:50 - 11:10
Homomorphic Encryption AccelerationHPCA Main Conference at Cronulla
Chair(s): Jung Ho Ahn Seoul National University
09:50
20m
Talk
UniFHE: Faster Accelerator for FHE with Diverse Algebraic Structure and Balanced Memory System
HPCA Main Conference
Qingyun Niu Key Laboratory of Cyberspace Security Defense, Institute of Information Engineering, CAS and School of Cyber Security, University of Chinese Academy of Sciences, Lutan Zhao State Key Laboratory of Cyberspace Security Defense, Institute of Information Engineering, CAS, Ming Cai Key Laboratory of Cyberspace Security Defense, Institute of Information Engineering, CAS and School of Cyber Security, University of Chinese Academy of Sciences, kai li Institute of Information Engineering,CAS, Dan Meng Institute of Information Engineering at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Rui Hou Institute of Information Engineering, CAS
10:10
20m
Talk
Leveraging ASIC AI Chips for Homomorphic Encryption
HPCA Main Conference
Jianming Tong Georgia Institute of Technology, Tianhao Huang MIT, Leo de Castro MIT, Anirudh Itagi Georgia Institute of Technology, Jingtian Dang Georgia Tech, Anupam Golder Georgia Institute of Technology, Asra Ali Google, Jevin Jiang Google, Jeremy Kun Google, Arvind Massachusetts Institute of Technology, G. Edward Suh Cornell University, USA, Tushar Krishna Georgia Institute of Technology
Pre-print
10:30
20m
Talk
CROPHE: Cross-Operator Dataflow Optimization for Fully Homomorphic Encryption Accelerators
HPCA Main Conference
Xinhua Chen Fudan University, Jiangbin Dong Xi'an Jiaotong University, Hongren Zheng Tsinghua University, Tian Tang Tsinghua University, Mingyu Gao Tsinghua University
10:50
20m
Talk
Peregrine: Accelerating TFHE Bootstrapping on GPUs via Multi-Level External Product Co-Design
HPCA Main Conference
Haoqi He State Key Laboratory of Cyberspace Security Defense, Institute of Information Engineering, Chinese Academy of Sciences and School of Cyber Security, University of Chinese Academy of Sciences, Zhiwei Wang State Key Laboratory of Cyberspace Security Defense, Institute of Information Engineering, CAS, Lutan Zhao State Key Laboratory of Cyberspace Security Defense, Institute of Information Engineering, CAS, Dian Jiao State Key Laboratory of Cyberspace Security Defense, Institute of Information Engineering, CAS, Dan Meng Institute of Information Engineering at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Rui Hou Institute of Information Engineering, CAS
09:50 - 11:10
Concurrency ControlPPoPP Main Conference at Pyrmont
Chair(s): Madan Musuvathi Microsoft Research
09:50
20m
Talk
Binary Compatible Critical Section DelegationBest Paper Award
PPoPP Main Conference
Junyao Zhang , Zhuo Wang Alibaba Group, Zhe Zhou Fudan University
DOI
10:10
20m
Talk
Hapax Locks: Scalable Value-Based Mutual Exclusion
PPoPP Main Conference
Dave Dice Independent, Alex Kogan Oracle Labs
DOI
10:30
20m
Talk
Fixing Non-blocking Data Structures for Better Compatibility with Memory Reclamation Schemes
PPoPP Main Conference
Md Amit Hasan Arovi Pennsylvania State University, Ruslan Nikolaev Pennsylvania State University
DOI
10:50
20m
Talk
Multiverse: Transactional Memory with Dynamic Multiversioning
PPoPP Main Conference
Gaetano Coccimiglio University of Waterloo, Trevor Brown University of Waterloo, Srivatsan Ravi University of Southern California
DOI
11:10 - 11:30
11:10
20m
Coffee break
Break
Catering

11:30 - 12:50
SecurityCGO Main Conference at Balmoral
Chair(s): Michael Franz University of California, Irvine
11:30
20m
Talk
PriTran: Privacy-Preserving Inference for Transformer-Based Language Models under Fully Homomorphic Encryption
CGO Main Conference
Yuechen Mu UNSW, Guangli Li Institute of Computing Technology, Chinese Academy of Sciences, Shiping Chen Data61 at CSIRO, Australia / UNSW, Australia, Jingling Xue University of New South Wales
Pre-print
11:50
20m
Talk
FHEFusion: Enabling Operator Fusion in FHE Compilers for Depth-Efficient DNN Inference
CGO Main Conference
Tianxiang Sui Ant Group, Jianxin Lai Ant Group, Long Li Ant Group, Peng Yuan Ant Group, Yan Liu Ant Group, Qing Zhu Ant Group, Xiaojing Zhang Ant Group, Linjie Xiao Ant Group, Mingzhe Zhang Ant Group, Jingling Xue University of New South Wales
Pre-print Media Attached
12:10
20m
Talk
Towards Path-Aware Coverage-Guided Fuzzing
CGO Main Conference
Giacomo Priamo Sapienza University of Rome, Daniele Cono D'Elia Sapienza University of Rome, Mathias Payer EPFL, Leonardo Querzoni Sapienza University Rome
Pre-print Media Attached
12:30
20m
Talk
SecSwift, a Compiler-Based Framework for Software Countermeasures in Cybersecurity
CGO Main Conference
François de Ferrière STMICROELECTRONICS, Yves Janin STMICROELECTRONICS, Sirine Mechmech Grenoble INP
Pre-print
11:30 - 12:50
AbstractionsCGO Main Conference at Bronte
Chair(s): Antonino Tumeo Pacific Northwest National Laboratory
11:30
20m
Talk
Partial-Evaluation Templates: Accelerating Partial Evaluation with Pre-compiled Templates
CGO Main Conference
Florian Huemer JKU Linz, Aleksandar Prokopec Oracle Labs, David Leopoldseder Oracle Labs, Raphael Mosaner Oracle Labs, Hanspeter Mössenböck JKU Linz
Pre-print
11:50
20m
Talk
Pyls: Enabling Python Hardware Synthesis with Dynamic Polymorphism via LCRS Encoding
CGO Main Conference
Bolei Tong Wuhan University, Yongyan Fang Wuhan University, Wang Chaorui Wuhan University, Qingan Li Wuhan University, China, Jingling Xue University of New South Wales, YUAN Mengting School of Computer Science, Wuhan University, Wuhan, China
Pre-print
12:10
20m
Talk
SkeleShare: Algorithmic Skeletons and Equality Saturation for Hardware Resource Sharing
CGO Main Conference
Jonathan Van der Cruysse McGill University, Tzung-Han Juang McGill University, Shakiba Bolbolian Khah McGill University, Christophe Dubach McGill University
Pre-print Media Attached
12:30
20m
Talk
Ember: A Compiler for Embedding Operations on Decoupled Access-Execute Architectures
CGO Main Conference
Marco Siracusa Barcelona Supercomputing Center; Universitat Politècnica de Catalunya, Olivia Hsu Stanford University, Víctor Soria-Pardos Barcelona Supercomputing Center, Joshua Randall Arm, Arnaud Grasset Arm, Eric Biscondi Arm, Douglas J. Joseph Arm, Randy Allen Barcelona Supercomputing Center, Fredrik Kjolstad Stanford University, Miquel Moreto Technical Univeristy of Catalonia, Adrià Armejach Sanosa Barcelona Supercomputing Center & Universitat Politècnica de Catalunya
Pre-print Media Attached
11:30 - 12:50
DRAM Security and ReliabilityHPCA Main Conference at Collaroy
Chair(s): Saugata Ghose University of Illinois Urbana-Champaign
11:30
20m
Talk
MIRZA: Efficiently Mitigating Rowhammer with Randomization and ALERT
HPCA Main Conference
Hritvik Taneja Georgia Tech, Ali Hajiabadi ETH Zurich, Michele Marazzi ABB Research, Kaveh Razavi ETH Zürich, Moinuddin K. Qureshi Georgia Tech
11:50
20m
Talk
SALT: Track-and-Mitigate Subarrays, Not Rows, for Blast-Radius-Free Rowhammer Defense
HPCA Main Conference
12:10
20m
Talk
ReScue: Reliable and Secure CXL Memory
HPCA Main Conference
Chihun Song UIUC, Austin Antony Cruz UIUC, Michael Jaemin Kim Meta, Minbok Wi Seoul National University, Gaohan Ye UIUC, Kyungsan Kim Samsung Electronics, Sangyeol Lee Samsung Electronics, Jung Ho Ahn Seoul National University, Nam Sung Kim UIUC
12:30
20m
Talk
Secret Caching Sauce for High-Performance Secure Memory
HPCA Main Conference
Xu Jiang Huazhong University of Science and Technology, Xueliang Wei Huazhong University of Science and Technology, YiFei Qu Huazhong University of Science and Technology, Dan Feng Huazhong University of Science and Technology, China, Yulai Xie Huazhong University of Science and Technology, Wei Tong Huazhong University of Science and Technology, China
11:30 - 12:50
Near-Data Processing and StorageHPCA Main Conference at Coogee
Chair(s): Jisung Park POSTECH (Pohang University of Science and Technology)
11:30
20m
Talk
PIMphony: Overcoming Bandwidth and Capacity Inefficiency in PIM-based Long-Context LLM Inference System
HPCA Main Conference
hyucksung kwon Hanyang University, Kyungmo Koo Hanyang University, Janghyeon Kim Hanyang University, Woongkyu Lee Hanyang University, Minjae Lee Hanyang University, Gyeonggeun Jung KAIST, Hyungdeok Lee Solution Advanced Technology, SK hynix, Yousub Jung Solution Advanced Technology, SK hynix, Jaehan Park Solution Advanced Technology, SK hynix, Yosub Song Solution Advanced Technology, SK hynix, Byeongsu Yang Solution Advanced Technology, SK hynix, Haerang Choi Solution Advanced Technology, SK hynix, Guhyun Kim Solution Advanced Technology, SK hynix, Jongsoon Won Solution Advanced Technology, SK hynix, Woojae Shin Solution Advanced Technology, SK hynix, Changhyun Kim Solution Advanced Technology, SK hynix, Shin Gyeongcheol Solution Advanced Technology, SK hynix, Yongkee Kwon Tenstorrent, Ilkon Kim Solution Advanced Technology, SK hynix, Euicheol Lim SK hynix, John Kim KAIST, Jungwook Choi Hanyang University
11:50
20m
Talk
Adaptive Draft Sequence Length: Enhancing Speculative Decoding Throughput on PIM-Enabled Systems
HPCA Main Conference
Runze Wang Huazhong University of Science and Technology, Qinggang Wang Huazhong University of Science and Technology, Haifeng Liu Huazhong University of Science and Technology, Long Zheng Huazhong University of Science and Technology, XIAOFEI LIAO Huazhong University of Science and Technology, Hai Jin Huazhong University of Science and Technology, Jingling Xue University of New South Wales
12:10
20m
Talk
Conduit: Programmer-Transparent Near-Data Processing Using Multiple Compute-Capable Resources in SSDs
HPCA Main Conference
Rakesh Nadig ETH Zurich, Vamanan Arulchelvan ETH Zurich, Mayank Kabra ETH Zurich, Harshita Gupta ETH Zurich, Rahul Bera ETH Zurich, Nika Mansouri Ghiasi ETH Zurich, Nanditha Rao ETH Zurich, Qingcai Jiang ETH Zurich, Andreas Kosmas Kakolyris ETH Zurich, Yu Liang ETH Zurich, Mohammad Sadrosadati ETH Zürich, Onur Mutlu ETH Zurich
12:30
20m
Talk
N-DIPPER: A Distributed Inter-die Peak Power Management Network for NAND Systems
HPCA Main Conference
Jinwoo Park KAIST, John Kim KAIST
11:30 - 12:50
Quantum Computing ArchitectureHPCA Main Conference at Cronulla
Chair(s): Frank Mueller NCSU
11:30
20m
Talk
Toward Scalable Gate-Level Parallelism on Trapped-Ion Processors with Racetrack Electrodes
HPCA Main Conference
Enhyeok Jang Yonsei University, Hyungseok Kim Yonsei University, Yongju Lee Yonsei University, Jaewon Kwon Yonsei University, Yipeng Huang Rutgers University, Won Woo Ro Yonsei University
11:50
20m
Talk
Cyclone: Designing Efficient and Highly Parallel QCCD Architectural Codesigns for Fault Tolerant Quantum Memory
HPCA Main Conference
Sahil Khan Duke University, Abhinav Anand Duke University, Kenneth R. Brown Duke University, Jonathan M. Baker University of Texas, Austin
12:10
20m
Talk
d'ArQ: A QOC Framework with Causality-Aware Grouping and Basis Selection
HPCA Main Conference
Changheon Lee Yonsei University, Hyungseok Kim Yonsei University, Seungwoo Choi Yonsei University, Youngmin Kim Yonsei University, Won Woo Ro Yonsei University
12:30
20m
Talk
Pinball: A Cryogenic Predecoder for Quantum Error Correction Decoding Under Circuit-Level Noise
HPCA Main Conference
Alexander Knapen University of Michigan, Guanchen Tao University of Michigan, Jacob Mack University of Michigan, Tomas Bruno University of Michigan, Mehdi Saligane Brown University, Dennis Sylvester University of Michigan, Qirui Zhang University of Michigan, Gokul Subramanian Ravi University of Michigan
11:30 - 12:50
Scheduling and Load BalancingPPoPP Main Conference at Pyrmont
Chair(s): V Krishna Nandivada IIT Madras
11:30
20m
Talk
Rethinking Thread Scheduling under Oversubscription: A User-Space Framework for Coordinating Multi-runtime and Multi-process WorkloadsBest Paper Nominee
PPoPP Main Conference
Aleix Roca Barcelona Supercomputing Center, Vicenç Beltran Barcelona Supercomputing Center
DOI
11:50
20m
Talk
Waste-Efficient Work Stealing
PPoPP Main Conference
Kyle Singer Massachusetts Institute of Technology, Kunal Agrawal Washington University in St. Louis, TB Schardl Massachusetts Institute of Technology
DOI
12:10
20m
Talk
DiggerBees: Depth First Search Leveraging Hierarchical Block-Level Stealing on GPUs
PPoPP Main Conference
Yuyao Niu Barcelona Supercomputing Center, Yuechen Lu China University of Petroleum-Beijing, Weifeng Liu China University of Petroleum-Beijing, Marc Casas Barcelona Supercomputing Center
DOI
12:30
20m
Talk
PANA: A Fine-Grained Runtime-Adaptive Load Balancing for Parallel SpMV on Multicore CPUs
PPoPP Main Conference
Haodong Bian Tsinghua University, Youhui Zhang Tsinghua University, Xiang Fei Tsinghua University, Jianqiang Huang Qinghai University, Xiaoying Wang Qinghai University
DOI
12:50 - 14:10
12:50
80m
Lunch
Lunch
Catering

14:10 - 15:30
MemoryCGO Main Conference at Balmoral
Chair(s): Christophe Guillon STMicroelectronics
14:10
20m
Talk
Flow-Graph-Aware Tiling and Rescheduling for Memory-Efficient On-Device Inference
CGO Main Conference
Yeonoh Jeong Yonsei University, Taehyeong Park Yonsei University, Yongjun Park Yonsei University
Pre-print
14:30
20m
Talk
VFlatten: Selective Value-Object Flattening using Hybrid Static and Dynamic Analysis
CGO Main Conference
Arjun H. Kumar IIT Mandi, Bhavya Hirani SVNIT, Surat, Hang Shao IBM, Tobi Ajila IBM, Vijay Sundaresan IBM Canada, Daryl Maier IBM Canada, Manas Thakur IIT Bombay
Pre-print Media Attached
14:50
20m
Talk
FRUGAL: Pushing GPU Applications beyond Memory Limits
CGO Main Conference
Lingqi Zhang RIKEN RCCS, Tengfei Wang Google Cloud, Jiajun Huang University of California, Riverside, Chen Zhuang Tokyo Institute of Technology, Riken Center for Computational Science, Ivan Ivanov Institute of Science Tokyo, Peng Chen RIKEN RCCS, Toshio Endo , Mohamed Wahib RIKEN Center for Computational Science
Pre-print
15:10
20m
Talk
Automatic Data Enumeration for Fast Collections
CGO Main Conference
Tommy McMichen Northwestern University, Simone Campanoni Google / Northwestern University
Pre-print Media Attached
14:10 - 15:30
DSLsCGO Main Conference at Bronte
Chair(s): Olivia Hsu Stanford University
14:10
20m
Talk
FORTE: Online DataFrame Query Optimizer
CGO Main Conference
Yoonho Choi POSTECH, Kyoungtae Lee Seoul National University, Minji Kim Ewha Womans University, Hyungsoo Jung Seoul National University, Hyojin Sung Seoul National University
Pre-print
14:30
20m
Talk
LEGO: A Layout Expression Language for Code Generation of Hierarchical Mapping
CGO Main Conference
Amir Mohammad Tavakkoli University of Utah, Cosmin E. Oancea University of Copenhagen, Denmark, Mary Hall University of Utah
Pre-print Media Attached
14:50
20m
Talk
Pushing Tensor Accelerators beyond MatMul in a User-Schedulable Language
CGO Main Conference
Yihong Zhang University of Washington, Derek Gerstmann Adobe, Andrew Adams Adobe Research, Maaz Bin Safeer Ahmad University of Washington, Seattle
Pre-print Media Attached
15:10
20m
Talk
Tawa: Automatic Warp Specialization for Modern GPUs with Asynchronous References
CGO Main Conference
Hongzheng Chen Cornell University, Bin Fan Nvidia, Alexander Collins NVIDIA, Bastian Hagedorn NVIDIA, Evghenii Gaburov NVIDIA, Masahiro Masuda NVIDIA, Matthew Brookhart NVIDIA, Chris Sullivan NVIDIA, Jason Knight NVIDIA, Zhiru Zhang Cornell University, USA, Vinod Grover NVIDIA
Pre-print
14:10 - 15:30
Memory System ReliabilityHPCA Main Conference at Collaroy
Chair(s): Haiyu Mao King's College London
14:10
20m
Talk
Predicting DRAM Failures at Scale: A Two-Stage Approach for Heterogeneous Systems
HPCA Main Conference
Chenglin Wang Xiamen University, Shouxin Wang Xiamen University, Shuyue Zhou Xiamen University, Ronglong Wu Xiamen University, Zhirong Shen Xiamen University, Lu Tang Xiamen University, Yiming Zhang Xiamen University, Jialiang Yu Huawei, Min Zhou Huawei
14:30
20m
Talk
MemSOS: OS-Guided Selective Memory Mirroring
HPCA Main Conference
Junghoon Kim Seoul National University & Samsung Electronics, Jongheon Jeong Seoul National University, Seokwon Moon Seoul National University, Seong Hoon Seo Seoul National University, Yeonhong Park Seoul National University, Jinkyu Jeong Yonsei University, Nam Sung Kim UIUC, Jae W. Lee Seoul National University
14:50
20m
Talk
ASPA: Reassigning DDR5 Parity Bandwidth
HPCA Main Conference
Fan Li University of Central Florida, Qiufeng Li George Washington University, Yanan Guo University of Rochester, Weidong Cao George Washington University, Xin Xin University of Central Florida
15:10
20m
Talk
HR-DCIM: High-Reliability Floating-Point Digital CIM Architecture with Unified Low-Cost Iterative Error Correction
HPCA Main Conference
Zhen He Tsinghua University, Yiqi Wang Tsinghua University, Zhiheng Yue Tsinghua University, Zihan Wu Tsinghua University, Huiming Han Tsinghua University, Shaojun Wei Tsinghua University, Yang Hu Tsinghua University, Fengbin Tu The Hong Kong University of Science and Technology, Shouyi Yin Tsinghua University
14:10 - 15:30
LLM Inference Serving SystemsHPCA Main Conference at Coogee
Chair(s): Jian Li Chinese Academy of Meteorological Sciences
14:10
20m
Talk
Towards Resource-Efficient Serverless LLM Inference with SLINFER
HPCA Main Conference
Chuhao Xu Shanghai Jiao Tong University, Zijun Li Shanghai Jiao Tong University, Quan Chen Shanghai Jiao Tong University, China, Han Zhao Shanghai Jiao Tong University, Xueyan Tang Nanyang Technological University, Minyi Guo Shanghai Jiao Tong University
14:30
20m
Talk
ELORA: Efficient LoRA and KV Cache Management for Multi-LoRA LLM Serving
HPCA Main Conference
Jiuchen Shi Shanghai Jiao Tong University & The Hong Kong Polytechnic University, Hang Zhang Shanghai Jiao Tong University, Yixiao Wang Shanghai Jiao Tong University, Quan Chen Shanghai Jiao Tong University, China, Yizhou Shan Huawei Cloud, Kaihua Fu Hong Kong University of Science and Technology, Wei Wang Hong Kong University of Science and Technology, Minyi Guo Shanghai Jiao Tong University
14:50
20m
Talk
PASCAL: A Phase-Aware Scheduling Algorithm for Serving Reasoning-based Large Language Models
HPCA Main Conference
15:10
20m
Talk
The Cost of Dynamic Reasoning: Demystifying AI Agents and Test-Time Scaling from an AI Infrastructure Perspective
HPCA Main Conference
Jiin Kim KAIST, Byeongjun Shin KAIST, Jinha Chung KAIST, Minsoo Rhu KAIST
14:10 - 15:30
Quantum Compilation and SimulationHPCA Main Conference at Cronulla
Chair(s): Gokul Subramanian Ravi University of Michigan
14:10
20m
Talk
CLINE: Improving Control Flow Compilation of Quantum Programs with Control Line Encoding
HPCA Main Conference
Anbang Wu Shanghai Jiao Tong University, Liqiang Lu Zhejiang University, Jianwei Yin Zhejiang University, Jingwen Leng Shanghai Jiao Tong University, Minyi Guo Shanghai Jiao Tong University
14:30
20m
Talk
Fully Parallelized BP Decoding for Quantum LDPC Codes Can Outperform BP-OSD
HPCA Main Conference
Ming Wang North Carolina State University, Ang Li Pacific Northwest National Laboratory, Frank Mueller North Carolina State University, USA
14:50
20m
Talk
DC-MBQC: A Distributed Quantum Compilation Framework for Measurement-Based Quantum Computing
HPCA Main Conference
Yecheng Xue Peking University, Rui Yang Peking University, Zhiding Liang The Chinese University of Hong Kong, Tongyang Li Peking University
15:10
20m
Talk
TraceQ: Trace-Based Reconstruction of Quantum Circuit Dataflow in Surface-Code Fault-Tolerant Quantum Computing
HPCA Main Conference
Theodoros Trochatos Yale University, Christopher Kang University of Chicago, Andrew Wang Cornell University, Frederic T. Chong University of Chicago, Jakub Szefer Northwestern University
14:10 - 15:30
Concurrent Data StructuresPPoPP Main Conference at Pyrmont
Chair(s): Calin Cascaval Google DeepMind
14:10
20m
Talk
UFO Trees: Practical and Provably-Efficient Parallel Batch-Dynamic TreesBest Paper Nominee
PPoPP Main Conference
Quinten De Man University of Maryland, Atharva Sharma University of Maryland, Kishen N Gowda University of Maryland, Laxman Dhulipala University of Maryland, College Park
DOI
14:30
20m
Talk
Sharded Elimination and Combining for Highly-Efficient Concurrent Stacks
PPoPP Main Conference
Ajay Singh FORTH ICS, Nikos Metaxakis , Panagiota Fatourou FORTH ICS and University of Crete, Greece
DOI
14:50
20m
Talk
Concurrent Balanced Augmented Trees
PPoPP Main Conference
Evan Wrench University of British Columbia, Ajay Singh FORTH ICS, Younghun Roh Massachusetts Institute of Technology, Panagiota Fatourou University of Crete & FORTH, Siddhartha Jayanti Google Research, Eric Ruppert York University, Yuanhao Wei University of British Columbia
DOI
15:10
20m
Talk
Parallel Dynamic Spatial Indexes
PPoPP Main Conference
Ziyang Men University of California, Riverside, Bo Huang University of California, Riverside, Yan Gu University of California, Riverside, Yihan Sun University of California, Riverside
DOI
15:30 - 15:50
15:30
20m
Coffee break
Break
Catering

15:50 - 17:10
Quantum / HLSCGO Main Conference at Balmoral
Chair(s): Aaron Smith
15:50
20m
Talk
Dependence-Driven, Scalable Quantum Circuit Mapping with Affine Abstractions
CGO Main Conference
Marouane Benbetka École Nationale Supérieure d’Informatique, Merwan BEKKAR École Nationale Supérieure d’Informatique, Riyadh Baghdadi New York University Abu Dhabi, Martin Kong Ohio State University
Pre-print Media Attached
16:10
20m
Talk
Space-Time Optimisations for Early Fault-Tolerant Quantum Computation
CGO Main Conference
Sanaa Sharma University of Cambridge, Prakash Murali University of Cambridge
Pre-print Media Attached
16:30
20m
Talk
OpenQudit: Extensible and Accelerated Numerical Quantum Compilation via a JIT-Compiled DSL
CGO Main Conference
Ed Younis Lawrence Berkeley National Laboratory
Pre-print Media Attached
16:50
20m
Talk
Selene: Cross-Level Barrier-Free Pipelining for Irregular Nested Loops in High-Level Synthesis
CGO Main Conference
Sungwoo Yun Yonsei University, Seonyoung Cheon Yonsei University, Dongkwan Kim Yonsei University, Heelim Choi Yonsei University, Kunmo Jeong Yonsei University, Chan Lee Yonsei University, Yongwoo Lee DGIST, Hanjun Kim Yonsei University
Pre-print
15:50 - 17:10
Parallelization / VectorizationCGO Main Conference at Bronte
Chair(s): V Krishna Nandivada IIT Madras
15:50
20m
Talk
Enabling Automatic Compiler-Driven Vectorization of Transformers
CGO Main Conference
Shreya Alladi University of Murcia, Alberto Ros University of Murcia, Alexandra Jimborean University of Murcia
Pre-print Media Attached
16:10
20m
Talk
Unlocking Python Multithreading Capabilities using OpenMP-Based Programming with OMP4Py
CGO Main Conference
César Piñeiro University of Santiago de Compostela, Juan C. Pichel University of Santiago de Compostela
Pre-print Media Attached
16:30
20m
Talk
The Parallel-Semantics Program Dependence Graph for Parallel Optimization
CGO Main Conference
Yian Su Northwestern University, Brian Homerding Northwestern University, Haocheng Gao Northwestern University, Federico Sossai Northwestern University, Yebin Chon Princeton University, David I. August Princeton University, Simone Campanoni Google / Northwestern University
Pre-print Media Attached
16:50
20m
Talk
From Threads to Tiles: T2T, a Compiler for CUDA-to-NPU Translation via 2D Vectorization
CGO Main Conference
Shuaijiang Li Institute of Computing Technology at Chinese Academy of Sciences, Jiacheng Zhao Institute of Computing Technology at Chinese Academy of Sciences; University of Chinese Academy of Sciences; Zhongguancun Laboratory, Ying Liu Institute of Computing Technology, Chinese Academy of Sciences, Shuoming Zhang Institute of Computing Technology at Chinese Academy of Sciences, Lei Chen University of Chinese Academy of Sciences, Yijin Li Institute of Computing Technology at Chinese Academy of Sciences, Yangyu Zhang Institute of Computing Technology,Chinese Academy of Sciences, lizhicheng Institute of Computing Technology at Chinese Academy of Sciences, Runyu Zhou Institute of Computing Technology at Chinese Academy of Sciences, Xiyu Shi Institute of Computing Technology at Chinese Academy of Sciences, Chunwei Xia University of Leeds, Yuan Wen University of Aberdeen, Xiaobing Feng ICT CAS, Huimin Cui Institute of Computing Technology, Chinese Academy of Sciences
Pre-print
15:50 - 17:10
Processing-in-Memory ArchitecturesHPCA Main Conference at Collaroy
Chair(s): Byeongho Kim Samsung Electronics
15:50
20m
Talk
The Memory Processing Unit: A Generalized Interface for End-to-End In-Memory Execution
HPCA Main Conference
Minh S. Q. Truong Carnegie Mellon University, Yiqiu Sun University of Illinois Urbana-Champaign, Dawei Xiong University of Illinois Urbana-Champaign, Amol Shah University of Illinois Urbana-Champaign, Alex Glass Carnegie Mellon University, Abraham Farrell University of Illinois Urbana-Champaign, James A. Bain Carnegie Mellon University, L. Richard Carley Carnegie Mellon University, Saugata Ghose University of Illinois Urbana-Champaign
Link to publication
16:10
20m
Talk
CoCoTree: A Computation-Capable Architecture for Collective Communication in Scalable PIM
HPCA Main Conference
Shunchen Shi Institute of Computing Technology, Chinese Academy of Sciences ; University of Chinese Academy of Sciences, Qijia Yang Institute of Computing Technology, Chinese Academy of Sciences ; University of Chinese Academy of Sciences, Fan Yang Institute of Computing Technology, Chinese Academy of Science, Yu Huang Huazhong University of Science and Technology, Youwei Zhuo Peking University, Zhichun Li Institute of Computing Technology, Chinese Academy of Sciences ; University of Chinese Academy of Sciences, Ninghui Sun State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Xueqi Li State Key Lab of Processors, Institute of Computing Technology, CAS
16:30
20m
Talk
PIM-malloc: A Fast and Scalable Dynamic Memory Allocator for Processing-In-Memory (PIM) Architectures
HPCA Main Conference
Dongjae Lee KAIST, Bongjoon Hyun Samsung, Youngjin Kwon KAIST, Minsoo Rhu KAIST
16:50
20m
Talk
Count2Multiply: Reliable In-Memory High-Radix Counting
HPCA Main Conference
Joao Paulo Cardoso de Lima TU Dresden, ScaDS.AI, Benjamin F. Morris III Duke University, Asif Ali Khan TU Dresden, Germany, Jeronimo Castrillon TU Dresden, Germany, Alex Jones Syracuse University
15:50 - 17:10
Efficient LLM Inference TechniquesHPCA Main Conference at Coogee
Chair(s): Jovan Stojkovic University of Illinois at Urbana-Champaign
15:50
20m
Talk
PADE: A Predictor-Free Sparse Attention Accelerator via Unified Execution and Stage Fusion
HPCA Main Conference
Huizheng Wang Tsinghua University, Hongbin Wang Tsinghua University, Zichuan Wang Tsinghua University, Zhiheng Yue Tsinghua University, Yang Wang Tsinghua University, Chao Li Shanghai Jiao Tong University, Yang Hu Tsinghua University, Shouyi Yin Tsinghua University
16:10
20m
Talk
AQPIM: Breaking the PIM Capacity Wall for LLMs with In-Memory Activation Quantization
HPCA Main Conference
Kosuke Matsushima Institute of Science Tokyo, Yasuyuki Okoshi Institute of Science Tokyo, Masato Motomura Institute of Science Tokyo, Daichi Fujiki Institute of Science Tokyo
16:30
20m
Talk
BitDecoding: Unlocking Tensor Cores for Long-Context LLMs with Low-Bit KV Cache
HPCA Main Conference
Dayou Du University of Edinburgh, Shijie Cao Microsoft Research, Jianyi Cheng University of Edinburgh, UK, Luo Mai University of Edinburgh, Ting Cao Institute for AI Industry Research (AIR), Tsinghua University, Mao Yang Microsoft Research
16:50
20m
Talk
GyRot: Leveraging Hidden Synergy between Rotation and Fine-grained Group Quantization for Low-bit LLM Inference
HPCA Main Conference
15:50 - 17:10
3D Graphics and Rendering AccelerationHPCA Main Conference at Cronulla
Chair(s): Yunho Oh Korea University
15:50
20m
Talk
GRTX: Efficient Ray Tracing for 3D Gaussian-Based Rendering
HPCA Main Conference
Junseo Lee Seoul National University, Sangyun Jeon Seoul National University, Jungi Lee Seoul National University, Junyong Park Seoul National University, Jaewoong Sim Seoul National University
16:10
20m
Talk
Splatonic: Architecture Support for 3D Gaussian Splatting SLAM via Sparse Processing
HPCA Main Conference
Xiaotong Huang Shanghai Jiao Tong University, He Zhu Shanghai Jiao Tong University, Tianrui Ma Institute of Computing Technology, Chinese Academy of Sciences, Yuxiang Xiong Shanghai Jiao Tong University, Fangxin Liu Shanghai Jiao Tong University, Zhezhi He Shanghai Jiao Tong University, Yiming Gan Institute of Computing Technology, Chinese Academy of Sciences, Zihan Liu Shanghai Jiao Tong University, Jingwen Leng Shanghai Jiao Tong University, Yu Feng Shanghai Jiao Tong University, Minyi Guo Shanghai Jiao Tong University
16:30
20m
Talk
FractalCloud: A Fractal-Inspired Architecture for Efficient Large-Scale Point Cloud Processing
HPCA Main Conference
Yuzhe Fu Duke University, Changchun Zhou Duke University, Hancheng Ye Duke University, Bowen Duan Duke University, Qiyu Huang Yale University, Chiyue Wei Duke University, Cong Guo Duke University, Hai "Helen" Li Duke University, Yiran Chen Duke University
16:50
20m
Talk
ORANGE: Exploring \underline{O}ckham's \underline{R}azor for Neural Rendering by \underline{A}ccelerating 3DGS on \underline{N}PUs with \underline{GE}MM-Friendly Blending and Balanced Workloads
HPCA Main Conference
Haomin Li Shanghai Jiao Tong University, Yue Liang Shanghai Jiao Tong University, Fangxin Liu Shanghai Jiao Tong University, Bowen Zhu Shanghai Jiao Tong University, Zongwu Wang Shanghai Jiao Tong University, Yu Feng Shanghai Jiao Tong University, Liqiang Lu Zhejiang University, Li Jiang Shanghai Jiaotong University, Haibing Guan Shanghai Jiao Tong University
15:50 - 17:10
GPU and Heterogeneous ComputingPPoPP Main Conference at Pyrmont
Chair(s): Frank Mueller North Carolina State University, USA
15:50
20m
Talk
PRISM: An Efficient GPU-Based Lossy Compression Framework for Progressive Data Retrieval with Multi-Level InterpolationBest Paper Nominee
PPoPP Main Conference
Bing Lu Institute of Computing Technology of Chinese Academy of Sciences, Zedong Liu University of Chinese Academy of Sciences, Hairui Zhao Jilin University, Dejun Luo University of Chinese Academy of Sciences, Wenjing Huang University of Chinese Academy of Sciences, Yida Gu University of Chinese Academy of Sciences, Jinyang Liu University of Houston, Guangming Tan University of Chinese Academy of Sciences, Dingwen Tao Institute of Computing Technology, Chinese Academy of Sciences
DOI
16:10
20m
Talk
Dynamic Detection of Inefficient Data Mapping Patterns in Heterogeneous OpenMP Applications
PPoPP Main Conference
Luke Marzen Iowa State University, Junhyung Shim Iowa State University, Ali Jannesari Iowa State University
DOI
16:30
20m
Talk
Root-Down Exposure for Maximal Clique Enumeration on GPUs
PPoPP Main Conference
Zhe Pan Tsinghua University, Peng Qu Tsinghua University, Youhui Zhang Tsinghua University
DOI
16:50
20m
Talk
ROME: Maximizing GPU Efficiency for All-Pairs Shortest Path via Taming Fine-Grained Irregularities
PPoPP Main Conference
Weile Luo The Hong Kong University of Science and Technology, Guangzhou, Yuhan Chen The Hong Kong University of Science and Technology, Guangzhou, Xiangrui Yu The Hong Kong University of Science and Technology, Guangzhou, Qiang Wang Harbin Institute of Technology, Shenzhen, Ruibo Fan The Hong Kong University of Science and Technology, Guangzhou, Hongyuan Liu Stevens Institute of Technology, Xiaowen Chu The Hong Kong University of Science and Technology, Guangzhou
DOI
17:30 - 19:00
Business MeetingCGO Main Conference at Bronte
Chair(s): Steve Blackburn Google and Australian National University, Albert Cohen Google DeepMind, Timothy M. Jones University of Cambridge
17:30
90m
Meeting
CGO Business Meeting
CGO Main Conference

17:30 - 19:00
Business MeetingHPCA Main Conference at Coogee
17:30
90m
Meeting
HPCA Business Meeting
HPCA Main Conference

17:30 - 19:00
Business MeetingPPoPP Main Conference at Cronulla
Chair(s): Tony Hosking Australian National University, Madan Musuvathi Microsoft Research, Kenjiro Taura The University of Tokyo
17:30
90m
Meeting
PPoPP Business Meeting
PPoPP Main Conference

Tue 3 Feb

Displayed time zone: Hobart change

08:15 - 16:00
08:45 - 09:45
Plenary KeynotePlenary Keynotes at Pyrmont
Chair(s): Tony Hosking Australian National University
08:45
60m
Keynote
Oracle Parfait – Scaling Vulnerability Detection from Enterprise Systems to Cloud-Scale Systems and Beyond
Plenary Keynotes
Cristina Cifuentes Oracle Software Assurance
09:50 - 11:10
Binary / JITCGO Main Conference at Balmoral
Chair(s): Alexandra Jimborean University of Murcia
09:50
20m
Talk
Binary Diffing via Library Signatures
CGO Main Conference
Andrei Rimsa CEFET-MG, Anderson Faustino da Silva State University of Maringá, Camilo Santana Melgaço Federal University of Minas Gerais, Fernando Magno Quintão Pereira Federal University of Minas Gerais
Pre-print Media Attached
10:10
20m
Talk
BIT: Empowering Binary Analysis through the LLVM Toolchain
CGO Main Conference
Puzhuo Liu Ant Group & Tsinghua University, Peng Di Ant Group & UNSW, Jingling Xue University of New South Wales, Yu Jiang Tsinghua University
Pre-print
10:30
20m
Talk
Dr.avx: A Dynamic Compilation System for Seamlessly Executing Hardware-Unsupported Vectorization Instructions
CGO Main Conference
Yue Tang East China Normal University, Mianzhi Wu East China Normal University, Yufeng Li East China Normal University, Haoyu Liao East China Normal University, Jianmei Guo East China Normal University, Bo Huang East China Normal University
Pre-print Media Attached
10:50
20m
Talk
Practical: Are Abstract-Interpreter Baseline JITs Worth It? An Empirical Evaluation through Metacompilation
CGO Main Conference
Nahuel Palumbo Université Lille, CNRS, Centrale Lille, Inria, UMR 9189 - CRIStAL, Guillermo Polito Univ. Lille, Inria, CNRS, Centrale Lille, UMR 9189 CRIStAL, Stéphane Ducasse Inria; University of Lille; CNRS; Centrale Lille; CRIStAL, Pablo Tesone Univ. Lille, Inria, CNRS, Centrale Lille, UMR 9189 CRIStAL, Pharo Consortium
Pre-print
09:50 - 11:10
Code GenerationCGO Main Conference at Bronte
Chair(s): Fredrik Kjolstad Stanford University
09:50
20m
Talk
TPDE: A Fast Adaptable Compiler Back-End Framework
CGO Main Conference
Tobias Schwarz TU Munich, Tobias Kamm TU Munich, Alexis Engelke TU Munich
Pre-print Media Attached
10:10
20m
Talk
Synthesizing Instruction Selection Back-Ends from ISA Specifications Made Practical
CGO Main Conference
Florian Drescher Technical University of Munich, Alexis Engelke TU Munich
Pre-print
10:30
20m
Talk
SparseX: Synergizing GPU Libraries for Sparse Matrix Multiplication on Heterogeneous Processors
CGO Main Conference
Ruifeng Zhang North Carolina State University, Xiangwei Wang North Carolina State University, Ang Li Pacific Northwest National Laboratory, Xipeng Shen North Carolina State University
Pre-print Media Attached
10:50
20m
Talk
Compilation of Generalized Matrix Chains with Symbolic Sizes
CGO Main Conference
Francisco López Umeå University, Lars Karlsson Umeå University, Paolo Bientinesi Umeå University
Pre-print Media Attached
09:50 - 11:10
CPU Microarchitecture OptimizationHPCA Main Conference at Collaroy
Chair(s): Daniel A. Jiménez
09:50
20m
Talk
The Last-Level Branch Predictor Revisited
HPCA Main Conference
David Schall Technical University of Munich, Mária Ďuračková University Of Edinburgh, Boris Grot University of Edinburgh, UK
10:10
20m
Talk
Tempranillo: Non-Speculative Early Register Release
HPCA Main Conference
Carlos Escuin Computing Systems Lab, Huawei Technologies Switzerland AG, Paolo Salvatore Galfano Computing Systems Laboratory, Zurich Research Center, Huawei Technologies, Switzerland, Davide Basilio Bartolini Computing Systems Laboratory, Zurich Research Center, Huawei Technologies, Switzerland, Leeor Peled Boole Labs, Tel-Aviv Research Center, Huawei Technologies, Israel, Mehdi Alipour Computing Systems Laboratory, Zurich Research Center, Huawei Technologies, Switzerland
10:30
20m
Talk
SMTcheck: Accurate SMT Interference Prediction to Improve Scheduling Efficiency in Datacenters
HPCA Main Conference
Sanghyun Kim Sungkyunkwan University, Jinhyeok Oh Sungkyunkwan University, Taehun Kim Sungkyunkwan University, Gyutae Kim Sungkyunkwan University, Youngsok Kim Yonsei University, Jaehyun Hwang Sungkyunkwan University, Joonsung Kim Sungkyunkwan University
10:50
20m
Talk
I-POP: Ignite Positive Prefetchers
HPCA Main Conference
Yiquan Lin Zhejiang University and Alibaba Group, Wenhai Lin Alibaba Group, Yiquan Chen Alibaba Group, Jiexiong Xu Zhejiang University and Alibaba Group, Shishun Cai Alibaba Group, Jiarong Ye Zhejiang University, Zonghui Wang Zhejiang University, Wenzhi Chen Zhejiang University
09:50 - 11:10
Wafer-Scale Systems for Large ModelsHPCA Main Conference at Coogee
Chair(s): Hyesoon Kim Georgia Institute of Technology, Hyesoon Kim Georgia Institute of Technology
09:50
20m
Talk
WATOS: Efficient LLM Training Strategies and Architecture Co-exploration for Wafer-scale Chip
HPCA Main Conference
Huizheng Wang Tsinghua University, Zichuan Wang Tsinghua University, Hongbin Wang Tsinghua University, Jingxiang Hou Tsinghua University, Taiquan Wei Tsinghua University, Chao Li Shanghai Jiao Tong University, Yang Hu Tsinghua University, Shouyi Yin Tsinghua University
10:10
20m
Talk
FACE: Fully PD Overlapped Scheduling and Multi-Level Architecture Co-Exploration on Wafer
HPCA Main Conference
Zheng Xu Tsinghua University, Dehao Kong Tsinghua University, Jiaxin Liu Tsinghua University, Dingcheng Jiang Tsinghua University, Xu Dai Shanghai Artificial Intelligence Laboratory, Jinyi Deng Tsinghua University, Yang Hu Tsinghua University, Shouyi Yin Tsinghua University
10:30
20m
Talk
TEMP: A Memory Efficient Physical-aware Tensor Partition-Mapping Framework on Wafer-scale Chips
HPCA Main Conference
Huizheng Wang Tsinghua University, Taiquan Wei Tsinghua University, Zichuan Wang Tsinghua University, Dingcheng Jiang Tsinghua University, Qize Yang Tsinghua University, Jiaxin Liu Tsinghua University, Jingxiang Hou Tsinghua University, Chao Li Shanghai Jiao Tong University, Jinyi Deng Tsinghua University, Yang Hu Tsinghua University, Shouyi Yin Tsinghua University
10:50
20m
Talk
MoEntwine: Unleashing the Potential of Wafer-scale Chips for Large-scale Expert Parallel Inference
HPCA Main Conference
Xinru Tang Tsinghua University, Jingxiang Hou Tsinghua University, Dingcheng Jiang Tsinghua University, Taiquan Wei Tsinghua University, Jiaxin Liu Tsinghua University, Jinyi Deng Tsinghua University, Huizheng Wang Tsinghua University, Qize Yang Tsinghua University, Haoran Shang Tsinghua University, Chao Li Shanghai Jiao Tong University, Yang Hu Tsinghua University, Shouyi Yin Tsinghua University
09:50 - 11:10
Best of CALHPCA Best of CAL at Cronulla
Chair(s): Sudhanva Gurumurthi AMD
09:50
26m
Talk
QuArch: A Question-Answering Dataset for AI Agents in Computer Architecture
HPCA Best of CAL
Shvetank Prakash Harvard, Andrew Cheng Harvard, Jason Yik Harvard, Arya Tschand Harvard, Radhika Ghosal Harvard, Ikechukwu Uchendu Harvard, Jessica Quaye Harvard, Jeffrey Ma Harvard, Shreyas Grampurohit IIT Bombay, Sofia Giannuzzi Harvard, Arnav Balyan Independent, Fin Amin North Carolina State University, Aadya Pipersenia IIT Bombay, Yash Choudhary IIT Bombay, Ankita Nayak Qualcomm AI Research, Amir Yazdanbakhsh Google Research, Brain Team, Vijay Janapa Janapa Reddi Harvard University
Link to publication DOI
10:16
26m
Talk
The Architectural Sustainability Indicator
HPCA Best of CAL
Jaime Roelandts Ghent University, Ajeya Naithani TU Eindhoven, Lieven Eeckhout Ghent University, Belgium
Link to publication DOI
10:43
26m
Talk
Per-Row Activation Counting on Real Hardware: Demystifying Performance Overheads
HPCA Best of CAL
Jumin Kim Seoul National University, Seungmin Baek Seoul National University, Minbok Wi Seoul National University, Hwayong Nam Seoul National University, Michael Jaemin Kim Meta, Sukhan Lee Samsung Electronics, Kyomin Sohn Samsung, Jung Ho Ahn Seoul National University
Link to publication DOI
09:50 - 11:10
Stencil and Sparse Matrix ComputationPPoPP Main Conference at Pyrmont
Chair(s): Shoaib Kamil Adobe Research
09:50
20m
Talk
SPIDER: Unleashing Sparse Tensor Cores for Stencil Computation via Strided Swapping
PPoPP Main Conference
Qiqi Gu Shanghai Jiao Tong University, Chenpeng Wu Shanghai Jiao Tong University, Heng Shi , Jianguo Yao Shanghai Jiao Tong University; Shanghai Enflame Technology
DOI
10:10
20m
Talk
ASM-SpMM: Unleashing the Potential of Arm SME for Sparse Matrix Multiplication Acceleration
PPoPP Main Conference
Jiazhi Jiang Sun Yat-sen University, Xijia Yao Sun Yat-sen University, Jiayu Chen Sun Yat-sen University, jinhui wei Sun Yat-sen University, Dan Huang , Yutong Lu Sun Yat-sen University
DOI
10:30
20m
Talk
Exploiting Efficient Mapping and Pipelined Execution for Accelerating SpMV on Tensor Cores
PPoPP Main Conference
Kaige Zhang Beihang University, Hailong Yang Beihang University, Xin You Beihang University, Tianyu Feng Beihang University, Yufan Xu Independent Researcher, Zhongzhi Luan Beihang University, Yi Liu Beihang University, Depei Qian Beihang University
DOI
10:50
20m
Talk
VDHA: Vector-Driven Hash Aggregation for Sparse Matrix-Sparse Vector Multiplication on GPUs
PPoPP Main Conference
Yuchen Li Tsinghua University, Zhe Pan Tsinghua University, Peng Qu Tsinghua University, Youhui Zhang Tsinghua University
DOI
11:10 - 11:30
11:10
20m
Coffee break
Break
Catering

11:30 - 12:50
Mixed Precision and QuantizationPPoPP Main Conference at Balmoral
Chair(s): Dingwen Tao Institute of Computing Technology, Chinese Academy of Sciences
11:30
20m
Talk
RoMeo: Mitigating Dual-dimensional Outliers with Rotated Mixed Precision Quantization
PPoPP Main Conference
Qihao Zhang Tsinghua University, MingLiang Tang Tsinghua University, Mingshu Zhai Tsinghua University, Kinman Lei Tsinghua University, Jidong Zhai Tsinghua University
DOI
11:50
20m
Talk
High-Throughput Non-Uniformly Quantized 3-bit LLM Inference
PPoPP Main Conference
YuAng Chen Chinese University of Hong Kong, Wenqi Zeng Hong Kong University of Science and Technology, Jeffrey Xu Yu Chinese University of Hong Kong
DOI
12:10
20m
Talk
JanusQuant: Accurate and Efficient 2-bit KV Cache Quantization for Long-Context Inference
PPoPP Main Conference
Chengyu Sun Wuhan University, Yaqi Xia Wuhan University, Hulin Wang , Donglin Yang Nvidia Corporation, Xiaobo Zhou University of Macau, Dazhao Cheng WuHan University
DOI
12:30
20m
Talk
HierCut: Enabling 16-bit Format Mixed Precision for Molecular Dynamics through Hierarchical Cutoff
PPoPP Main Conference
zeyu song Tsinghua University, Lin Gan Tsinghua University, Xiaohui Duan Shandong University, Jiayu Fu Tsinghua University, Zhengrui Li Tsinghua University, Yinuo Wang Tsinghua University, Guangzhao Li Chinese Academy of Sciences, Guangwen Yang Tsinghua University
DOI
11:30 - 12:50
Profiling / InstrumentationCGO Main Conference at Bronte
Chair(s): Mircea Trofin Google
11:30
20m
Talk
TRACE4J: A Lightweight, Flexible, and Insightful Performance Tracing Tool for Java
CGO Main Conference
Haide He UC Merced, Pengfei Su University of California, Merced
Pre-print Media Attached
11:50
20m
Talk
Proton: Towards Multi-level, Adaptive Profiling for Triton
CGO Main Conference
Keren Zhou George Mason University, Tianle Zhong University of Virginia, Hao Wu George Mason University, Jihyeong Lee George Mason University, Yue Guan University of California at San Diego, Yufei Ding University of California at Santa Barbara, Corbin Robeck Meta, Yuanwei Fang Meta, Jeff Niu OpenAI, Philippe Tillet OpenAI
Pre-print Media Attached
12:10
20m
Talk
On the Precision of Dynamic Program Fingerprints Based on Performance Counters
CGO Main Conference
Anderson Faustino da Silva State University of Maringá, Sergio Queiroz de Medeiros Universidade Federal do Rio Grande do Norte, Marcelo Borges Nogueira Federal University of Rio Grande do Norte, Jeronimo Castrillon TU Dresden, Germany, Fernando Magno Quintão Pereira Federal University of Minas Gerais
Pre-print Media Attached
12:30
20m
Talk
PASTA: A Modular Program Analysis Tool Framework for Accelerators
CGO Main Conference
Mao Lin University of California Merced, Hyeran Jeon University of California, Merced, Keren Zhou George Mason University
Pre-print Media Attached
11:30 - 12:50
Caching and PrefetchingHPCA Main Conference at Collaroy
Chair(s): David Schall Technical University of Munich
11:30
20m
Talk
Athena: Synergizing Data Prefetching and Off-Chip Prediction via Online Reinforcement Learning
HPCA Main Conference
Zhenrong Lang ETH Zürich, Rahul Bera ETH Zurich, Caroline Hengartner ETH Zürich, Konstantinos Kanellopoulos ETH Zurich, Rakesh Kumar NTNU, Mohammad Sadrosadati ETH Zürich, Onur Mutlu ETH Zurich
11:50
20m
Talk
Streamlined On-Chip Temporal Prefetching
HPCA Main Conference
Quang Duong The University of Texas at Austin, Calvin Lin The University of Texas at Austin
12:10
20m
Talk
Intermittence-Aware Cache Compression
HPCA Main Conference
Gan Fang Purdue University, Jianping Zeng Arizona State University, Yuchen Zhou Purdue University, Changhee Jung Purdue University, USA
12:30
20m
Talk
TENET-v2: Applying Relation-centric Notation to Model and Optimize Data Swizzle in the Cache of Modern NPU
HPCA Main Conference
Hanyu Zhang Zhejiang University, Fangxu Guo Zhejiang University, Liqiang Lu Zhejiang University, Long Wang Huawei Technologies, Yunfei Du Huawei Technologies, Zhe Wang Huawei Technologies, Jinghan Zhang Huawei Technologies, Jie Zhang Peking University, Chenli Xue Zhejiang University, Chengpeng Wu Zhejiang University, Ziyi Zhang Zhejiang University, Yun Liang Peking University, Size Zheng Tsinghua University, Jianwei Yin Zhejiang University
11:30 - 12:50
Visual and Multimodal AccelerationHPCA Main Conference at Coogee
Chair(s): Yu Feng Shanghai Jiao Tong University
11:30
20m
Talk
V-Rex: Real-Time Streaming Video LLM Acceleration via Dynamic KV Cache Retrieval
HPCA Main Conference
11:50
20m
Talk
SFD: Towards Segment Fusion Dataflow for Spatial Accelerators
HPCA Main Conference
Fuyu Wang Sun Yat-sen University, Minghua Shen Sun Yat-sen University, Yufei Ding UCSD, Nong Xiao National University of Defense Technology & Sun Yat-sen University, Yutong Lu Sun Yat-sen University
12:10
20m
Talk
VAR-Turbo: Unlocking the Potential of Visual Autoregressive Models through Dual Redundancy
HPCA Main Conference
Xujiang Xiang The Hong Kong University of Science and Technology, Fengbin Tu The Hong Kong University of Science and Technology
12:30
20m
Talk
Cambricon-GS: An Accelerator for 3D Gaussian Splatting Training with Gaussian-Pixel Hybrid Parallelism
HPCA Main Conference
Rui Wen Institute of Computing Technology, Chinese Academy of Sciences, Zhifei Yue University of Science and Technology of China, Tianbo Liu University of Science and Technology of China, Xinkai Song Institute of Computing Technology, Chinese Academy of Sciences, Jin Li Institute of Computing Technology, Chinese Academy of Sciences, Di Huang Chinese Academy of Sciences, Institute of Computing Technology, Jiaming Guo Institute of Computing Technology, Chinese Academy of Sciences, Xing Hu Institute of Computing Technology, Chinese Academy of Sciences, zidong du Institute of Computing Technology, Chinese Academy of Sciences, Qi Guo Chinese Academy of Sciences, Tianshi Chen Cambricon Technologies
11:30 - 12:50
Zero-Knowledge and Private Information RetrievalHPCA Main Conference at Cronulla
Chair(s): Hanjun Kim POSTECH
11:30
20m
Talk
zkPHIRE: A Programmable Accelerator for ZKPs over HIgh-degRee, Expressive Gates
HPCA Main Conference
Alhad Daftardar New York University, Jianqiao Cambridge Mo New York University, Joey Ah-kiow New York University, Benedikt Bünz New York University, Siddharth Garg New York University, Brandon Reagen New York University
11:50
20m
Talk
Conflux: A High-Performance Keyword Private Retrieval System for Dynamic Datasets
HPCA Main Conference
Zehao Chen Shandong University, Zhaoyan Shen Shandong University, Qian Wei Shandong University, Hang Lu Institute of Computing Technology, Chinese Academy of Sciences, Lei Ju Shandong University
12:10
20m
Talk
An Efficient and Scalable Hardware Architecture for Number Theoretic Transform on FPGA with Design Automation
HPCA Main Conference
Yilan Zhu Ant Group, Geng Yang Ant Group, Xingyu Tian Simon Fraser University, Dilshan Kumarathunga Simon Fraser University, Liang Kong Ant Group, Xianglong Deng UCAS, Shengyu Fan UCAS, Guang Fan Ant Group, Guiming Shi Tsinghua University, Lei Chen University of Chinese Academy of Sciences, Bo Zhang Ant Group, Yisong Chang Ant Group, Shoumeng Yan Ant Group, Zhenman Fang Simon Fraser University, Mingzhe Zhang Ant Group
12:30
20m
Talk
IVE: An Accelerator for Single-Server Private Information Retrieval Using a Versatile Processing Element
HPCA Main Conference
Sangpyo Kim Seoul National University, Hyesung Ji Seoul National University, Jongmin Kim Seoul National University, Jaiyoung Park Seoul National University, Wonseok Choi Seoul National University, Jung Ho Ahn Seoul National University
Pre-print
11:30 - 12:50
Cluster and Cloud ComputingPPoPP Main Conference at Pyrmont
Chair(s): Ruslan Nikolaev Pennsylvania State University
11:30
20m
Talk
Cacheman: A Comprehensive Last-Level Cache Management System for Multi-tenant Clouds
PPoPP Main Conference
Xiaokang Hu Alibaba Cloud Computing, Yuchao Cao Alibaba Cloud Computing, Naixuan Guan Alibaba Cloud Computing, Yifan Wu Alibaba Cloud Computing, Xishi Qiu Alibaba Cloud Computing, Shengdong Dai Alibaba Cloud Computing, Ben Luo Alibaba Cloud Computing, Sanchuan Cheng Alibaba Cloud Computing, Fudong Qiu Alibaba Cloud Computing, Yibin Shen Alibaba Cloud, Jiesheng Wu Alibaba Cloud Computing
DOI
11:50
20m
Talk
zBuffer: Zero-Copy and Metadata-Free Serialization for Fast RPC with Scatter-Gather Reflection
PPoPP Main Conference
Xiangyu Liu Xiamen University, Huiba Li Alibaba, Shun Gai Alibaba, Youmin Chen Shanghai Jiao Tong University, Yiming Zhang Xiamen University
DOI
12:10
20m
Talk
Scaling GPU-to-CPU Migration for Efficient Distributed Execution on CPU Clusters
PPoPP Main Conference
Ruobing Han Georgia Institute of Technology, Hyesoon Kim Georgia Institute of Technology
DOI
12:30
20m
Talk
Trojan Horse: Aggregate-and-Batch for Scaling Up Sparse Direct Solvers on GPU ClustersBest Paper Nominee
PPoPP Main Conference
Yida Li China University of Petroleum-Beijing, Siwei Zhang China University of Petroleum-Beijing, Yiduo Niu China University of Petroleum-Beijing, Yang Du China University of Petroleum-Beijing, Qingxiao Sun China University of Petroleum-Beijing, Zhou Jin China University of Petroleum-Beijing, Weifeng Liu China University of Petroleum-Beijing
DOI
12:50 - 14:10
HPCA Awards LunchCatering at Parkside Ballroom
12:50
80m
Awards
HPCA Awards Lunch
Catering

12:50 - 14:10
12:50
80m
Lunch
Lunch
Catering

14:10 - 15:30
Distributed TrainingPPoPP Main Conference at Balmoral
Chair(s): Bo Fang University of Texas at Arlington
14:10
20m
Talk
COCCL: A Collective Communication Library Supporting Easy Integration and Configuration of Customized Compression for Scalable LLM Training
PPoPP Main Conference
Xingchen Liu University of Chinese Academy of Sciences, Haoran Kong Chinese University of Hong Kong, Shenzhen, Hairui Zhao Jilin University, Shengkai Lyu University of Chinese Academy of Sciences, Zheng Wei University of Chinese Academy of Sciences, Man Liu University of Chinese Academy of Sciences, Xingjian Tian University of Chinese Academy of Sciences, Liyang Zhao University of Chinese Academy of Sciences, Zhuohan Chen University of Chinese Academy of Sciences, Fakang Wang Ant Group, Zizhong Chen Chinese University of Hong Kong, Shenzhen, Zhan Wang University of Chinese Academy of Sciences, Guangming Tan University of Chinese Academy of Sciences, Dingwen Tao Institute of Computing Technology, Chinese Academy of Sciences
DOI
14:30
20m
Talk
Elastor: Elastic and Efficient Model Partitioning and Checkpointing for Fault-Tolerant Distributed Training
PPoPP Main Conference
Xuanyu Wang Peking University, Fangcheng FU Shanghai Jiao Tong University, Haoyang Li Peking University, Hao Ge Peking University, Sheng Lin Peking University, Jiawen Niu Peking University, Bin Cui Peking University
DOI
14:50
20m
Talk
HelixPipe: Efficient Distributed Training of Long Sequence Transformers with Attention Parallel Pipeline Parallelism
PPoPP Main Conference
Geng Zhang National University of Singapore, Shenggan Cheng National University of Singapore, Xuanlei Zhao National University of Singapore, Ziming Liu , Yang You National University of Singapore
DOI
15:10
20m
Talk
CCL-D: A High-Precision Diagnostic System for Slow and Hang Anomalies in Large-Scale Model TrainingBest Paper Nominee
PPoPP Main Conference
Yida Gu University of Chinese Academy of Sciences, Fakang Wang AntGroup, Jianhao Fu AntGroup, Zhenhang Sun Ant Group, Qianyu Zhang Ant Group, Hairui Zhao Jilin University, Xingchen Liu University of Chinese Academy of Sciences, Yang Tian Ant Group, Wenjing Huang University of Chinese Academy of Sciences, Zedong Liu University of Chinese Academy of Sciences, Yifan Chen Ant Group, Jinwu Yang University of Chinese Academy of Sciences, Yueyuan Zhou University of Chinese Academy of Sciences, Qian Zhao Ant Group, Haoxu Li University of Chinese Academy of Sciences, Tao Wang Ant Group, Feng Yu Ant Group, Zhan Wang University of Chinese Academy of Sciences, Guangming Tan University of Chinese Academy of Sciences, Dingwen Tao Institute of Computing Technology, Chinese Academy of Sciences
DOI
14:10 - 15:30
AnalysisCGO Main Conference at Bronte
Chair(s): Jose Nelson Amaral University of Alberta
14:10
20m
Talk
PIP: Making Andersen’s Points-to Analysis Sound and Practical for Incomplete C Programs
CGO Main Conference
Håvard Rognebakke Krogstie NTNU, Helge Bahmann Independent Researcher, Magnus Själander Norwegian University of Science and Technology (NTNU), Nico Reissmann Independent Researcher
Pre-print Media Attached
14:30
20m
Talk
Thinking Fast and Correct: Automated Rewriting of Numerical Code through Compiler Augmentation
CGO Main Conference
Siyuan Brant Qian University of Illinois at Urbana-Champaign, Vimarsh Sathia University of Illinois Urbana Champaign, Ivan Ivanov Institute of Science Tokyo, Jan Hueckelheim Argonne National Laboratory, Paul Hovland Argonne National Laboratory, William S. Moses University of Illinois Urbana-Champaign
Pre-print Media Attached
14:50
20m
Talk
PolyUFC: Polyhedral Compilation Meets Roofline Analysis for Uncore Frequency Capping
CGO Main Conference
Nilesh Rajendra Shah Indian Institute of Technology Hyderabad, India, M V V S Manoj Kumar IIT Hyderabad, Dhairya Baxi IIT Hyderabad, Ramakrishna Upadrasta IIT Hyderabad
Pre-print
15:10
20m
Talk
Accelerating App Recompilation across Android System Updates by Code Reusing
CGO Main Conference
Hongtao Wu Wuhan University, Yu Chen Wuhan University, Mengfei Xie Wuhan University, Futeng Yang Guangdong OPPO Mobile Telecommunications, Jun Yan Guangdong OPPO Mobile Telecommunications, Jiang Ma OPPO Electronics Corp., Jianming Fu Wuhan University, Jason Xue MBZUAI, Qingan Li Wuhan University, China
Pre-print
14:10 - 15:30
Memory Systems for Scalable ComputingHPCA Main Conference at Collaroy
Chair(s): Alexandros Daglis Georgia Tech
14:10
20m
Talk
BARD: Reducing Write Latency of DDR5 Memory by Exploiting Bank-Parallelism
HPCA Main Conference
Suhas Vittal Georgia Tech, Moinuddin K. Qureshi Georgia Tech
14:30
20m
Talk
RoMe: Row Granularity Access Memory System for Large Language Models
HPCA Main Conference
Hwayong Nam Seoul National University, Seungmin Baek Seoul National University, Jumin Kim Seoul National University, Michael Jaemin Kim Meta, Jung Ho Ahn Seoul National University
Pre-print
14:50
20m
Talk
HDPAT: Hierarchical Distributed Page Address Translation for Wafer-Scale GPUs
HPCA Main Conference
daoxuan xu William & Mary, Ying Li William & Mary, Yuwei Sun UIUC, Jie Ren William & Mary, Yifan Sun William&Mary
15:10
20m
Talk
Pulse: Fine-Grained Hierarchical Hashing Index for Disaggregated Memory
HPCA Main Conference
Guangyang Deng Xiamen University, Zixiang Yu Xiamen University, Zhirong Shen Xiamen University, Qiangsheng Su Xiamen University, Jiwu Shu Xiamen University
14:10 - 15:30
LLM Systems and Microarchitecture ToolsHPCA Main Conference at Coogee
Chair(s): Josep Torellas
14:10
20m
Talk
LILo: Harnessing the On-chip Accelerators in Intel CPUs for Compressed LLM Inference Acceleration
HPCA Main Conference
Hyungyo Kim UIUC, Qirong Xia UIUC, Jinghan Huang UIUC, Nachuan Wang UIUC, Jung Ho Ahn Seoul National University, Younjoo Lee Seoul National University, Wajdi K Feghali Intel, Ren Wang Intel Labs, Nam Sung Kim UIUC
14:30
20m
Talk
ReThermal: Co-Design of Thermal-Aware Static and Dynamic Scheduling for LLM Training on Liquid-Cooled Wafer-Scale Chips
HPCA Main Conference
Chengran Li Tsinghua University, Huizheng Wang Tsinghua University, Jiaxin Liu Tsinghua University, Jingyao Liu Tsinghua University, Zhiheng Yue Tsinghua University, Xia Li Shanghai AI Lab, Shenfei Jiang Shanghai AI Lab, Jinyi Deng Tsinghua University, Yang Hu Tsinghua University, Shouyi Yin Tsinghua University
14:50
20m
Talk
TraceRTL: Agile Performance Evaluation for Microarchitecture Exploration
HPCA Main Conference
Zifei Zhang SKLP, Institute of Computing Technology, Chinese Academy of Sciences; University of Chinese Academy of Sciences, Yinan Xu SKLP, Institute of Computing Technology, Chinese Academy of Sciences; University of Chinese Academy of Sciences, Sa Wang SKLP, Institute of Computing Technology, Chinese Academy of Sciences; University of Chinese Academy of Sciences, Dan Tang SKLP, Institute of Computing Technology, Chinese Academy of Sciences; Beijing Institute of Open Source Chip, Yungang Bao State Key Lab of Processors, Institute of Computing Technology, CAS; University of Chinese Academy of Sciences
15:10
20m
Talk
Nugget: Portable Program Snippets
HPCA Main Conference
Zhantong Qiu University of California, Davis, Mahyar Samani University of California, Davis, Jason Lowe-Power University of California, Davis & Google
14:10 - 15:30
Emerging Compute ParadigmsHPCA Main Conference at Cronulla
Chair(s): Calin Cascaval Google DeepMind
14:10
20m
Talk
BASES: Enabling Energy-Efficient and Error-Resilient Analog CIM Acceleration via Reformation of Coding Bases
HPCA Main Conference
hongrui guo Institute of Computing Technology, Chinese Academy of Sciences, Tianrui Ma Institute of Computing Technology, Chinese Academy of Sciences, zidong du Institute of Computing Technology, Chinese Academy of Sciences, Mo Zou Institute of Computing Technology, Chinese Academy of Sciences, Yifan Hao ICT, Chinese Academy of Sciences, Yongwei Zhao Institute of Computing Technology, Chinese Academy of Sciences, Rui Zhang Chinese Academy of Sciences, Wei Li Institute of Software Chinese Academy of Sciences; University of Chinese Academy of Sciences, Xing Hu Institute of Computing Technology, Chinese Academy of Sciences, Zhiwei Xu Institute of Computing Technology of the Chinese Academy of Sciences, China, Qi Guo Chinese Academy of Sciences, Tianshi Chen Cambricon Technologies
14:30
20m
Talk
A PN-Free Digital SAT Accelerator Using Crossbar Architecture and Frequency-Controlled Counters
HPCA Main Conference
Zhezheng Ren University of Waterloo, Chenao Yuan University of Waterloo, Yuke Zhang University of Toronto, Shiyu Su University of Waterloo
14:50
20m
Talk
ESTroM: Element-Flow Architecture For Processing Sparse Tractable Probabilistic Models
HPCA Main Conference
anjunyi fan Peking University, Xuejie Liu Peking University, Anji Liu University of California, Los Angeles, Qiuping Wu Peking University, Jiaqi Yang Peking University, Yuchao Qin Peking University, Guy Van den Broeck University of California at Los Angeles, Yitao Liang Peking University, Bonan Yan Peking University
15:10
20m
Talk
GustavSNN: Unleashing the Power of Gustavson's Algorithm on SNN Acceleration with Column-Parallel Tick-Batch Dataflow
HPCA Main Conference
Sangwoo Hwang Korea University, Donghun Lee Korea University, Jahyun Koo DGIST, Jaeha Kung Korea University
14:10 - 15:30
Parallel AlgorithmsPPoPP Main Conference at Pyrmont
Chair(s): Kenjiro Taura The University of Tokyo
14:10
20m
Talk
Pipelonk: Accelerating End-to-End Zero-Knowledge Proof Generation on GPUs for PLONK-Based Protocols
PPoPP Main Conference
Zhiyuan Zhang Shandong University, Yanxin Cai Shandong University, Wenhao Yin Shandong University, Xueyu Wu The University of Hong Kong, Yi Wang Shenzhen University, Lei Ju Shandong University, Zhuoran Ji Shandong University
DOI
14:30
20m
Talk
ParDiff: Efficiently Parallelizing Reverse-Mode Automatic Differentiation with Direct Indexing
PPoPP Main Conference
Shuhong Huang Tsinghua University, Shizhi Tang Qingcheng.AI, Yuan Wen University of Aberdeen, Huanqi Cao Tsinghua University, Ruibai Tang Tsinghua University, yidong chen , Jiping Yu Tsinghua University, Yang Li Lenovo Research, Chao Jiang Lenovo Research, Limin Xiao Lenovo Research, Jidong Zhai Tsinghua University
DOI
14:50
20m
Talk
Faster and Cheaper: Pushing the Sequence Alignment Throughput with Commercial CPUs
PPoPP Main Conference
Zhonghai Zhang Institute of Computing Technology, Chinese Academy of Sciences / University of Chinese Academy of Sciences, Yewen Li The Hong Kong University of Science and Technology, Ke Meng Chinese Academy of Sciences, Chunming Zhang Institute of Computing Technology, Chinese Academy of Sciences, Guangming Tan University of Chinese Academy of Sciences
DOI
15:10
20m
Talk
PIM-zd-tree: A Fast Space-Partitioning Index Leveraging Processing-in-Memory
PPoPP Main Conference
Yiwei Zhao Carnegie Mellon University, Hongbo Kang Tsinghua University, Ziyang Men University of California, Riverside, Yan Gu University of California, Riverside, Guy E. Blelloch Carnegie Mellon University, Laxman Dhulipala University of Maryland, College Park, Charles McGuffey Reed College, Phil Gibbons Carnegie Mellon University
DOI
15:30 - 15:50
15:30
20m
Coffee break
Break
Catering

15:50 - 17:10
ML InferencePPoPP Main Conference at Balmoral
Chair(s): Hailong Yang Beihang University
15:50
20m
Talk
BEEMS: Boosting Machine Vision Efficiency via Computation Graph-Based Memory Smoothing
PPoPP Main Conference
Hanjing Shen Shanghai Jiao Tong University, Fangxin Liu Shanghai Jiao Tong University, Jian Liu Beijing University of Aeronautics and Astronautics, Li Jiang Shanghai Jiaotong University, Haibing Guan Shanghai Jiao Tong University
DOI
16:10
20m
Talk
Laser: Unlocking Layer-Level Scheduling for Efficient Multi-SLO LLM Serving
PPoPP Main Conference
Jianxiong Liao Sun Yat-sen University, ​​Quanxing​ Dong​ Sun Yat-sen University​, Yunkai Liang Sun Yat-sen University, Zhi Zhou Sun Yat-sen University, Xu Chen Sun Yat-sen University
DOI
16:30
20m
Talk
MixFusion: A Patch-Level Parallel Serving System for Mixed-Resolution Diffusion Models
PPoPP Main Conference
Desen Sun University of Waterloo, Zepeng Zhao Carnegie Mellon University, Yuke Wang Rice University
DOI
16:50
20m
Talk
ChituDiffusion: A Data-Characteristic-Aware Serving System for Diffusion Models
PPoPP Main Conference
Chengzhang Wu Tsinghua University, Liyan Zheng Tsinghua University, Haojie Wang Tsinghua University, Kezhao Huang Tsinghua University, Zixuan Ma Tsinghua University, Dong Dong , Jidong Zhai Tsinghua University
DOI
15:50 - 17:10
Compiling for ML 2CGO Main Conference at Bronte
Chair(s): Fabrice Rastello University Grenoble Alpes - Inria - CNRS - Grenoble INP - LIG
15:50
20m
Talk
QIGen: A Kernel Generator for Inference on Nonuniformly Quantized Large Language Models
CGO Main Conference
Tommaso Pegolotti ETH Zürich, Dan Alistarh IST Austria, Markus Püschel ETH Zurich
Pre-print Media Attached
16:10
20m
Talk
DyPARS: Dynamic-Shape DNN Optimization via Pareto-Aware MCTS for Graph Variants
CGO Main Conference
Hao Qian University of New South Wales, Guangli Li Institute of Computing Technology, Chinese Academy of Sciences, Qiuchu Yu Institute of Computing Technology at Chinese Academy of Sciences, Xueying Wang Beijing University of Posts and Telecommunications, Jingling Xue University of New South Wales
Pre-print Media Attached
16:30
20m
Talk
Compiler-Runtime Co-operative Chain of Verification for LLM-Based Code Optimization
CGO Main Conference
Hyunho Kwon Yonsei University, Sanggyu Shin SAIT, Ju Min Lee Yonsei University, Hoyun Youm Yonsei University, Seungbin Song SAIT, Seongho Kim Yonsei University, Hanwoong Jung Samsung Advanced Institute of Technology, Seungwon Lee Samsung Advanced Institute of Technology, Hanjun Kim Yonsei University
Pre-print
16:50
20m
Talk
Hexcute: A Compiler Framework for Automating Layout Synthesis in GPU Programs
CGO Main Conference
Xiao Zhang University of Toronto; NVIDIA, Yaoyao Ding University of Toronto; Vector Institute; NVIDIA, Bolin Sun University of Toronto; NVIDIA, Yang Hu NVIDIA, Tatiana Shpeisman Google, Gennady Pekhimenko University of Toronto / Vector Institute
Pre-print Media Attached
15:50 - 17:10
Accelerator Design and ModelingHPCA Main Conference at Collaroy
Chair(s): Leeor Peled Huawei
15:50
20m
Talk
NPUWattch: ML-based Power, Area, and Timing Modeling for Neural Accelerators
HPCA Main Conference
Sehyeon Kim Yonsei University, Minkwan Kim Yonsei University, Chanho Park Yonsei University, Hanmok Park Kyungpook National University, Seonghoon Kim Kyungpook National University, Taigon Song Kyungpook National University, William Song Yonsei University
16:10
20m
Talk
Area Bloating and the Future of Specialization
HPCA Main Conference
Qixuan Yu Princeton University, David Wentzlaff Princeton University
16:30
20m
Talk
Advancing Full-stack Acceleration for Schrödinger-Style Quantum Simulation
HPCA Main Conference
Shuang Liang Imperial College London, Yuncheng Lu Imperial College London, Ce Guo Imperial College London, Paul H J Kelly Imperial College London, Wayne Luk Imperial College London, Hongxiang Fan Imperial College London
16:50
20m
Talk
COMET: Communication and Memory Co-Design for Fine-Grained AI Inference in MCM Accelerators
HPCA Main Conference
Taishu Sheng College of Computer Science and Technology, National University of Defense Technology, Guangyu Sun Peking University, Dezun Dong NUDT
15:50 - 17:10
Distributed and Multi-GPU TrainingHPCA Main Conference at Coogee
Chair(s): J. Nelson Amaral
15:50
20m
Talk
Compression-Aware Gradient Splitting for Collective Communications in Distributed Training
HPCA Main Conference
Pranati Majhi Texas A&M University, Sabuj Laskar Texas A&M University, Abdullah Muzahid Texas A & M University, Eun Jung Kim
16:10
20m
Talk
SCALE: Tackling Communication Bottlenecks in Confidential Multi-GPU ML
HPCA Main Conference
Joongun Park Georgia Tech, Yongqin Wang University of Southern California, Huan Xu Georgia Institute of Technology, Hanjiang Wu Georgia Institute of Technology, Mengyuan Li USC, Tushar Krishna Georgia Institute of Technology
16:30
20m
Talk
AutoHAAP: Automated Heterogeneity-Aware Asymmetric Partitioning for LLM Training
HPCA Main Conference
Yuanyuan Wang Zhejiang Lab, Nana Tang Zhejiang Lab, Yuyang Wang Zhejiang Lab, Shu Pan Zhejiang Lab, Dingding Yu Zhejiang Lab, Zeyue Wang Zhejiang Lab, Mou Sun Zhejiang Lab, Kejie Fu Zhejiang Lab, Fangyu Wang Zhejiang Lab, Yunchuan Chen Zhejiang Lab, Ning Sun Zhejiang Lab, Fei Yang Zhejiang Lab
16:50
20m
Talk
Towards Compute-Aware In-Switch Computing for LLMs Tensor-Parallelism on Multi-GPU Systems
HPCA Main Conference
Chen Zhang Shanghai Jiao Tong University, Qijun Zhang Shanghai Jiao Tong University, Zhuoshan Zhou Shanghai Jiao Tong University, Yijia Diao Shanghai Jiao Tong University, Haibo Wang Huawei, Zhe Zhou Huawei, Zhipeng Tu Huawei, Zhiyao Li Huawei, Guangyu Sun Peking University, Zhuoran Song Shanghai Jiao Tong University, Zhigang Ji Shanghai Jiao Tong University, Jingwen Leng Shanghai Jiao Tong University, Minyi Guo Shanghai Jiao Tong University
15:50 - 17:10
Domain Specific AcceleratorsHPCA Main Conference at Cronulla
Chair(s): Jaewoong Sim Seoul National University
15:50
20m
Talk
Uni-STC: Unified Sparse Tensor Core
HPCA Main Conference
Haocheng Lian China University of Petroleum-Beijing, Qiyue Zhang China University of Petroleum-Beijing, Xinran Zhao China University of Petroleum-Beijing, Meichen Dong China University of Petroleum-Beijing, Yijie Nie China University of Petroleum-Beijing, Zhengyi Zhao China University of Petroleum-Beijing, Junzhong Shen National University of Defense Technology, Wei Guo National University of Defense Technology, Chun Huang National University of Defense Technology, Bingcai Sui National University of Defense Technology, Weifeng Liu China University of Petroleum-Beijing
16:10
20m
Talk
AUM: Unleashing the Efficiency Potential of Shared Processors with Accelerator Units for LLM Serving
HPCA Main Conference
Xinkai Wang Shanghai Jiao Tong University, Chao Li Shanghai Jiao Tong University, Yiming Zhuansun Shanghai Jiao Tong University, Jinyang Guo Shanghai Jiao Tong University, Xiaofeng Hou Shanghai Jiao Tong University, Jing Wang Shanghai Jiao Tong University, Luping Wang Alibaba Group, Weigao Chen Alibaba Group, Cheng Huang Alibaba Group, Guodong Yang Alibaba Group, Liping Zhang Alibaba Group, Minyi Guo Shanghai Jiao Tong University
16:30
20m
Talk
DRACO: A Hardware-Efficient Robot Rigid Body Dynamics Accelerator with Precision-Aware Quantization Framework
HPCA Main Conference
Xingyu Liu The Hong Kong University of Science and Technology, Jiawei Liang The Hong Kong University of Science and Technology, Yipu Zhang The Hong Kong University of Science and Technology, Linfeng Du The Hong Kong University of Science and Technology, Chaofang Ma The Hong Kong University of Science and Technology, Hui Yu Hong Kong University of Science and Technology, Xu Jiang University of Electronic Science and Technology of China, Wei Zhang The Hong Kong University of Science and Technology
16:50
20m
Talk
REASON: Accelerating Probabilistic Logical Reasoning for Neuro-Symbolic Cognitive Intelligence
HPCA Main Conference
Zishen Wan Georgia Institute of Technology, Che-Kai Liu Georgia Institute of Technology, Jiayi Qian Georgia Institute of Technology, Hanchen Yang Georgia Institute of Technology, Arijit Raychowdhury Georgia Institute of Technology, Tushar Krishna Georgia Institute of Technology
15:50 - 17:10
Graphs and Graph Neural NetworksPPoPP Main Conference at Pyrmont
Chair(s): Ali Jannesari Iowa State University
15:50
20m
Talk
ElasGNN: An Elastic Training Framework for Distributed GNN Training
PPoPP Main Conference
Siqi Wang Beihang University, Hailong Yang Beihang University, Pengbo Wang Beihang University, Hongliang Cao Beihang University, Yufan Xu Independent Researcher, Xuezhu Wang Beihang University, Zhongzhi Luan Beihang University, Yi Liu Beihang University, Depei Qian Beihang University
DOI
16:10
20m
Talk
APERTURE: Algorithm-System Co-optimization for Temporal Graph Network Inference
PPoPP Main Conference
Yiqing Wang Beihang University, Hailong Yang Beihang University, Enze Yu Beihang University, Qingxiao Sun Beihang University, Kejie Ma Beihang University, Kaige Zhang Beihang University, chenhao xie Beihang University, Depei Qian Beihang University
DOI
16:30
20m
Talk
TAC: Cache-Based System for Accelerating Billion-Scale GNN Training on Multi-GPU Platform
PPoPP Main Conference
Zhiqiang Liang , Hongyu Gao​​ , Fang Liu Computer Network Information Center, Chinese Academy of Sciences,University of Chinese Academy of Sciences, Jue Wang Computer Network Information Center, Chinese Academy of Sciences;University of Chinese Academy of Sciences, Xingguo Shi University of Chinese Academy of Sciences, Juyu Gu University of Chinese Academy of Sciences, Peng Di Ant Group & UNSW, San Li University of Chinese Academy of Sciences, Lei Tang University of Chinese Academy of Sciences, Chunbao Zhou University of Chinese Academy of Sciences, Lian Zhao University of Chinese Academy of Sciences, yangang wang University of Chinese Academy of Sciences, Xuebin Chi University of Chinese Academy of Sciences
DOI
16:50
20m
Talk
DTMiner: A Data-Centric System for Efficient Temporal Motif Mining
PPoPP Main Conference
hou yinbo Huazhong University of Science and Technology, Hao Qi Huazhong University of Science and Technology, Ligang He University of Warwick, Jin Zhao Huazhong University of Science and Technology, Yu Zhang School of Computer Science and Technology, Huazhong University of Science and Technology, Hui Yu Hong Kong University of Science and Technology, Longlong Lin Southwest University, Lin Gu Huazhong University of Science and Technology, Wenbin Jiang Huazhong University of Science and Technology, XIAOFEI LIAO Huazhong University of Science and Technology, Hai Jin Huazhong University of Science and Technology
DOI
17:15 - 18:15
17:15 - 18:15
Industry TrackHPCA Industry Track at Coogee
Chair(s): Pradip Bose IBM
17:15
20m
Industry talk
Enterprise Class On-Chip Accelerator Integration
HPCA Industry Track
17:35
20m
Industry talk
Characterizing Cloud-Native LLM Inference at ByteDance and Exposing Optimization Challenges and Opportunities for Future AI Accelerators
HPCA Industry Track
Jingwei Cai ByteDance Seed, Dehao Kong , Huang Hantao ByteDance Seed, Zishan Jiang ByteDance Seed, Zixuan Ma ByteDance Seed, Qingyu Guo ByteDance Seed, Zhenxing Zhang ByteDance Seed, Guiming Shi Tsinghua University, Mingyu Gao Tsinghua University, Kaisheng Ma Tsinghua University, Minghui Yu ByteDance Seed
17:55
20m
Industry talk
eGPU: Production-Scale Elastic Sharing over 10,000 GPUs
HPCA Industry Track
Xiaochuan Tang Alibaba Group, Hao Qi , Jianbo Dong Alibaba Group, Yinghao Yu Alibaba Group, Zhennan Xue Alibaba Group, Zhengyu Zhang Alibaba Group, Daocheng Ying Alibaba Group, Zheng Cao Alibaba Group, Xiaoyi Lu UC Merced
17:15 - 18:15
Genomics and BioinformaticsHPCA Main Conference at Cronulla
Chair(s): Abdulaziz Tabbakh King Fahd University of Petroleum and Minerals
17:15
20m
Talk
GenPairX: A Hardware-Algorithm Co-Designed Accelerator for Paired-End Read Mapping
HPCA Main Conference
Julien Eudine Huawei Technologies Switzerland AG, Chu Li Huawei Zurich Research Center, Zhuo Cheng Huawei Zurich Research Center, Renzo Andri Huawei Technologies Switzerland AG, Onur Mutlu ETH Zurich, Can Firtina ETH Zurich and UMD, Mohammad Sadrosadati ETH Zürich, Nika Mansouri Ghiasi ETH Zurich, Konstantina Koliogeorgi ETH Zurich, Anirban Nag Huawei Zurich Research Center, Arash Tavakkol Huawei Zurich Research Center, Haiyu Mao King's College London, Shai Bergman Huawei Zurich Research Center, Ji Zhang Huawei Zurich Research Center
17:35
20m
Talk
SAGe: A Lightweight Algorithm-Architecture Co-Design for Mitigating the Data Preparation Bottleneck in Large-Scale Genome Sequence Analysis
HPCA Main Conference
Nika Mansouri Ghiasi ETH Zurich, Talu Güloglu ETH Zurich, Harun Mustafa ETH Zurich and Johns Hopkins University, Can Firtina ETH Zurich and UMD, Konstantina Koliogeorgi ETH Zurich, Konstantinos Kanellopoulos ETH Zurich, Haiyu Mao King's College London, Rakesh Nadig ETH Zurich, Mohammad Sadrosadati ETH Zürich, Jisung Park POSTECH (Pohang University of Science and Technology), Onur Mutlu ETH Zurich
17:55
20m
Talk
NP-CAM: Efficient and Scalable DNA Classification using a NoC-Partitioned CAM Architecture
HPCA Main Conference
Benjamin F. Morris III Duke University, Tergel Molom-Ochir Duke University, Changchun Zhou Duke University, Yiran Chen Duke University, Alex Jones Syracuse University, Hai "Helen" Li Duke University
17:15 - 18:15
Optimizing TransformersPPoPP Main Conference at Pyrmont
Chair(s): Shaoshuai Zhang University of Electronic Science and Technology of China
17:15
20m
Talk
FlashAttention-T: Towards Fully Tensorized Attention by Exploiting Tensor-Vector Parallelism
PPoPP Main Conference
Jianxing Xu University of Science and Technology of China, Yuanbo Wen , Jun Bi Chinese Academy of Sciences, Ruibai Xu University of Science and Technology of China, Guanglin Xu Chinese Academy of Sciences, Rui Zhang Chinese Academy of Sciences, Wei Li Chinese Academy of Sciences, Ling Li Institute of Software, Chinese Academy of Sciences, Tianshi Chen Cambricon Technologies, Qi Guo Chinese Academy of Sciences, Yunji Chen Chinese Academy of Sciences
DOI
17:35
20m
Talk
Accelerating Sparse Transformer Inference on GPU
PPoPP Main Conference
Wenhao Dai China University of Petroleum-Beijing, Haodong Deng China University of Petroleum, Mengfei Rong China University of Petroleum, Xinyu Yang Beihang University, Hongyu Liu Baidu Inc., Fangxin Liu Shanghai Jiao Tong University, Hailong Yang Beihang University, Qianwen Cao China University of Petroleum, Qingxiao Sun Beihang University
DOI
17:55
20m
Talk
MetaAttention: A Unified and Performant Attention Framework Across Hardware Backends
PPoPP Main Conference
Feiyang Chen Shanghai Jiao Tong University, Yu Cheng Peking University, Lei Wang Peking University, Yuqing Xia Microsoft Research, Ziming Miao Microsoft Research, Lingxiao Ma Microsoft Research, Fan Yang Microsoft Research Asia, Jilong Xue Microsoft Research, Zhi Yang Peking University, Mao Yang Microsoft Research, Xingda Wei Shanghai Jiao Tong University, Haibo Chen Shanghai Jiao Tong University
DOI
18:30 - 21:30
18:30
3h
Social Event
Excursion
Catering

Wed 4 Feb

Displayed time zone: Hobart change

08:15 - 10:00
08:30 - 08:45
08:30
15m
Day opening
Didgeridoo Performance
Plenary Keynotes

09:50 - 11:10
Tensor OptimizationCGO Main Conference at Bronte
Chair(s): Bastian Hagedorn NVIDIA
09:50
20m
Talk
Multidirectional Propagation of Sparsity Information across Tensor Slices
CGO Main Conference
Kaio Henrique Andrade Ananias Universidade Federal de Minas Gerais, Danila Seliayeu University of Alberta, Jose Nelson Amaral University of Alberta, Fernando Magno Quintão Pereira Federal University of Minas Gerais
Pre-print Media Attached
10:10
20m
Talk
Synthesizing Specialized Sparse Tensor Accelerators for FPGAs via High-Level Functional Abstractions
CGO Main Conference
Hamza Javed McGill University, Canada, Christophe Dubach McGill University
Pre-print
10:30
20m
Talk
Progressive Low-Precision Approximation of Tensor Operators on GPUs: Enabling Greater Trade-Offs between Performance and Accuracy
CGO Main Conference
Fan Luo Institute of Computing Technology at Chinese Academy of Sciences, Guangli Li Institute of Computing Technology, Chinese Academy of Sciences, Zhaoyang Hao Institute of Computing Technology at Chinese Academy of Sciences, Xueying Wang Beijing University of Posts and Telecommunications, Xiaobing Feng ICT CAS, Huimin Cui Institute of Computing Technology, Chinese Academy of Sciences, Jingling Xue University of New South Wales
Pre-print
10:50
20m
Talk
Tensor Program Superoptimization through Cost-Guided Symbolic Program Synthesis
CGO Main Conference
Alexander Brauckmann University of Edinburgh, Aarsh Chaube University of Edinburgh, José Wesley De Souza Magalhães University of Edinburgh, Elizabeth Polgreen University of Edinburgh, Michael F. P. O'Boyle University of Edinburgh
Pre-print Media Attached
09:50 - 11:10
Hardware Security and Side-Channel DefensesHPCA Main Conference at Collaroy
Chair(s): Georgios Vavouliotis Huawei Zurich Research Center, Switzerland
09:50
20m
Talk
DSASSASSIN: Cross-VM Side-Channel Attacks by Exploiting Intel Data Streaming Accelerator
HPCA Main Conference
Ben Chen The Hong Kong University of Science and Technology (Guangzhou), Kunlin Li The Hong Kong University of Science and Technology (Guangzhou), Shuwen Deng Tsinghua University, Dongsheng Wang Tsinghua University, Yun Chen The Hong Kong University of Science and Technology (Guangzhou)
10:10
20m
Talk
SSBleed: Non-speculative Side-channel Attacks via Speculative Store Bypass on Armv9 CPUs
HPCA Main Conference
Chang Liu Tsinghua University, Hongpei Zheng Tsinghua University, Xin Zhang Peking University, Dapeng Ju Tsinghua University, Dongsheng Wang Tsinghua University, Yinqian Zhang Southern University of Science and Technology, Trevor E. Carlson National University of Singapore
10:30
20m
Talk
Protean: A Programmable Spectre Defense
HPCA Main Conference
Nicholas Mosier Stanford University, Hamed Nemati KTH Royal Institute of Technology, John C. Mitchell Stanford University, Caroline Trippel Stanford University
10:50
20m
Talk
HERO-Sign: Hierarchical Tuning and Efficient Compiler-Time GPU Optimizations for SPHINCS$^+$ Signature Generation
HPCA Main Conference
Yaoyun Zhou University of California, Merced, Qian Wang University of California, Merced (UC Merced)
09:50 - 11:10
Graph Neural Networks and Retrieval SystemsHPCA Main Conference at Coogee
Chair(s): Amir Yazdanbakhsh Google Research, Brain Team
09:50
20m
Talk
VeloxGNN: Accelerating Out-of-Core based GNN Training with Low Data Migration and High Accuracy via Delayed Gradient Propagation
HPCA Main Conference
Yi Li University of Texas at Dallas, Tsun-Yu Yang Center for Computational Evolutionary Intelligence, Electrical & Computer Engineering, Duke University, Zhaoyan Shen Shandong University, Ming-Chang Yang The Chinese University of Hong Kong (CUHK), Bingzhe Li University of Texas at Dallas
10:10
20m
Talk
AutoGNN: End-to-End Hardware-Driven Graph Preprocessing for Enhanced GNN Performance
HPCA Main Conference
Seungkwan Kang KAIST, Seungjun Lee KAIST, Donghyun Gouk Panmnesia, Miryeong Kwon Panmnesia, Hyunkyu Choi Panmnesia, Junhyeok Jang Panmnesia, Sangwon Lee Panmnesia, Huiwon Choi KAIST, Jie Zhang Peking University, Wonil Choi Hanyang University, Mahmut Taylan Kandemir Pennsylvania State University, Myoungsoo Jung KAIST
10:30
20m
Talk
Scaling Graph Neural Network Training via Geometric Optimization
HPCA Main Conference
Fangzhou Ye University of Central Florida, Lingxiang Yin University of Central Florida, Hao Zheng University of Central Florida
10:50
20m
Talk
VectorLiteRAG: Latency-Aware and Fine-Grained Resource Partitioning for Efficient RAG
HPCA Main Conference
Junkyum Kim Georgia Institute of Technology, Divya Mahajan Georgia Institute of Technology
09:50 - 11:10
GPU Kernel Optimization and Resource SharingHPCA Main Conference at Cronulla
Chair(s): Hyojin Sung Seoul National University
09:50
20m
Talk
μShare: Non-Intrusive Kernel Co-Locating on NVIDIA GPUs
HPCA Main Conference
Wenhao Huang Tianjin University, Zhaolin Duan Tianjin University, Laiping Zhao Tianjin University, Yuhao Zhang Tianjin University, Yanjie Wang Tianjin University, Yiming Li Tianjin University, Yihan Wang Tianjin University, Yichi Chen Tianjin University, Zhihang Tang Tianjin University, Kang Chen Tsinghua University, Deze Zeng China University of Geosciences, Wenxin Li Tianjin University, Keqiu Li Tianjin University
10:10
20m
Talk
FlashFuser: Expanding the Scale of Kernel Fusion for Compute-Intensive operators via Inter-Core Connection
HPCA Main Conference
huang ziyu Shanghai Jiao Tong University, Yangjie Zhou National University of Singapore, Zihan Liu Shanghai Jiao Tong University, Xinhao Luo Shanghai Jiao Tong University, Yijia Diao Shanghai Jiao Tong University, Minyi Guo Shanghai Jiao Tong University, Jidong Zhai Tsinghua University, Yu Feng Shanghai Jiao Tong University, Chen Zhang Shanghai Jiao Tong University, Anbang Wu Shanghai Jiao Tong University, Jingwen Leng Shanghai Jiao Tong University
10:30
20m
Talk
Swift: High-Performance Sparse-Dense Matrix Multiplication on GPUs
HPCA Main Conference
Jinyu Hu Hunan University, Huizhang Luo Hunan University, Hong Jiang UT Arlington, Marc Casas Barcelona Supercomputing Center, Kenli Li National Supercomputing Center in Changsha, Hunan University, Chubo Liu Hunan University
10:50
20m
Talk
QuCo: Efficient and Flexible Hardware-Driven Automatic Configuration of Tile Transfers in GPUs
HPCA Main Conference
Nicolas Meseguer University of Murcia, daoxuan xu William & Mary, Yifan Sun William&Mary, Michael Pellauer Nvidia, José L. Abellán University of Murcia, Manuel E. Acacio Universidad de Murcia (UMU)
09:50 - 11:10
Matrix and Linear Algebra AlgorithmsPPoPP Main Conference at Pyrmont
Chair(s): Tony Hosking Australian National University
09:50
20m
Talk
Towards Singular Value Decomposition for Rank-Deficient Matrices: An Efficient and Accurate Algorithm on GPU Architectures
PPoPP Main Conference
Lu Shi University of Electronic Science and Technology of China, WeiWei Xu Nanjing University of Information Science and Technology, Shaoshuai Zhang University of Electronic Science and Technology of China
DOI
10:10
20m
Talk
A Diagonal Block Memory-Aware Polynomial Preconditioner for Linear and Eigenvalue Solvers
PPoPP Main Conference
Xiaojian Yang National University of Defense Technology, Yuhui Ni National University of Defense Technology, Fan Yuan Xiangtan University, Shengguo Li National University of Defense Technology, Dezun Dong NUDT, xuchuanfu National University of Defense Technology, Haipeng Jia Jia, Jie Liu National University of Defense Technology
DOI
10:30
20m
Talk
A Distributed Matrix-Block-Vector Multiplication in Presence of System Performance Variability
PPoPP Main Conference
Yuchen Ma College of William & Mary, Bin Ren College of William & Mary, Andreas Stathopoulos College of William & Mary
DOI
10:50
20m
Talk
Characterizing Matrix Multiplication Units across General Parallel Patterns in Scientific Computing
PPoPP Main Conference
Yuechen Lu China University of Petroleum-Beijing, Hongwei Zeng , Marc Casas Barcelona Supercomputing Center, Weifeng Liu China University of Petroleum-Beijing
DOI
11:10 - 11:30
11:10
20m
Coffee break
Break
Catering

11:30 - 12:50
OptimizationCGO Main Conference at Bronte
Chair(s): Teresa Johnson Google
11:30
20m
Talk
A Reinforcement Learning Environment for Automatic Code Optimization in the MLIR Compiler
CGO Main Conference
Mohammed Tirichine New York University Abu Dhabi; Ecole nationale Supérieure d'Informatique, Nassim Ameur NYU Abu Dhabi; École Nationale Supérieure d’Informatique, Nazim Bendib NYU Abu Dhabi; École Nationale Supérieure d’Informatique, Iheb Nassim Aouadj NYU Abu Dhabi, Djad Bouchama NYU Abu Dhabi; University of Science and Technology Houari Boumediene, Rafik Bouloudene NYU Abu Dhabi; University of Science and Technology Houari Boumediene, Riyadh Baghdadi New York University Abu Dhabi
Pre-print Media Attached
11:50
20m
Talk
Towards Threading the Needle of Debuggable Optimized Binaries
CGO Main Conference
Cristian Assaiante Sapienza University of Rome, Simone Di Biasio Sapienza University of Rome, Snehasish Kumar Google LLC, Giuseppe Antonio Di Luna Sapienza University of Rome, Daniele Cono D'Elia Sapienza University of Rome, Leonardo Querzoni Sapienza University Rome
Pre-print Media Attached
12:10
20m
Talk
Compiler-Assisted Instruction Fusion
CGO Main Conference
Ravikiran Ravindranath Reddy University of Murcia, Sawan Singh AMD, Arthur Perais CNRS, Alberto Ros University of Murcia, Alexandra Jimborean University of Murcia
Pre-print
12:30
20m
Talk
LLM-VeriOpt: Verification-Guided Reinforcement Learning for LLM-Based Compiler Optimization
CGO Main Conference
Xiangxin Fang Queen Mary University of London; University of Edinburgh, Jiaqin Kang Queen Mary University of London, Rodrigo C. O. Rocha University of Edinburgh, Sam Ainsworth University of Edinburgh, Lev Mukhanov IMEC (Cambridge); Queen Mary University of London
Pre-print Media Attached
11:30 - 12:50
FPGA, SmartNIC, and Reconfigurable ComputingHPCA Main Conference at Collaroy
Chair(s): Jinho Lee Seoul National University
11:30
20m
Talk
RidgeWalker: Perfectly Pipelined Graph Random Walks on FPGAs
HPCA Main Conference
Hongshi Tan National University of Singapore, Yao CHEN , Xinyu Chen Hong Kong University of Science and Technology, Qizhen Zhang University of Toronto, Cheng Chen ByteDance, China, Weng-Fai Wong National University of Singapore, Bingsheng He National University of Singapore
11:50
20m
Talk
DP-HLS: A High-Level Synthesis Framework for Accelerating Dynamic Programming Algorithms in Bioinformatics
HPCA Main Conference
Anshu Gupta UC San Diego, Yingqi Cao UC San Diego, Jason Liang UC San Diego, Yatish Turakhia UC San Diego
12:10
20m
Talk
Sassy: SmartNIC-Assisted Notification Delivery for μs-scale RDMA Workloads
HPCA Main Conference
Hamed Seyedroudbari Georgia Tech, Alexandros Daglis Georgia Tech
12:30
20m
Talk
TurboFuzz: FPGA Accelerated Hardware Fuzzing for Processor Agile Verification
HPCA Main Conference
Yang Zhong Institute of Computing, Chinese Academy of Sciences, Haoran Wu University of Cambridge, Xueqi Li State Key Lab of Processors, Institute of Computing Technology, CAS, Sa Wang SKLP, Institute of Computing Technology, Chinese Academy of Sciences; University of Chinese Academy of Sciences, David Boland The University of Sydney, Yungang Bao State Key Lab of Processors, Institute of Computing Technology, CAS; University of Chinese Academy of Sciences, Kan Shi Institute of Computing, Chinese Academy of Sciences
11:30 - 12:50
Efficient Serving and Resource ManagementHPCA Main Conference at Coogee
Chair(s): Mohammad A. Islam University of Texas at Arlington
11:30
20m
Talk
Near-Zero-Overhead Freshness for Recommendation Systems via Inference-Side Model Updates
HPCA Main Conference
Wenjun Yu Hong Kong Baptist University, Sitian Chen Hong Kong Baptist University, Amelie Chi Zhou Hong Kong Baptist University, Cheng Chen ByteDance, China
11:50
20m
Talk
AccelFlow: Orchestrating an On-Package Ensemble of Fine-Grained Accelerators for Microservices
HPCA Main Conference
Jovan Stojkovic University of Illinois at Urbana-Champaign, Abraham Farrell University of Illinois Urbana-Champaign, Zhangxiaowen Gong Intel, Christopher J. Hughes Intel, Josep Torrellas University of Illinois at Urbana-Champaign
12:10
20m
Talk
SpotCC: Facilitating Coded Computation for Prediction Serving Systems on Spot Instances
HPCA Main Conference
Lin Wang , Yuchong Hu Huazhong University of Science and Technology, Ziling Duan Huazhong University of Science and Technology, Mingqi Li Huazhong University of Science and Technology, Chenxuan Yao Huazhong University of Science and Technology, feifanliu Huazhong University of Science and Technology, Xiaolu Li Huazhong University of Science and Technology, Leihua Qin Huazhong University of Science and Technology, Dan Feng Huazhong University of Science and Technology, China
12:30
20m
Talk
LowCarb: Carbon-Aware Scheduling of Serverless Functions
HPCA Main Conference
Rohan Basu Roy University of Utah, Devesh Tiwari Northeastern University
11:30 - 12:50
GPU Memory Management and Multi-Chiplet SystemsHPCA Main Conference at Cronulla
Chair(s): EJ Kim Texas A&M University
11:30
20m
Talk
Exploration of LLM Workload Reliability based on di/dt effects and Voltage Droops
HPCA Main Conference
Zhixing Jiang University of Texas at Austin, Justin Garrigus University of Texas at Austin, Allison Seigler University of Texas at Austin, Ethan Syed University of Texas at Austin, Yan-Lun Huang University of Texas at Austin, Mehdi Sadi Advanced Micro Devices, Tawfik Rahal-Arabi Advanced Micro Devices, Lizy John University of Texas, Austin
11:50
20m
Talk
ARIADNE: Adaptive UVM Management for Efficient GPU Memory Oversubscription
HPCA Main Conference
Hyunkyun Shin Yonsei University, Seongtae Bang DGIST, Hyungwon Park DGIST, Daehoon Kim Yonsei University
12:10
20m
Talk
LRM-GPU: Alleviating Synchronization Overhead for Multi-Chiplet GPU Architecture
HPCA Main Conference
Baiqing Zhong Sun Yat-Sen University, Zhirong Ye Sun Yat-Sen University, Xiaojie Li Sun Yat-Sen University, Peilin Wang Sun Yat-Sen University, Haiqiu Huang Sun Yat-Sen University, Zhaolin Li Tsinghua University, Zhiyi Yu Sun Yat-sen University, Mingyu Wang Sun Yat-Sen University
12:30
20m
Talk
LEGO: Supporting LLM-enhanced Games with One Gaming GPU
HPCA Main Conference
Han Zhao Shanghai Jiao Tong University, Weihao Cui Shanghai Jiao Tong University, Zeshen Zhang Tongji University, Wenhao Zhang Shanghai Jiao Tong University, Jiangtong Li Tongji University, Quan Chen Shanghai Jiao Tong University, China, Youmin Chen Shanghai Jiao Tong University, Pu Pang Shanghai Jiao Tong University, Zijun Li Shanghai Jiao Tong University, Zhenhua Han The University of Hong Kong, Yuqing Yang Microsoft Research, Minyi Guo Shanghai Jiao Tong University
12:50 - 13:20
Hide past events