Search all - HPCA/CGO/PPoPP/CC 2026

Events (44 results)

ROME: Maximizing GPU Efficiency for All-Pairs Shortest Path via Taming Fine-Grained Irregularities

Main Conference When: Mon 2 Feb 2026 16:50 - 17:10 People: Weile Luo, Yuhan Chen, Xiangrui Yu, Qiang Wang, Ruibo Fan, Hongyuan Liu, Xiaowen Chu

… All-Pairs Shortest Path (APSP), a fundamental problem in graph analytics, can be solved efficiently by reducing the computational workload through vertex …% and up to 34.7% of peak min-plus OPs across all tested graphs. …

Welcome Reception

Catering When: Sun 1 Feb 2026 18:00 - 20:00

… All attendees registered for the main conference are invited to attend the welcome reception from 18:00 on Sunday evening, where there will be great food and drink and an opportunity to engage with the vibrant HPCA/CGO/PPoPP/CC …

Oracle Parfait – Scaling Vulnerability Detection from Enterprise Systems to Cloud-Scale Systems and Beyond

Plenary Keynotes When: Tue 3 Feb 2026 08:45 - 09:45 People: Cristina Cifuentes

… to a DevSecOps model where security gets integrated at all levels of the software process …

Compiler 2.0: Building the Next Generation Compilers with Machine Learning

Plenary Keynotes When: Mon 2 Feb 2026 08:45 - 09:45 People: Saman Amarasinghe

… , complex vector instructions, and specialized accelerators have all pushed more …

Practical MHP Analysis for Java

Main Conference When: Sun 1 Feb 2026 12:18 - 12:45 People: Samuel Moses, V Krishna Nandivada

… time on all the tested benchmarks in DaCapo and Renaissance benchmarks (geomean …

Accelerating Sparse Algebra with Program Synthesis

Main Conference When: Sat 31 Jan 2026 16:00 - 16:26 People: José Wesley De Souza Magalhães, Shideh Hashemian, Alexander Brauckmann, Jackson Woodruff, Elizabeth Polgreen, Michael F. P. O'Boyle

… , and GPT 4.o respectively. All lifted programs are completely correct compared …

CHEHAB: Automatic Compiler Code Optimization for Fully Homomorphic Encryption

Main Conference When: Sat 31 Jan 2026 14:37 - 15:03 People: Riyadh Baghdadi, Abdessamed Seddiki, Arab Mohammed, Zakaria Hebbal, Aimad Chabounia, Eduardo Chielle, Michail Maniatakos, MENACER Djamel Eddine, Karima Benatchba, Challal Yacine

… cryptographic expertise. Programmers may not be aware of all possible optimizations …

DiTOX: Fault Detection and Localization in the ONNX Optimizer

Main Conference When: Sat 31 Jan 2026 13:45 - 14:11 People: Nikolaos Louloudakis, Ajitha Rajan

… optimization passes as well as the optimizer more broadly. All findings were reported …

Sharded Elimination and Combining for Highly-Efficient Concurrent Stacks

Main Conference When: Mon 2 Feb 2026 14:30 - 14:50 People: Ajay Singh, Nikos Metaxakis, Panagiota Fatourou

… all existing concurrent stacks. The proposed implementation is based on a novel … that the proposed stack implementation outperforms all existing concurrent stacks by up to 2X …

PANA: A Fine-Grained Runtime-Adaptive Load Balancing for Parallel SpMV on Multicore CPUs

Main Conference When: Mon 2 Feb 2026 12:30 - 12:50 People: Haodong Bian, Youhui Zhang, Xiang Fei, Jianqiang Huang, Xiaoying Wang

… SuiteSparse matrices, PANA consistently outperforms all baseline methods (CAMLB, CSR5 …

zBuffer: Zero-Copy and Metadata-Free Serialization for Fast RPC with Scatter-Gather Reflection

Main Conference When: Tue 3 Feb 2026 11:50 - 12:10 People: Xiangyu Liu, Huiba Li, Shun Gai, Youmin Chen, Yiming Zhang

… , we design a fast RPC system (called zRPC) which eliminates all RPC memory copy …

Waste-Efficient Work Stealing

Main Conference When: Mon 2 Feb 2026 11:50 - 12:10 People: Kyle Singer, Kunal Agrawal, TB Schardl

… Although randomized work stealing is effective at automatically load-balancing task-parallel programs, it can waste computational resources when scheduling programs that lack sufficient parallelism to use all available threads …

DTMiner: A Data-Centric System for Efficient Temporal Motif Mining

Main Conference When: Tue 3 Feb 2026 16:50 - 17:10 People: hou yinbo, Hao Qi, Ligang He, Jin Zhao, Yu Zhang, Hui Yu, Longlong Lin, Lin Gu, Wenbin Jiang, XIAOFEI LIAO, Hai Jin

… order and then triggers all relevant tasks to explore only these loaded data …

Pipelonk: Accelerating End-to-End Zero-Knowledge Proof Generation on GPUs for PLONK-Based Protocols

Main Conference When: Tue 3 Feb 2026 14:10 - 14:30 People: Zhiyuan Zhang, Yanxin Cai, Wenhao Yin, Xueyu Wu, Yi Wang, Lei Ju, Zhuoran Ji

… } introduces a segmentable operator library that offloads all operations, including …

Ember: A Compiler for Embedding Operations on Decoupled Access-Execute Architectures

Main Conference When: Mon 2 Feb 2026 12:30 - 12:50 People: Marco Siracusa, Olivia Hsu, Víctor Soria-Pardos, Joshua Randall, Arnaud Grasset, Eric Biscondi, Douglas J. Joseph, Randy Allen, Fredrik Kjolstad, Miquel Moreto, Adrià Armejach Sanosa

… to automatically compile all of these embedding operations to DAE architectures. Conversely … implement all optimizations to match the performance of hand-written code …

Pyls: Enabling Python Hardware Synthesis with Dynamic Polymorphism via LCRS Encoding

Main Conference When: Mon 2 Feb 2026 11:50 - 12:10 People: Bolei Tong, Yongyan Fang, Wang Chaorui, Qingan Li, Jingling Xue, YUAN Mengting

… is representing all Python objects as LCRS trees, enabling uniform hardware handling …

Compilation of Generalized Matrix Chains with Symbolic Sizes

Main Conference When: Tue 3 Feb 2026 10:50 - 11:10 People: Francisco López, Lars Karlsson, Paolo Bientinesi

… calls is optimal for all possible combinations of matrix sizes.
We design … results that guarantee that the cost is within a constant factor from optimal for all …

GRANII: Selection and Ordering of Primitives in GRAph Neural Networks using Input Inspection

Main Conference When: Mon 2 Feb 2026 10:10 - 10:30 People: Damitha Lenadora, Vimarsh Sathia, Gerasimos Gerogiannis, Serif Yesil, Josep Torrellas, Charith Mendis

… compilation stage that enumerates all valid re-associations leading to different sparse …

SparseX: Synergizing GPU Libraries for Sparse Matrix Multiplication on Heterogeneous Processors

Main Conference When: Tue 3 Feb 2026 10:30 - 10:50 People: Ruifeng Zhang, Xiangwei Wang, Ang Li, Xipeng Shen

… all matrices and scenarios. Based on the empirical observations, this work …

BIT: Empowering Binary Analysis through the LLVM Toolchain

Main Conference When: Tue 3 Feb 2026 10:10 - 10:30 People: Puzhuo Liu, Peng Di, Jingling Xue, Yu Jiang

… ; in reanalysis, BIT can complete all tasks and is consistent with the advanced work …

PriTran: Privacy-Preserving Inference for Transformer-Based Language Models under Fully Homomorphic Encryption

Main Conference When: Mon 2 Feb 2026 11:30 - 11:50 People: Yuechen Mu, Guangli Li, Shiping Chen, Jingling Xue

… ) matrix multiplications (MMs) across all BERT models by reducing costly …

Dr.avx: A Dynamic Compilation System for Seamlessly Executing Hardware-Unsupported Vectorization Instructions

Main Conference When: Tue 3 Feb 2026 10:30 - 10:50 People: Yue Tang, Mianzhi Wu, Yufeng Li, Haoyu Liao, Jianmei Guo, Bo Huang

… challenges emerging across all major ISAs. …

Partial-Evaluation Templates: Accelerating Partial Evaluation with Pre-compiled Templates

Main Conference When: Mon 2 Feb 2026 11:30 - 11:50 People: Florian Huemer, Aleksandar Prokopec, David Leopoldseder, Raphael Mosaner, Hanspeter Mössenböck

… for nearly all opcodes in GraalWasm, which reduced partial-evaluation time by up …

Tensor Abstraction Enabling Explicit Layout Optimization in Homomorphic Encryption

Student Research Competition When: Sun 1 Feb 2026 18:00 - 20:00 People: Seongho Kim, Hanjun Kim

… , multiplication, and cyclic rotation – where all slots must be uniformly processed in each …

Synthesizing Instruction Selection Back-Ends from ISA Specifications Made Practical

Main Conference When: Tue 3 Feb 2026 10:10 - 10:30 People: Florian Drescher, Alexis Engelke

… to ensure completeness in all other cases. Combined with search bounds derived from …

Practical: Are Abstract-Interpreter Baseline JITs Worth It? An Empirical Evaluation through Metacompilation

Main Conference When: Tue 3 Feb 2026 10:50 - 11:10 People: Nahuel Palumbo, Guillermo Polito, Stéphane Ducasse, Pablo Tesone

… , although they share the same technique, all these implementations vary …

MoEntwine: Unleashing the Potential of Wafer-scale Chips for Large-scale Expert Parallel Inference

Main Conference When: Tue 3 Feb 2026 10:50 - 11:10 People: Xinru Tang, Jingxiang Hou, Dingcheng Jiang, Taiquan Wei, Jiaxin Liu, Jinyi Deng, Huizheng Wang, Qize Yang, Haoran Shang, Chao Li, Yang Hu, Shouyi Yin

… parallelism (EP) to alleviate memory bottleneck, which introduces all-to-all … GPU clusters, high-overhead cross-node communication makes all-to-all expensive … provide a unified high-performance network connecting all devices, presenting …

Enterprise Class On-Chip Accelerator Integration

Industry Track When: Tue 3 Feb 2026 17:15 - 17:35 People: Deanna Berger, Alper Buyuktosunoglu, Craig Walters, Robert Sonnelitter, Hailey Nicholson, Ashraf ElSharif, Yamil Rivera, Avery Francois, Cedric Lichtenau, Jason Kohl

… with a sustained processor utilization of over 90% under all workload conditions … with an integrated multi-tier unified cache hierarchy all within one chip. The processor chip design leverages a unique approach to ensure all elements work in unison …

I-POP: Ignite Positive Prefetchers

Main Conference When: Tue 3 Feb 2026 10:50 - 11:10 People: Yiquan Lin, Wenhai Lin, Yiquan Chen, Jiexiong Xu, Shishun Cai, Jiarong Ye, Zonghui Wang, Wenzhi Chen

… prefetchers for issuing requests, but they all face limitations. Specifically, existing … prefetcher’s PE, and the Control Engine, which dynamically manages all …

Protean: A Programmable Spectre Defense

Main Conference When: Wed 4 Feb 2026 10:30 - 10:50 People: Nicholas Mosier, Hamed Nemati, John C. Mitchell, Caroline Trippel

… We present the Protean Spectre defense—the first to be altogether comprehensive, covering all side-channels and speculation; programmer-transparent, requiring no source modifications; and programmable, tailoring its hardware protections …

GustavSNN: Unleashing the Power of Gustavson's Algorithm on SNN Acceleration with Column-Parallel Tick-Batch Dataflow

Main Conference When: Tue 3 Feb 2026 15:10 - 15:30 People: Sangwoo Hwang, Donghun Lee, Jahyun Koo, Jaeha Kung

… this by employing tick-batch techniques, which process all timesteps within a layer before …

GenPairX: A Hardware-Algorithm Co-Designed Accelerator for Paired-End Read Mapping

Main Conference When: Tue 3 Feb 2026 17:15 - 17:35 People: Julien Eudine, Chu Li, Zhuo Cheng, Renzo Andri, Onur Mutlu, Can Firtina, Mohammad Sadrosadati, Nika Mansouri Ghiasi, Konstantina Koliogeorgi, Anirban Nag, Arash Tavakkol, Haiyu Mao, Shai Bergman, Ji Zhang

… CPU-based and 1.41$\times$ compared to hardware-based read mappers, all while …

VectorLiteRAG: Latency-Aware and Fine-Grained Resource Partitioning for Efficient RAG

Main Conference When: Wed 4 Feb 2026 10:50 - 11:10 People: Junkyum Kim, Divya Mahajan

… consistently expands the SLO-compliant request rate range across all tested …

QuCo: Efficient and Flexible Hardware-Driven Automatic Configuration of Tile Transfers in GPUs

Main Conference When: Wed 4 Feb 2026 10:50 - 11:10 People: Nicolas Meseguer, daoxuan xu, Yifan Sun, Michael Pellauer, José L. Abellán, Manuel E. Acacio

… , and synchronization primitives, all of which are hardware-specific and workload …

Splatonic: Architecture Support for 3D Gaussian Splatting SLAM via Sparse Processing

Main Conference When: Mon 2 Feb 2026 16:10 - 16:30 People: Xiaotong Huang, He Zhu, Tianrui Ma, Yuxiang Xiong, Fangxin Liu, Zhezhi He, Yiming Gan, Zihan Liu, Jingwen Leng, Yu Feng, Minyi Guo

… and 241.1$\times$ energy savings over state-of-the-art accelerators, all …

LEGO: Supporting LLM-enhanced Games with One Gaming GPU

Main Conference When: Wed 4 Feb 2026 12:30 - 12:50 People: Han Zhao, Weihao Cui, Zeshen Zhang, Wenhao Zhang, Jiangtong Li, Quan Chen, Youmin Chen, Pu Pang, Zijun Li, Zhenhua Han, Yuqing Yang, Minyi Guo

… . Evaluations on an Nvidia RTX 4090 show that LEGO meets latency targets in all …

FractalCloud: A Fractal-Inspired Architecture for Efficient Large-Scale Point Cloud Processing

Main Conference When: Mon 2 Feb 2026 16:30 - 16:50 People: Yuzhe Fu, Changchun Zhou, Hancheng Ye, Bowen Duan, Qiyu Huang, Chiyue Wei, Cong Guo, Hai "Helen" Li, Yiran Chen

… ) block-parallel point operations that decompose and parallelize all point …

Cyclone: Designing Efficient and Highly Parallel QCCD Architectural Codesigns for Fault Tolerant Quantum Memory

Main Conference When: Mon 2 Feb 2026 11:50 - 12:10 People: Sahil Khan, Abhinav Anand, Kenneth R. Brown, Jonathan M. Baker

… Modular trapped-ion quantum computing hardware, known as Quantum Charge Coupled Devices (QCCDs) require shuttling operations in order to maintain effective all-to-all connectivity. Each module or trap can perform only one operation …

BARD: Reducing Write Latency of DDR5 Memory by Exploiting Bank-Parallelism

Main Conference When: Tue 3 Feb 2026 14:10 - 14:30 People: Suhas Vittal, Moinuddin K. Qureshi

… policy that works well across all the workloads. We develop a hybrid policy (BARD …

WATOS: Efficient LLM Training Strategies and Architecture Co-exploration for Wafer-scale Chip

Main Conference When: Tue 3 Feb 2026 09:50 - 10:10 People: Huizheng Wang, Zichuan Wang, Hongbin Wang, Jingxiang Hou, Taiquan Wei, Chao Li, Yang Hu, Shouyi Yin

… , existing approaches all fall short in addressing these challenges.

To bridge …

SALT: Track-and-Mitigate Subarrays, Not Rows, for Blast-Radius-Free Rowhammer Defense

Main Conference When: Mon 2 Feb 2026 11:50 - 12:10 People: Moinuddin K. Qureshi

… to the subarray before all rows are guaranteed to be refreshed, thus providing {\em Blast …

MIRZA: Efficiently Mitigating Rowhammer with Randomization and ALERT

Main Conference When: Mon 2 Feb 2026 11:30 - 11:50 People: Hritvik Taneja, Ali Hajiabadi, Michele Marazzi, Kaveh Razavi, Moinuddin K. Qureshi

… In-DRAM Rowhammer mitigation requires three resources: space (to track aggressor rows), time (to perform mit- igation), and energy (to refresh victim rows). An ideal in-DRAM mitigation must minimize all three overheads. Recent …

Cohet: A CXL-Driven Coherent Heterogeneous Computing Framework with Hardware-Calibrated Full-System Simulation

Main Conference When: Mon 2 Feb 2026 10:10 - 10:30 People: Yanjing Wang, Lizhou Wu, Sunfeng Gao, Yibo Tang, Junhui Luo, Zicong Wang, Yang Ou, Dezun Dong, Nong Xiao, Mingche Lai

… of modeling all CXL sub-protocols and device types. CXLSim has been rigorously …

Focus: A Streaming Concentration Architecture for Efficient Vision-Language Models

Main Conference When: Mon 2 Feb 2026 09:50 - 10:10 People: Chiyue Wei, Cong Guo, Junyao Zhang, Haoxuan Shan, Yifan Xu, Ziyue Zhang, Yudong Liu, Qinsi Wang, Changchun Zhou, Hai "Helen" Li, Yiran Chen

… -level redundancy removal via motion-aware matching. All concentration steps …

Search events for 'all'