Search events for 'all'
ROME: Maximizing GPU Efficiency for All-Pairs Shortest Path via Taming Fine-Grained Irregularities
Main Conference When: Mon 2 Feb 2026 16:50 - 17:10 People: Weile Luo, Yuhan Chen, Xiangrui Yu, Qiang Wang, Ruibo Fan, Hongyuan Liu, Xiaowen Chu
… All-Pairs Shortest Path (APSP), a fundamental problem in graph analytics, can be solved efficiently by reducing the computational workload through vertex …% and up to 34.7% of peak min-plus OPs across all tested graphs. …
Welcome Reception
Catering When: Sun 1 Feb 2026 18:00 - 20:00
… All attendees registered for the main conference are invited to attend the welcome reception from 18:00 on Sunday evening, where there will be great food and drink and an opportunity to engage with the vibrant HPCA/CGO/PPoPP/CC …
Oracle Parfait – Scaling Vulnerability Detection from Enterprise Systems to Cloud-Scale Systems and Beyond
Plenary Keynotes When: Tue 3 Feb 2026 08:45 - 09:45 People: Cristina Cifuentes
… to a DevSecOps model where security gets integrated at all levels of the software process …
Compiler 2.0: Building the Next Generation Compilers with Machine Learning
Plenary Keynotes When: Mon 2 Feb 2026 08:45 - 09:45 People: Saman Amarasinghe
… , complex vector instructions, and specialized accelerators have all pushed more …
Accelerating Sparse Algebra with Program Synthesis
Main Conference When: Sat 31 Jan 2026 16:00 - 16:26 People: José Wesley De Souza Magalhães, Shideh Hashemian, Alexander Brauckmann, Jackson Woodruff, Elizabeth Polgreen, Michael F. P. O'Boyle
… , and GPT 4.o respectively. All lifted programs are completely correct compared …
Practical MHP Analysis for Java
Main Conference When: Sun 1 Feb 2026 12:18 - 12:45 People: Samuel Moses, V Krishna Nandivada
… time on all the tested benchmarks in DaCapo and Renaissance benchmarks (geomean …
CHEHAB: Automatic Compiler Code Optimization for Fully Homomorphic Encryption
Main Conference When: Sat 31 Jan 2026 14:37 - 15:03 People: Riyadh Baghdadi, Abdessamed Seddiki, Arab Mohammed, Zakaria Hebbal, Aimad Chabounia, Eduardo Chielle, Michail Maniatakos, MENACER Djamel Eddine, Karima Benatchba, Challal Yacine
… cryptographic expertise. Programmers may not be aware of all possible optimizations …
DiTOX: Fault Detection and Localization in the ONNX Optimizer
Main Conference When: Sat 31 Jan 2026 13:45 - 14:11 People: Nikolaos Louloudakis, Ajitha Rajan
… optimization passes as well as the optimizer more broadly. All findings were reported …
Sharded Elimination and Combining for Highly-Efficient Concurrent Stacks
Main Conference When: Mon 2 Feb 2026 14:30 - 14:50 People: Ajay Singh, Nikos Metaxakis, Panagiota Fatourou
… all existing concurrent stacks. The proposed implementation is based on a novel … that the proposed stack implementation outperforms all existing concurrent stacks by up to 2X …
PANA: A Fine-Grained Runtime-Adaptive Load Balancing for Parallel SpMV on Multicore CPUs
Main Conference When: Mon 2 Feb 2026 12:30 - 12:50 People: Haodong Bian, Youhui Zhang, Xiang Fei, Jianqiang Huang, Xiaoying Wang
… SuiteSparse matrices, PANA consistently outperforms all baseline methods (CAMLB, CSR5 …
Waste-Efficient Work Stealing
Main Conference When: Mon 2 Feb 2026 11:50 - 12:10 People: Kyle Singer, Kunal Agrawal, TB Schardl
… Although randomized work stealing is effective at automatically load-balancing task-parallel programs, it can waste computational resources when scheduling programs that lack sufficient parallelism to use all available threads …
zBuffer: Zero-Copy and Metadata-Free Serialization for Fast RPC with Scatter-Gather Reflection
Main Conference When: Tue 3 Feb 2026 11:50 - 12:10 People: Xiangyu Liu, Huiba Li, Shun Gai, Youmin Chen, Yiming Zhang
… , we design a fast RPC system (called zRPC) which eliminates all RPC memory copy …
Pipelonk: Accelerating End-to-End Zero-Knowledge Proof Generation on GPUs for PLONK-Based Protocols
Main Conference When: Tue 3 Feb 2026 14:10 - 14:30 People: Zhiyuan Zhang, Yanxin Cai, Wenhao Yin, Xueyu Wu, Yi Wang, Lei Ju, Zhuoran Ji
… } introduces a segmentable operator library that offloads all operations, including …
DTMiner: A Data-Centric System for Efficient Temporal Motif Mining
Main Conference When: Tue 3 Feb 2026 16:50 - 17:10 People: hou yinbo, Hao Qi, Ligang He, Jin Zhao, Yu Zhang, Hui Yu, Longlong Lin, Lin Gu, Wenbin Jiang, XIAOFEI LIAO, Hai Jin
… order and then triggers all relevant tasks to explore only these loaded data …
Ember: A Compiler for Embedding Operations on Decoupled Access-Execute Architectures
Main Conference When: Mon 2 Feb 2026 12:30 - 12:50 People: Marco Siracusa, Olivia Hsu, Víctor Soria-Pardos, Joshua Randall, Arnaud Grasset, Eric Biscondi, Douglas J. Joseph, Randy Allen, Fredrik Kjolstad, Miquel Moreto, Adrià Armejach Sanosa
… to automatically compile all of these embedding operations to DAE architectures. Conversely … implement all optimizations to match the performance of hand-written code …
Pyls: Enabling Python Hardware Synthesis with Dynamic Polymorphism via LCRS Encoding
Main Conference When: Mon 2 Feb 2026 11:50 - 12:10 People: Bolei Tong, Yongyan Fang, Wang Chaorui, Qingan Li, Jingling Xue, YUAN Mengting
… is representing all Python objects as LCRS trees, enabling uniform hardware handling …
Compilation of Generalized Matrix Chains with Symbolic Sizes
Main Conference When: Tue 3 Feb 2026 10:50 - 11:10 People: Francisco López, Lars Karlsson, Paolo Bientinesi
… calls is optimal for all possible combinations of matrix sizes.
We design … results that guarantee that the cost is within a constant factor from optimal for all …
SparseX: Synergizing GPU Libraries for Sparse Matrix Multiplication on Heterogeneous Processors
Main Conference When: Tue 3 Feb 2026 10:30 - 10:50 People: Ruifeng Zhang, Xiangwei Wang, Ang Li, Xipeng Shen
… all matrices and scenarios. Based on the empirical observations, this work …
GRANII: Selection and Ordering of Primitives in GRAph Neural Networks using Input Inspection
Main Conference When: Mon 2 Feb 2026 10:10 - 10:30 People: Damitha Lenadora, Vimarsh Sathia, Gerasimos Gerogiannis, Serif Yesil, Josep Torrellas, Charith Mendis
… compilation stage that enumerates all valid re-associations leading to different sparse …
PriTran: Privacy-Preserving Inference for Transformer-Based Language Models under Fully Homomorphic Encryption
Main Conference When: Mon 2 Feb 2026 11:30 - 11:50 People: Yuechen Mu, Guangli Li, Shiping Chen, Jingling Xue
… ) matrix multiplications (MMs) across all BERT models by reducing costly …
BIT: Empowering Binary Analysis through the LLVM Toolchain
Main Conference When: Tue 3 Feb 2026 10:10 - 10:30 People: Puzhuo Liu, Peng Di, Jingling Xue, Yu Jiang
… ; in reanalysis, BIT can complete all tasks and is consistent with the advanced work …
Synthesizing Instruction Selection Back-Ends from ISA Specifications Made Practical
Main Conference When: Tue 3 Feb 2026 10:10 - 10:30 People: Florian Drescher, Alexis Engelke
… to ensure completeness in all other cases. Combined with search bounds derived from …
Tensor Abstraction Enabling Explicit Layout Optimization in Homomorphic Encryption
Student Research Competition When: Sun 1 Feb 2026 18:00 - 20:00 People: Seongho Kim, Hanjun Kim
… , multiplication, and cyclic rotation – where all slots must be uniformly processed in each …
Partial-Evaluation Templates: Accelerating Partial Evaluation with Pre-compiled Templates
Main Conference When: Mon 2 Feb 2026 11:30 - 11:50 People: Florian Huemer, Aleksandar Prokopec, David Leopoldseder, Raphael Mosaner, Hanspeter Mössenböck
… for nearly all opcodes in GraalWasm, which reduced partial-evaluation time by up …
Dr.avx: A Dynamic Compilation System for Seamlessly Executing Hardware-Unsupported Vectorization Instructions
Main Conference When: Tue 3 Feb 2026 10:30 - 10:50 People: Yue Tang, Mianzhi Wu, Yufeng Li, Haoyu Liao, Jianmei Guo, Bo Huang
… challenges emerging across all major ISAs. …
Practical: Are Abstract-Interpreter Baseline JITs Worth It? An Empirical Evaluation through Metacompilation
Main Conference When: Tue 3 Feb 2026 10:50 - 11:10 People: Nahuel Palumbo, Guillermo Polito, Stéphane Ducasse, Pablo Tesone
… , although they share the same technique, all these implementations vary …
MoEntwine: Unleashing the Potential of Wafer-scale Chips for Large-scale Expert Parallel Inference
Main Conference When: Tue 3 Feb 2026 10:50 - 11:10 People: Xinru Tang, Jingxiang Hou, Dingcheng Jiang, Taiquan Wei, Jiaxin Liu, Jinyi Deng, Huizheng Wang, Qize Yang, Haoran Shang, Chao Li, Yang Hu, Shouyi Yin
… parallelism (EP) to alleviate memory bottleneck, which introduces all-to-all … GPU clusters, high-overhead cross-node communication makes all-to-all expensive … provide a unified high-performance network connecting all devices, presenting …
Enterprise Class On-Chip Accelerator Integration
Industry Track When: Tue 3 Feb 2026 17:15 - 17:35 People: Deanna Berger, Alper Buyuktosunoglu, Craig Walters, Robert Sonnelitter, Hailey Nicholson, Ashraf ElSharif, Yamil Rivera, Avery Francois, Cedric Lichtenau, Jason Kohl
… with a sustained processor utilization of over 90% under all workload conditions … with an integrated multi-tier unified cache hierarchy all within one chip. The processor chip design leverages a unique approach to ensure all elements work in unison …
I-POP: Ignite Positive Prefetchers
Main Conference When: Tue 3 Feb 2026 10:50 - 11:10 People: Yiquan Lin, Wenhai Lin, Yiquan Chen, Jiexiong Xu, Shishun Cai, Jiarong Ye, Zonghui Wang, Wenzhi Chen
… prefetchers for issuing requests, but they all face limitations. Specifically, existing … prefetcher’s PE, and the Control Engine, which dynamically manages all …
FractalCloud: A Fractal-Inspired Architecture for Efficient Large-Scale Point Cloud Processing
Main Conference When: Mon 2 Feb 2026 16:30 - 16:50 People: Yuzhe Fu, Changchun Zhou, Hancheng Ye, Bowen Duan, Qiyu Huang, Chiyue Wei, Cong Guo, Hai "Helen" Li, Yiran Chen
… ) block-parallel point operations that decompose and parallelize all point …
Protean: A Programmable Spectre Defense
Main Conference When: Wed 4 Feb 2026 10:30 - 10:50 People: Nicholas Mosier, Hamed Nemati, John C. Mitchell, Caroline Trippel
… We present the Protean Spectre defense—the first to be altogether comprehensive, covering all side-channels and speculation; programmer-transparent, requiring no source modifications; and programmable, tailoring its hardware protections …
GustavSNN: Unleashing the Power of Gustavson's Algorithm on SNN Acceleration with Column-Parallel Tick-Batch Dataflow
Main Conference When: Tue 3 Feb 2026 15:10 - 15:30 People: Sangwoo Hwang, Donghun Lee, Jahyun Koo, Jaeha Kung
… this by employing tick-batch techniques, which process all timesteps within a layer before …
GenPairX: A Hardware-Algorithm Co-Designed Accelerator for Paired-End Read Mapping
Main Conference When: Tue 3 Feb 2026 17:15 - 17:35 People: Julien Eudine, Chu Li, Zhuo Cheng, Renzo Andri, Onur Mutlu, Can Firtina, Mohammad Sadrosadati, Nika Mansouri Ghiasi, Konstantina Koliogeorgi, Anirban Nag, Arash Tavakkol, Haiyu Mao, Shai Bergman, Ji Zhang
… CPU-based and 1.41$\times$ compared to hardware-based read mappers, all while …
VectorLiteRAG: Latency-Aware and Fine-Grained Resource Partitioning for Efficient RAG
Main Conference When: Wed 4 Feb 2026 10:50 - 11:10 People: Junkyum Kim, Divya Mahajan
… consistently expands the SLO-compliant request rate range across all tested …
LEGO: Supporting LLM-enhanced Games with One Gaming GPU
Main Conference When: Wed 4 Feb 2026 12:30 - 12:50 People: Han Zhao, Weihao Cui, Zeshen Zhang, Wenhao Zhang, Jiangtong Li, Quan Chen, Youmin Chen, Pu Pang, Zijun Li, Zhenhua Han, Yuqing Yang, Minyi Guo
… . Evaluations on an Nvidia RTX 4090 show that LEGO meets latency targets in all …
QuCo: Efficient and Flexible Hardware-Driven Automatic Configuration of Tile Transfers in GPUs
Main Conference When: Wed 4 Feb 2026 10:50 - 11:10 People: Nicolas Meseguer, daoxuan xu, Yifan Sun, Michael Pellauer, José L. Abellán, Manuel E. Acacio
… , and synchronization primitives, all of which are hardware-specific and workload …
Splatonic: Architecture Support for 3D Gaussian Splatting SLAM via Sparse Processing
Main Conference When: Mon 2 Feb 2026 16:10 - 16:30 People: Xiaotong Huang, He Zhu, Tianrui Ma, Yuxiang Xiong, Fangxin Liu, Zhezhi He, Yiming Gan, Zihan Liu, Jingwen Leng, Yu Feng, Minyi Guo
… and 241.1$\times$ energy savings over state-of-the-art accelerators, all …
Cyclone: Designing Efficient and Highly Parallel QCCD Architectural Codesigns for Fault Tolerant Quantum Memory
Main Conference When: Mon 2 Feb 2026 11:50 - 12:10 People: Sahil Khan, Abhinav Anand, Kenneth R. Brown, Jonathan M. Baker
… Modular trapped-ion quantum computing hardware, known as Quantum Charge Coupled Devices (QCCDs) require shuttling operations in order to maintain effective all-to-all connectivity. Each module or trap can perform only one operation …
BARD: Reducing Write Latency of DDR5 Memory by Exploiting Bank-Parallelism
Main Conference When: Tue 3 Feb 2026 14:10 - 14:30 People: Suhas Vittal, Moinuddin K. Qureshi
… policy that works well across all the workloads. We develop a hybrid policy (BARD …
Cohet: A CXL-Driven Coherent Heterogeneous Computing Framework with Hardware-Calibrated Full-System Simulation
Main Conference When: Mon 2 Feb 2026 10:10 - 10:30 People: Yanjing Wang, Lizhou Wu, Sunfeng Gao, Yibo Tang, Junhui Luo, Zicong Wang, Yang Ou, Dezun Dong, Nong Xiao, Mingche Lai
… of modeling all CXL sub-protocols and device types. CXLSim has been rigorously …
WATOS: Efficient LLM Training Strategies and Architecture Co-exploration for Wafer-scale Chip
Main Conference When: Tue 3 Feb 2026 09:50 - 10:10 People: Huizheng Wang, Zichuan Wang, Hongbin Wang, Jingxiang Hou, Taiquan Wei, Chao Li, Yang Hu, Shouyi Yin
… , existing approaches all fall short in addressing these challenges.
To bridge …
SALT: Track-and-Mitigate Subarrays, Not Rows, for Blast-Radius-Free Rowhammer Defense
Main Conference When: Mon 2 Feb 2026 11:50 - 12:10 People: Moinuddin K. Qureshi
… to the subarray before all rows are guaranteed to be refreshed, thus providing {\em Blast …
Focus: A Streaming Concentration Architecture for Efficient Vision-Language Models
Main Conference When: Mon 2 Feb 2026 09:50 - 10:10 People: Chiyue Wei, Cong Guo, Junyao Zhang, Haoxuan Shan, Yifan Xu, Ziyue Zhang, Yudong Liu, Qinsi Wang, Changchun Zhou, Hai "Helen" Li, Yiran Chen
… -level redundancy removal via motion-aware matching. All concentration steps …
MIRZA: Efficiently Mitigating Rowhammer with Randomization and ALERT
Main Conference When: Mon 2 Feb 2026 11:30 - 11:50 People: Hritvik Taneja, Ali Hajiabadi, Michele Marazzi, Kaveh Razavi, Moinuddin K. Qureshi
… In-DRAM Rowhammer mitigation requires three resources: space (to track aggressor rows), time (to perform mit- igation), and energy (to refresh victim rows). An ideal in-DRAM mitigation must minimize all three overheads. Recent …