Publications

Disclaimer: This page contains links to personal archived articles published by IEEE, ACM, and other publishers. It also contains links to presentations, videos and other materials. Copyright and all rights therein are retained by authors or by other copyright holders. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works, must be obtained from the copyright holder. The following materials are based upon work supported by various funding sources.

(Google Scholar) (DBLP with DOI Links) (ORCID)

2023

(MICRO 2023) [PDF]
Ying Li, Yifan Sun, Adwait Jog
Path Forward Beyond Simulators: Fast and Accurate GPU Execution Time Prediction for DNN Workloads,
In the Proceedings of 56th International Symposium on Microarchitecture (MICRO), Toronto, Canada, October 2023
(Acceptance rate: 101/424 ≈ 24%)

(ISCA 2023) [PDF]
Rishabh Jain, Scott Cheng, Vishwas Kalagi, Vrushabh Sanghavi, Samvit Kaul, Meena Arunachalam, Kiwan Maeng, Adwait Jog, Anand Sivasubramaniam, Mahmut T. Kandemir, and Chita R. Das
Optimizing CPU Performance for Recommendation Systems At-Scale,
In the Proceedings of ACM International Symposium on Computer Architecture (ISCA), FCRC, Orlando, FL, June 2023
(Acceptance rate: 79/372 ≈ 21%)

(SIGMETRICS 2023) [PDF]
Hongyuan Liu, Sreepathi Pai, Adwait Jog
Asynchronous Automata Processing on GPUs,
In the Proceedings of ACM Measurement and Analysis of Computing Systems (POMACS Journal) at SIGMETRICS conference, FCRC, Orlando, FL, June 2023

(ISPASS 2023, Poster) [PDF]
Ying Li, Yifan Sun, Adwait Jog
A Regression-based Model for End-to-End Latency Prediction for DNN Execution on GPUs,
In the Proceedings of IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), Raleigh, NC, April 2023

2022

(ISVLSI 2022) [PDF]
Khoa Ho, Hui Zhao, Adwait Jog, Saraju Mohanty
Improving GPU Throughput through Parallel Execution Using Tensor Cores and CUDA Cores,
In the Proceedings of IEEE Annual Symposium on VLSI (ISVLSI), Pafos, Cyprus, July 2022

(Ph.D. Dissertation, 2022) [PDF]
Hongyuan Liu
Techniques for Accelerating Large-Scale Automata Processing
Committee Members: Adwait Jog (Advisor and Chair), Pradeep Kumar, Zhenming Liu, Weizhen Mao, Zhijia Zhao

2021

(Ph.D. Dissertation, 2021) [PDF]
Mohamed Assem Ibrahim
Rethinking Cache Hierarchy and Interconnect Design for Next-generation GPUs
(Winner of the Distinguished Dissertation Award) (details)
Committee Members: Adwait Jog (Advisor and Chair), Dmitry Evtyushkin, Bin Ren, Andreas Stathopoulos, Asit Mishra

(Ph.D. Dissertation, 2021) [PDF]
Gurunath Kadam
Low-Overhead Techniques for Secure and Reliable GPU Computing
Committee Members: Adwait Jog (Advisor and Chair), Dmitry Evtyushkin, Bin Ren, Evgenia Smirni, Ashutosh Pattnaik

(Cluster 2021) [PDF] [Talk (PPTX)]
Hongyuan Liu, Bogdan Nicolae, Sheng Di, Franck Cappello, Adwait Jog
Accelerating DNN Architecture Search at Scale using Selective Weight Transfer,
In the Proceedings of 23rd IEEE Cluster 2021 International Conference, Virtual Event, Sept 2021
(Acceptance rate: 48/163 ≈ 29%)

(DSN 2021) [PDF] [YouTube Video]
Gurunath Kadam, Evgenia Smirni, Adwait Jog
Data-centric Reliability Management in GPUs,
In the Proceedings of 51st IEEE International Conference on Dependable Systems and Networks (DSN), Virtual Event, June 2021
(Acceptance rate: 48/295 ≈ 16%)

(SIGMETRICS 2021) [PDF] [YouTube Full Video]
Lishan Yang, Bin Nie, Adwait Jog, Evgenia Smirni
SUGAR: Speeding Up GPGPU Application Resilience Estimation with Input Sizing,
In the Proceedings of the ACM on Measurement and Analysis of Computing Systems (POMACS Journal) at SIGMETRICS conference, Virtual Event, June 2021
(Acceptance rate: ≈ 12%)

(ICSE 2021) [PDF] [YouTube Full Video]
Lishan Yang, Bin Nie, Adwait Jog, Evgenia Smirni
Enabling Software Resilience in GPGPU Applications via Partial Thread Protection,
In the Proceedings of 43rd International Conference on Software Engineering (ICSE), Virtual Event, May 2021
(Acceptance rate: 138/615 ≈ 22%)

(HPCA 2021) [PDF] [YouTube Full Video]
Mohamed Assem Ibrahim, Onur Kayiran, Yasuko Eckert, Gabriel H. Loh, Adwait Jog
Analyzing and Leveraging Decoupled L1 Caches in GPUs,
In the Proceedings of 27th IEEE Symposium on High Performance Computer Architecture (HPCA), Virtual Event, Feb 2021
(Acceptance rate: 63/258 ≈ 24%)

(TC 2021) [PDF]
Lishan Yang, Bin Nie, Adwait Jog, Evgenia Smirni
Practical Resilience Analysis of GPGPU Applications in the Presence of Single-and Multi-bit Faults,
In IEEE Transactions on Computers (TC), Jan 2021

2020

(Ph.D. Dissertation, 2020) [PDF]
Haonan Wang
Design And Analysis Of Memory Management Techniques For Next-Generation GPUs
Committee Members: Adwait Jog (Advisor and Chair), Xu Liu, Bin Ren, Evgenia Smirni, Onur Kayiran

(PACT 2020) [PDF] [Talk (PPTX)] [YouTube Full Video]
Mohamed Assem Ibrahim, Onur Kayiran, Yasuko Eckert, Gabriel H. Loh, Adwait Jog
Analyzing and Leveraging Shared L1 Caches in GPUs,
In the Proceedings of 29th International Conference on Parallel Architectures and Compilation Techniques (PACT), Virtual Event, October 2020
(Acceptance rate: 35/137 ≈ 25%)

(ASPLOS 2020) [PDF] [PPTX] [Artifact] [YouTube Full Video]
Hongyuan Liu, Sreepathi Pai, Adwait Jog
Why GPUs are Slow at Executing NFAs and How to Make them Faster,
In the Proceedings of 25th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Virtual Event, March, 2020
(Acceptance rate: 86/479 ≈ 18%)

(HPCA 2020) [PDF] [PPTX] [YouTube Trailer]
Gurunath Kadam, Danfeng Zhang, Adwait Jog
BCoal: Bucketing-based Memory Coalescing for Efficient and Secure GPUs,
In the Proceedings of 26th IEEE Symposium on High Performance Computer Architecture (HPCA), San Diego, CA, Feb 2020
(Acceptance rate: 48/248 ≈ 19%)

(CCGrid 2020) [PDF]
Bin Nie, Adwait Jog, Evgenia Smirni
Characterizing Accuracy-Aware Resilience of GPGPU Applications,
In the Proceedings of 20th IEEE International Symposium on Cluster, Cloud and Internet Computing, Melbourne, Victoria, Australia
(Acceptance rate: 66/234 ≈ 28%)

2019

(PACT 2019) [PDF] [Talk (PPTX)]
Mohamed Assem Ibrahim, Hongyuan Liu, Onur Kayiran, Adwait Jog
Analyzing and Leveraging Remote-core Bandwidth for Enhanced Performance in GPUs,
In the Proceedings of 28th International Conference on Parallel Architectures and Compilation Techniques (PACT), Seattle, WA, September 2019 (Acceptance rate: 26/126 ≈ 21%)

(DSN 2019) [PDF] [Talk (PPTX)]
Haonan Wang, Adwait Jog
Exploiting Latency and Error Tolerance of GPGPU Applications for an Energy-efficient DRAM,
In the Proceedings of 49th IEEE International Conference on Dependable Systems and Networks (DSN), Portland, OR, June 2019
(Acceptance rate: 54/252 ≈ 21%)

(ICS 2019) [PDF] [Talk (PPTX)]
Haonan Wang, Mohamed Assem Ibrahim, Sparsh Mittal, Adwait Jog
Address-Stride Assisted Approximate Value Prediction in GPUs,
In the Proceedings of 33rd ACM International Conference on Super Computing (ICS), Phoenix, Arizona, June 2019
(Acceptance rate: 45/193 ≈ 23%)

(ISCA 2019) [PDF] [YouTube Trailer]
Ashutosh Pattnaik, Xulong Tang, Onur Kayiran, Adwait Jog, Asit Mishra, Mahmut T. Kandemir, Anand Sivasubramaniam, Chita R. Das
Opportunistic Computing in GPU Architectures,
In the Proceedings of 46th International Symposium on Computer Architecture (ISCA), Phoenix, Arizona, June 2019
(Acceptance rate: 62/365 ≈ 17%)

(SIGMETRICS 2019) [PDF]
Xulong Tang, Ashutosh Pattnaik, Onur Kayiran, Adwait Jog, Mahmut Taylan Kandemir, Chita R. Das
Quantifying Data Locality in Dynamic Parallelism in GPUs,
In the Proceedings of the ACM on Measurement and Analysis of Computing Systems (POMACS Journal) at SIGMETRICS conference, Phoenix, Arizona, June 2019
(Acceptance rate: ≈ 20%)

2018

(MICRO 2018) [PDF] [Talk (PPTX)] [Poster (PDF)] [2-page Summary (PDF)] [YouTube Trailer]
Hongyuan Liu, Mohamed Assem Ibrahim, Onur Kayiran, Sreepathi Pai, Adwait Jog
Architectural Support for Efficient Large-Scale Automata Processing,
In the Proceedings of 51st International Symposium on Microarchitecture (MICRO), Fukuoka, Japan, October 2018
(Acceptance rate: 74/351 ≈ 21%)

(MICRO 2018) [PDF] [Talk (PPTX)] [Poster (PDF)] [YouTube Trailer]
Bin Nie, Lishan Yang, Adwait Jog, Evgenia Smirni
Fault Site Pruning for Practical Reliability Analysis of GPGPU Applications,
In the Proceedings of 51st International Symposium on Microarchitecture (MICRO), Fukuoka, Japan, October 2018
(Acceptance rate: 74/351 ≈ 21%)

(ASPLOS 2018) [PDF] [Talk (PPTX)]
Rachata Ausavarungnirun, Vance Miller, Joshua Landgraf, Saugata Ghose, Adwait Jog, Jayneel Gandhi, Christopher J. Rossbach, Onur Mutlu
MASK: Redesigning the GPU Memory Hierarchy to Support Multi-Application Concurrency,
In the Proceedings of 23rd ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Williamsburg, VA, March, 2018
(Acceptance rate: 56/319 ≈ 17%)

(HPCA 2018) [PDF] [Talk (PPTX)] [YouTube Trailer]
Haonan Wang, Fan Luo, Mohamed Assem Ibrahim, Onur Kayiran, Adwait Jog
Efficient and Fair Multi-programming in GPUs via Effective Bandwidth Management,
In the Proceedings of 24th IEEE Symposium on High Performance Computer Architecture (HPCA), Vienna, Austria, Feb 2018
(Acceptance rate: 54/260 ≈ 20%)

(HPCA 2018) [PDF] [Talk (PPTX)] [YouTube Trailer]
Gurunath Kadam, Danfeng Zhang, Adwait Jog
RCoal: Mitigating GPU Timing Attack via Subwarp-based Randomized Coalescing Techniques,
In the Proceedings of 24th IEEE Symposium on High Performance Computer Architecture (HPCA), Vienna, Austria, Feb 2018
(Acceptance rate: 54/260 ≈ 20%)

2017

(ISVLSI 2017) [PDF]
Sparsh Mittal, Rajendra Bishnoi, Fabian Oboril, Haonan Wang, Mehdi Tahoori, Adwait Jog, Jeffrey Vetter
Architecting SOT-RAM Based GPU Register File,
In the Proceedings of IEEE Annual Symposium on VLSI (ISVLSI), Bochum, Germany, July 2017
(Acceptance rate: 67/212 ≈ 32%)

(AIM@PACT 2017) [PDF]
Hengyu Zhao, Colin Weinshenker, Mohamed Assem Ibrahim, Adwait Jog, Jishen Zhao
Layer-wise Performance Bottleneck Analysis of Deep Neural Networks,
In The 1st International Workshop on Architectures for Intelligent Machine, Portland, Oregon, September 2017

(HPCA 2017) [PDF]
Xulong Tang, Ashutosh Pattnaik, Huaipan Jiang, Onur Kayiran, Adwait Jog, Sreepathi Pai, Mohamed Assem Ibrahim, Mahmut Kandemir, Chita R. Das
Controlled Kernel Launch for Dynamic Parallelism in GPUs,
In the Proceedings of 23rd IEEE Symposium on High Performance Computer Architecture (HPCA), Austin, TX, Feb 2017
(Acceptance rate: 50/224 ≈ 22%)

(VLSID 2017) [PDF] [Talk (PPTX)]
Sparsh Mittal, Haonan Wang, Adwait Jog, Jeffrey Vetter
Design and Analysis of Soft-Error Resilience Mechanisms for GPU Register File,
In the Proceedings of 30th International Conference on VLSI design and 16th International Conference on Embedded Systems, Hyderabad, India, Jan 2017
(Acceptance rate: 71/292 ≈ 24%)

2016

(MICRO 2016) [PDF] [Talk (PPTX)]
Nandita Vijaykumar, Kevin Hsieh, Gennady Pekhimenko, Samira Khan, Ashish Shrestha, Saugata Ghose, Adwait Jog, Phillip B. Gibbons, Onur Mutlu
Zorua: A Holistic Approach to Resource Virtualization in GPUs,
In the Proceedings of 49th International Symposium on Micro Architecture (MICRO), Taipei, Taiwan, October 2016
(Acceptance rate: 61/288 ≈ 21%)

(IISWC 2016) [PDF] [Talk (PPTX)]
Robert Risque, Adwait Jog
Characterization of Quantum Workloads on SIMD Architectures,
In the Proceedings of International Symposium on Workload Characterization, Providence, RI, September 2016
(Acceptance rate: 21/71 ≈ 29%)

(PACT 2016 [PDF] [Talk (PPTX)]
Ashutosh Pattnaik, Xulong Tang, Adwait Jog, Onur Kayiran, Asit K. Mishra, Mahmut T. Kandemir, Onur Mutlu, Chita R. Das
Scheduling Techniques for GPU Architectures with Processing-In-Memory Capabilities,
In the Proceedings of 25th International Conference on Parallel Architectures and Compilation Techniques (PACT), Haifa, Israel, September 2016
(Acceptance rate: 31/139 ≈ 22%)

(PACT 2016 [PDF] [Talk (PPTX)]
Onur Kayiran, Adwait Jog, Ashutosh Pattnaik, Rachata Ausavarungnirun, Xulong Tang, Mahmut T. Kandemir, Gabriel H. Loh, Onur Mutlu, Chita R. Das
μC-States: Fine-grained GPU Datapath Power Management,
In the Proceedings of 25th International Conference on Parallel Architectures and Compilation Techniques (PACT), Haifa, Israel, September 2016
(Acceptance rate: 31/139 ≈ 22%)

(SIGMETRICS 2016) [PDF] [Talk (PPTX)]
Adwait Jog, Onur Kayiran, Ashutosh Pattnaik, Mahmut T. Kandemir, Onur Mutlu, Ravishankar Iyer, Chita R. Das
Exploiting Core Criticality for Enhanced GPU Performance,
In the Proceedings of 42nd ACM International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS), Antibes Juan-Les-Pins, France, June 2016
(Acceptance rate: 28/208 ≈ 13%)

(Book Chapter)
Nandita Vijaykumar, Gennady Pekhimenko, Adwait Jog, Saugata Ghose, Abhishek Bhowmick, Rachata Ausavarungnirun, Chita Das, Mahmut Kandemir, Todd C. Mowry, Onur Mutlu
A Framework for Accelerating Bottlenecks in GPU Execution with Assist Warps,
Book Chapter in Advances in GPU Research and Practice, Elsevier, to be published in 2016. Elsevier Book Link (Chapter 15)

2015

(MEMSYS 2015) [PDF] [Talk (PPTX)] [Github]
Adwait Jog, Onur Kayiran, Tuba Kesten, Ashutosh Pattnaik, Evgeny Bolotin, Niladrish Chatterjee, Stephen W. Keckler, Mahmut T. Kandemir, Chita R. Das
Anatomy of GPU Memory System for Multi-Application Execution,
In the Proceedings of 1st International Symposium on Memory Systems (MEMSYS), Washington, DC, Oct 2015

(ISCA 2015) [PDF] [Talk (PPTX)] [Lightning]
Nandita Vijaykumar, Gennady Pekhimenko, Adwait Jog, Abhishek Bhowmick, Rachata Ausavarungnirun, Chita R. Das Mahmut Kandemir, Todd Mowry, Onur Mutlu,
A Case for Core-Assisted Bottleneck Acceleration in GPUs: Enabling Flexible Data Compression with Assist Warps,
In the Proceedings of 42nd International Symposium on Computer Architecture (ISCA), Portland, OR, June 2015
(Acceptance rate: 58/305 ≈ 19%)

(US Patent App, 2015)
Evgeny Bolotin, Zvika Guz, Adwait Jog, Stephen W. Keckler, Mike Parker
Approach to Adaptive Allocation of Shared Resources in Computer Systems,
United States Patent Application US20150163324A1

(Ph.D. Dissertation, 2015)
Design and Analysis of Scheduling Techniques for Throughput Processors.
Committee Members: Chita Das: Advisor (Penn State), Mahmut Kandemir (Penn State), Yuan Xie (Penn State/UCSB), Ken Jenkins (Penn State), Onur Mutlu (CMU), Ravi Iyer (Intel)

2014

(MICRO 2014) [PDF] [Talk (PPTX)] [Poster] [Lightning]
Onur Kayiran, Nachiappan CN, Adwait Jog, Rachata Ausavarungnirun, Mahmut Kandemir, Gabriel Loh, Onur Mutlu, Chita R. Das
Managing GPU Concurrency in Heterogeneous Architectures,
In the Proceedings of 47th International Symposium on Micro Architecture (MICRO), Cambridge, UK, December 2014
(Acceptance rate: 53/273 ≈ 19%)

(PACT 2014) [PDF]
Wei Ding, Mahmut Kandemir, Diana Guttman, Adwait Jog, Chita Das, Praveen Yedlapalli
Trading Cache Hit Rate for Memory Performance,
In the Proceedings of 23rd International Conference on Parallel Architectures and Compilation Techniques (PACT), Edmonton, Alberta, Canada, August 2014
(Acceptance rate: 37/144 ≈ 25%)

(GPGPU@ASPLOS 2014) [PDF] [Talk (PPTX)] [ACM DOI]
Adwait Jog, Evgeny Bolotin, Zvika Guz, Mike Parker, Stephen W. Keckler, Mahmut Kandemir, Chita R. Das
Application-aware Memory System for Fair and Efficient Execution of Concurrent GPGPU Applications,
In the Proceedings of 7th Workshop on General Purpose Computing using GPUs (GPGPU7), co-located with 19th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Salt Lake City, UT, March 2014
(Acceptance rate: 12/27 ≈ 44%)

2013

(PACT 2013) [PDF] [Talk (PPTX)]
Best Paper Nomination: One of the four papers nominated for the Best Paper Award.
Onur Kayiran, Adwait Jog, Mahmut T. Kandemir, Chita R. Das
Neither More Nor Less: Optimizing Thread-level Parallelism for GPGPUs,
In the Proceedings of 22nd International Conference on Parallel Architectures and Compilation Techniques (PACT), Edinburgh, Scotland, September 2013
(Acceptance rate: 36/208 ≈ 17%)

(ISCA 2013) [PDF] [Talk (PPTX)]
Adwait Jog, Onur Kayiran, Asit K. Mishra, Mahmut T. Kandemir, Onur Mutlu, Ravi Iyer, Chita R. Das
Orchestrated Scheduling and Prefetching for GPGPUs,
In the Proceedings of 40th International Symposium on Computer Architecture (ISCA), Tel Aviv, Israel, June 2013
(Acceptance rate: 56/288 ≈ 19%)

(ASPLOS 2013) [PDF] [2-page-summary (PDF)] [Talk (PPTX)]
Adwait Jog, Onur Kayiran, Nachiappan CN, Asit K. Mishra, Mahmut T. Kandemir, Onur Mutlu, Ravishankar Iyer, Chita R. Das OWL: Cooperative Thread Array Aware Scheduling Techniques for Improving GPGPU performance, In the Proceedings of 18th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Houston, TX, March 2013
(Acceptance rate: 44/191 ≈ 23%)

2012

(DAC 2012) [PDF] [Talk (PPTX)] [Poster]
Adwait Jog, Asit K. Mishra, Cong Xu, Yuan Xie, N. Vijaykrishnan, Ravishankar Iyer, Chita R. Das
Cache Revive: Architecting Volatile STT-RAM Caches for Enhanced Performance in CMPs,
In the Proceedings of 49th Design Automation Conference (DAC), San Francisco, CA, June 2012
(Acceptance rate: 168/741 ≈ 22%)

(Tech Report 2012)
Onur Kayiran, Adwait Jog, Mahmut T. Kandemir, Chita R. Das
Neither More Nor Less: Optimizing Thread-level Parallelism for GPGPUs,
TR-CSE-2012-006, CSE-Penn State Tech Report, Sept 2012

2011

(Tech Report 2011)
Adwait Jog, Asit K. Mishra, Cong Xu, Yuan Xie, N. Vijaykrishnan, Ravishankar Iyer, Chita R. Das
Cache Revive: Architecting Volatile STT-RAM Caches for Enhanced Performance in CMPs,
TR-CSE-2011-010, CSE-Penn State Tech Report, June 2011

© 20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.