GPU Architectures: From Basic to Advanced Concepts

General Information

Dates: July 9th, 10th, 12th, 13th (four lectures). No class on July 11.
Time and Location: Slot 3, Class Starts at 14:30-15:45 hrs. Duration is 75 minutes.
Instructor: Adwait Jog (Personal Website)

Lecture Material

Lecture 1 (July 9th): Intro to GPUs and Basics of CUDA programming (slides)
Lecture 2 (July 10th): Basics of GPU architecture (slides)
Lecture 3 (July 12th): GPU Performance/Energy Bottlenecks and some Mitigation Techniques (slides)
Lecture 4 (July 13th): Emerging GPU Security Concerns and some Mitigation Techniques (slides)

Background and Additional Reading Material

(HPCA 2018) [PDF] [Talk (PPTX)] [YouTube Trailer]
Haonan Wang, Fan Luo, Mohamed Ibrahim, Onur Kayiran, Adwait Jog
Efficient and Fair Multi-programming in GPUs via Effective Bandwidth Management,
In the Proceedings of 24th IEEE Symposium on High Performance Computer Architecture (HPCA), Vienna, Austria, Feb 2018
(Acceptance rate: 54/260 ≈ 20%)

(HPCA 2018) [PDF] [Talk (PPTX)] [YouTube Trailer]
Gurunath Kadam, Danfeng Zhang, Adwait Jog
RCoal: Mitigating GPU Timing Attack via Subwarp-based Randomized Coalescing Techniques,
In the Proceedings of 24th IEEE Symposium on High Performance Computer Architecture (HPCA), Vienna, Austria, Feb 2018
(Acceptance rate: 54/260 ≈ 20%)

(MEMSYS 2015) [PDF] [Talk (PPTX)] [Github]
Adwait Jog, Onur Kayiran, Tuba Kesten, Ashutosh Pattnaik, Evgeny Bolotin, Niladrish Chatterjee, Stephen W. Keckler, Mahmut T. Kandemir, Chita R. Das,
Anatomy of GPU Memory System for Multi-Application Execution,
In the Proceedings of 1st International Symposium on Memory Systems (MEMSYS), Washington, DC, Oct 2015

(ISCA 2015) [PDF] [Talk (PPTX)] [Lightning]
Nandita Vijaykumar, Gennady Pekhimenko, Adwait Jog, Abhishek Bhowmick, Rachata Ausavarungnirun, Chita Das, Mahmut Kandemir, Todd Mowry, Onur Mutlu,
A Case for Core-Assisted Bottleneck Acceleration in GPUs: Enabling Flexible Data Compression with Assist Warps,
In the Proceedings of 42nd International Symposium on Computer Architecture (ISCA), Portland, OR, June 2015
(Acceptance rate: 58/305 ≈ 19%)

(GPGPU@ASPLOS 2014) [PDF] [Talk (PPTX)] [ACM DOI]
Adwait Jog, Evgeny Bolotin, Zvika Guz, Mike Parker, Stephen W. Keckler, Mahmut Kandemir, Chita Das,
Application-aware Memory System for Fair and Efficient Execution of Concurrent GPGPU Applications,
In the Proceedings of 7th Workshop on General Purpose Computing using GPUs (GPGPU7), co-located with 19th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Salt Lake City, UT, March 2014
(Acceptance rate: 12/27 ≈ 44%)

(PACT 2013) [PDF] [Talk (PPTX)]
Best Paper Nomination: One of the four papers nominated for the Best Paper Award.
Onur Kayiran, Adwait Jog, Mahmut T. Kandemir, Chita R. Das,
Neither More Nor Less: Optimizing Thread-level Parallelism for GPGPUs,
In the Proceedings of 22nd International Conference on Parallel Architectures and Compilation Techniques (PACT), Edinburgh, Scotland, September 2013
(Acceptance rate: 36/208 ≈ 17%)

(ASPLOS 2013) [PDF] [2-page-summary (PDF)] [Talk (PPTX)]
Adwait Jog, Onur Kayiran, Nachiappan CN, Asit K. Mishra, Mahmut T. Kandemir, Onur Mutlu, Ravishankar Iyer, Chita R. Das, OWL: Cooperative Thread Array Aware Scheduling Techniques for Improving GPGPU performance, In the Proceedings of 18th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Houston, TX, March 2013
(Acceptance rate: 44/191 ≈ 23%)

Credits

The lecture slides have material from the presentation slides of various research papers (see above), which I am an author on. Credit goes to my Ph.D. students and collaborators. Several slides are adopted from Tor Aamodt's ACACES 2015 course on GPU architecture and other open source materials. Thanks to all.