Introduction to GPU programming with CUDA

Instructor: Dr. Juan Carlos Zuniga-Anaya (University of Saskatchewan)

Title: Introduction to GPU programming with CUDA

This is a general introduction to the GP-GPU programming model and the use of CUDA and C to implement parallel computations in modern NVIDIA-GPU devices. The course also provides a series of examples and practical exercises where the attendees will be able to implement the concepts and techniques learned.

Target audience: anyone interested in learning CUDA from C

Course plan:

Overview of GPU computing
- what is a GPU?
- GPU architecture
- GPU programming approaches
CUDA programming model
- data parallelism
- thread hierarchy
- memory model
Synchronicity in CUDA
- task timeline (kernels, transfers, CPU computations)
- concurrency
- streams and events (synchronization)
Profiling and optimization of CUDA kernels
- instrumenting code, and the NVIDIA profiler
- occupancy (memory latency, stalls)
- branching, iterations, loops

Note that contents won’t be given in strict order, but rather “as needed” during the solution of examples and exercises, combining presentation of concepts with hands-on experiences.

Duration: 6 hours

Level: intermediate

Prerequisites: Some programming background is required. In particular, basic knowledge of C or C++ is necessary to follow the examples and exercises.

Setup:

Cluster reservation: ideally 1 GPU per student (this way we can fit 4 students per node on Cedar, only 2 per node on Graham)
Laptop software: SSH client (built-in on Mac/Linux, on Windows MobaXTerm or PuTTY)