Introduction to GPU Programming ( gpuprog | 21 hours )
- An understanding of C/C++ language and parallel programming concepts
- Basic knowledge of computer architecture and memory hierarchy
- Experience with command-line tools and code editors
Audience
- Developers who wish to learn the basics of GPU programming and the main frameworks and tools for developing GPU applications
- Developers who wish to write portable and scalable code that can run on different platforms and devices
- Programmers who wish to explore the benefits and challenges of GPU programming and optimization
GPU programming is a technique that leverages the parallel processing power of GPUs to accelerate applications that require high-performance computing, such as artificial intelligence, gaming, graphics, and scientific computing. There are several frameworks and tools that enable GPU programming, each with its own advantages and disadvantages. Some of the most popular ones are OpenCL, CUDA, ROCm, and HIP.
This instructor-led, live training (online or onsite) is aimed at beginner-level to intermediate-level developers who wish to learn the basics of GPU programming and the main frameworks and tools for developing GPU applications.
- By the end of this training, participants will be able to:
Understand the difference between CPU and GPU computing and the benefits and challenges of GPU programming. - Choose the right framework and tool for their GPU application.
- Create a basic GPU program that performs vector addition using one or more of the frameworks and tools.
- Use the respective APIs, languages, and libraries to query device information, allocate and deallocate device memory, copy data between host and device, launch kernels, and synchronize threads.
- Use the respective memory spaces, such as global, local, constant, and private, to optimize data transfers and memory accesses.
- Use the respective execution models, such as work-items, work-groups, threads, blocks, and grids, to control the parallelism.
- Debug and test GPU programs using tools such as CodeXL, CUDA-GDB, CUDA-MEMCHECK, and NVIDIA Nsight.
- Optimize GPU programs using techniques such as coalescing, caching, prefetching, and profiling.
Format of the Course
- Interactive lecture and discussion.
- Lots of exercises and practice.
- Hands-on implementation in a live-lab environment.
Course Customization Options
- To request a customized training for this course, please contact us to arrange.
Introduction
- What is GPU programming?
- Why use GPU programming?
- What are the challenges and trade-offs of GPU programming?
- What are the frameworks and tools for GPU programming?
- Choosing the right framework and tool for your application
OpenCL
- What is OpenCL?
- What are the advantages and disadvantages of OpenCL?
- Setting up the development environment for OpenCL
- Creating a basic OpenCL program that performs vector addition
- Using OpenCL API to query device information, allocate and deallocate device memory, copy data between host and device, launch kernels, and synchronize threads
- Using OpenCL C language to write kernels that execute on the device and manipulate data
- Using OpenCL built-in functions, variables, and libraries to perform common tasks and operations
- Using OpenCL memory spaces, such as global, local, constant, and private, to optimize data transfers and memory accesses
- Using OpenCL execution model to control the work-items, work-groups, and ND-ranges that define the parallelism
- Debugging and testing OpenCL programs using tools such as CodeXL
- Optimizing OpenCL programs using techniques such as coalescing, caching, prefetching, and profiling
CUDA
- What is CUDA?
- What are the advantages and disadvantages of CUDA?
- Setting up the development environment for CUDA
- Creating a basic CUDA program that performs vector addition
- Using CUDA API to query device information, allocate and deallocate device memory, copy data between host and device, launch kernels, and synchronize threads
- Using CUDA C/C++ language to write kernels that execute on the device and manipulate data
- Using CUDA built-in functions, variables, and libraries to perform common tasks and operations
- Using CUDA memory spaces, such as global, shared, constant, and local, to optimize data transfers and memory accesses
- Using CUDA execution model to control the threads, blocks, and grids that define the parallelism
- Debugging and testing CUDA programs using tools such as CUDA-GDB, CUDA-MEMCHECK, and NVIDIA Nsight
- Optimizing CUDA programs using techniques such as coalescing, caching, prefetching, and profiling
ROCm
- What is ROCm?
- What are the advantages and disadvantages of ROCm?
- Setting up the development environment for ROCm
- Creating a basic ROCm program that performs vector addition
- Using ROCm API to query device information, allocate and deallocate device memory, copy data between host and device, launch kernels, and synchronize threads
- Using ROCm C/C++ language to write kernels that execute on the device and manipulate data
- Using ROCm built-in functions, variables, and libraries to perform common tasks and operations
- Using ROCm memory spaces, such as global, local, constant, and private, to optimize data transfers and memory accesses
- Using ROCm execution model to control the threads, blocks, and grids that define the parallelism
- Debugging and testing ROCm programs using tools such as ROCm Debugger and ROCm Profiler
- Optimizing ROCm programs using techniques such as coalescing, caching, prefetching, and profiling
HIP
- What is HIP?
- What are the advantages and disadvantages of HIP?
- Setting up the development environment for HIP
- Creating a basic HIP program that performs vector addition
- Using HIP language to write kernels that execute on the device and manipulate data
- Using HIP built-in functions, variables, and libraries to perform common tasks and operations
- Using HIP memory spaces, such as global, shared, constant, and local, to optimize data transfers and memory accesses
- Using HIP execution model to control the threads, blocks, and grids that define the parallelism
- Debugging and testing HIP programs using tools such as ROCm Debugger and ROCm Profiler
- Optimizing HIP programs using techniques such as coalescing, caching, prefetching, and profiling
Comparison
- Comparing the features, performance, and compatibility of OpenCL, CUDA, ROCm, and HIP
- Evaluating GPU programs using benchmarks and metrics
- Learning the best practices and tips for GPU programming
- Exploring the current and future trends and challenges of GPU programming
Summary and Next Steps
United Arab Emirates - Introduction to GPU Programming
Qatar - Introduction to GPU Programming
Egypt - Introduction to GPU Programming
Saudi Arabia - Introduction to GPU Programming
South Africa - Introduction to GPU Programming
Brasil - Introduction to GPU Programming
Canada - Introduction to GPU Programming
中国 - Introduction to GPU Programming
香港 - Introduction to GPU Programming
澳門 - Introduction to GPU Programming
台灣 - Introduction to GPU Programming
USA - Introduction to GPU Programming
Österreich - Introduction to GPU Programming
Schweiz - Introduction to GPU Programming
Deutschland - Introduction to GPU Programming
Czech Republic - Introduction to GPU Programming
Denmark - Introduction to GPU Programming
Estonia - Introduction to GPU Programming
Finland - Introduction to GPU Programming
Greece - Introduction to GPU Programming
Magyarország - Introduction to GPU Programming
Ireland - Introduction to GPU Programming
Luxembourg - Introduction to GPU Programming
Latvia - Introduction to GPU Programming
España - Introduction to GPU Programming
Italia - Introduction to GPU Programming
Lithuania - Introduction to GPU Programming
Nederland - Introduction to GPU Programming
Norway - Introduction to GPU Programming
Portugal - Introduction to GPU Programming
România - Introduction to GPU Programming
Sverige - Introduction to GPU Programming
Türkiye - Introduction to GPU Programming
Malta - Introduction to GPU Programming
Belgique - Introduction to GPU Programming
France - Introduction to GPU Programming
日本 - Introduction to GPU Programming
Australia - Introduction to GPU Programming
Malaysia - Introduction to GPU Programming
New Zealand - Introduction to GPU Programming
Philippines - Introduction to GPU Programming
Singapore - Introduction to GPU Programming
Thailand - Introduction to GPU Programming
Vietnam - Introduction to GPU Programming
India - Introduction to GPU Programming
Argentina - Introduction to GPU Programming
Chile - Introduction to GPU Programming
Costa Rica - Introduction to GPU Programming
Ecuador - Introduction to GPU Programming
Guatemala - Introduction to GPU Programming
Colombia - Introduction to GPU Programming
México - Introduction to GPU Programming
Panama - Introduction to GPU Programming
Peru - Introduction to GPU Programming
Uruguay - Introduction to GPU Programming
Venezuela - Introduction to GPU Programming
Polska - Introduction to GPU Programming
United Kingdom - Introduction to GPU Programming
South Korea - Introduction to GPU Programming
Pakistan - Introduction to GPU Programming
Sri Lanka - Introduction to GPU Programming
Bulgaria - Introduction to GPU Programming
Bolivia - Introduction to GPU Programming
Indonesia - Introduction to GPU Programming
Kazakhstan - Introduction to GPU Programming
Moldova - Introduction to GPU Programming
Morocco - Introduction to GPU Programming
Tunisia - Introduction to GPU Programming
Kuwait - Introduction to GPU Programming
Oman - Introduction to GPU Programming
Slovakia - Introduction to GPU Programming
Kenya - Introduction to GPU Programming
Nigeria - Introduction to GPU Programming
Botswana - Introduction to GPU Programming
Slovenia - Introduction to GPU Programming
Croatia - Introduction to GPU Programming
Serbia - Introduction to GPU Programming