GPU Programming with OpenCL

Course Code: gpuprogopencl

Duration: 28 hours

Prerequisites:

An understanding of C/C++ language and parallel programming concepts
Basic knowledge of computer architecture and memory hierarchy
Experience with command-line tools and code editors

Audience

Developers who wish to learn how to use OpenCL to program heterogeneous devices and exploit their parallelism
Developers who wish to write portable and scalable code that can run on different platforms and devices
Programmers who wish to explore the low-level aspects of heterogeneous programming and optimize their code performance

Overview:

OpenCL is an open standard for heterogeneous programming that enables a code to run on different platforms and devices, such as multicore CPUs, GPUs, FPGAs, and others. OpenCL exposes the programmer to the hardware details and gives full control over the parallelization process. However, this also requires a good understanding of the device architecture, memory model, execution model, and optimization techniques.

This instructor-led, live training (online or onsite) is aimed at beginner-level to intermediate-level developers who wish to use OpenCL to program heterogeneous devices and exploit their parallelism.

By the end of this training, participants will be able to:

Set up a development environment that includes OpenCL SDK, a device that supports OpenCL, and Visual Studio Code.
Create a basic OpenCL program that performs vector addition on the device and retrieves the results from the device memory.
Use OpenCL API to query device information, create contexts, command queues, buffers, kernels, and events.
Use OpenCL C language to write kernels that execute on the device and manipulate data.
Use OpenCL built-in functions, extensions, and libraries to perform common tasks and operations.
Use OpenCL host and device memory models to optimize data transfers and memory accesses.
Use OpenCL execution model to control the work-items, work-groups, and ND-ranges.
Debug and test OpenCL programs using tools such as CodeXL, Intel VTune, and NVIDIA Nsight.
Optimize OpenCL programs using techniques such as vectorization, loop unrolling, local memory, and profiling.

Format of the Course

Interactive lecture and discussion.
Lots of exercises and practice.
Hands-on implementation in a live-lab environment.

Course Customization Options

To request a customized training for this course, please contact us to arrange.

Course Outline:

Introduction

What is OpenCL?
OpenCL vs CUDA vs SYCL
Overview of OpenCL features and architecture
Setting up the Development Environment

Getting Started

Creating a new OpenCL project using Visual Studio Code
Exploring the project structure and files
Compiling and running the program
Displaying the output using printf and fprintf

OpenCL API

Understanding the role of OpenCL API in the host program
Using OpenCL API to query device information and capabilities
Using OpenCL API to create contexts, command queues, buffers, kernels, and events
Using OpenCL API to enqueue commands, such as read, write, copy, map, unmap, execute, and wait
Using OpenCL API to handle errors and exceptions

OpenCL C

Understanding the role of OpenCL C in the device program
Using OpenCL C to write kernels that execute on the device and manipulate data
Using OpenCL C data types, qualifiers, operators, and expressions
Using OpenCL C built-in functions, such as math, geometric, relational, etc.
Using OpenCL C extensions and libraries, such as atomic, image, cl_khr_fp16, etc.

OpenCL Memory Model

Understanding the difference between host and device memory models
Using OpenCL memory spaces, such as global, local, constant, and private
Using OpenCL memory objects, such as buffers, images, and pipes
Using OpenCL memory access modes, such as read-only, write-only, read-write, etc.
Using OpenCL memory consistency model and synchronization mechanisms

OpenCL Execution Model

Understanding the difference between host and device execution models
Using OpenCL work-items, work-groups, and ND-ranges to define the parallelism
Using OpenCL work-item functions, such as get_global_id, get_local_id, get_group_id, etc.
Using OpenCL work-group functions, such as barrier, work_group_reduce, work_group_scan, etc.
Using OpenCL device functions, such as get_num_groups, get_global_size, get_local_size, etc.

Debugging

Understanding the common errors and bugs in OpenCL programs
Using Visual Studio Code debugger to inspect variables, breakpoints, call stack, etc.
Using CodeXL to debug and analyze OpenCL programs on AMD devices
Using Intel VTune to debug and analyze OpenCL programs on Intel devices
Using NVIDIA Nsight to debug and analyze OpenCL programs on NVIDIA devices

Optimization

Understanding the factors that affect the performance of OpenCL programs
Using OpenCL vector data types and vectorization techniques to improve arithmetic throughput
Using OpenCL loop unrolling and loop tiling techniques to reduce control overhead and increase locality
Using OpenCL local memory and local memory functions to optimize memory accesses and bandwidth
Using OpenCL profiling and profiling tools to measure and improve the execution time and resource utilization

Summary and Next Steps

Sites Published:

United Arab Emirates - GPU Programming with OpenCL

Qatar - GPU Programming with OpenCL

Egypt - GPU Programming with OpenCL

Saudi Arabia - GPU Programming with OpenCL

South Africa - GPU Programming with OpenCL

Brasil - GPU Programming with OpenCL

Canada - GPU Programming with OpenCL

中国 - GPU Programming with OpenCL

香港 - GPU Programming with OpenCL

澳門 - GPU Programming with OpenCL

台灣 - GPU Programming with OpenCL

USA - GPU Programming with OpenCL

Österreich - GPU Programming with OpenCL

Schweiz - GPU Programming with OpenCL

Deutschland - GPU Programming with OpenCL

Czech Republic - GPU Programming with OpenCL

Denmark - GPU Programming with OpenCL

Estonia - GPU Programming with OpenCL

Finland - GPU Programming with OpenCL

Greece - GPU Programming with OpenCL

Magyarország - GPU Programming with OpenCL

Ireland - GPU Programming with OpenCL

Luxembourg - GPU Programming with OpenCL

Latvia - GPU Programming with OpenCL

España - GPU Programming with OpenCL

Italia - GPU Programming with OpenCL

Lithuania - GPU Programming with OpenCL

Nederland - GPU Programming with OpenCL

Norway - GPU Programming with OpenCL

Portugal - GPU Programming with OpenCL

România - GPU Programming with OpenCL

Sverige - GPU Programming with OpenCL

Türkiye - GPU Programming with OpenCL

Malta - GPU Programming with OpenCL

Belgique - GPU Programming with OpenCL

France - GPU Programming with OpenCL

日本 - GPU Programming with OpenCL

Australia - GPU Programming with OpenCL

Malaysia - GPU Programming with OpenCL

New Zealand - GPU Programming with OpenCL

Philippines - GPU Programming with OpenCL

Singapore - GPU Programming with OpenCL

Thailand - GPU Programming with OpenCL

Vietnam - GPU Programming with OpenCL

India - GPU Programming with OpenCL

Argentina - GPU Programming with OpenCL

Chile - GPU Programming with OpenCL

Costa Rica - GPU Programming with OpenCL

Ecuador - GPU Programming with OpenCL

Guatemala - GPU Programming with OpenCL

Colombia - GPU Programming with OpenCL

México - GPU Programming with OpenCL

Panama - GPU Programming with OpenCL

Peru - GPU Programming with OpenCL

Uruguay - GPU Programming with OpenCL

Venezuela - GPU Programming with OpenCL

Polska - GPU Programming with OpenCL

United Kingdom - GPU Programming with OpenCL

South Korea - GPU Programming with OpenCL

Pakistan - GPU Programming with OpenCL

Sri Lanka - GPU Programming with OpenCL

Bulgaria - GPU Programming with OpenCL

Bolivia - GPU Programming with OpenCL

Indonesia - GPU Programming with OpenCL

Kazakhstan - GPU Programming with OpenCL

Moldova - GPU Programming with OpenCL

Morocco - GPU Programming with OpenCL

Tunisia - GPU Programming with OpenCL

Kuwait - GPU Programming with OpenCL

Oman - GPU Programming with OpenCL

Slovakia - GPU Programming with OpenCL

Kenya - GPU Programming with OpenCL

Nigeria - GPU Programming with OpenCL

Botswana - GPU Programming with OpenCL

Slovenia - GPU Programming with OpenCL

Croatia - GPU Programming with OpenCL

Serbia - GPU Programming with OpenCL

Bhutan - GPU Programming with OpenCL

Nepal - GPU Programming with OpenCL

Uzbekistan - GPU Programming with OpenCL