- An understanding of C/C++ language and parallel programming concepts
- Basic knowledge of computer architecture and memory hierarchy
- Experience with command-line tools and code editors
Audience
- Developers who wish to learn how to use OpenCL to program heterogeneous devices and exploit their parallelism
- Developers who wish to write portable and scalable code that can run on different platforms and devices
- Programmers who wish to explore the low-level aspects of heterogeneous programming and optimize their code performance
OpenCL is an open standard for heterogeneous programming that enables a code to run on different platforms and devices, such as multicore CPUs, GPUs, FPGAs, and others. OpenCL exposes the programmer to the hardware details and gives full control over the parallelization process. However, this also requires a good understanding of the device architecture, memory model, execution model, and optimization techniques.
This instructor-led, live training (online or onsite) is aimed at beginner-level to intermediate-level developers who wish to use OpenCL to program heterogeneous devices and exploit their parallelism.
By the end of this training, participants will be able to:
- Set up a development environment that includes OpenCL SDK, a device that supports OpenCL, and Visual Studio Code.
- Create a basic OpenCL program that performs vector addition on the device and retrieves the results from the device memory.
- Use OpenCL API to query device information, create contexts, command queues, buffers, kernels, and events.
- Use OpenCL C language to write kernels that execute on the device and manipulate data.
- Use OpenCL built-in functions, extensions, and libraries to perform common tasks and operations.
- Use OpenCL host and device memory models to optimize data transfers and memory accesses.
- Use OpenCL execution model to control the work-items, work-groups, and ND-ranges.
- Debug and test OpenCL programs using tools such as CodeXL, Intel VTune, and NVIDIA Nsight.
- Optimize OpenCL programs using techniques such as vectorization, loop unrolling, local memory, and profiling.
Format of the Course
- Interactive lecture and discussion.
- Lots of exercises and practice.
- Hands-on implementation in a live-lab environment.
Course Customization Options
- To request a customized training for this course, please contact us to arrange.
Introduction
- What is OpenCL?
- OpenCL vs CUDA vs SYCL
- Overview of OpenCL features and architecture
- Setting up the Development Environment
Getting Started
- Creating a new OpenCL project using Visual Studio Code
- Exploring the project structure and files
- Compiling and running the program
- Displaying the output using printf and fprintf
OpenCL API
- Understanding the role of OpenCL API in the host program
- Using OpenCL API to query device information and capabilities
- Using OpenCL API to create contexts, command queues, buffers, kernels, and events
- Using OpenCL API to enqueue commands, such as read, write, copy, map, unmap, execute, and wait
- Using OpenCL API to handle errors and exceptions
OpenCL C
- Understanding the role of OpenCL C in the device program
- Using OpenCL C to write kernels that execute on the device and manipulate data
- Using OpenCL C data types, qualifiers, operators, and expressions
- Using OpenCL C built-in functions, such as math, geometric, relational, etc.
- Using OpenCL C extensions and libraries, such as atomic, image, cl_khr_fp16, etc.
OpenCL Memory Model
- Understanding the difference between host and device memory models
- Using OpenCL memory spaces, such as global, local, constant, and private
- Using OpenCL memory objects, such as buffers, images, and pipes
- Using OpenCL memory access modes, such as read-only, write-only, read-write, etc.
- Using OpenCL memory consistency model and synchronization mechanisms
OpenCL Execution Model
- Understanding the difference between host and device execution models
- Using OpenCL work-items, work-groups, and ND-ranges to define the parallelism
- Using OpenCL work-item functions, such as get_global_id, get_local_id, get_group_id, etc.
- Using OpenCL work-group functions, such as barrier, work_group_reduce, work_group_scan, etc.
- Using OpenCL device functions, such as get_num_groups, get_global_size, get_local_size, etc.
Debugging
- Understanding the common errors and bugs in OpenCL programs
- Using Visual Studio Code debugger to inspect variables, breakpoints, call stack, etc.
- Using CodeXL to debug and analyze OpenCL programs on AMD devices
- Using Intel VTune to debug and analyze OpenCL programs on Intel devices
- Using NVIDIA Nsight to debug and analyze OpenCL programs on NVIDIA devices
Optimization
- Understanding the factors that affect the performance of OpenCL programs
- Using OpenCL vector data types and vectorization techniques to improve arithmetic throughput
- Using OpenCL loop unrolling and loop tiling techniques to reduce control overhead and increase locality
- Using OpenCL local memory and local memory functions to optimize memory accesses and bandwidth
- Using OpenCL profiling and profiling tools to measure and improve the execution time and resource utilization
Summary and Next Steps
United Arab Emirates - GPU Programming with OpenCL
Qatar - GPU Programming with OpenCL
Egypt - GPU Programming with OpenCL
Saudi Arabia - GPU Programming with OpenCL
South Africa - GPU Programming with OpenCL
Brasil - GPU Programming with OpenCL
Canada - GPU Programming with OpenCL
中国 - GPU Programming with OpenCL
香港 - GPU Programming with OpenCL
澳門 - GPU Programming with OpenCL
台灣 - GPU Programming with OpenCL
USA - GPU Programming with OpenCL
Österreich - GPU Programming with OpenCL
Schweiz - GPU Programming with OpenCL
Deutschland - GPU Programming with OpenCL
Czech Republic - GPU Programming with OpenCL
Denmark - GPU Programming with OpenCL
Estonia - GPU Programming with OpenCL
Finland - GPU Programming with OpenCL
Greece - GPU Programming with OpenCL
Magyarország - GPU Programming with OpenCL
Ireland - GPU Programming with OpenCL
Luxembourg - GPU Programming with OpenCL
Latvia - GPU Programming with OpenCL
España - GPU Programming with OpenCL
Italia - GPU Programming with OpenCL
Lithuania - GPU Programming with OpenCL
Nederland - GPU Programming with OpenCL
Norway - GPU Programming with OpenCL
Portugal - GPU Programming with OpenCL
România - GPU Programming with OpenCL
Sverige - GPU Programming with OpenCL
Türkiye - GPU Programming with OpenCL
Malta - GPU Programming with OpenCL
Belgique - GPU Programming with OpenCL
France - GPU Programming with OpenCL
日本 - GPU Programming with OpenCL
Australia - GPU Programming with OpenCL
Malaysia - GPU Programming with OpenCL
New Zealand - GPU Programming with OpenCL
Philippines - GPU Programming with OpenCL
Singapore - GPU Programming with OpenCL
Thailand - GPU Programming with OpenCL
Vietnam - GPU Programming with OpenCL
India - GPU Programming with OpenCL
Argentina - GPU Programming with OpenCL
Chile - GPU Programming with OpenCL
Costa Rica - GPU Programming with OpenCL
Ecuador - GPU Programming with OpenCL
Guatemala - GPU Programming with OpenCL
Colombia - GPU Programming with OpenCL
México - GPU Programming with OpenCL
Panama - GPU Programming with OpenCL
Peru - GPU Programming with OpenCL
Uruguay - GPU Programming with OpenCL
Venezuela - GPU Programming with OpenCL
Polska - GPU Programming with OpenCL
United Kingdom - GPU Programming with OpenCL
South Korea - GPU Programming with OpenCL
Pakistan - GPU Programming with OpenCL
Sri Lanka - GPU Programming with OpenCL
Bulgaria - GPU Programming with OpenCL
Bolivia - GPU Programming with OpenCL
Indonesia - GPU Programming with OpenCL
Kazakhstan - GPU Programming with OpenCL
Moldova - GPU Programming with OpenCL
Morocco - GPU Programming with OpenCL
Tunisia - GPU Programming with OpenCL
Kuwait - GPU Programming with OpenCL
Oman - GPU Programming with OpenCL
Slovakia - GPU Programming with OpenCL
Kenya - GPU Programming with OpenCL
Nigeria - GPU Programming with OpenCL
Botswana - GPU Programming with OpenCL
Slovenia - GPU Programming with OpenCL
Croatia - GPU Programming with OpenCL
Serbia - GPU Programming with OpenCL
Bhutan - GPU Programming with OpenCL