ROCm for Windows

Course Code: rocmwindows

Duration: 21 hours

Prerequisites:

An understanding of C/C++ language and parallel programming concepts
Basic knowledge of computer architecture and memory hierarchy
Experience with command-line tools and code editors
Familiarity with Windows operating system and PowerShell

Audience

Developers who wish to learn how to install and use ROCm on Windows to program AMD GPUs and exploit their parallelism
Developers who wish to write high-performance and scalable code that can run on different AMD devices
Programmers who wish to explore the low-level aspects of GPU programming and optimize their code performance

Overview:

ROCm is an open source platform for GPU programming that supports AMD GPUs, and also provides compatibility with CUDA and OpenCL. ROCm exposes the programmer to the hardware details and gives full control over the parallelization process. However, this also requires a good understanding of the device architecture, memory model, execution model, and optimization techniques.

ROCm for Windows is a recent development that allows users to install and use ROCm on Windows operating system, which is widely used for personal and professional purposes. ROCm for Windows enables users to leverage the power of AMD GPUs for various applications, such as artificial intelligence, gaming, graphics, and scientific computing.

This instructor-led, live training (online or onsite) is aimed at beginner-level to intermediate-level developers who wish to install and use ROCm on Windows to program AMD GPUs and exploit their parallelism.

By the end of this training, participants will be able to:

Set up a development environment that includes ROCm Platform, a AMD GPU, and Visual Studio Code on Windows.
Create a basic ROCm program that performs vector addition on the GPU and retrieves the results from the GPU memory.
Use ROCm API to query device information, allocate and deallocate device memory, copy data between host and device, launch kernels, and synchronize threads.
Use HIP language to write kernels that execute on the GPU and manipulate data.
Use HIP built-in functions, variables, and libraries to perform common tasks and operations.
Use ROCm and HIP memory spaces, such as global, shared, constant, and local, to optimize data transfers and memory accesses.
Use ROCm and HIP execution models to control the threads, blocks, and grids that define the parallelism.
Debug and test ROCm and HIP programs using tools such as ROCm Debugger and ROCm Profiler.
Optimize ROCm and HIP programs using techniques such as coalescing, caching, prefetching, and profiling.

Format of the Course

Interactive lecture and discussion.
Lots of exercises and practice.
Hands-on implementation in a live-lab environment.

Course Customization Options

To request a customized training for this course, please contact us to arrange.

Course Outline:

Introduction

What is ROCm?
What is HIP?
ROCm vs CUDA vs OpenCL
Overview of ROCm and HIP features and architecture
ROCm for Windows vs ROCm for Linux

Installation

Installing ROCm on Windows
Verifying the installation and check the device compatibility
Updating or uninstall ROCm on Windows
Troubleshooting common installation issues

Getting Started

Creating a new ROCm project using Visual Studio Code on Windows
Exploring the project structure and files
Compiling and run the program
Displaying the output using printf and fprintf

ROCm API

Using ROCm API in the host program
Querying device information and capabilities
Allocating and deallocate device memory
Copying data between host and device
Launching kernels and synchronize threads
Handling errors and exceptions

HIP Language

Using HIP language in the device program
Writing kernels that execute on the GPU and manipulate data
Using data types, qualifiers, operators, and expressions
Using built-in functions, variables, and libraries

ROCm and HIP Memory Model

Using different memory spaces, such as global, shared, constant, and local
Using different memory objects, such as pointers, arrays, textures, and surfaces
Using different memory access modes, such as read-only, write-only, read-write, etc.
Using memory consistency model and synchronization mechanisms

ROCm and HIP Execution Model

Using different execution models, such as threads, blocks, and grids
Using thread functions, such as hipThreadIdx_x, hipBlockIdx_x, hipBlockDim_x, etc.
Using block functions, such as __syncthreads, __threadfence_block, etc.
Using grid functions, such as hipGridDim_x, hipGridSync, cooperative groups, etc.

Debugging

Debugging ROCm and HIP programs on Windows
Using Visual Studio Code debugger to inspect variables, breakpoints, call stack, etc.
Using ROCm Debugger to debug ROCm and HIP programs on AMD devices
Using ROCm Profiler to analyze ROCm and HIP programs on AMD devices

Optimization

Optimizing ROCm and HIP programs on Windows
Using coalescing techniques to improve memory throughput
Using caching and prefetching techniques to reduce memory latency
Using shared memory and local memory techniques to optimize memory accesses and bandwidth
Using profiling and profiling tools to measure and improve the execution time and resource utilization

Summary and Next Steps

Sites Published:

United Arab Emirates - ROCm for Windows

Qatar - ROCm for Windows

Egypt - ROCm for Windows

Saudi Arabia - ROCm for Windows

South Africa - ROCm for Windows

Brasil - ROCm for Windows

Canada - ROCm for Windows

中国 - ROCm for Windows

香港 - ROCm for Windows

澳門 - ROCm for Windows

台灣 - ROCm for Windows

USA - ROCm for Windows

Österreich - ROCm for Windows

Schweiz - ROCm for Windows

Deutschland - ROCm for Windows

Czech Republic - ROCm for Windows

Denmark - ROCm for Windows

Estonia - ROCm for Windows

Finland - ROCm for Windows

Greece - ROCm for Windows

Magyarország - ROCm for Windows

Ireland - ROCm for Windows

Luxembourg - ROCm for Windows

Latvia - ROCm for Windows

España - ROCm for Windows

Italia - ROCm for Windows

Lithuania - ROCm for Windows

Nederland - ROCm for Windows

Norway - ROCm for Windows

Portugal - ROCm for Windows

România - ROCm for Windows

Sverige - ROCm for Windows

Türkiye - ROCm for Windows

Malta - ROCm for Windows

Belgique - ROCm for Windows

France - ROCm for Windows

日本 - ROCm for Windows

Australia - ROCm for Windows

Malaysia - ROCm for Windows

New Zealand - ROCm for Windows

Philippines - ROCm for Windows

Singapore - ROCm for Windows

Thailand - ROCm for Windows

Vietnam - ROCm for Windows

India - ROCm for Windows

Argentina - ROCm for Windows

Chile - ROCm for Windows

Costa Rica - ROCm for Windows

Ecuador - ROCm for Windows

Guatemala - ROCm for Windows

Colombia - ROCm for Windows

México - ROCm for Windows

Panama - ROCm for Windows

Peru - ROCm for Windows

Uruguay - ROCm for Windows

Venezuela - ROCm for Windows

Polska - ROCm for Windows

United Kingdom - ROCm for Windows

South Korea - ROCm for Windows

Pakistan - ROCm for Windows

Sri Lanka - ROCm for Windows

Bulgaria - ROCm for Windows

Bolivia - ROCm for Windows

Indonesia - ROCm for Windows

Kazakhstan - ROCm for Windows

Moldova - ROCm for Windows

Morocco - ROCm for Windows

Tunisia - ROCm for Windows

Kuwait - ROCm for Windows

Oman - ROCm for Windows

Slovakia - ROCm for Windows

Kenya - ROCm for Windows

Nigeria - ROCm for Windows

Botswana - ROCm for Windows

Slovenia - ROCm for Windows

Croatia - ROCm for Windows

Serbia - ROCm for Windows

Bhutan - ROCm for Windows

Nepal - ROCm for Windows

Uzbekistan - ROCm for Windows