Course Code: horovod
Duration: 7 hours
Prerequisites:
  • An understanding of Machine Learning, specifically deep learning
  • Familiarity with machine learning libraries (TensorFlow, Keras, PyTorch, Apache MXNet)
  • Python programming experience

Audience

  • Developers
  • Data scientists
Overview:

Horovod is an open source software framework, designed for processing fast and efficient distributed deep learning models using TensorFlow, Keras, PyTorch, and Apache MXNet. It can scale up a single-GPU training script to run on multiple GPUs or hosts with minimal code changes.

This instructor-led, live training (online or onsite) is aimed at developers or data scientists who wish to use Horovod to run distributed deep learning trainings and scale it up to run across multiple GPUs in parallel.

By the end of this training, participants will be able to:

  • Set up the necessary development environment to start running deep learning trainings.
  • Install and configure Horovod to train models with TensorFlow, Keras, PyTorch, and Apache MXNet.
  • Scale deep learning training with Horovod to run on multiple GPUs.

Format of the Course

  • Interactive lecture and discussion.
  • Lots of exercises and practice.
  • Hands-on implementation in a live-lab environment.

Course Customization Options

  • This course is focused on Horovod, but other software tools and frameworks such as TensorFlow, Keras, PyTorch, and Apache MXNet may be required. Please let us know if you have specific requirements or preferences.
  • To request a customized training for this course, please contact us to arrange.
Course Outline:

Introduction

  • Overview of Horovod features and concepts
  • Understanding the supported frameworks

Installing and Configuring Horovod

  • Preparing the hosting environment    
  • Building Horovod for TensorFlow, Keras, PyTorch, and Apache MXNet
  • Running Horovod

Running Distributed Training

  • Modifying and running training examples with TensorFlow
  • Modifying and running training examples with Keras
  • Modifying and running training examples with PyTorch
  • Modifying and running training examples with Apache MXNet

Optimizing Distributed Training Processes

  • Running concurrent operations on multiple GPUs    
  • Tuning hyperparameters
  • Enabling performance autotuning

Troubleshooting

Summary and Conclusion

Sites Published:

United Arab Emirates - Distributed Deep Learning with Horovod

Qatar - Distributed Deep Learning with Horovod

Egypt - Distributed Deep Learning with Horovod

Saudi Arabia - Distributed Deep Learning with Horovod

South Africa - Distributed Deep Learning with Horovod

Brasil - Distributed Deep Learning with Horovod

Canada - Distributed Deep Learning with Horovod

中国 - Distributed Deep Learning with Horovod

香港 - Distributed Deep Learning with Horovod

澳門 - Distributed Deep Learning with Horovod

台灣 - Distributed Deep Learning with Horovod

USA - Distributed Deep Learning with Horovod

Österreich - Distributed Deep Learning with Horovod

Schweiz - Distributed Deep Learning with Horovod

Deutschland - Distributed Deep Learning with Horovod

Czech Republic - Distributed Deep Learning with Horovod

Denmark - Distributed Deep Learning with Horovod

Estonia - Distributed Deep Learning with Horovod

Finland - Distributed Deep Learning with Horovod

Greece - Distributed Deep Learning with Horovod

Magyarország - Distributed Deep Learning with Horovod

Ireland - Distributed Deep Learning with Horovod

Luxembourg - Distributed Deep Learning with Horovod

Latvia - Distributed Deep Learning with Horovod

España - Distributed Deep Learning with Horovod

Italia - Distributed Deep Learning with Horovod

Lithuania - Distributed Deep Learning with Horovod

Nederland - Distributed Deep Learning with Horovod

Norway - Distributed Deep Learning with Horovod

Portugal - Distributed Deep Learning with Horovod

România - Distributed Deep Learning with Horovod

Sverige - Distributed Deep Learning with Horovod

Türkiye - Distributed Deep Learning with Horovod

Malta - Distributed Deep Learning with Horovod

Belgique - Distributed Deep Learning with Horovod

France - Distributed Deep Learning with Horovod

日本 - Distributed Deep Learning with Horovod

Australia - Distributed Deep Learning with Horovod

Malaysia - Distributed Deep Learning with Horovod

New Zealand - Distributed Deep Learning with Horovod

Philippines - Distributed Deep Learning with Horovod

Singapore - Distributed Deep Learning with Horovod

Thailand - Distributed Deep Learning with Horovod

Vietnam - Distributed Deep Learning with Horovod

India - Distributed Deep Learning with Horovod

Argentina - Distributed Deep Learning with Horovod

Chile - Distributed Deep Learning with Horovod

Costa Rica - Distributed Deep Learning with Horovod

Ecuador - Distributed Deep Learning with Horovod

Guatemala - Distributed Deep Learning with Horovod

Colombia - Distributed Deep Learning with Horovod

México - Distributed Deep Learning with Horovod

Panama - Distributed Deep Learning with Horovod

Peru - Distributed Deep Learning with Horovod

Uruguay - Distributed Deep Learning with Horovod

Venezuela - Distributed Deep Learning with Horovod

Polska - Distributed Deep Learning with Horovod

United Kingdom - Distributed Deep Learning with Horovod

South Korea - Distributed Deep Learning with Horovod

Pakistan - Distributed Deep Learning with Horovod

Sri Lanka - Distributed Deep Learning with Horovod

Bulgaria - Distributed Deep Learning with Horovod

Bolivia - Distributed Deep Learning with Horovod

Indonesia - Distributed Deep Learning with Horovod

Kazakhstan - Distributed Deep Learning with Horovod

Moldova - Distributed Deep Learning with Horovod

Morocco - Distributed Deep Learning with Horovod

Tunisia - Distributed Deep Learning with Horovod

Kuwait - Distributed Deep Learning with Horovod

Oman - Distributed Deep Learning with Horovod

Slovakia - Distributed Deep Learning with Horovod

Kenya - Distributed Deep Learning with Horovod

Nigeria - Distributed Deep Learning with Horovod

Botswana - Distributed Deep Learning with Horovod

Slovenia - Distributed Deep Learning with Horovod

Croatia - Distributed Deep Learning with Horovod

Serbia - Distributed Deep Learning with Horovod

Bhutan - Distributed Deep Learning with Horovod

Nepal - Distributed Deep Learning with Horovod

Uzbekistan - Distributed Deep Learning with Horovod