- System administration experience
- Experience with Linux command line
- An understanding of big data concepts
Audience
- System administrators
- DBAs
Apache Hadoop is a popular data processing framework for processing large data sets across many computers.
This instructor-led, live training (online or onsite) is aimed at system administrators who wish to learn how to set up, deploy and manage Hadoop clusters within their organization.
By the end of this training, participants will be able to:
- Install and configure Apache Hadoop.
- Understand the four major components in the Hadoop ecoystem: HDFS, MapReduce, YARN, and Hadoop Common.
- Use Hadoop Distributed File System (HDFS) to scale a cluster to hundreds or thousands of nodes.
- Set up HDFS to operate as storage engine for on-premise Spark deployments.
- Set up Spark to access alternative storage solutions such as Amazon S3 and NoSQL database systems such as Redis, Elasticsearch, Couchbase, Aerospike, etc.
- Carry out administrative tasks such as provisioning, management, monitoring and securing an Apache Hadoop cluster.
Format of the Course
- Interactive lecture and discussion.
- Lots of exercises and practice.
- Hands-on implementation in a live-lab environment.
Course Customization Options
- To request a customized training for this course, please contact us to arrange.
Introduction
- Introduction to Cloud Computing and Big Data solutions
- Overview of Apache Hadoop Features and Architecture
Setting up Hadoop
- Planning a Hadoop cluster (on-premise, cloud, etc.)
- Selecting the OS and Hadoop distribution
- Provisioning resources (hardware, network, etc.)
- Downloading and installing the software
- Sizing the cluster for flexibility
Working with HDFS
- Understanding the Hadoop Distributed File System (HDFS)
- Overview of HDFS Command Reference
- Accessing HDFS
- Performing Basic File Operations on HDFS
- Using S3 as a complement to HDFS
Overview of the MapReduce
- Understanding Data Flow in the MapReduce Framework
- Map, Shuffle, Sort and Reduce
- Demo: Computing Top Salaries
Working with YARN
- Understanding resource management in Hadoop
- Working with ResourceManager, NodeManager, Application Master
- Scheduling jobs under YARN
- Scheduling for large numbers of nodes and clusters
- Demo: Job scheduling
Integrating Hadoop with Spark
- Setting up storage for Spark (HDFS, Amazon, S3, NoSQL, etc.)
- Understanding Resilient Distributed Datasets (RDDs)
- Creating an RDD
- Implementing RDD Transformations
- Demo: Implementing a Text Search Program for Movie Titles
Managing a Hadoop Cluster
- Monitoring Hadoop
- Securing a Hadoop cluster
- Adding and removing nodes
- Running a performance benchmark
- Tuning a Hadoop cluster to optimizing performance
- Backup, recovery and business continuity planning
- Ensuring high availability (HA)
Upgrading and Migrating a Hadoop Cluster
- Assessing workload requirements
- Upgrading Hadoop
- Moving from on-premise to cloud and vice-versa
- Recovering from failures
Troubleshooting
Summary and Conclusion
United Arab Emirates - Hadoop and Spark for Administrators
Qatar - Hadoop and Spark for Administrators
Egypt - Hadoop and Spark for Administrators
Saudi Arabia - Hadoop and Spark for Administrators
South Africa - Hadoop and Spark for Administrators
Brasil - Hadoop and Spark for Administrators
Canada - Hadoop and Spark for Administrators
中国 - Hadoop and Spark for Administrators
香港 - Hadoop and Spark for Administrators
澳門 - Hadoop and Spark for Administrators
台灣 - Hadoop and Spark for Administrators
USA - Hadoop and Spark for Administrators
Österreich - Hadoop and Spark for Administrators
Schweiz - Hadoop and Spark for Administrators
Deutschland - Hadoop and Spark for Administrators
Czech Republic - Hadoop and Spark for Administrators
Denmark - Hadoop and Spark for Administrators
Estonia - Hadoop and Spark for Administrators
Finland - Hadoop and Spark for Administrators
Greece - Hadoop and Spark for Administrators
Magyarország - Hadoop and Spark for Administrators
Ireland - Hadoop and Spark for Administrators
Luxembourg - Hadoop and Spark for Administrators
Latvia - Hadoop and Spark for Administrators
España - Hadoop and Spark for Administrators
Italia - Hadoop and Spark for Administrators
Lithuania - Hadoop and Spark for Administrators
Nederland - Hadoop and Spark for Administrators
Norway - Hadoop and Spark for Administrators
Portugal - Hadoop and Spark for Administrators
România - Hadoop and Spark for Administrators
Sverige - Hadoop and Spark for Administrators
Türkiye - Hadoop and Spark for Administrators
Malta - Hadoop and Spark for Administrators
Belgique - Hadoop and Spark for Administrators
France - Hadoop and Spark for Administrators
日本 - Hadoop and Spark for Administrators
Australia - Hadoop and Spark for Administrators
Malaysia - Hadoop and Spark for Administrators
New Zealand - Hadoop and Spark for Administrators
Philippines - Hadoop and Spark for Administrators
Singapore - Hadoop and Spark for Administrators
Thailand - Hadoop and Spark for Administrators
Vietnam - Hadoop and Spark for Administrators
India - Hadoop and Spark for Administrators
Argentina - Hadoop and Spark for Administrators
Chile - Hadoop and Spark for Administrators
Costa Rica - Hadoop and Spark for Administrators
Ecuador - Hadoop and Spark for Administrators
Guatemala - Hadoop and Spark for Administrators
Colombia - Hadoop and Spark for Administrators
México - Hadoop and Spark for Administrators
Panama - Hadoop and Spark for Administrators
Peru - Hadoop and Spark for Administrators
Uruguay - Hadoop and Spark for Administrators
Venezuela - Hadoop and Spark for Administrators
Polska - Hadoop and Spark for Administrators
United Kingdom - Hadoop and Spark for Administrators
South Korea - Hadoop and Spark for Administrators
Pakistan - Hadoop and Spark for Administrators
Sri Lanka - Hadoop and Spark for Administrators
Bulgaria - Hadoop and Spark for Administrators
Bolivia - Hadoop and Spark for Administrators
Indonesia - Hadoop and Spark for Administrators
Kazakhstan - Hadoop and Spark for Administrators
Moldova - Hadoop and Spark for Administrators
Morocco - Hadoop and Spark for Administrators
Tunisia - Hadoop and Spark for Administrators
Kuwait - Hadoop and Spark for Administrators
Oman - Hadoop and Spark for Administrators
Slovakia - Hadoop and Spark for Administrators
Kenya - Hadoop and Spark for Administrators
Nigeria - Hadoop and Spark for Administrators
Botswana - Hadoop and Spark for Administrators
Slovenia - Hadoop and Spark for Administrators
Croatia - Hadoop and Spark for Administrators
Serbia - Hadoop and Spark for Administrators
Bhutan - Hadoop and Spark for Administrators