- Basic Linux administration skills
- Basic programming skills
Audience:
The course is intended for IT specialists looking for a solution to store and process large data sets in a distributed system environment
Goal:
Deep knowledge on Hadoop cluster administration.
1: HDFS (17%)
- Describe the function of HDFS Daemons
- Describe the normal operation of an Apache Hadoop cluster, both in data storage and in data processing.
- Identify current features of computing systems that motivate a system like Apache Hadoop.
- Classify major goals of HDFS Design
- Given a scenario, identify appropriate use case for HDFS Federation
- Identify components and daemon of an HDFS HA-Quorum cluster
- Analyze the role of HDFS security (Kerberos)
- Determine the best data serialization choice for a given scenario
- Describe file read and write paths
- Identify the commands to manipulate files in the Hadoop File System Shell
2: YARN and MapReduce version 2 (MRv2) (17%)
- Understand how upgrading a cluster from Hadoop 1 to Hadoop 2 affects cluster settings
- Understand how to deploy MapReduce v2 (MRv2 / YARN), including all YARN daemons
- Understand basic design strategy for MapReduce v2 (MRv2)
- Determine how YARN handles resource allocations
- Identify the workflow of MapReduce job running on YARN
- Determine which files you must change and how in order to migrate a cluster from MapReduce version 1 (MRv1) to MapReduce version 2 (MRv2) running on YARN.
3: Hadoop Cluster Planning (16%)
- Principal points to consider in choosing the hardware and operating systems to host an Apache Hadoop cluster.
- Analyze the choices in selecting an OS
- Understand kernel tuning and disk swapping
- Given a scenario and workload pattern, identify a hardware configuration appropriate to the scenario
- Given a scenario, determine the ecosystem components your cluster needs to run in order to fulfill the SLA
- Cluster sizing: given a scenario and frequency of execution, identify the specifics for the workload, including CPU, memory, storage, disk I/O
- Disk Sizing and Configuration, including JBOD versus RAID, SANs, virtualization, and disk sizing requirements in a cluster
- Network Topologies: understand network usage in Hadoop (for both HDFS and MapReduce) and propose or identify key network design components for a given scenario
4: Hadoop Cluster Installation and Administration (25%)
- Given a scenario, identify how the cluster will handle disk and machine failures
- Analyze a logging configuration and logging configuration file format
- Understand the basics of Hadoop metrics and cluster health monitoring
- Identify the function and purpose of available tools for cluster monitoring
- Be able to install all the ecosystem components in CDH 5, including (but not limited to): Impala, Flume, Oozie, Hue, Manager, Sqoop, Hive, and Pig
- Identify the function and purpose of available tools for managing the Apache Hadoop file system
5: Resource Management (10%)
- Understand the overall design goals of each of Hadoop schedulers
- Given a scenario, determine how the FIFO Scheduler allocates cluster resources
- Given a scenario, determine how the Fair Scheduler allocates cluster resources under YARN
- Given a scenario, determine how the Capacity Scheduler allocates cluster resources
6: Monitoring and Logging (15%)
- Understand the functions and features of Hadoop’s metric collection abilities
- Analyze the NameNode and JobTracker Web UIs
- Understand how to monitor cluster Daemons
- Identify and monitor CPU usage on master nodes
- Describe how to monitor swap and memory allocation on all nodes
- Identify how to view and manage Hadoop’s log files
- Interpret a log file
United Arab Emirates - Administrator Training for Apache Hadoop
Qatar - Administrator Training for Apache Hadoop
Egypt - Administrator Training for Apache Hadoop
Saudi Arabia - Administrator Training for Apache Hadoop
South Africa - Administrator Training for Apache Hadoop
Brasil - Treinamento de Administrador para Apache Hadoop
Canada - Administrator Training for Apache Hadoop
中国 - Administrator Training for Apache Hadoop
香港 - Administrator Training for Apache Hadoop
澳門 - Administrator Training for Apache Hadoop
台灣 - Administrator Training for Apache Hadoop
USA - Administrator Training for Apache Hadoop
Österreich - Administrator Training for Apache Hadoop
Schweiz - Administrator Training for Apache Hadoop
Deutschland - Administrator Training for Apache Hadoop
Czech Republic - Administrator Training for Apache Hadoop
Denmark - Administrator Training for Apache Hadoop
Estonia - Administrator Training for Apache Hadoop
Finland - Administrator Training for Apache Hadoop
Greece - Administrator Training for Apache Hadoop
Magyarország - Administrator Training for Apache Hadoop
Ireland - Administrator Training for Apache Hadoop
Luxembourg - Administrator Training for Apache Hadoop
Latvia - Administrator Training for Apache Hadoop
España - Capacitación de Administrador para Apache Hadoop
Italia - Administrator Training for Apache Hadoop
Lithuania - Administrator Training for Apache Hadoop
Nederland - Administrator Training for Apache Hadoop
Norway - Administrator Training for Apache Hadoop
Portugal - Treinamento de Administrador para Apache Hadoop
România - Administrator Training for Apache Hadoop
Sverige - Administrator Training for Apache Hadoop
Türkiye - Administrator Training for Apache Hadoop
Malta - Administrator Training for Apache Hadoop
Belgique - Administrator Training for Apache Hadoop
France - Administrator Training for Apache Hadoop
日本 - Administrator Training for Apache Hadoop
Australia - Administrator Training for Apache Hadoop
Malaysia - Administrator Training for Apache Hadoop
New Zealand - Administrator Training for Apache Hadoop
Philippines - Administrator Training for Apache Hadoop
Singapore - Administrator Training for Apache Hadoop
Thailand - Administrator Training for Apache Hadoop
Vietnam - Administrator Training for Apache Hadoop
India - Administrator Training for Apache Hadoop
Argentina - Capacitación de Administrador para Apache Hadoop
Chile - Capacitación de Administrador para Apache Hadoop
Costa Rica - Capacitación de Administrador para Apache Hadoop
Ecuador - Capacitación de Administrador para Apache Hadoop
Guatemala - Capacitación de Administrador para Apache Hadoop
Colombia - Capacitación de Administrador para Apache Hadoop
México - Capacitación de Administrador para Apache Hadoop
Panama - Capacitación de Administrador para Apache Hadoop
Peru - Capacitación de Administrador para Apache Hadoop
Uruguay - Capacitación de Administrador para Apache Hadoop
Venezuela - Capacitación de Administrador para Apache Hadoop
Polska - Administrator Training for Apache Hadoop
United Kingdom - Administrator Training for Apache Hadoop
South Korea - Administrator Training for Apache Hadoop
Pakistan - Administrator Training for Apache Hadoop
Sri Lanka - Administrator Training for Apache Hadoop
Bulgaria - Administrator Training for Apache Hadoop
Bolivia - Capacitación de Administrador para Apache Hadoop
Indonesia - Administrator Training for Apache Hadoop
Kazakhstan - Administrator Training for Apache Hadoop
Moldova - Administrator Training for Apache Hadoop
Morocco - Administrator Training for Apache Hadoop
Tunisia - Administrator Training for Apache Hadoop
Kuwait - Administrator Training for Apache Hadoop
Oman - Administrator Training for Apache Hadoop
Slovakia - Administrator Training for Apache Hadoop
Kenya - Administrator Training for Apache Hadoop
Nigeria - Administrator Training for Apache Hadoop
Botswana - Administrator Training for Apache Hadoop
Slovenia - Administrator Training for Apache Hadoop
Croatia - Administrator Training for Apache Hadoop
Serbia - Administrator Training for Apache Hadoop
Bhutan - Administrator Training for Apache Hadoop