Course Code: hadoopdev
Duration: 28 hours
Prerequisites:
  • 熟悉Java编程语言(大多数编程练习使用Java)
  • 熟悉Linux环境(能够使用Linux命令行,使用vi / nano编辑文件)

实验环境

零安装:无需在学生机器上安装Hadoop软件!将为学生提供一个可用的Hadoop集群。

学生需要准备以下内容

  • 一个SSH客户端(Linux和Mac已经自带SSH客户端,Windows推荐使用Putty)
  • 一个浏览器来访问集群,推荐使用Firefox
Overview:

Apache Hadoop 是最流行的用于在服务器集群上处理 Big Data 的框架。本课程将向开发者介绍 Hadoop 生态系统中的各种组件(HDFS、MapReduce、Pig、Hive 和 HBase)。

Course Outline:

第一部分:Hadoop 简介

  • Hadoop 历史与概念
  • 生态系统
  • 发行版
  • 高层架构
  • Hadoop 误区
  • Hadoop 挑战
  • 硬件/软件
  • 实验:初识 Hadoop

第二部分:HDFS

  • 设计与架构
  • 概念(水平扩展、复制、数据本地性、机架感知)
  • 守护进程:Namenode、Secondary namenode、Data node
  • 通信/心跳
  • 数据完整性
  • 读/写路径
  • Namenode 高可用性 (HA)、联邦
  • 实验:与 HDFS 交互

第三部分:Map Reduce

  • 概念与架构
  • 守护进程 (MRV1):jobtracker / tasktracker
  • 阶段:driver、mapper、shuffle/sort、reducer
  • Map Reduce 版本 1 和版本 2 (YARN)
  • Map Reduce 内部机制
  • Java Map Reduce 程序简介
  • 实验:运行一个 MapReduce 示例程序

第四部分:Pig

  • Pig 与 Java Map Reduce 对比
  • Pig 作业流程
  • Pig Latin 语言
  • 使用 Pig 进行 ETL
  • 转换与连接
  • 用户定义函数 (UDF)
  • 实验:编写 Pig 脚本来分析数据

第五部分:Hive

  • 架构与设计
  • 数据类型
  • Hive 中的 SQL 支持
  • 创建 Hive 表与查询
  • 分区
  • 连接
  • 文本处理
  • 实验:使用 Hive 处理数据的多个实验

第六部分:HBase

  • 概念与架构
  • HBase 与 RDBMS 与 Cassandra 对比
  • HBase Java API
  • HBase 上的时间序列数据
  • 模式设计
  • 实验:使用 shell 与 HBase 交互;使用 HBase Java API 编程;模式设计练习
Sites Published:

United Arab Emirates - Hadoop for Developers (4 days)

Qatar - Hadoop for Developers (4 days)

Egypt - Hadoop for Developers (4 days)

Saudi Arabia - Hadoop for Developers (4 days)

South Africa - Hadoop for Developers (4 days)

Brasil - Hadoop for Developers (4 days)

Canada - Hadoop for Developers (4 days)

中国 - Hadoop for Developers (4 days)

香港 - Hadoop for Developers (4 days)

澳門 - Hadoop for Developers (4 days)

台灣 - Hadoop for Developers (4 days)

USA - Hadoop for Developers (4 days)

Österreich - Hadoop for Developers (4 days)

Schweiz - Hadoop for Developers (4 days)

Deutschland - Hadoop for Developers (4 days)

Czech Republic - Hadoop for Developers (4 days)

Denmark - Hadoop for Developers (4 days)

Estonia - Hadoop for Developers (4 days)

Finland - Hadoop for Developers (4 days)

Greece - Hadoop for Developers (4 days)

Magyarország - Hadoop for Developers (4 days)

Ireland - Hadoop for Developers (4 days)

Luxembourg - Hadoop for Developers (4 days)

Latvia - Hadoop for Developers (4 days)

España - Hadoop para Desarrolladores (4 días)

Italia - Hadoop for Developers (4 days)

Lithuania - Hadoop for Developers (4 days)

Nederland - Hadoop for Developers (4 days)

Norway - Hadoop for Developers (4 days)

Portugal - Hadoop for Developers (4 days)

România - Hadoop for Developers (4 days)

Sverige - Hadoop for Developers (4 days)

Türkiye - Hadoop for Developers (4 days)

Malta - Hadoop for Developers (4 days)

Belgique - Hadoop for Developers (4 days)

France - Hadoop for Developers (4 days)

日本 - Hadoop for Developers (4 days)

Australia - Hadoop for Developers (4 days)

Malaysia - Hadoop for Developers (4 days)

New Zealand - Hadoop for Developers (4 days)

Philippines - Hadoop for Developers (4 days)

Singapore - Hadoop for Developers (4 days)

Thailand - Hadoop for Developers (4 days)

Vietnam - Hadoop for Developers (4 days)

India - Hadoop for Developers (4 days)

Argentina - Hadoop para Desarrolladores (4 días)

Chile - Hadoop para Desarrolladores (4 días)

Costa Rica - Hadoop para Desarrolladores (4 días)

Ecuador - Hadoop para Desarrolladores (4 días)

Guatemala - Hadoop para Desarrolladores (4 días)

Colombia - Hadoop para Desarrolladores (4 días)

México - Hadoop para Desarrolladores (4 días)

Panama - Hadoop para Desarrolladores (4 días)

Peru - Hadoop para Desarrolladores (4 días)

Uruguay - Hadoop para Desarrolladores (4 días)

Venezuela - Hadoop para Desarrolladores (4 días)

Polska - Hadoop for Developers (4 days)

United Kingdom - Hadoop for Developers (4 days)

South Korea - Hadoop for Developers (4 days)

Pakistan - Hadoop for Developers (4 days)

Sri Lanka - Hadoop for Developers (4 days)

Bulgaria - Hadoop for Developers (4 days)

Bolivia - Hadoop para Desarrolladores (4 días)

Indonesia - Hadoop for Developers (4 days)

Kazakhstan - Hadoop for Developers (4 days)

Moldova - Hadoop for Developers (4 days)

Morocco - Hadoop for Developers (4 days)

Tunisia - Hadoop for Developers (4 days)

Kuwait - Hadoop for Developers (4 days)

Oman - Hadoop for Developers (4 days)

Slovakia - Hadoop for Developers (4 days)

Kenya - Hadoop for Developers (4 days)

Nigeria - Hadoop for Developers (4 days)

Botswana - Hadoop for Developers (4 days)

Slovenia - Hadoop for Developers (4 days)

Croatia - Hadoop for Developers (4 days)

Serbia - Hadoop for Developers (4 days)

Bhutan - Hadoop for Developers (4 days)

Nepal - Hadoop for Developers (4 days)

Uzbekistan - Hadoop for Developers (4 days)