- 熟悉Java编程语言(大多数编程练习使用Java)
- 熟悉Linux环境(能够使用Linux命令行,使用vi / nano编辑文件)
实验环境
零安装:无需在学生机器上安装Hadoop软件!将为学生提供一个可用的Hadoop集群。
学生需要准备以下内容
- 一个SSH客户端(Linux和Mac已经自带SSH客户端,Windows推荐使用Putty)
- 一个浏览器来访问集群,推荐使用Firefox
Apache Hadoop 是最流行的用于在服务器集群上处理 Big Data 的框架。本课程将向开发者介绍 Hadoop 生态系统中的各种组件(HDFS、MapReduce、Pig、Hive 和 HBase)。
第一部分:Hadoop 简介
- Hadoop 历史与概念
- 生态系统
- 发行版
- 高层架构
- Hadoop 误区
- Hadoop 挑战
- 硬件/软件
- 实验:初识 Hadoop
第二部分:HDFS
- 设计与架构
- 概念(水平扩展、复制、数据本地性、机架感知)
- 守护进程:Namenode、Secondary namenode、Data node
- 通信/心跳
- 数据完整性
- 读/写路径
- Namenode 高可用性 (HA)、联邦
- 实验:与 HDFS 交互
第三部分:Map Reduce
- 概念与架构
- 守护进程 (MRV1):jobtracker / tasktracker
- 阶段:driver、mapper、shuffle/sort、reducer
- Map Reduce 版本 1 和版本 2 (YARN)
- Map Reduce 内部机制
- Java Map Reduce 程序简介
- 实验:运行一个 MapReduce 示例程序
第四部分:Pig
- Pig 与 Java Map Reduce 对比
- Pig 作业流程
- Pig Latin 语言
- 使用 Pig 进行 ETL
- 转换与连接
- 用户定义函数 (UDF)
- 实验:编写 Pig 脚本来分析数据
第五部分:Hive
- 架构与设计
- 数据类型
- Hive 中的 SQL 支持
- 创建 Hive 表与查询
- 分区
- 连接
- 文本处理
- 实验:使用 Hive 处理数据的多个实验
第六部分:HBase
- 概念与架构
- HBase 与 RDBMS 与 Cassandra 对比
- HBase Java API
- HBase 上的时间序列数据
- 模式设计
- 实验:使用 shell 与 HBase 交互;使用 HBase Java API 编程;模式设计练习
United Arab Emirates - Hadoop for Developers (4 days)
Qatar - Hadoop for Developers (4 days)
Egypt - Hadoop for Developers (4 days)
Saudi Arabia - Hadoop for Developers (4 days)
South Africa - Hadoop for Developers (4 days)
Brasil - Hadoop for Developers (4 days)
Canada - Hadoop for Developers (4 days)
中国 - Hadoop for Developers (4 days)
香港 - Hadoop for Developers (4 days)
澳門 - Hadoop for Developers (4 days)
台灣 - Hadoop for Developers (4 days)
USA - Hadoop for Developers (4 days)
Österreich - Hadoop for Developers (4 days)
Schweiz - Hadoop for Developers (4 days)
Deutschland - Hadoop for Developers (4 days)
Czech Republic - Hadoop for Developers (4 days)
Denmark - Hadoop for Developers (4 days)
Estonia - Hadoop for Developers (4 days)
Finland - Hadoop for Developers (4 days)
Greece - Hadoop for Developers (4 days)
Magyarország - Hadoop for Developers (4 days)
Ireland - Hadoop for Developers (4 days)
Luxembourg - Hadoop for Developers (4 days)
Latvia - Hadoop for Developers (4 days)
España - Hadoop para Desarrolladores (4 días)
Italia - Hadoop for Developers (4 days)
Lithuania - Hadoop for Developers (4 days)
Nederland - Hadoop for Developers (4 days)
Norway - Hadoop for Developers (4 days)
Portugal - Hadoop for Developers (4 days)
România - Hadoop for Developers (4 days)
Sverige - Hadoop for Developers (4 days)
Türkiye - Hadoop for Developers (4 days)
Malta - Hadoop for Developers (4 days)
Belgique - Hadoop for Developers (4 days)
France - Hadoop for Developers (4 days)
日本 - Hadoop for Developers (4 days)
Australia - Hadoop for Developers (4 days)
Malaysia - Hadoop for Developers (4 days)
New Zealand - Hadoop for Developers (4 days)
Philippines - Hadoop for Developers (4 days)
Singapore - Hadoop for Developers (4 days)
Thailand - Hadoop for Developers (4 days)
Vietnam - Hadoop for Developers (4 days)
India - Hadoop for Developers (4 days)
Argentina - Hadoop para Desarrolladores (4 días)
Chile - Hadoop para Desarrolladores (4 días)
Costa Rica - Hadoop para Desarrolladores (4 días)
Ecuador - Hadoop para Desarrolladores (4 días)
Guatemala - Hadoop para Desarrolladores (4 días)
Colombia - Hadoop para Desarrolladores (4 días)
México - Hadoop para Desarrolladores (4 días)
Panama - Hadoop para Desarrolladores (4 días)
Peru - Hadoop para Desarrolladores (4 días)
Uruguay - Hadoop para Desarrolladores (4 días)
Venezuela - Hadoop para Desarrolladores (4 días)
Polska - Hadoop for Developers (4 days)
United Kingdom - Hadoop for Developers (4 days)
South Korea - Hadoop for Developers (4 days)
Pakistan - Hadoop for Developers (4 days)
Sri Lanka - Hadoop for Developers (4 days)
Bulgaria - Hadoop for Developers (4 days)
Bolivia - Hadoop para Desarrolladores (4 días)
Indonesia - Hadoop for Developers (4 days)
Kazakhstan - Hadoop for Developers (4 days)
Moldova - Hadoop for Developers (4 days)
Morocco - Hadoop for Developers (4 days)
Tunisia - Hadoop for Developers (4 days)
Kuwait - Hadoop for Developers (4 days)
Oman - Hadoop for Developers (4 days)
Slovakia - Hadoop for Developers (4 days)
Kenya - Hadoop for Developers (4 days)
Nigeria - Hadoop for Developers (4 days)
Botswana - Hadoop for Developers (4 days)
Slovenia - Hadoop for Developers (4 days)
Croatia - Hadoop for Developers (4 days)
Serbia - Hadoop for Developers (4 days)
Bhutan - Hadoop for Developers (4 days)