编程技能(最好是 python,scala)
SQL 基础知识
Apache Spark的学习曲线在开始时逐渐增加,需要付出很多努力来获得第一次回归。本课程旨在突破第一个艰难的部分。参加本课程后,参与者将了解Apache Spark的基础知识,他们将明确区分RDD和DataFrame,他们将学习Python和Scala API,他们将理解执行者和任务等。同样遵循最佳实践,本课程重点关注云部署,Databricks和AWS。学生还将了解AWS EMR与AWS最新Spark服务之一AWS Glue之间的差异。
听众:
数据工程师, DevOps ,数据科学家
介绍:
- Apache Spark 在 Hadoop 生态系统中
- python、scala 的简短介绍
基础知识(理论):
- 建筑
- RDD型
- 转型与行动
- 阶段、任务、依赖项
使用 Databricks 环境了解基础知识(动手研讨会):
- 使用 RDD API 的练习
- 基本操作和转换函数
- 货币对RDD
- 加入
- 缓存策略
- 使用 DataFrame API 的练习
- 火花SQL
- DataFrame:选择、筛选、分组、排序
- UDF(用户定义函数)
- 查看数据集 API
- 流
使用 AWS 环境了解部署(动手研讨会):
- AWS Glue 基础知识
- 了解 AWS EMR 和AWS Glue 之间的差异
- 两个环境中的示例作业
- 了解利弊
额外:
- Apache Airflow 编排简介
United Arab Emirates - Apache Spark in the Cloud
Qatar - Apache Spark in the Cloud
Egypt - Apache Spark in the Cloud
Saudi Arabia - Apache Spark in the Cloud
South Africa - Apache Spark in the Cloud
Brasil - Apache Spark in the Cloud
Canada - Apache Spark in the Cloud
中国 - Apache Spark in the Cloud
香港 - Apache Spark in the Cloud
澳門 - Apache Spark in the Cloud
台灣 - Apache Spark in the Cloud
USA - Apache Spark in the Cloud
Österreich - Apache Spark in the Cloud
Schweiz - Apache Spark in the Cloud
Deutschland - Apache Spark in the Cloud
Czech Republic - Apache Spark in the Cloud
Denmark - Apache Spark in the Cloud
Estonia - Apache Spark in the Cloud
Finland - Apache Spark in the Cloud
Greece - Apache Spark in the Cloud
Magyarország - Apache Spark in the Cloud
Ireland - Apache Spark in the Cloud
Luxembourg - Apache Spark in the Cloud
Latvia - Apache Spark in the Cloud
España - Apache Spark in the Cloud
Italia - Apache Spark in the Cloud
Lithuania - Apache Spark in the Cloud
Nederland - Apache Spark in the Cloud
Norway - Apache Spark in the Cloud
Portugal - Apache Spark in the Cloud
România - Apache Spark in the Cloud
Sverige - Apache Spark in the Cloud
Türkiye - Apache Spark in the Cloud
Malta - Apache Spark in the Cloud
Belgique - Apache Spark in the Cloud
France - Apache Spark in the Cloud
日本 - Apache Spark in the Cloud
Australia - Apache Spark in the Cloud
Malaysia - Apache Spark in the Cloud
New Zealand - Apache Spark in the Cloud
Philippines - Apache Spark in the Cloud
Singapore - Apache Spark in the Cloud
Thailand - Apache Spark in the Cloud
Vietnam - Apache Spark in the Cloud
India - Apache Spark in the Cloud
Argentina - Apache Spark in the Cloud
Chile - Apache Spark in the Cloud
Costa Rica - Apache Spark in the Cloud
Ecuador - Apache Spark in the Cloud
Guatemala - Apache Spark in the Cloud
Colombia - Apache Spark in the Cloud
México - Apache Spark in the Cloud
Panama - Apache Spark in the Cloud
Peru - Apache Spark in the Cloud
Uruguay - Apache Spark in the Cloud
Venezuela - Apache Spark in the Cloud
Polska - Apache Spark in the Cloud
United Kingdom - Apache Spark in the Cloud
South Korea - Apache Spark in the Cloud
Pakistan - Apache Spark in the Cloud
Sri Lanka - Apache Spark in the Cloud
Bulgaria - Apache Spark in the Cloud
Bolivia - Apache Spark in the Cloud
Indonesia - Apache Spark in the Cloud
Kazakhstan - Apache Spark in the Cloud
Moldova - Apache Spark in the Cloud
Morocco - Apache Spark in the Cloud
Tunisia - Apache Spark in the Cloud
Kuwait - Apache Spark in the Cloud
Oman - Apache Spark in the Cloud
Slovakia - Apache Spark in the Cloud
Kenya - Apache Spark in the Cloud
Nigeria - Apache Spark in the Cloud
Botswana - Apache Spark in the Cloud
Slovenia - Apache Spark in the Cloud
Croatia - Apache Spark in the Cloud
Serbia - Apache Spark in the Cloud
Bhutan - Apache Spark in the Cloud