程式設計技能(最好是 python,scala)
SQL 基礎知識
Apache Spark的學習曲線在開始時逐漸增加,需要付出很多努力來獲得第一次回歸。本課程旨在突破第一個艱難的部分。參加本課程後,參與者將了解Apache Spark的基礎知識,他們將明確區分RDD和DataFrame,他們將學習Python和Scala API,他們將理解執行者和任務等。同樣遵循最佳實踐,本課程重點關注雲部署,Databricks和AWS。學生還將了解AWS EMR與AWS Glue之間的差異,AWS Glue是AWS最新的Spark服務之一。
聽眾:
數據工程師, DevOps ,數據科學家
介紹:
- Apache Spark 在 Hadoop 生態系統中
- python、scala 的簡短介紹
基礎知識(理論):
- 建築
- RDD型
- 轉型與行動
- 階段、任務、依賴項
使用 Databricks 環境瞭解基礎知識(動手研討會):
- 使用 RDD API 的練習
- 基本操作和轉換函數
- 貨幣對RDD
- 加入
- 緩存策略
- 使用 DataFrame API 的練習
- 火花SQL
- DataFrame:選擇、篩選、分組、排序
- UDF(使用者定義函數)
- 查看數據集 API
- 流
使用 AWS 環境瞭解部署(動手研討會):
- AWS Glue 基礎知識
- 瞭解 AWS EMR 和AWS Glue 之間的差異
- 兩個環境中的示例作業
- 瞭解利弊
額外:
- Apache Airflow 編排簡介
United Arab Emirates - Apache Spark in the Cloud
Qatar - Apache Spark in the Cloud
Egypt - Apache Spark in the Cloud
Saudi Arabia - Apache Spark in the Cloud
South Africa - Apache Spark in the Cloud
Brasil - Apache Spark in the Cloud
Canada - Apache Spark in the Cloud
中国 - Apache Spark in the Cloud
香港 - Apache Spark in the Cloud
澳門 - Apache Spark in the Cloud
台灣 - Apache Spark in the Cloud
USA - Apache Spark in the Cloud
Österreich - Apache Spark in the Cloud
Schweiz - Apache Spark in the Cloud
Deutschland - Apache Spark in the Cloud
Czech Republic - Apache Spark in the Cloud
Denmark - Apache Spark in the Cloud
Estonia - Apache Spark in the Cloud
Finland - Apache Spark in the Cloud
Greece - Apache Spark in the Cloud
Magyarország - Apache Spark in the Cloud
Ireland - Apache Spark in the Cloud
Luxembourg - Apache Spark in the Cloud
Latvia - Apache Spark in the Cloud
España - Apache Spark in the Cloud
Italia - Apache Spark in the Cloud
Lithuania - Apache Spark in the Cloud
Nederland - Apache Spark in the Cloud
Norway - Apache Spark in the Cloud
Portugal - Apache Spark in the Cloud
România - Apache Spark in the Cloud
Sverige - Apache Spark in the Cloud
Türkiye - Apache Spark in the Cloud
Malta - Apache Spark in the Cloud
Belgique - Apache Spark in the Cloud
France - Apache Spark in the Cloud
日本 - Apache Spark in the Cloud
Australia - Apache Spark in the Cloud
Malaysia - Apache Spark in the Cloud
New Zealand - Apache Spark in the Cloud
Philippines - Apache Spark in the Cloud
Singapore - Apache Spark in the Cloud
Thailand - Apache Spark in the Cloud
Vietnam - Apache Spark in the Cloud
India - Apache Spark in the Cloud
Argentina - Apache Spark in the Cloud
Chile - Apache Spark in the Cloud
Costa Rica - Apache Spark in the Cloud
Ecuador - Apache Spark in the Cloud
Guatemala - Apache Spark in the Cloud
Colombia - Apache Spark in the Cloud
México - Apache Spark in the Cloud
Panama - Apache Spark in the Cloud
Peru - Apache Spark in the Cloud
Uruguay - Apache Spark in the Cloud
Venezuela - Apache Spark in the Cloud
Polska - Apache Spark in the Cloud
United Kingdom - Apache Spark in the Cloud
South Korea - Apache Spark in the Cloud
Pakistan - Apache Spark in the Cloud
Sri Lanka - Apache Spark in the Cloud
Bulgaria - Apache Spark in the Cloud
Bolivia - Apache Spark in the Cloud
Indonesia - Apache Spark in the Cloud
Kazakhstan - Apache Spark in the Cloud
Moldova - Apache Spark in the Cloud
Morocco - Apache Spark in the Cloud
Tunisia - Apache Spark in the Cloud
Kuwait - Apache Spark in the Cloud
Oman - Apache Spark in the Cloud
Slovakia - Apache Spark in the Cloud
Kenya - Apache Spark in the Cloud
Nigeria - Apache Spark in the Cloud
Botswana - Apache Spark in the Cloud
Slovenia - Apache Spark in the Cloud
Croatia - Apache Spark in the Cloud
Serbia - Apache Spark in the Cloud
Bhutan - Apache Spark in the Cloud