- 一般的编程技能
Spark是一个用于查询、分析和转换大数据的数据处理引擎。Python是一种高级编程语言,因其清晰的语法和代码可读性而闻名。PySpark允许用户将Spark与Python连接。
在这一由讲师引导的现场培训中,学员将通过实践练习学习如何使用Python和Spark一起分析大数据。
在本次培训结束后,学员将能够:
- 了解如何使用Spark和Python一起分析大数据
- 开展模拟真实世界环境的练习
- 用不同的工具和技术通过PySpark进行大数据分析
课程形式
- 部分讲座、部分讨论、练习和大量实操
介绍
了解大数据
Spark概述
Python概述
PySpark概述
- 使用弹性分布式数据集框架分发数据
- 使用Spark API运算符分布计算
设置Python和Spark
设置PySpark
针对Spark使用Amazon Web Services(AWS)EC2实例
设置数据块
设置AWS EMR集群
学习Python编程的基础知识
- Python入门
- 使用Jupyter Notebook
- 使用变量和简单的数据类型
- 使用列表
- 使用 if 语句
- 使用用户输入
- 处理while循环
- 实现函数
- 使用类
- 处理文件和异常
- 处理项目、数据、API
学习Spark DataFrame的基础知识
- Spark DataFrames入门
- 用Spark实现基本操作
- 使用Groupby和聚合操作
- 使用时间戳和日期
进行Spark DataFrame项目练习
了解用MLlib进行机器学习
使用MLlib、Spark和Python进行机器学习
了解回归
- 学习线性回归理论
- 实现回归评估代码
- 进行线性回归示例练习
- 学习Logistic回归理论
- 实现一个Logistic回归代码
- 进行Logistic回归示例练习
了解随机森林(Random Forests)和决策树(Decision Trees)
- 学习树方法论(Tree Methods Theory)
- 实现决策树和随机森林代码
- 进行随机森林分类示例练习
使用K均值聚类
- 了解K均值聚类理论
- 实现K均值聚类代码
- 进行群集示例练习
使用推荐系统
实现自然语言处理
- 理解自然语言处理(NLP)
- NLP工具概述
- 进行NLP示例练习
在Python中用Spark进行流式处理
- 用Spark进行流式处理概述
- Spark流数据处理(Spark Streaming)示例练习
结束语
United Arab Emirates - Python and Spark for Big Data (PySpark)
Qatar - Python and Spark for Big Data (PySpark)
Egypt - Python and Spark for Big Data (PySpark)
Saudi Arabia - Python and Spark for Big Data (PySpark)
South Africa - Python and Spark for Big Data (PySpark)
Brasil - Python e Spark para Big Data (PySpark)
Canada - Python and Spark for Big Data (PySpark)
中国 - 用Spark和Python通过PySpark处理大数据
香港 - Python and Spark for Big Data (PySpark)
澳門 - Python and Spark for Big Data (PySpark)
台灣 - Python and Spark for Big Data (PySpark)
USA - Python and Spark for Big Data (PySpark)
Österreich - Python and Spark for Big Data (PySpark)
Schweiz - Python and Spark for Big Data (PySpark)
Deutschland - Python and Spark for Big Data (PySpark)
Czech Republic - Python and Spark for Big Data (PySpark)
Denmark - Python and Spark for Big Data (PySpark)
Estonia - Python and Spark for Big Data (PySpark)
Finland - Python and Spark for Big Data (PySpark)
Greece - Python and Spark for Big Data (PySpark)
Magyarország - Python and Spark for Big Data (PySpark)
Ireland - Python and Spark for Big Data (PySpark)
Luxembourg - Python and Spark for Big Data (PySpark)
Latvia - Python and Spark for Big Data (PySpark)
España - Python y Spark para Big Data (PySpark)
Italia - Python and Spark for Big Data (PySpark)
Lithuania - Python and Spark for Big Data (PySpark)
Nederland - Python and Spark for Big Data (PySpark)
Norway - Python and Spark for Big Data (PySpark)
Portugal - Python e Spark para Big Data (PySpark)
România - Python and Spark for Big Data (PySpark)
Sverige - Python and Spark for Big Data (PySpark)
Türkiye - Python and Spark for Big Data (PySpark)
Malta - Python and Spark for Big Data (PySpark)
Belgique - Python and Spark for Big Data (PySpark)
France - Python and Spark for Big Data (PySpark)
日本 - Python and Spark for Big Data (PySpark)
Australia - Python and Spark for Big Data (PySpark)
Malaysia - Python and Spark for Big Data (PySpark)
New Zealand - Python and Spark for Big Data (PySpark)
Philippines - Python and Spark for Big Data (PySpark)
Singapore - Python and Spark for Big Data (PySpark)
Thailand - Python and Spark for Big Data (PySpark)
Vietnam - Python and Spark for Big Data (PySpark)
India - Python and Spark for Big Data (PySpark)
Argentina - Python y Spark para Big Data (PySpark)
Chile - Python y Spark para Big Data (PySpark)
Costa Rica - Python y Spark para Big Data (PySpark)
Ecuador - Python y Spark para Big Data (PySpark)
Guatemala - Python y Spark para Big Data (PySpark)
Colombia - Python y Spark para Big Data (PySpark)
México - Python y Spark para Big Data (PySpark)
Panama - Python y Spark para Big Data (PySpark)
Peru - Python y Spark para Big Data (PySpark)
Uruguay - Python y Spark para Big Data (PySpark)
Venezuela - Python y Spark para Big Data (PySpark)
Polska - Python and Spark for Big Data (PySpark)
United Kingdom - Python and Spark for Big Data (PySpark)
South Korea - Python and Spark for Big Data (PySpark)
Pakistan - Python and Spark for Big Data (PySpark)
Sri Lanka - Python and Spark for Big Data (PySpark)
Bulgaria - Python and Spark for Big Data (PySpark)
Bolivia - Python y Spark para Big Data (PySpark)
Indonesia - Python and Spark for Big Data (PySpark)
Kazakhstan - Python and Spark for Big Data (PySpark)
Moldova - Python and Spark for Big Data (PySpark)
Morocco - Python and Spark for Big Data (PySpark)
Tunisia - Python and Spark for Big Data (PySpark)
Kuwait - Python and Spark for Big Data (PySpark)
Oman - Python and Spark for Big Data (PySpark)
Slovakia - Python and Spark for Big Data (PySpark)
Kenya - Python and Spark for Big Data (PySpark)
Nigeria - Python and Spark for Big Data (PySpark)
Botswana - Python and Spark for Big Data (PySpark)
Slovenia - Python and Spark for Big Data (PySpark)
Croatia - Python and Spark for Big Data (PySpark)
Serbia - Python and Spark for Big Data (PySpark)
Bhutan - Python and Spark for Big Data (PySpark)