Course Code: sparkpythonhadoop
Duration: 21 hours
Prerequisites:
  • 具有 Spark 和 Hadoop 的经验
  • Python 编程经验

观众

  • 数据科学家
  • 开发 人员
Overview:

Python 是一种可扩展、灵活且广泛使用的编程语言,用于数据科学和机器学习。Spark 是一个用于查询、分析和转换大数据的数据处理引擎,而 Hadoop 是一个用于大规模数据存储和处理的软件库框架。

这种以讲师为主导的现场培训(现场或远程)针对希望使用和集成Spark,Hadoop和Python以处理,分析和转换大型复杂数据集的开发人员。

在培训结束时,参与者将能够:

  • 设置必要的环境以开始使用 Spark、Hadoop 和 Python 处理大数据。
  • 了解 Spark 和 Hadoop 的功能、核心组件和架构。
  • 了解如何集成 Spark、Hadoop 和 Python 进行大数据处理。
  • 探索 Spark 生态系统中的工具(Spark MlLib、Spark Streaming、Kafka、Sqoop、Kafka 和 Flume)。
  • 构建类似于 Netflix、YouTube、Amazon、Spotify 和 Google 的协作过滤推荐系统。
  • 使用 Apache Mahout 扩展机器学习算法。

课程形式

  • 互动讲座和讨论。
  • 大量的练习和练习。
  • 在现场实验室环境中实际实施。

课程定制选项

  • 如需申请此课程的定制培训,请联系我们进行安排。
Course Outline:

介绍

  • Spark 和 Hadoop 功能和体系结构概述
  • 了解大数据
  • Python 编程基础

开始

  • 设置 Python、Spark 和 Hadoop
  • 了解 Python 中的数据结构
  • 了解 PySpark API
  • 了解 HDFS 和 MapReduce

将 Spark 和 Hadoop 与 Python 集成

  • 在 Python 中实现 Spark RDD
  • 使用MapReduce处理数据
  • 在HDFS中创建分布式数据集

Machine Learning 使用 Spark MLlib

处理 Big Data 和 Spark Streaming

使用推荐系统

使用 Kafka、Sqoop、Kafka 和 Flume

使用 Spark 和 Hadoop 的 Apache Mahout

故障 排除

摘要和后续步骤

Sites Published:

United Arab Emirates - Python, Spark, and Hadoop for Big Data

Qatar - Python, Spark, and Hadoop for Big Data

Egypt - Python, Spark, and Hadoop for Big Data

Saudi Arabia - Python, Spark, and Hadoop for Big Data

South Africa - Python, Spark, and Hadoop for Big Data

Brasil - Python, Spark, and Hadoop for Big Data

Canada - Python, Spark, and Hadoop for Big Data

中国 - Python, Spark, and Hadoop for Big Data

香港 - Python, Spark, and Hadoop for Big Data

澳門 - Python, Spark, and Hadoop for Big Data

台灣 - Python, Spark, and Hadoop for Big Data

USA - Python, Spark, and Hadoop for Big Data

Österreich - Python, Spark, and Hadoop for Big Data

Schweiz - Python, Spark, and Hadoop for Big Data

Deutschland - Python, Spark, and Hadoop for Big Data

Czech Republic - Python, Spark, and Hadoop for Big Data

Denmark - Python, Spark, and Hadoop for Big Data

Estonia - Python, Spark, and Hadoop for Big Data

Finland - Python, Spark, and Hadoop for Big Data

Greece - Python, Spark, and Hadoop for Big Data

Magyarország - Python, Spark, and Hadoop for Big Data

Ireland - Python, Spark, and Hadoop for Big Data

Luxembourg - Python, Spark, and Hadoop for Big Data

Latvia - Python, Spark, and Hadoop for Big Data

España - Python, Spark, and Hadoop for Big Data

Italia - Python, Spark, and Hadoop for Big Data

Lithuania - Python, Spark, and Hadoop for Big Data

Nederland - Python, Spark, and Hadoop for Big Data

Norway - Python, Spark, and Hadoop for Big Data

Portugal - Python, Spark, and Hadoop for Big Data

România - Python, Spark, and Hadoop for Big Data

Sverige - Python, Spark, and Hadoop for Big Data

Türkiye - Python, Spark, and Hadoop for Big Data

Malta - Python, Spark, and Hadoop for Big Data

Belgique - Python, Spark, and Hadoop for Big Data

France - Python, Spark, and Hadoop for Big Data

日本 - Python, Spark, and Hadoop for Big Data

Australia - Python, Spark, and Hadoop for Big Data

Malaysia - Python, Spark, and Hadoop for Big Data

New Zealand - Python, Spark, and Hadoop for Big Data

Philippines - Python, Spark, and Hadoop for Big Data

Singapore - Python, Spark, and Hadoop for Big Data

Thailand - Python, Spark, and Hadoop for Big Data

Vietnam - Python, Spark, and Hadoop for Big Data

India - Python, Spark, and Hadoop for Big Data

Argentina - Python, Spark, and Hadoop for Big Data

Chile - Python, Spark, and Hadoop for Big Data

Costa Rica - Python, Spark, and Hadoop for Big Data

Ecuador - Python, Spark, and Hadoop for Big Data

Guatemala - Python, Spark, and Hadoop for Big Data

Colombia - Python, Spark, and Hadoop for Big Data

México - Python, Spark, and Hadoop for Big Data

Panama - Python, Spark, and Hadoop for Big Data

Peru - Python, Spark, and Hadoop for Big Data

Uruguay - Python, Spark, and Hadoop for Big Data

Venezuela - Python, Spark, and Hadoop for Big Data

Polska - Python, Spark, and Hadoop for Big Data

United Kingdom - Python, Spark, and Hadoop for Big Data

South Korea - Python, Spark, and Hadoop for Big Data

Pakistan - Python, Spark, and Hadoop for Big Data

Sri Lanka - Python, Spark, and Hadoop for Big Data

Bulgaria - Python, Spark, and Hadoop for Big Data

Bolivia - Python, Spark, and Hadoop for Big Data

Indonesia - Python, Spark, and Hadoop for Big Data

Kazakhstan - Python, Spark, and Hadoop for Big Data

Moldova - Python, Spark, and Hadoop for Big Data

Morocco - Python, Spark, and Hadoop for Big Data

Tunisia - Python, Spark, and Hadoop for Big Data

Kuwait - Python, Spark, and Hadoop for Big Data

Oman - Python, Spark, and Hadoop for Big Data

Slovakia - Python, Spark, and Hadoop for Big Data

Kenya - Python, Spark, and Hadoop for Big Data

Nigeria - Python, Spark, and Hadoop for Big Data

Botswana - Python, Spark, and Hadoop for Big Data

Slovenia - Python, Spark, and Hadoop for Big Data

Croatia - Python, Spark, and Hadoop for Big Data

Serbia - Python, Spark, and Hadoop for Big Data

Bhutan - Python, Spark, and Hadoop for Big Data

Nepal - Python, Spark, and Hadoop for Big Data

Uzbekistan - Python, Spark, and Hadoop for Big Data