Course Code: sparksql
Duration: 7 hours
Prerequisites:
  • Experience with SQL queries
  • Programming experience in any language

Audience

  • Data analysts
  • Data scientists
  • Data engineers
Overview:

Spark SQL is Apache Spark's module for working with structured and unstructured data. Spark SQL provides information about the structure of the data as well as the computation being performed. This information can be used to perform optimizations. Two common uses for Spark SQL are:
- to execute SQL queries.
- to read data from an existing Hive installation.

In this instructor-led, live training (onsite or remote), participants will learn how to analyze various types of data sets using Spark SQL.

By the end of this training, participants will be able to:

  • Install and configure Spark SQL.
  • Perform data analysis using Spark SQL.
  • Query data sets in different formats.
  • Visualize data and query results.

Format of the Course

  • Interactive lecture and discussion.
  • Lots of exercises and practice.
  • Hands-on implementation in a live-lab environment.

Course Customization Options

  • To request a customized training for this course, please contact us to arrange.
Course Outline:

Introduction

Overview of Data Access Approaches (Hive, databases, etc.)

Overview of Spark Features and Architecture

Installing and Configuring Spark

Understanding Dataframes in Spark

Defining Tables and Importing Datasets

Querying Data Frames using SQL

Carrying out Aggregations, JOINs and Nested Queries

Uploading and Accessing Data

Querying Different Types of Data

  • JSON, Parquet, etc.

Querying Data Lakes with SQL

Troubleshooting

Summary and Conclusion

Sites Published:

United Arab Emirates - Apache Spark SQL

Qatar - Apache Spark SQL

Egypt - Apache Spark SQL

Saudi Arabia - Apache Spark SQL

South Africa - Apache Spark SQL

Brasil - Apache Spark SQL

Canada - Apache Spark SQL

中国 - Apache Spark SQL

香港 - Apache Spark SQL

澳門 - Apache Spark SQL

台灣 - Apache Spark SQL

USA - Apache Spark SQL

Österreich - Apache Spark SQL

Schweiz - Apache Spark SQL

Deutschland - Apache Spark SQL

Czech Republic - Apache Spark SQL

Denmark - Apache Spark SQL

Estonia - Apache Spark SQL

Finland - Apache Spark SQL

Greece - Apache Spark SQL

Magyarország - Apache Spark SQL

Ireland - Apache Spark SQL

Luxembourg - Apache Spark SQL

Latvia - Apache Spark SQL

España - Apache Spark SQL

Italia - Apache Spark SQL

Lithuania - Apache Spark SQL

Nederland - Apache Spark SQL

Norway - Apache Spark SQL

Portugal - Apache Spark SQL

România - Apache Spark SQL

Sverige - Apache Spark SQL

Türkiye - Apache Spark SQL

Malta - Apache Spark SQL

Belgique - Apache Spark SQL

France - Apache Spark SQL

日本 - Apache Spark SQL

Australia - Apache Spark SQL

Malaysia - Apache Spark SQL

New Zealand - Apache Spark SQL

Philippines - Apache Spark SQL

Singapore - Apache Spark SQL

Thailand - Apache Spark SQL

Vietnam - Apache Spark SQL

India - Apache Spark SQL

Argentina - Apache Spark SQL

Chile - Apache Spark SQL

Costa Rica - Apache Spark SQL

Ecuador - Apache Spark SQL

Guatemala - Apache Spark SQL

Colombia - Apache Spark SQL

México - Apache Spark SQL

Panama - Apache Spark SQL

Peru - Apache Spark SQL

Uruguay - Apache Spark SQL

Venezuela - Apache Spark SQL

Polska - Apache Spark SQL

United Kingdom - Apache Spark SQL

South Korea - Apache Spark SQL

Pakistan - Apache Spark SQL

Sri Lanka - Apache Spark SQL

Bulgaria - Apache Spark SQL

Bolivia - Apache Spark SQL

Indonesia - Apache Spark SQL

Kazakhstan - Apache Spark SQL

Moldova - Apache Spark SQL

Morocco - Apache Spark SQL

Tunisia - Apache Spark SQL

Kuwait - Apache Spark SQL

Oman - Apache Spark SQL

Slovakia - Apache Spark SQL

Kenya - Apache Spark SQL

Nigeria - Apache Spark SQL

Botswana - Apache Spark SQL

Slovenia - Apache Spark SQL

Croatia - Apache Spark SQL

Serbia - Apache Spark SQL

Bhutan - Apache Spark SQL

Nepal - Apache Spark SQL

Uzbekistan - Apache Spark SQL