Course Code: gbigqorng
Duration: 20 hours
Prerequisites:

Google Account

 

Target Audience

Data Analysts, BI professionals
Data Engineers & Scientists
Cloud Architects & Developers
Technical Project Managers

Overview:

Training Objective

Provide participants with practical skills in BigQuery while giving strategic understanding of big data architecture across major cloud platforms (GCP, Azure, AWS). Emphasis is placed on query optimization, cost efficiency, security, and real-world integration patterns.

Course Outline:

1. Big Data in the Cloud – Architectural Overview

  • Big Data Trends & Challenges

  • Comparative Cloud Platforms: GCP vs AWS vs Azure

  • Modern Architectural Patterns:

    • The Modern Data Stack

    • ELT vs ETL

    • Lakehouse Overview


2. Why BigQuery?

  • Serverless Architecture

  • Columnar Storage

  • ANSI SQL Support

  • Federated Query Capability


3. Introduction to BigQuery

  • BigQuery Fundamentals:

    • Projects, Datasets, Tables, Schemas

    • Serverless model: Storage vs Compute

    • Storage Options: Native, External, Temporary, Materialized Views

🧪 Hands-on Labs:

  • Exploring BigQuery UI & Public Datasets

  • Writing Basic SQL: SELECT, JOIN, WHERE, GROUP BY

  • Previewing Query Cost & Execution Plan


4. Querying, Optimization & Data Modeling

Intermediate SQL & Performance Tuning

  • Nested & Repeated Fields: ARRAY, STRUCT

  • Table Partitioning & Clustering

  • Optimizing Query Costs

  • Query Execution Plans, Caching, Dry Run

🧪 Hands-on Labs:

  • Working with Nested JSON

  • Creating Partitioned & Clustered Tables

  • Using EXPLAIN and INFORMATION_SCHEMA


5. Data Modeling Patterns in BigQuery

  • Star and Snowflake Schemas

  • Denormalization Strategies

  • Schema Design for Performance

🧪 Hands-on Labs:

  • Modeling a Retail Dataset

  • Query Optimization through Schema Tuning


6. Data Ingestion, Security, and Integration

Ingestion & Integration

  • Loading Data: CSV, JSON, Avro, Parquet

  • Streaming Inserts vs Batch Loads

  • Federated Queries: Cloud Storage, Google Sheets, Cloud SQL

  • Scheduling & Automating Queries

🧪 Hands-on Labs:

  • Loading Files via Console & CLI

  • Running Federated Queries

  • Scheduled Reporting Tasks


7. Governance, Access, and Cost

  • IAM Roles & Dataset/Table-Level Access

  • Audit Logging & Query History

  • Row-Level & Column-Level Security

  • Cost Optimization & Budget Allocation