Course Code: synapsebspk
Duration: 14 hours
Prerequisites:
  • Basic understanding of SQL, cloud computing concepts, and data warehousing.
Course Outline:

Day 1: Introduction to Azure Synapse & Cloud Data Best Practices

Introduction to Azure Synapse Analytics

  • Overview of Synapse architecture: serverless SQL, dedicated SQL pools, and Apache Spark pools
  • Differences between on-premise data solutions (SAS) and cloud-based Synapse
  • Synapse Studio: Workspace management, user access, and permissions
  • Cloud-based data integration: Understanding data ingestion, storage, and querying

T-SQL Querying for Ad-Hoc Analysis

  • Revisiting T-SQL basics: Joins, filters, and advanced query techniques
  • Writing cost-efficient ad hoc queries in Synapse Analytics
  • Practical exercises: Crafting complex queries while minimizing cloud costs

Cost Management in a Cloud Environment

  • Cost-efficient cloud queries: Understanding serverless vs. dedicated pools
  • Managing Synapse Analytics workloads for optimal performance and cost
  • Monitoring and controlling cloud costs: Best practices for scaling and workload management

Security and Governance in Azure Synapse

  • Managing security and compliance: Data encryption, access control, and permission management
  • Governance strategies: Avoiding accidental data overwrites and controlling pipeline versions
  • Establishing cloud governance frameworks to reduce production risks and ensure compliance

Day 2: Advanced Synapse Features & Practical Data Integration

Data Integration and Pipelines in Synapse

  • Building Synapse Pipelines: From ingestion to transformation
  • Managing pipeline versions and branches to avoid conflicts and accidental overwrites
  • Practical exercises: Creating data pipelines and handling real-world challenges

Optimizing Big Data Processing with Apache Spark Pools

  • Introduction to Apache Spark pools: When and how to use Spark for big data processing
  • Performance tuning for large datasets in Synapse using Spark and SQL pools
  • Hands-on session: Writing and optimizing Spark queries for high-performance data processing

Advanced T-SQL and Query Optimization Techniques

  • Understanding query execution plans: Identifying and resolving bottlenecks
  • Using indexing, partitioning, and query optimization techniques in Azure Synapse
  • Hands-on lab: Optimizing T-SQL queries to reduce runtime and cost in cloud environments

Monitoring, Alerts, and Governance in Practice

  • Setting up monitoring and alert systems for Synapse workloads
  • Best practices for creating governance rules: Managing environments, permissions, and cost monitoring
  • Review and wrap-up: Implementing cost-efficient, secure, and scalable workflows in Synapse