Course Code:
synapsebspk
Duration:
14 hours
Prerequisites:
- Basic understanding of SQL, cloud computing concepts, and data warehousing.
Course Outline:
Day 1: Introduction to Azure Synapse & Cloud Data Best Practices
Introduction to Azure Synapse Analytics
- Overview of Synapse architecture: serverless SQL, dedicated SQL pools, and Apache Spark pools
- Differences between on-premise data solutions (SAS) and cloud-based Synapse
- Synapse Studio: Workspace management, user access, and permissions
- Cloud-based data integration: Understanding data ingestion, storage, and querying
T-SQL Querying for Ad-Hoc Analysis
- Revisiting T-SQL basics: Joins, filters, and advanced query techniques
- Writing cost-efficient ad hoc queries in Synapse Analytics
- Practical exercises: Crafting complex queries while minimizing cloud costs
Cost Management in a Cloud Environment
- Cost-efficient cloud queries: Understanding serverless vs. dedicated pools
- Managing Synapse Analytics workloads for optimal performance and cost
- Monitoring and controlling cloud costs: Best practices for scaling and workload management
Security and Governance in Azure Synapse
- Managing security and compliance: Data encryption, access control, and permission management
- Governance strategies: Avoiding accidental data overwrites and controlling pipeline versions
- Establishing cloud governance frameworks to reduce production risks and ensure compliance
Day 2: Advanced Synapse Features & Practical Data Integration
Data Integration and Pipelines in Synapse
- Building Synapse Pipelines: From ingestion to transformation
- Managing pipeline versions and branches to avoid conflicts and accidental overwrites
- Practical exercises: Creating data pipelines and handling real-world challenges
Optimizing Big Data Processing with Apache Spark Pools
- Introduction to Apache Spark pools: When and how to use Spark for big data processing
- Performance tuning for large datasets in Synapse using Spark and SQL pools
- Hands-on session: Writing and optimizing Spark queries for high-performance data processing
Advanced T-SQL and Query Optimization Techniques
- Understanding query execution plans: Identifying and resolving bottlenecks
- Using indexing, partitioning, and query optimization techniques in Azure Synapse
- Hands-on lab: Optimizing T-SQL queries to reduce runtime and cost in cloud environments
Monitoring, Alerts, and Governance in Practice
- Setting up monitoring and alert systems for Synapse workloads
- Best practices for creating governance rules: Managing environments, permissions, and cost monitoring
- Review and wrap-up: Implementing cost-efficient, secure, and scalable workflows in Synapse