Course Code:
kafkabspk
Duration:
21 hours
Prerequisites:
- Basic system administration skills, understanding of any object oriented programming language.
Overview:
This course is for enterprise architects, developers, system administrators and anyone who wants to understand and use a high-throughput distributed messaging system, Kafka, which is an open source streaming platform to handle real-time data feeds.
Course Outline:
[DAY 1]
Kafka Overview
- Publish/Subscribe Messaging
- Messages and Batches, Schemas, Topics and Partitions, Producers and Consumers, Brokers and Clusters, Multiple Clusters
- Use Cases
- Kafka's Origin and comparison with alternatives
Installing Kafka
- Environment Setup, Installing a Kafka Broker, Broker Configuration, Topic Defaults, Hardware Selection, Kafka in the Cloud, Production Concerns
Kafka Producers
- Constructing a Kafka Producer
- Sending a Message to Kafka
- Configuring Producers
- Serializers
- Partitions
- Headers
- Interceptors
- Quotas and Throttling
Kafka Consumers
- Consumers, Consumer Groups and Partition Rebalance
- Creating a Kafka Consumer
- Subscribing to Topics
- The Poll Loop
- Configuring Consumers
- Commits and Offsets
- Rebalance Listeners
- Consuming Records with Specific Offsets
- Deserializers
[DAY 2]
Managing Kafka Programmatically
- AdminClient Overview
- AdminClient Lifecycle: Creating, Configuring and Closing
- Configuration management
- Consumer group management
- Cluster Metadata
- Testing
Kafka Internals
- Cluster Membership
- The Controller
- KRaft
- Replication
- Request Processing
- Physical Storage
- Compaction
Reliable Data Delivery
- Reliability Guarantees
- Broker Configuration
- Using Producers in a Reliable System
- Using Consumers in a Reliable System
- Validating System Reliability
Exactly Once Semantics
- Idempotent Producer
- Transactions
Building Data Pipelines
- Considerations
- Kafka Connect Versus Producer and Consumer
- Kafka Connect
- Alternatives to Kafka Connect
[DAY 3]
Cross-Cluster Data Mirroring
- Use Cases
- Hub-and-Spokes Architecture
- Active-Active Architecture
- Active-Standby Architecture
- Apache Kafka’s MirrorMaker
Securing Kafka
- Locking Down Kafka
- Security Protocols
- Authentication, SASL, Re-authentication
- Security updates without downtime
- Authorization
- Securing ZooKeeper
- Securing the Platform
Administering Kafka
- Topic Operations
- Consumer Groups
- Dynamic Configuration Changes
- Producing and Consuming
- Partition Management
Monitoring Kafka
- Metric Basics
- Service Level Objectives
- Kafka Broker Metrics
- Client Monitoring
- Lag Monitoring
Setup and Run Kafka on Kubernetes
Summary and Conclusion