Course Code: kafkabspk
Duration: 21 hours
Prerequisites:
  • Basic system administration skills, understanding of any object oriented programming language.
Overview:

This course is for enterprise architects, developers, system administrators and anyone who wants to understand and use a high-throughput distributed messaging system, Kafka, which is an open source streaming platform to handle real-time data feeds.

Course Outline:

[DAY 1]

Kafka Overview

  • Publish/Subscribe Messaging
  • Messages and Batches, Schemas, Topics and Partitions, Producers and Consumers, Brokers and Clusters, Multiple Clusters
  • Use Cases
  • Kafka's Origin and comparison with alternatives

Installing Kafka

  • Environment Setup, Installing a Kafka Broker, Broker Configuration, Topic Defaults, Hardware Selection, Kafka in the Cloud, Production Concerns

Kafka Producers

  • Constructing a Kafka Producer
  • Sending a Message to Kafka
  • Configuring Producers
  • Serializers
  • Partitions
  • Headers
  • Interceptors
  • Quotas and Throttling

Kafka Consumers

  • Consumers, Consumer Groups and Partition Rebalance
  • Creating a Kafka Consumer
  • Subscribing to Topics
  • The Poll Loop
  • Configuring Consumers
  • Commits and Offsets
  • Rebalance Listeners
  • Consuming Records with Specific Offsets
  • Deserializers

[DAY 2]

Managing Kafka Programmatically

  • AdminClient Overview
  • AdminClient Lifecycle: Creating, Configuring and Closing
  • Configuration management
  • Consumer group management
  • Cluster Metadata
  • Testing

Kafka Internals

  • Cluster Membership
  • The Controller
  • KRaft
  • Replication
  • Request Processing
  • Physical Storage
  • Compaction

Reliable Data Delivery

  • Reliability Guarantees
  • Broker Configuration
  • Using Producers in a Reliable System
  • Using Consumers in a Reliable System
  • Validating System Reliability

Exactly Once Semantics

  • Idempotent Producer
  • Transactions

Building Data Pipelines

  • Considerations
  • Kafka Connect Versus Producer and Consumer
  • Kafka Connect
  • Alternatives to Kafka Connect

[DAY 3]

Cross-Cluster Data Mirroring

  • Use Cases
  • Hub-and-Spokes Architecture
  • Active-Active Architecture
  • Active-Standby Architecture
  • Apache Kafka’s MirrorMaker

Securing Kafka

  • Locking Down Kafka
  • Security Protocols
  • Authentication, SASL, Re-authentication
  • Security updates without downtime
  • Authorization
  • Securing ZooKeeper
  • Securing the Platform

Administering Kafka

  • Topic Operations
  • Consumer Groups
  • Dynamic Configuration Changes
  • Producing and Consuming
  • Partition Management

Monitoring Kafka

  • Metric Basics
  • Service Level Objectives
  • Kafka Broker Metrics
  • Client Monitoring
  • Lag Monitoring

Setup and Run Kafka on Kubernetes

Summary and Conclusion