Kafka

Apache Kafka is a distributed log-based stream processing platform that combines Pubsub semantics with durable ordered storage. Producers append records to named topics, while consumers read those partitions at their own pace, enabling loose coupling between services.

Kafka is one style of Messaging, but it emphasizes durable append-only logs rather than short-lived queues.

Core pieces

Brokers store topic partitions and replicate logs across the cluster for durability.
Topics + partitions provide scalability by sharding ordered logs; consumers track offsets per partition.
Consumer groups balance load by ensuring only one consumer in a group processes a partition at a time.
Schema registry / serializers (e.g. Avro, Protobuf) keep producers and consumers compatible.

Typical uses

Event streaming between microservices, change data capture pipelines, and ETL into data lakes/warehouses.
Buffering bursty workloads so downstream systems ingest at stable rates.
Real-time analytics when paired with Kafka Streams, ksqlDB, or external processors (Flink, Spark, Beam).
Replayable event processing when consumers need to read from a durable history.

Ecosystem

Kafka Connect manages scalable source/sink connectors without bespoke consumers or producers.
Kafka Streams embeds stateful stream processing inside applications without a separate cluster.
ksqlDB provides a SQL-like interface for defining persistent queries over topics.

Operational notes

Requires ZooKeeper (Kafka < 3.3) or the built-in KRaft controller quorum (Kafka 3.3+) for metadata.
Plan for disk-heavy workloads; retention policies are per-topic (time and/or size), and compaction keeps the latest value per key.
Monitor broker health via lag metrics, ISR size, and controller elections; automate rebalancing when adding brokers.
Consumer failures are often handled with retries, backoff, and sometimes a Dead Letter Queue pattern outside the main topic flow.
Compare with Kafka vs RabbitMQ when deciding between queue-oriented work distribution and log-oriented streaming.

Notes

Explorer

Kafka

Core pieces

Typical uses

Ecosystem

Operational notes

Graph View

Table of Contents

Backlinks