What are special topics in Kafka?
Kafka has several "special topics" that are automatically created and internally used by Kafka itself. These are not regular user-defined topics but serve internal coordination, metadata.
🛠️ 1. __consumer_offsets
Purpose: Stores committed offsets for Kafka consumers (used in consumer groups).
Type: Compacted topic (only latest offset per partition-key is retained).
Key Format:
<group.id>-<topic>-<partition>
Used by: Kafka clients for
enable.auto.commit=true
or manual commits.Partitions: Default 50 (can be configured with
offsets.topic.num.partitions
).
📌 Crucial for consumer rebalancing, offset recovery, and lag monitoring.
🧠 2. __transaction_state
Purpose: Stores the state of transactions for Kafka producers (in EOS mode).
Type: Compacted topic.
Used by: Kafka transaction coordinator to track transactional IDs, state, offsets.
📌 Required for exactly-once semantics (EOS) and transactional producers.
🔄 3. __cluster_metadata
(Kafka KRaft mode only)
Purpose: In KRaft (Kafka without Zookeeper), this topic stores cluster metadata (e.g., brokers, topics, configs).
Used by: KRaft metadata quorum (Raft log).
Appears only if: You run Kafka in KRaft mode (Zookeeper-less).
🔁 4. __changelog_*
topics (Kafka Streams apps)
Purpose: Store state store changelogs in Kafka Streams applications.
Format: One per state store, auto-created with name like:
application-id-store-name-changelog
Type: Compacted topic.
📌 Used to restore stateful operators during app restarts.
🔄 5. __consumer_timestamps
(optional, added via KIP-320)
Purpose: Tracks last seen timestamps of active consumers in a group.
Useful for: More accurate lag measurements, detecting dead consumers.
💬 6. Kafka Connect Internal Topics
When using Kafka Connect, it creates several internal topics:
connect-configs
: Stores connector and task configurationsconnect-offsets
: Tracks source offsets for connectorsconnect-status
: Tracks status of connectors and tasks
📌 All are compacted topics.
🧾 7. Schema Registry Topic
If you're using Confluent Schema Registry, it uses:
_schemas
— stores all Avro/Protobuf schemas and their versionsCompacted topic
🧼 8. MirrorMaker 2.0 Internal Topics
If you're replicating Kafka clusters using MirrorMaker 2.0, it creates:
__mm2-configs
__mm2-offsets
__mm2-heartbeats
These are internal system topics for config tracking, offset sync, and heartbeat validation.
🔐 Accessing Special Topics
By default:
Many internal topics are hidden from basic listing APIs.
You can still list them using:
kafka-topics.sh --bootstrap-server <host> --list --exclude-internal false
🔒 Should You Write to These Topics?
❌ No – You should never directly produce to or consume from internal topics like __consumer_offsets
or __transaction_state
.
✅ Use Kafka APIs (producers/consumers/streams) which interact with these topics safely.
Let’s dive into how Kafka uses the __consumer_offsets
topic internally, and then we’ll look at Kafka Streams changelog topics.
🔄 Part 1: __consumer_offsets
– How Kafka Tracks Consumer Group Progress
✅ Purpose:
Kafka uses the __consumer_offsets
topic to store committed offsets for consumer groups. It ensures that when a consumer restarts or a rebalance occurs, it can resume from the last committed offset.
🧱 Topic Structure:
Name:
__consumer_offsets
Default partitions: 50 (configurable via
offsets.topic.num.partitions
)Cleanup policy:
compact
(retains only the latest offset per key)Replication factor: Default is 3 (
offsets.topic.replication.factor
)Key format:
GroupMetadataKey {
group: "my-consumer-group",
topic: "my-topic",
partition: 0
}
🔧 How It Works (Internally):
1. Offset Commit
When a consumer commits offsets (either automatically or manually), Kafka:
Writes a message to
__consumer_offsets
with:Key: Consumer group + topic + partition
Value: Committed offset and metadata (like timestamp)
This is asynchronous and batched for efficiency.
2. Offset Fetch
On consumer start or rebalance:
Kafka fetches the latest committed offsets from
__consumer_offsets
.Because it’s a compacted topic, only the latest offset per key is read.
3. Consumer Group Coordinator
A broker is selected as the Group Coordinator.
It handles:
Heartbeats from consumers
Offset commits and fetches
Rebalancing logic
Kafka uses a hash of the group ID to assign each group to a specific coordinator broker.
4. Rebalancing
When consumers join/leave, the group coordinator triggers a rebalance.
New assignments are communicated, and offset fetch happens via
__consumer_offsets
.
👁️ Monitoring Offsets
Consumer Lag = Latest offset in partition - Committed offset in
__consumer_offsets
Tools:
kafka-consumer-groups.sh --describe
Prometheus + JMX metrics:
kafka.consumer:type=consumer-fetch-manager-metrics
⚙️ Part 2: Kafka Streams Changelog Topics
✅ Purpose:
Kafka Streams uses state stores for operations like counting, joining, windowing, etc.
These state stores are backed up to Kafka via changelog topics, so that the processing state can be recovered after a crash or migration.
🧱 Changelog Topic Structure:
Name format:
<application-id>-<store-name>-changelog
Example:
pageviews-app-clicks-store-changelog
Partitions: Same as the corresponding task’s input topic
Cleanup policy:
compact
Key-value: Mirror of the state store updates
🔧 How It Works:
1. State Store Writes
During processing (e.g., a count operation), Kafka Streams updates a local RocksDB store.
These updates are also written to a changelog topic in Kafka.
2. Recovery
When a Streams app instance crashes or migrates, Kafka Streams:
Reconstructs the state store by replaying the changelog topic.
Enables fast failover and recovery with full processing consistency.
3. Exactly-Once Support
In EOS mode (
processing.guarantee=exactly_once_v2
):Changelog updates and output records are part of the same Kafka transaction.
🛠 Example:
Suppose you’re counting clicks per user:
KTable<String, Long> userClicks = clicksStream
.groupByKey()
.count(Materialized.<String, Long, KeyValueStore<Bytes, byte[]>>as("user-clicks-store"));
Kafka Streams:
Stores counts in
user-clicks-store
(local RocksDB)Persists changes in
my-app-user-clicks-store-changelog
Kafka topic
📌 Why Changelog Topics Matter
Make stateful stream processing fault-tolerant
Allow scaling out Kafka Streams apps (new instances can rebuild state)
Support hot standby replicas for quick failover
| Feature | `__consumer_offsets` | Kafka Streams Changelog Topics |
| ------------ | ---------------------------------------- | -------------------------------------------------- |
| Stores | Committed offsets of consumer groups | Updates to state stores (like counts, windows) |
| Used by | All Kafka consumers | Kafka Streams apps |
| Topic Type | Compacted | Compacted |
| Partitions | Default 50 (configurable) | Depends on state store partitioning |
| Critical for | Offset tracking, rebalance, consumer lag | Stateful recovery, EOS processing, fault tolerance |