KubeBlocks
BlogsSkillsEnterprise
⌘K
​
Blogs
Overview
Quickstart
Architecture

Operations

Lifecycle Management
Vertical Scaling
Horizontal Scaling
Volume Expansion
Manage Kafka Services
Decommission Kafka Replica

Monitoring

Observability for Kafka Clusters

tpl

  1. Combined Architecture (combined / combined_monitor)
    1. Resource Hierarchy
    2. Containers Inside Each Pod
    3. KRaft Consensus
    4. Traffic Routing
    5. Combined ± Monitor
  2. Separated Architecture (separated / separated_monitor)
    1. Resource Hierarchy
    2. Containers Inside Each Pod
    3. KRaft Consensus
    4. Traffic Routing
    5. Separated ± Monitor
  3. Kafka 2.x Architecture (kafka2-external-zk)
    1. Resource Hierarchy
    2. Containers Inside Each Pod
    3. ZooKeeper Coordination
    4. Traffic Routing
  4. Partition Leader Election
  5. Automatic Failover
  6. System Accounts

Kafka Architecture in KubeBlocks

KubeBlocks supports three Kafka deployment topologies:

TopologyNode layoutCoordinationUse Case
combined / combined_monitorSingle Component; every pod runs broker + controller rolesKRaft (no ZooKeeper)Smaller clusters; simplified operations; fewer pods to manage
separated / separated_monitorController and broker Components are independent; scale each separatelyKRaft (no ZooKeeper)Production workloads; large clusters; independent controller and broker scaling
kafka2-external-zkBroker-only Component; coordination delegated to external ZooKeeperZooKeeper (external)Legacy Kafka 2.7 deployments that already have a ZooKeeper ensemble

The *_monitor variants add a standalone kafka-exporter Component that scrapes Kafka-specific metrics (consumer group lag, partition offsets, topic throughput) and exposes them on port 9308 for Prometheus.

NOTE

Configuration templates and configs: the Kafka ComponentDefinition treats main config slots (for example kafka-configuration-tpl) as externally managed in current addon charts. When you create a Cluster, you must wire those slots by setting configs on the matching component (or sharding template) to ConfigMaps whose keys match the template file names — typically the ConfigMaps shipped with the addon in kb-system, or your own copies in the application namespace. If provisioning fails with a message about missing templates, compare your manifest to the Kafka examples in kubeblocks-addons for the same chart version.


Combined Architecture (combined / combined_monitor)

In the combined topology every pod simultaneously acts as both a broker (stores and serves partition data) and a controller (participates in the KRaft metadata quorum). There is a single KubeBlocks Component (kafka-combine) for all combined nodes, and the entire set of pods forms both the KRaft controller quorum and the broker cluster.

Producer / Consumer
Bootstrap seed list  kafka-cluster-kafka-combine-advertised-listener-0:9092,...
Per-pod (direct)  kafka-{n}.kafka-cluster-kafka-combine-headless:9092
bootstrap → fetch metadata → connect to partition leaders
Kubernetes Services
kafka-cluster-kafka-combine-advertised-listener-{n}
ClusterIP · :9092 (one per pod, podService: true)
Use all per-pod addresses as bootstrap seed list
per-pod bootstrap
→ partition leader pods (direct advertised address)
Combined Nodes (kafka-combine) · Worker Nodes
kafka-0BROKER+CTRL
BrokerController
kafka:9092 · :9093 · :9094
jmx-exporter:5556 metrics
PVC data-0 · log dir
kafka-1BROKER+CTRL
BrokerController
kafka:9092 · :9093 · :9094
jmx-exporter:5556 metrics
PVC data-1 · log dir
kafka-2BROKER+CTRL
BrokerController
kafka:9092 · :9093 · :9094
jmx-exporter:5556 metrics
PVC data-2 · log dir
KRaft Quorum (port :9093)same kafka container on each node — not a separate metadata deployment · Raft consensus for cluster metadata · one active controller at a time
Headless service — per-pod DNS for advertised listener addresses; Kafka clients connect to partition leaders directly
Client Traffic (:9092)
KRaft Controller Quorum (:9093)
Internal Replication (:9094)
Persistent Storage

Resource Hierarchy

Cluster  →  Component (kafka-combine)   →  InstanceSet  →  Pod × N
         →  Component (kafka-exporter)  →  InstanceSet  →  Pod × 1   [combined_monitor only]
ResourceRole
ClusterUser-facing declaration — specifies topology, combined node count, storage, and resources
Component (kafka-combine)Generated automatically; references the cmpd-kafka-combine ComponentDefinition; all pods are identical (each runs broker + controller)
Component (kafka-exporter)Optional; present in combined_monitor only; references cmpd-kafka-exporter; scrapes Kafka cluster metrics and exposes them on port 9308
InstanceSetKubeBlocks custom workload (replaces StatefulSet); manages pods with stable identities
PodActual running combined node; each pod gets a unique ordinal and its own PVC

KubeBlocks provisions kafka-combine first; kafka-exporter (if present) starts after.

Containers Inside Each Pod

Each combined pod runs two containers. A kafkatool init container runs before the main containers start, copying the /sasl directory to /shared-tools/sasl for SASL authentication support:

ContainerPortPurpose
kafka9092 (client), 9093 (controller quorum), 9094 (inter-broker)Kafka node running in combined broker,controller mode (KAFKA_CFG_PROCESS_ROLES=broker,controller); serves producer/consumer traffic on 9092; participates in KRaft consensus on 9093; replicates partition data between brokers on 9094
jmx-exporter5556JMX-based Prometheus metrics exporter; scrapes the local JVM's JMX registry and exposes all Kafka JMX metrics

Each pod mounts a single PVC at /bitnami/kafka. Kafka stores the partition log data under /bitnami/kafka/data and the KRaft metadata log under /bitnami/kafka/metadata, both on the same PVC.

KRaft Consensus

Since Kafka 3.3 (KIP-833), the KRaft metadata quorum fully replaces ZooKeeper. In the combined topology every pod is a controller-eligible node, so the whole pod set forms the controller quorum:

KRaft ConceptDescription
Controller quorumAll combined pods participate in Raft consensus on port 9093 to manage cluster metadata — topic configurations, partition assignments, and ISR lists
Active controllerThe Raft leader among controllers; all metadata writes go through it; elected automatically via Raft
Metadata logAn internal Kafka topic (__cluster_metadata) replicated across all controller-eligible pods; brokers tail this log to stay current
Quorum tolerance3 pods tolerate 1 failure; 5 pods tolerate 2 failures

A combined node that fails loses both its broker and controller roles simultaneously. The remaining quorum elects a new active controller; partition leaders for the affected partitions are re-elected from the ISR.

Traffic Routing

Kafka clients do not use a single ClusterIP service for all brokers. Instead, KubeBlocks creates one per-pod ClusterIP service (via podService: true) for each combined pod so that every broker can advertise a unique, stable address:

ServiceTypePortNotes
{cluster}-kafka-combine-advertised-listener-{n}ClusterIP9092One service per pod; clients use all per-pod addresses as the bootstrap seed list
{cluster}-kafka-combine-headlessHeadless—All pods; always created; used for internal cluster bus (port 9093 and 9094) and operator access

Clients bootstrap by connecting to the seed list of per-pod ClusterIP addresses on port 9092. After bootstrap, the client fetches the full cluster metadata and connects directly to the partition leader for each partition using the advertised per-pod address.

# Bootstrap seed list (all per-pod ClusterIP addresses)
{cluster}-kafka-combine-advertised-listener-0.{namespace}.svc.cluster.local:9092,
{cluster}-kafka-combine-advertised-listener-1.{namespace}.svc.cluster.local:9092,
{cluster}-kafka-combine-advertised-listener-2.{namespace}.svc.cluster.local:9092

Combined ± Monitor

The combined topology provisions only the kafka-combine Component. The combined_monitor topology additionally provisions the kafka-exporter Component, which connects to the Kafka cluster and exposes consumer group lag, topic/partition offsets, and throughput metrics on port 9308. If you use combined_monitor, configure your Prometheus scrape target to point at the kafka-exporter pod on port 9308.


Separated Architecture (separated / separated_monitor)

In the separated topology the controller and broker roles are placed in independent KubeBlocks Components, each backed by its own InstanceSet. This allows you to scale and update controllers and brokers independently and provides clear operational isolation between the metadata plane (controllers) and the data plane (brokers).

Producer / Consumer Client
Bootstrap seed list  kafka-cluster-kafka-broker-advertised-listener-0:9092,...
Per-broker (direct)  kafka-{n}.kafka-cluster-kafka-broker-headless:9092
client traffic → brokers :9092
Kubernetes Services
kafka-cluster-kafka-broker-advertised-listener-{n}
ClusterIP · :9092 (one per broker pod, podService: true)
bootstrap seed list for producers & consumers
+ headless svc for direct pod DNS access
per-pod bootstrap
→ broker pods
Pods · Worker Nodes
controller-0CONTROLLER
kafka (KRaft controller)
:9093 ctrl quorum only
controller
jmx-exporter
:5556 metrics (JMX_PORT :5555)
PVC metadata · /bitnami/kafka
broker-0BROKER
kafka (KRaft broker)
:9092 client · :9094 internal
broker
jmx-exporter
:5556 metrics
PVC data · /bitnami/kafka/data
broker-1BROKER
kafka (KRaft broker)
:9092 client · :9094 internal
broker
jmx-exporter
:5556 metrics
PVC data · /bitnami/kafka/data
KRaft Metadata Replication + ISR Partition Replicationcontroller quorum · topic partitions replicated across brokers
Headless service — stable pod DNS for internal use (replication, HA heartbeat, operator probes); not a client endpoint
Controller Pod (KRaft)
Broker Pod
Persistent Storage

Resource Hierarchy

Cluster  →  Component (kafka-controller)  →  InstanceSet  →  Pod × N
         →  Component (kafka-broker)      →  InstanceSet  →  Pod × N
         →  Component (kafka-exporter)    →  InstanceSet  →  Pod × 1  [separated_monitor only]
ResourceRole
ClusterUser-facing declaration — specifies topology, controller count, broker count, storage per component type, and resources
Component (kafka-controller)References cmpd-kafka-controller; controller-eligible pods only; forms the KRaft quorum; no client traffic
Component (kafka-broker)References cmpd-kafka-broker; broker-only pods; serves all producer/consumer traffic; fetches metadata from controllers
Component (kafka-exporter)Optional; present in separated_monitor only; scrapes Kafka metrics and exposes them on port 9308
InstanceSetKubeBlocks custom workload; manages pods within each Component with stable identities
PodActual running process; each pod gets a unique ordinal and its own PVC

KubeBlocks provisions the components in order: kafka-controller first (the quorum must elect an active controller before brokers can register), then kafka-broker, then kafka-exporter (if present). On termination, the order reverses.

Containers Inside Each Pod

Controller pods (no init containers):

ContainerPortPurpose
kafka9093 (controller quorum)Kafka node running as controller only (KAFKA_CFG_PROCESS_ROLES=controller); participates in KRaft Raft consensus; manages cluster metadata; does not serve client traffic
jmx-exporter5556JMX-based Prometheus metrics exporter

Each controller pod mounts a PVC at /bitnami/kafka for the KRaft metadata log (/bitnami/kafka/metadata).

Broker pods (init container: kafkatool copies /sasl to /shared-tools/sasl before startup):

ContainerPortPurpose
kafka9092 (client), 9094 (inter-broker)Kafka node running as broker only (KAFKA_CFG_PROCESS_ROLES=broker); serves producer/consumer requests; replicates partition data to other brokers on port 9094; fetches metadata from the active controller
jmx-exporter5556JMX-based Prometheus metrics exporter

Each broker pod mounts a PVC at /bitnami/kafka/data for partition log storage.

Exporter pod (present in separated_monitor only; no init containers):

ContainerPortPurpose
kafka-exporter9308Standalone Kafka metrics exporter — connects to the broker cluster and exposes consumer group lag, topic/partition offsets, and throughput in Prometheus format

KRaft Consensus

The controller Component forms a dedicated KRaft quorum. Brokers do not participate in the quorum — they only consume the __cluster_metadata log and register themselves with the active controller:

KRaft ConceptDescription
Controller quorumAll controller pods run Raft consensus on port 9093; replicate the __cluster_metadata log
Active controllerThe current Raft leader; brokers send all metadata updates (topic creation, ISR changes) through it
Broker registrationEach broker pod fetches the controller quorum address on startup and registers itself with the active controller
Quorum tolerance3 controller pods tolerate 1 failure; 5 tolerate 2. Broker count has no effect on quorum tolerance
NOTE

The controller quorum size and the broker count are independently configurable in the separated topology. A common production pattern is 3 controller pods + N broker pods, where N scales with data throughput requirements.

Traffic Routing

Controllers have no client-facing service. Only broker pods get per-pod ClusterIP services:

ServiceTypePortNotes
{cluster}-kafka-broker-advertised-listener-{n}ClusterIP9092One per broker pod (podService: true); use all as bootstrap seed list
{cluster}-kafka-broker-headlessHeadless—All broker pods; internal use (inter-broker replication on port 9094, operator access)
{cluster}-kafka-controller-headlessHeadless—All controller pods; used by brokers to reach the controller quorum on port 9093

Clients connect to broker pods only. The controller headless service is for internal Kafka use:

# Bootstrap seed list (all broker per-pod ClusterIP addresses)
{cluster}-kafka-broker-advertised-listener-0.{namespace}.svc.cluster.local:9092,
{cluster}-kafka-broker-advertised-listener-1.{namespace}.svc.cluster.local:9092,
...

Separated ± Monitor

The separated topology provisions only kafka-controller and kafka-broker. The separated_monitor topology additionally provisions the kafka-exporter Component on port 9308 for consumer group and partition metrics.


Kafka 2.x Architecture (kafka2-external-zk)

Producer / Consumer Client
Bootstrap seed list  kafka-cluster-kafka-broker-advertised-listener-0:9092,...
Per-broker (direct)  kafka-{n}.kafka-cluster-kafka-broker-headless:9092
client traffic → brokers :9092
Kubernetes Services
kafka-cluster-kafka-broker-advertised-listener-{n}
ClusterIP · :9092 (one per broker pod, podService: true)
bootstrap seed list for producers & consumers
+ headless svc for inter-broker replication on port 9094
per-pod bootstrap
→ broker pods
Kafka 2.x Broker Pods (kafka-broker) · Worker Nodes
broker-0BROKER
elected controller
kafka (broker + ZK controller)
:9092 client · :9094 internal
broker
jmx-exporter
:5556 metrics
PVC data · /bitnami/kafka/data
broker-1BROKER
kafka (broker only)
:9092 client · :9094 internal
broker
jmx-exporter
:5556 metrics
PVC data · /bitnami/kafka/data
broker-2BROKER
kafka (broker only)
:9092 client · :9094 internal
broker
jmx-exporter
:5556 metrics
PVC data · /bitnami/kafka/data
Headless service — stable pod DNS for inter-broker replication (:9094) and operator access; not a client endpoint
ZooKeeper coordination (external cluster)
External ZooKeeper Ensemble
serviceRef: kafkaZookeeper
zk-0
Leader
zk-1
Follower
zk-2
Follower
Controller election — one broker elected as Kafka Controller via ZK ephemeral node
Topic metadata — partition assignments, ISR lists stored as ZK znodes
Broker registration — brokers register on startup; controller detects failures via session expiry
SASL credentials — SCRAM-SHA-256/512 provisioned via kafka-configs.sh --zookeeper
KubeBlocks references this ensemble via serviceRefs[].cluster · ZooKeeper 3.5–3.9 supported
Broker Pod
ZooKeeper Ensemble (external)
Client Traffic (:9092)
Persistent Storage

The kafka2-external-zk topology deploys Kafka 2.7 in the traditional ZooKeeper-based mode. Instead of the KRaft metadata quorum, an external ZooKeeper ensemble (deployed as a separate KubeBlocks cluster or external service) provides cluster coordination, controller election, and topic metadata storage.

Resource Hierarchy

Cluster  →  Component (kafka-broker)    →  InstanceSet  →  Pod × N
         →  Component (kafka-exporter)  →  InstanceSet  →  Pod × 1

The kafka-broker ComponentDefinition for Kafka 2.x declares a serviceRefDeclaration named kafkaZookeeper (required; matches ZooKeeper 3.5–3.9). The Cluster CR must provide a serviceRef pointing to an available ZooKeeper ensemble before the broker component can start:

spec: topology: kafka2-external-zk serviceRefs: - name: kafkaZookeeper namespace: <zk-namespace> cluster: <zk-cluster-name>

Containers Inside Each Pod

Each broker pod runs two containers. The kafkatool init container copies /sasl to /shared-tools/sasl before startup, as in Kafka 3.x:

ContainerPortPurpose
kafka9092 (client), 9094 (inter-broker)Kafka 2.7 broker node; uses ZooKeeper for metadata (topic configs, controller election, ISR management); serves producer/consumer traffic on 9092; replicates partition data on 9094
jmx-exporter5556JMX-based Prometheus metrics exporter

Each broker pod mounts a PVC at /bitnami/kafka/data for partition log storage.

ZooKeeper Coordination

In Kafka 2.x, ZooKeeper is responsible for all coordination tasks that KRaft handles in Kafka 3.x:

ZooKeeper RoleDescription
Controller electionOne broker is elected Kafka controller via a ZooKeeper ephemeral node; it manages partition leader assignments
Topic metadataTopic configurations, partition assignments, and ISR lists are stored as ZooKeeper znodes
Broker registrationEach broker registers an ephemeral node in ZooKeeper on startup; the controller detects broker failures via session expiry
SASL credentialsSCRAM-SHA-256/512 credentials are provisioned using kafka-configs.sh --zookeeper (via the KubeBlocks accountProvision lifecycle action)
NOTE

Kafka 2.x is a legacy topology. It requires an operational ZooKeeper ensemble and does not support the KRaft metadata management or the operational simplifications available in Kafka 3.x. For new deployments, use combined or separated.

Traffic Routing

Traffic routing is identical to Kafka 3.x brokers — per-pod ClusterIP services via advertised-listener:

ServiceTypePortNotes
{cluster}-kafka-broker-advertised-listener-{n}ClusterIP9092One per broker pod; use all as bootstrap seed list
{cluster}-kafka-broker-headlessHeadless—All broker pods; inter-broker replication on port 9094

Partition Leader Election

Partition leader election governs data availability and is the core HA mechanism across all Kafka topologies. The active KRaft controller (or ZooKeeper-elected controller in Kafka 2.x) manages all partition leader assignments:

ConceptDescription
Partition leaderThe single broker responsible for all reads and writes for a given partition; elected by the active controller
Follower replicasBrokers that replicate the leader's partition log via port 9094; form the In-Sync Replica (ISR) set
ISR (In-Sync Replicas)Replicas fully caught up with the leader; only ISR members are eligible to become the new leader
Leader electionWhen a partition leader fails, the active controller selects the next leader from the current ISR set
Unclean leader electionDisabled by default (unclean.leader.election.enable=false); enabling it risks data loss but allows recovery when the ISR is empty

The replication factor for each topic determines how many ISR replicas exist per partition. A replication factor of 3 tolerates 1 broker failure per partition without any data loss or unavailability.


Automatic Failover

When a Kafka node fails, KubeBlocks and Kafka's internal protocols respond as follows:

  1. Broker failure — the active controller detects the broker's session timeout and removes it from all ISR lists
  2. Partition leader election — for each partition where the failed broker was leader, the controller elects a new leader from the remaining ISR members; clients refreshing metadata see the new leader
  3. Producer/consumer continuity — clients encounter a NOT_LEADER_OR_FOLLOWER error, refresh metadata from any available broker, and reconnect to the new partition leaders; brief retries are normal
  4. Controller failure (combined topology) — if the active controller's pod fails, the remaining quorum members hold a Raft election and elect a new active controller within seconds
  5. Controller failure (separated topology) — controller Component pods independently hold Raft election; broker pods continue serving data traffic without interruption while the quorum recovers
  6. KubeBlocks pod recovery — KubeBlocks restarts the failed pod; on startup the broker fetches the current metadata from the active controller, re-joins the cluster, and begins catching up on missed partition data

System Accounts

KubeBlocks automatically creates the following Kafka SASL accounts for the broker and combined components. Credentials are stored in Secrets named {cluster}-{component}-account-{name}.

AccountPurpose
adminSuperuser account for cluster administration — topic management, ACL configuration, quota management; injected as KAFKA_ADMIN_USER / KAFKA_ADMIN_PASSWORD and added to super.users
clientDefault client SASL account for producer and consumer authentication — use the credentials from the {cluster}-{component}-account-client Secret in your Kafka client configuration

Both accounts use SCRAM-SHA-256 and SCRAM-SHA-512 for authentication. In Kafka 3.x, credentials are stored in the __cluster_metadata log. In Kafka 2.x (kafka2-external-zk), credentials are provisioned via kafka-configs.sh --zookeeper during the KubeBlocks accountProvision lifecycle action.

NOTE

The kafka-controller Component in the separated topology and the kafka-exporter Component in monitor topologies do not have system accounts — accounts are only needed on the broker-eligible components (kafka-broker and kafka-combine).

© 2026 KUBEBLOCKS INC