Decommission a Specific Pod in KubeBlocks-Managed Kafka Clusters

This guide explains how to decommission (take offline) specific Pods in Kafka clusters managed by KubeBlocks. Decommissioning provides precise control over cluster resources while maintaining availability. Use this for workload rebalancing, node maintenance, or addressing failures.

Why Decommission Pods with KubeBlocks?

In traditional StatefulSet-based deployments, Kubernetes lacks the ability to decommission specific Pods. StatefulSets ensure the order and identity of Pods, and scaling down always removes the Pod with the highest ordinal number (e.g., scaling down from 3 replicas removes Pod-2 first). This limitation prevents precise control over which Pod to take offline, which can complicate maintenance, workload distribution, or failure handling.

KubeBlocks overcomes this limitation by enabling administrators to decommission specific Pods directly. This fine-grained control ensures high availability and allows better resource management without disrupting the entire cluster.

Prerequisites

Before proceeding, ensure the following:

Environment Setup:
- A Kubernetes cluster is up and running.
- The kubectl CLI tool is configured to communicate with your cluster.
- KubeBlocks CLI and KubeBlocks Operator are installed. Follow the installation instructions here.
Namespace Preparation: To keep resources isolated, create a dedicated namespace for this tutorial:

kubectl create ns demo
namespace/demo created

Deploy a Kafka Cluster

KubeBlocks uses a declarative approach for managing Kafka Clusters. Below is an example configuration for deploying a Kafka Cluster with 3 components

Apply the following YAML configuration to deploy the cluster:


apiVersion: apps.kubeblocks.io/v1
kind: Cluster
metadata:
  name: kafka-separated-cluster
  namespace: demo
spec:
  terminationPolicy: Delete
  clusterDef: kafka
  topology: separated_monitor
  componentSpecs:
    - name: kafka-broker
      replicas: 1
      resources:
        limits:
          cpu: "0.5"
          memory: "0.5Gi"
        requests:
          cpu: "0.5"
          memory: "0.5Gi"
      env:
        - name: KB_KAFKA_BROKER_HEAP
          value: "-XshowSettings:vm -XX:MaxRAMPercentage=100 -Ddepth=64"
        - name: KB_KAFKA_CONTROLLER_HEAP
          value: "-XshowSettings:vm -XX:MaxRAMPercentage=100 -Ddepth=64"
        - name: KB_BROKER_DIRECT_POD_ACCESS
          value: "true"
      volumeClaimTemplates:
        - name: data
          spec:
            storageClassName: ""
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 20Gi
        - name: metadata
          spec:
            storageClassName: ""
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 1Gi
    - name: kafka-controller
      replicas: 1
      resources:
        limits:
          cpu: "0.5"
          memory: "0.5Gi"
        requests:
          cpu: "0.5"
          memory: "0.5Gi"
      volumeClaimTemplates:
        - name: metadata
          spec:
            storageClassName: ""
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 1Gi
    - name: kafka-exporter
      replicas: 1
      resources:
        limits:
          cpu: "0.5"
          memory: "1Gi"
        requests:
          cpu: "0.1"
          memory: "0.2Gi"

NOTE

These three components will be created strictly in controller->broker->exporter order as defined in ClusterDefinition.

Verifying the Deployment

Monitor the cluster status until it transitions to the Running state:

kubectl get cluster kafka-separated-cluster -n demo -w

Expected Output:

kubectl get cluster kafka-separated-cluster -n demo
NAME                      CLUSTER-DEFINITION   TERMINATION-POLICY   STATUS     AGE
kafka-separated-cluster   kafka                Delete               Creating   13s
kafka-separated-cluster   kafka                Delete               Running    63s

Check the pod status and roles:

kubectl get pods -l app.kubernetes.io/instance=kafka-separated-cluster -n demo

Expected Output:

NAME                                         READY   STATUS    RESTARTS   AGE
kafka-separated-cluster-kafka-broker-0       2/2     Running   0          13m
kafka-separated-cluster-kafka-controller-0   2/2     Running   0          13m
kafka-separated-cluster-kafka-exporter-0     1/1     Running   0          12m

Once the cluster status becomes Running, your Kafka cluster is ready for use.

TIP

If you are creating the cluster for the very first time, it may take some time to pull images before running.

Decommission a Pod

Expected Workflow:

Replica specified in onlineInstancesToOffline is removed
Pod terminates gracefully
Cluster transitions from Updating to Running

Before decommissioning a specific pod from a component, make sure this component has more than one replicas. If not, please scale out the component ahead.

E.g. you can patch the cluster CR with command, to declare there are 3 replicas in component querynode.


kubectl patch cluster kafka-separated-cluster -n demo --type='json' -p='[
  {
    "op": "replace",
    "path": "/spec/componentSpecs/1/replicas",
    "value": 3
  }
]'

Wait till all pods are running

kubectl get pods -n demo -l app.kubernetes.io/instance=kafka-separated-cluster,apps.kubeblocks.io/component-name=kafka-broker

Expected Output:

NAME                                     READY   STATUS    RESTARTS   AGE
kafka-separated-cluster-kafka-broker-0   2/2     Running   0          18m
kafka-separated-cluster-kafka-broker-1   2/2     Running   0          3m33m
kafka-separated-cluster-kafka-broker-2   2/2     Running   0          2m1s

To decommission a specific Pod (e.g., 'kafka-separated-cluster-kafka-broker-1'), you can use one of the following methods:

Option 1: Using OpsRequest

Create an OpsRequest to mark the Pod as offline:


apiVersion: operations.kubeblocks.io/v1alpha1
kind: OpsRequest
metadata:
  name: kafka-separated-cluster-decommission-ops
  namespace: demo
spec:
  clusterName: kafka-separated-cluster
  type: HorizontalScaling
  horizontalScaling:
  - componentName: kafka-broker
    scaleIn:
      onlineInstancesToOffline:
        - 'kafka-separated-cluster-kafka-broker-1'  # Specifies the instance names that need to be taken offline

Monitor the Decommissioning Process

Check the progress of the decommissioning operation:

kubectl get ops kafka-separated-cluster-decommission-ops -n demo -w

Example Output:

NAME                                       TYPE                CLUSTER                   STATUS    PROGRESS   AGE
kafka-separated-cluster-decommission-ops   HorizontalScaling   kafka-separated-cluster   Running   0/1        8s
kafka-separated-cluster-decommission-ops   HorizontalScaling   kafka-separated-cluster   Running   1/1        31s
kafka-separated-cluster-decommission-ops   HorizontalScaling   kafka-separated-cluster   Succeed   1/1        31s

Option 2: Using Cluster API

Alternatively, update the Cluster resource directly to decommission the Pod:


apiVersion: apps.kubeblocks.io/v1
kind: Cluster
spec:
  componentSpecs:
    - name: kafka-broker
      replicas: 2       # explected replicas after decommission
      offlineInstances:
        - kafka-separated-cluster-kafka-broker-1   # <----- Specify Pod to be decommissioned
 ...

Verify the Decommissioning

After applying the updated configuration, verify the remaining Pods in the cluster:

kubectl get pods -n demo -l app.kubernetes.io/instance=kafka-separated-cluster,apps.kubeblocks.io/component-name=kafka-broker

Example Output:

NAME                                     READY   STATUS    RESTARTS   AGE
kafka-separated-cluster-kafka-broker-0   2/2     Running   0          24m
kafka-separated-cluster-kafka-broker-2   2/2     Running   0          2m1s

Summary

Key takeaways:

Traditional StatefulSets lack precise Pod removal control
KubeBlocks enables targeted Pod decommissioning
Two implementation methods: OpsRequest or Cluster API

This provides granular cluster management while maintaining availability.

Decommission a Specific Pod in KubeBlocks-Managed Kafka Clusters

Why Decommission Pods with KubeBlocks?

Prerequisites

Before proceeding, ensure the following:

Environment Setup:
- A Kubernetes cluster is up and running.
- The kubectl CLI tool is configured to communicate with your cluster.
- KubeBlocks CLI and KubeBlocks Operator are installed. Follow the installation instructions here.
Namespace Preparation: To keep resources isolated, create a dedicated namespace for this tutorial:

kubectl create ns demo
namespace/demo created

Deploy a Kafka Cluster

KubeBlocks uses a declarative approach for managing Kafka Clusters. Below is an example configuration for deploying a Kafka Cluster with 3 components

Apply the following YAML configuration to deploy the cluster:


apiVersion: apps.kubeblocks.io/v1
kind: Cluster
metadata:
  name: kafka-separated-cluster
  namespace: demo
spec:
  terminationPolicy: Delete
  clusterDef: kafka
  topology: separated_monitor
  componentSpecs:
    - name: kafka-broker
      replicas: 1
      resources:
        limits:
          cpu: "0.5"
          memory: "0.5Gi"
        requests:
          cpu: "0.5"
          memory: "0.5Gi"
      env:
        - name: KB_KAFKA_BROKER_HEAP
          value: "-XshowSettings:vm -XX:MaxRAMPercentage=100 -Ddepth=64"
        - name: KB_KAFKA_CONTROLLER_HEAP
          value: "-XshowSettings:vm -XX:MaxRAMPercentage=100 -Ddepth=64"
        - name: KB_BROKER_DIRECT_POD_ACCESS
          value: "true"
      volumeClaimTemplates:
        - name: data
          spec:
            storageClassName: ""
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 20Gi
        - name: metadata
          spec:
            storageClassName: ""
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 1Gi
    - name: kafka-controller
      replicas: 1
      resources:
        limits:
          cpu: "0.5"
          memory: "0.5Gi"
        requests:
          cpu: "0.5"
          memory: "0.5Gi"
      volumeClaimTemplates:
        - name: metadata
          spec:
            storageClassName: ""
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 1Gi
    - name: kafka-exporter
      replicas: 1
      resources:
        limits:
          cpu: "0.5"
          memory: "1Gi"
        requests:
          cpu: "0.1"
          memory: "0.2Gi"

NOTE

These three components will be created strictly in controller->broker->exporter order as defined in ClusterDefinition.

Verifying the Deployment

Monitor the cluster status until it transitions to the Running state:

kubectl get cluster kafka-separated-cluster -n demo -w

Expected Output:

kubectl get cluster kafka-separated-cluster -n demo
NAME                      CLUSTER-DEFINITION   TERMINATION-POLICY   STATUS     AGE
kafka-separated-cluster   kafka                Delete               Creating   13s
kafka-separated-cluster   kafka                Delete               Running    63s

Check the pod status and roles:

kubectl get pods -l app.kubernetes.io/instance=kafka-separated-cluster -n demo

Expected Output:

NAME                                         READY   STATUS    RESTARTS   AGE
kafka-separated-cluster-kafka-broker-0       2/2     Running   0          13m
kafka-separated-cluster-kafka-controller-0   2/2     Running   0          13m
kafka-separated-cluster-kafka-exporter-0     1/1     Running   0          12m

Once the cluster status becomes Running, your Kafka cluster is ready for use.

TIP

If you are creating the cluster for the very first time, it may take some time to pull images before running.

Decommission a Pod

Expected Workflow:

Replica specified in onlineInstancesToOffline is removed
Pod terminates gracefully
Cluster transitions from Updating to Running

Before decommissioning a specific pod from a component, make sure this component has more than one replicas. If not, please scale out the component ahead.

E.g. you can patch the cluster CR with command, to declare there are 3 replicas in component querynode.


kubectl patch cluster kafka-separated-cluster -n demo --type='json' -p='[
  {
    "op": "replace",
    "path": "/spec/componentSpecs/1/replicas",
    "value": 3
  }
]'

Wait till all pods are running

kubectl get pods -n demo -l app.kubernetes.io/instance=kafka-separated-cluster,apps.kubeblocks.io/component-name=kafka-broker

Expected Output:

NAME                                     READY   STATUS    RESTARTS   AGE
kafka-separated-cluster-kafka-broker-0   2/2     Running   0          18m
kafka-separated-cluster-kafka-broker-1   2/2     Running   0          3m33m
kafka-separated-cluster-kafka-broker-2   2/2     Running   0          2m1s

To decommission a specific Pod (e.g., 'kafka-separated-cluster-kafka-broker-1'), you can use one of the following methods:

Option 1: Using OpsRequest

Create an OpsRequest to mark the Pod as offline:


apiVersion: operations.kubeblocks.io/v1alpha1
kind: OpsRequest
metadata:
  name: kafka-separated-cluster-decommission-ops
  namespace: demo
spec:
  clusterName: kafka-separated-cluster
  type: HorizontalScaling
  horizontalScaling:
  - componentName: kafka-broker
    scaleIn:
      onlineInstancesToOffline:
        - 'kafka-separated-cluster-kafka-broker-1'  # Specifies the instance names that need to be taken offline

Monitor the Decommissioning Process

Check the progress of the decommissioning operation:

kubectl get ops kafka-separated-cluster-decommission-ops -n demo -w

Example Output:

NAME                                       TYPE                CLUSTER                   STATUS    PROGRESS   AGE
kafka-separated-cluster-decommission-ops   HorizontalScaling   kafka-separated-cluster   Running   0/1        8s
kafka-separated-cluster-decommission-ops   HorizontalScaling   kafka-separated-cluster   Running   1/1        31s
kafka-separated-cluster-decommission-ops   HorizontalScaling   kafka-separated-cluster   Succeed   1/1        31s

Option 2: Using Cluster API

Alternatively, update the Cluster resource directly to decommission the Pod:


apiVersion: apps.kubeblocks.io/v1
kind: Cluster
spec:
  componentSpecs:
    - name: kafka-broker
      replicas: 2       # explected replicas after decommission
      offlineInstances:
        - kafka-separated-cluster-kafka-broker-1   # <----- Specify Pod to be decommissioned
 ...

Verify the Decommissioning

After applying the updated configuration, verify the remaining Pods in the cluster:

kubectl get pods -n demo -l app.kubernetes.io/instance=kafka-separated-cluster,apps.kubeblocks.io/component-name=kafka-broker

Example Output:

NAME                                     READY   STATUS    RESTARTS   AGE
kafka-separated-cluster-kafka-broker-0   2/2     Running   0          24m
kafka-separated-cluster-kafka-broker-2   2/2     Running   0          2m1s

Summary

Key takeaways:

Traditional StatefulSets lack precise Pod removal control
KubeBlocks enables targeted Pod decommissioning
Two implementation methods: OpsRequest or Cluster API

This provides granular cluster management while maintaining availability.