KubeBlocks
BlogsKubeBlocks Cloud
⌘K
​
Overview
Quickstart

Operations

Lifecycle Management
Vertical Scaling
Horizontal Scaling
Volume Expansion
Manage Kafka Services
Decommission Kafka Replica

Monitoring

Observability for Kafka Clusters

tpl

  1. Prerequisites
  2. Deploy a Kafka Cluster
  3. Verifying the Deployment
  4. Cluster Lifecycle Operations
    1. Stopping the Cluster
    2. Verifying Cluster Stop
    3. Starting the Cluster
    4. Verifying Cluster Start
    5. Restarting Cluster
  5. Summary

Kafka Cluster Lifecycle Management

This guide demonstrates how to manage a Kafka Cluster's operational state in KubeBlocks, including:

  • Stopping the cluster to conserve resources
  • Starting a stopped cluster
  • Restarting cluster components

These operations help optimize resource usage and reduce operational costs in Kubernetes environments.

Lifecycle management operations in KubeBlocks:

OperationEffectUse Case
StopSuspends cluster, retains storageCost savings, maintenance
StartResumes cluster operationRestore service after pause
RestartRecreates pods for componentConfiguration changes, troubleshooting

Prerequisites

    Before proceeding, ensure the following:

    • Environment Setup:
      • A Kubernetes cluster is up and running.
      • The kubectl CLI tool is configured to communicate with your cluster.
      • KubeBlocks CLI and KubeBlocks Operator are installed. Follow the installation instructions here.
    • Namespace Preparation: To keep resources isolated, create a dedicated namespace for this tutorial:
    kubectl create ns demo
    namespace/demo created
    

    Deploy a Kafka Cluster

      KubeBlocks uses a declarative approach for managing Kafka Clusters. Below is an example configuration for deploying a Kafka Cluster with 3 components

      Apply the following YAML configuration to deploy the cluster:

      apiVersion: apps.kubeblocks.io/v1
      kind: Cluster
      metadata:
        name: kafka-separated-cluster
        namespace: demo
      spec:
        terminationPolicy: Delete
        clusterDef: kafka
        topology: separated_monitor
        componentSpecs:
          - name: kafka-broker
            replicas: 1
            resources:
              limits:
                cpu: "0.5"
                memory: "0.5Gi"
              requests:
                cpu: "0.5"
                memory: "0.5Gi"
            env:
              - name: KB_KAFKA_BROKER_HEAP
                value: "-XshowSettings:vm -XX:MaxRAMPercentage=100 -Ddepth=64"
              - name: KB_KAFKA_CONTROLLER_HEAP
                value: "-XshowSettings:vm -XX:MaxRAMPercentage=100 -Ddepth=64"
              - name: KB_BROKER_DIRECT_POD_ACCESS
                value: "true"
            volumeClaimTemplates:
              - name: data
                spec:
                  storageClassName: ""
                  accessModes:
                    - ReadWriteOnce
                  resources:
                    requests:
                      storage: 20Gi
              - name: metadata
                spec:
                  storageClassName: ""
                  accessModes:
                    - ReadWriteOnce
                  resources:
                    requests:
                      storage: 1Gi
          - name: kafka-controller
            replicas: 1
            resources:
              limits:
                cpu: "0.5"
                memory: "0.5Gi"
              requests:
                cpu: "0.5"
                memory: "0.5Gi"
            volumeClaimTemplates:
              - name: metadata
                spec:
                  storageClassName: ""
                  accessModes:
                    - ReadWriteOnce
                  resources:
                    requests:
                      storage: 1Gi
          - name: kafka-exporter
            replicas: 1
            resources:
              limits:
                cpu: "0.5"
                memory: "1Gi"
              requests:
                cpu: "0.1"
                memory: "0.2Gi"
      
      NOTE

      These three components will be created strictly in controller->broker->exporter order as defined in ClusterDefinition.

      Verifying the Deployment

        Monitor the cluster status until it transitions to the Running state:

        kubectl get cluster kafka-separated-cluster -n demo -w
        

        Expected Output:

        kubectl get cluster kafka-separated-cluster -n demo
        NAME                      CLUSTER-DEFINITION   TERMINATION-POLICY   STATUS     AGE
        kafka-separated-cluster   kafka                Delete               Creating   13s
        kafka-separated-cluster   kafka                Delete               Running    63s
        

        Check the pod status and roles:

        kubectl get pods -l app.kubernetes.io/instance=kafka-separated-cluster -n demo
        

        Expected Output:

        NAME                                         READY   STATUS    RESTARTS   AGE
        kafka-separated-cluster-kafka-broker-0       2/2     Running   0          13m
        kafka-separated-cluster-kafka-controller-0   2/2     Running   0          13m
        kafka-separated-cluster-kafka-exporter-0     1/1     Running   0          12m
        

        Once the cluster status becomes Running, your Kafka cluster is ready for use.

        TIP

        If you are creating the cluster for the very first time, it may take some time to pull images before running.

        Cluster Lifecycle Operations

        Stopping the Cluster

        Stopping a Kafka Cluster in KubeBlocks will:

        1. Terminates all running pods
        2. Retains persistent storage (PVCs)
        3. Maintains cluster configuration

        This operation is ideal for:

        • Temporary cost savings
        • Maintenance windows
        • Development environment pauses

        Option 1: OpsRequest API

        Create a Stop operation request:

        apiVersion: operations.kubeblocks.io/v1alpha1
        kind: OpsRequest
        metadata:
          name: kafka-separated-cluster-stop-ops
          namespace: demo
        spec:
          clusterName: kafka-separated-cluster
          type: Stop
        

        Option 2: Cluster API Patch

        Modify the cluster spec directly by patching the stop field:

        kubectl patch cluster kafka-separated-cluster -n demo --type='json' -p='[
        {
          "op": "add",
          "path": "/spec/componentSpecs/0/stop",
          "value": true
        },
        {
          "op": "add",
          "path": "/spec/componentSpecs/1/stop",
          "value": true
        },
        {
          "op": "add",
          "path": "/spec/componentSpecs/2/stop",
          "value": true
        }
        ]'
        

        Verifying Cluster Stop

        To confirm a successful stop operation:

        1. Check cluster status transition:

          kubectl get cluster kafka-separated-cluster -n demo -w
          

          Example Output:

          NAME                      CLUSTER-DEFINITION    TERMINATION-POLICY   STATUS     AGE
          kafka-separated-cluster   kafka                 Delete               Stopping   16m3s
          kafka-separated-cluster   kafka                 Delete               Stopped    16m55s
          
        2. Verify no running pods:

          kubectl get pods -l app.kubernetes.io/instance=kafka-separated-cluster -n demo
          

          Example Output:

          No resources found in demo namespace.
          
        3. Confirm persistent volumes remain:

          kubectl get pvc -l app.kubernetes.io/instance=kafka-separated-cluster -n demo
          

          Example Output:

          NAME                                                  STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   VOLUMEATTRIBUTESCLASS   AGE
          data-kafka-separated-cluster-kafka-broker-0           Bound    pvc-ddd54e0f-414a-49ed-8e17-41e9f5082af1   20Gi       RWO            standard       <unset>                 14m
          metadata-kafka-separated-cluster-kafka-broker-0       Bound    pvc-d63b7d80-cac5-41b9-b694-6a298921003b   1Gi        RWO            standard       <unset>                 14m
          metadata-kafka-separated-cluster-kafka-controller-0   Bound    pvc-e6263eb1-405a-4090-b2bb-f92cca0ba36d   1Gi        RWO            standard       <unset>                 14m
          

        Starting the Cluster

        Starting a stopped Kafka Cluster:

        1. Recreates all pods
        2. Reattaches persistent storage
        3. Restores service endpoints

        Expected behavior:

        • Cluster returns to previous state
        • No data loss occurs
        • Services resume automatically

        Initiate a Start operation request:

        apiVersion: operations.kubeblocks.io/v1alpha1
        kind: OpsRequest
        metadata:
          name: kafka-separated-cluster-start-ops
          namespace: demo
        spec:
          # Specifies the name of the Cluster resource that this operation is targeting.
          clusterName: kafka-separated-cluster
          type: Start
        

        Modify the cluster spec to resume operation:

        1. Set stop: false, or

        2. Remove the stop field entirely

          kubectl patch cluster kafka-separated-cluster -n demo --type='json' -p='[
          {
            "op": "remove",
            "path": "/spec/componentSpecs/0/stop"
          },
          {
            "op": "remove",
            "path": "/spec/componentSpecs/1/stop"
          },
          {
            "op": "remove",
            "path": "/spec/componentSpecs/2/stop"
          }
          ]'
          

        Verifying Cluster Start

        To confirm a successful start operation:

        1. Check cluster status transition:

          kubectl get cluster kafka-separated-cluster -n demo -w
          

          Example Output:

          NAME                      CLUSTER-DEFINITION     TERMINATION-POLICY   STATUS     AGE
          kafka-separated-cluster   kafka                  Delete               Updating   24m
          kafka-separated-cluster   kafka                  Delete               Running    24m
          kafka-separated-cluster   kafka                  Delete               Running    24m
          
        2. Verify pod recreation:

          kubectl get pods -n demo -l app.kubernetes.io/instance=kafka-separated-cluster
          

          Example Output:

          NAME                                         READY   STATUS    RESTARTS   AGE
          kafka-separated-cluster-kafka-broker-0       2/2     Running   0          2m4s
          kafka-separated-cluster-kafka-controller-0   2/2     Running   0          104s
          kafka-separated-cluster-kafka-exporter-0     1/1     Running   0          84s
          

        Restarting Cluster

        Restart operations provide:

        • Pod recreation without full cluster stop
        • Component-level granularity
        • Minimal service disruption

        Use cases:

        • Configuration changes requiring restart
        • Resource refresh
        • Troubleshooting

        Check Components

        There are five components in Milvus Cluster. To get the list of components,

        kubectl get cluster -n demo kafka-separated-cluster -oyaml | yq '.spec.componentSpecs[].name'
        

        Expected Output:

        kafka-controller
        kafka-broker
        kafka-exporter
        

        Restart Proxy via OpsRequest API

        List specific components to be restarted:

        apiVersion: operations.kubeblocks.io/v1alpha1
        kind: OpsRequest
        metadata:
          name: kafka-separated-cluster-restart-ops
          namespace: demo
        spec:
          clusterName: kafka-separated-cluster
          type: Restart
          restart:
          - componentName: kafka-broker
        

        Verifying Restart Completion

        To verify a successful component restart:

        1. Track OpsRequest progress:

          kubectl get opsrequest kafka-separated-cluster-restart-ops -n demo -w
          

          Example Output:

          NAME                                  TYPE      CLUSTER                   STATUS    PROGRESS   AGE
          kafka-separated-cluster-restart-ops   Restart   kafka-separated-cluster   Running   0/1        8s
          kafka-separated-cluster-restart-ops   Restart   kafka-separated-cluster   Running   1/1        22s
          kafka-separated-cluster-restart-ops   Restart   kafka-separated-cluster   Running   1/1        23s
          kafka-separated-cluster-restart-ops   Restart   kafka-separated-cluster   Succeed   1/1        23s
          
        2. Check pod status:

          kubectl get pods -n demo -l app.kubernetes.io/instance=kafka-separated-cluster
          

          Note: Pods will show new creation timestamps after restart. Only pods belongs to component kafka-broker have been restarted.

        Once the operation is complete, the cluster will return to the Running state.

        Summary

        In this guide, you learned how to:

        1. Stop a Kafka Cluster to suspend operations while retaining persistent storage.
        2. Start a stopped cluster to bring it back online.
        3. Restart specific cluster components to recreate their Pods without stopping the entire cluster.

        By managing the lifecycle of your Kafka Cluster, you can optimize resource utilization, reduce costs, and maintain flexibility in your Kubernetes environment. KubeBlocks provides a seamless way to perform these operations, ensuring high availability and minimal disruption.

        © 2025 ApeCloud PTE. Ltd.