KubeBlocks
BlogsKubeBlocks Cloud
⌘K
​
Overview
Quickstart

Operations

Lifecycle Management
Vertical Scaling
Horizontal Scaling
Volume Expansion
Manage Kafka Services
Decommission Kafka Replica

Monitoring

Observability for Kafka Clusters

tpl

  1. Prerequisites
  2. Deploy a Kafka Cluster
  3. Verifying the Deployment
  4. Cluster Lifecycle Operations
    1. Stopping the Cluster
    2. Verifying Cluster Stop
    3. Starting the Cluster
    4. Verifying Cluster Start
    5. Restarting Cluster
  5. Summary

Kafka Cluster Lifecycle Management

This guide demonstrates how to manage a Kafka Cluster's operational state in KubeBlocks, including:

  • Stopping the cluster to conserve resources
  • Starting a stopped cluster
  • Restarting cluster components

These operations help optimize resource usage and reduce operational costs in Kubernetes environments.

Lifecycle management operations in KubeBlocks:

OperationEffectUse Case
StopSuspends cluster, retains storageCost savings, maintenance
StartResumes cluster operationRestore service after pause
RestartRecreates pods for componentConfiguration changes, troubleshooting

Prerequisites

    Before proceeding, ensure the following:

    • Environment Setup:
      • A Kubernetes cluster is up and running.
      • The kubectl CLI tool is configured to communicate with your cluster.
      • KubeBlocks CLI and KubeBlocks Operator are installed. Follow the installation instructions here.
    • Namespace Preparation: To keep resources isolated, create a dedicated namespace for this tutorial:
    kubectl create ns demo namespace/demo created

    Deploy a Kafka Cluster

      KubeBlocks uses a declarative approach for managing Kafka Clusters. Below is an example configuration for deploying a Kafka Cluster with 3 components

      Apply the following YAML configuration to deploy the cluster:

      apiVersion: apps.kubeblocks.io/v1 kind: Cluster metadata: name: kafka-separated-cluster namespace: demo spec: terminationPolicy: Delete clusterDef: kafka topology: separated_monitor componentSpecs: - name: kafka-broker replicas: 1 resources: limits: cpu: "0.5" memory: "0.5Gi" requests: cpu: "0.5" memory: "0.5Gi" env: - name: KB_KAFKA_BROKER_HEAP value: "-XshowSettings:vm -XX:MaxRAMPercentage=100 -Ddepth=64" - name: KB_KAFKA_CONTROLLER_HEAP value: "-XshowSettings:vm -XX:MaxRAMPercentage=100 -Ddepth=64" - name: KB_BROKER_DIRECT_POD_ACCESS value: "true" volumeClaimTemplates: - name: data spec: storageClassName: "" accessModes: - ReadWriteOnce resources: requests: storage: 20Gi - name: metadata spec: storageClassName: "" accessModes: - ReadWriteOnce resources: requests: storage: 1Gi - name: kafka-controller replicas: 1 resources: limits: cpu: "0.5" memory: "0.5Gi" requests: cpu: "0.5" memory: "0.5Gi" volumeClaimTemplates: - name: metadata spec: storageClassName: "" accessModes: - ReadWriteOnce resources: requests: storage: 1Gi - name: kafka-exporter replicas: 1 resources: limits: cpu: "0.5" memory: "1Gi" requests: cpu: "0.1" memory: "0.2Gi"
      NOTE

      These three components will be created strictly in controller->broker->exporter order as defined in ClusterDefinition.

      Verifying the Deployment

        Monitor the cluster status until it transitions to the Running state:

        kubectl get cluster kafka-separated-cluster -n demo -w

        Expected Output:

        kubectl get cluster kafka-separated-cluster -n demo NAME CLUSTER-DEFINITION TERMINATION-POLICY STATUS AGE kafka-separated-cluster kafka Delete Creating 13s kafka-separated-cluster kafka Delete Running 63s

        Check the pod status and roles:

        kubectl get pods -l app.kubernetes.io/instance=kafka-separated-cluster -n demo

        Expected Output:

        NAME READY STATUS RESTARTS AGE kafka-separated-cluster-kafka-broker-0 2/2 Running 0 13m kafka-separated-cluster-kafka-controller-0 2/2 Running 0 13m kafka-separated-cluster-kafka-exporter-0 1/1 Running 0 12m

        Once the cluster status becomes Running, your Kafka cluster is ready for use.

        TIP

        If you are creating the cluster for the very first time, it may take some time to pull images before running.

        Cluster Lifecycle Operations

        Stopping the Cluster

        Stopping a Kafka Cluster in KubeBlocks will:

        1. Terminates all running pods
        2. Retains persistent storage (PVCs)
        3. Maintains cluster configuration

        This operation is ideal for:

        • Temporary cost savings
        • Maintenance windows
        • Development environment pauses

        Option 1: OpsRequest API

        Create a Stop operation request:

        apiVersion: operations.kubeblocks.io/v1alpha1 kind: OpsRequest metadata: name: kafka-separated-cluster-stop-ops namespace: demo spec: clusterName: kafka-separated-cluster type: Stop

        Option 2: Cluster API Patch

        Modify the cluster spec directly by patching the stop field:

        kubectl patch cluster kafka-separated-cluster -n demo --type='json' -p='[ { "op": "add", "path": "/spec/componentSpecs/0/stop", "value": true }, { "op": "add", "path": "/spec/componentSpecs/1/stop", "value": true }, { "op": "add", "path": "/spec/componentSpecs/2/stop", "value": true } ]'

        Verifying Cluster Stop

        To confirm a successful stop operation:

        1. Check cluster status transition:

          kubectl get cluster kafka-separated-cluster -n demo -w

          Example Output:

          NAME CLUSTER-DEFINITION TERMINATION-POLICY STATUS AGE kafka-separated-cluster kafka Delete Stopping 16m3s kafka-separated-cluster kafka Delete Stopped 16m55s
        2. Verify no running pods:

          kubectl get pods -l app.kubernetes.io/instance=kafka-separated-cluster -n demo

          Example Output:

          No resources found in demo namespace.
        3. Confirm persistent volumes remain:

          kubectl get pvc -l app.kubernetes.io/instance=kafka-separated-cluster -n demo

          Example Output:

          NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE data-kafka-separated-cluster-kafka-broker-0 Bound pvc-ddd54e0f-414a-49ed-8e17-41e9f5082af1 20Gi RWO standard <unset> 14m metadata-kafka-separated-cluster-kafka-broker-0 Bound pvc-d63b7d80-cac5-41b9-b694-6a298921003b 1Gi RWO standard <unset> 14m metadata-kafka-separated-cluster-kafka-controller-0 Bound pvc-e6263eb1-405a-4090-b2bb-f92cca0ba36d 1Gi RWO standard <unset> 14m

        Starting the Cluster

        Starting a stopped Kafka Cluster:

        1. Recreates all pods
        2. Reattaches persistent storage
        3. Restores service endpoints

        Expected behavior:

        • Cluster returns to previous state
        • No data loss occurs
        • Services resume automatically

        Initiate a Start operation request:

        apiVersion: operations.kubeblocks.io/v1alpha1 kind: OpsRequest metadata: name: kafka-separated-cluster-start-ops namespace: demo spec: # Specifies the name of the Cluster resource that this operation is targeting. clusterName: kafka-separated-cluster type: Start

        Modify the cluster spec to resume operation:

        1. Set stop: false, or
        2. Remove the stop field entirely
        kubectl patch cluster kafka-separated-cluster -n demo --type='json' -p='[ { "op": "remove", "path": "/spec/componentSpecs/0/stop" }, { "op": "remove", "path": "/spec/componentSpecs/1/stop" }, { "op": "remove", "path": "/spec/componentSpecs/2/stop" } ]'

        Verifying Cluster Start

        To confirm a successful start operation:

        1. Check cluster status transition:

          kubectl get cluster kafka-separated-cluster -n demo -w

          Example Output:

          NAME CLUSTER-DEFINITION TERMINATION-POLICY STATUS AGE kafka-separated-cluster kafka Delete Updating 24m kafka-separated-cluster kafka Delete Running 24m kafka-separated-cluster kafka Delete Running 24m
        2. Verify pod recreation:

          kubectl get pods -n demo -l app.kubernetes.io/instance=kafka-separated-cluster

          Example Output:

          NAME READY STATUS RESTARTS AGE kafka-separated-cluster-kafka-broker-0 2/2 Running 0 2m4s kafka-separated-cluster-kafka-controller-0 2/2 Running 0 104s kafka-separated-cluster-kafka-exporter-0 1/1 Running 0 84s

        Restarting Cluster

        Restart operations provide:

        • Pod recreation without full cluster stop
        • Component-level granularity
        • Minimal service disruption

        Use cases:

        • Configuration changes requiring restart
        • Resource refresh
        • Troubleshooting

        Check Components

        There are five components in Milvus Cluster. To get the list of components,

        kubectl get cluster -n demo kafka-separated-cluster -oyaml | yq '.spec.componentSpecs[].name'

        Expected Output:

        kafka-controller kafka-broker kafka-exporter

        Restart Proxy via OpsRequest API

        List specific components to be restarted:

        apiVersion: operations.kubeblocks.io/v1alpha1 kind: OpsRequest metadata: name: kafka-separated-cluster-restart-ops namespace: demo spec: clusterName: kafka-separated-cluster type: Restart restart: - componentName: kafka-broker

        Verifying Restart Completion

        To verify a successful component restart:

        1. Track OpsRequest progress:

          kubectl get opsrequest kafka-separated-cluster-restart-ops -n demo -w

          Example Output:

          NAME TYPE CLUSTER STATUS PROGRESS AGE kafka-separated-cluster-restart-ops Restart kafka-separated-cluster Running 0/1 8s kafka-separated-cluster-restart-ops Restart kafka-separated-cluster Running 1/1 22s kafka-separated-cluster-restart-ops Restart kafka-separated-cluster Running 1/1 23s kafka-separated-cluster-restart-ops Restart kafka-separated-cluster Succeed 1/1 23s
        2. Check pod status:

          kubectl get pods -n demo -l app.kubernetes.io/instance=kafka-separated-cluster

          Note: Pods will show new creation timestamps after restart. Only pods belongs to component kafka-broker have been restarted.

        Once the operation is complete, the cluster will return to the Running state.

        Summary

        In this guide, you learned how to:

        1. Stop a Kafka Cluster to suspend operations while retaining persistent storage.
        2. Start a stopped cluster to bring it back online.
        3. Restart specific cluster components to recreate their Pods without stopping the entire cluster.

        By managing the lifecycle of your Kafka Cluster, you can optimize resource utilization, reduce costs, and maintain flexibility in your Kubernetes environment. KubeBlocks provides a seamless way to perform these operations, ensuring high availability and minimal disruption.

        © 2025 ApeCloud PTE. Ltd.