KubeBlocks
BlogsKubeBlocks Cloud
⌘K
​
Overview
Quickstart

Topologies

Milvus Standalone Cluster
Milvus Cluster

Operations

Lifecycle Management
Vertical Scaling
Horizontal Scaling
Manage Milvus Services
Decommission Milvus Replica

Monitoring

Observability for Milvus Clusters

tpl

  1. Prerequisites
  2. Deploy a Milvus Cluster
  3. Scale-out (Add Replicas)
    1. Verify Scale-Out
  4. Scale-in (Remove Replicas)
    1. Verify Scale-In
  5. Best Practices
  6. Cleanup
  7. Summary

Horizontal Scaling for Milvus Clusters with KubeBlocks

This guide explains how to perform horizontal scaling (scale-out and scale-in) on a Milvus cluster managed by KubeBlocks. You'll learn how to use both OpsRequest and direct Cluster API updates to achieve this.

Prerequisites

    Before proceeding, ensure the following:

    • Environment Setup:
      • A Kubernetes cluster is up and running.
      • The kubectl CLI tool is configured to communicate with your cluster.
      • KubeBlocks CLI and KubeBlocks Operator are installed. Follow the installation instructions here.
    • Namespace Preparation: To keep resources isolated, create a dedicated namespace for this tutorial:
    kubectl create ns demo namespace/demo created

    Deploy a Milvus Cluster

    Please refer to Deploying a Milvus Cluster with KubeBlocks to deploy a milvus cluster.

    Scale-out (Add Replicas)

    Expected Workflow:

    1. New pod is provisioned, and transitions from Pending to Running.
    2. Cluster status changes from Updating to Running

    Option 1: Using Horizontal Scaling OpsRequest

    Scale out the Milvus cluster by adding 1 replica to milvus component:

    apiVersion: operations.kubeblocks.io/v1alpha1 kind: OpsRequest metadata: name: milvus-cluster-scale-out-ops namespace: demo spec: clusterName: milvus-cluster type: HorizontalScaling horizontalScaling: - componentName: querynode # Specifies the replica changes for scaling in components scaleOut: # Specifies the replica changes for the component. # add one more replica to current component replicaChanges: 1

    Monitor the progress of the scaling operation:

    kubectl get ops milvus-cluster-scale-out-ops -n demo -w

    Expected Result:

    NAME TYPE CLUSTER STATUS PROGRESS AGE milvus-cluster-scale-out-ops HorizontalScaling milvus-cluster Running 0/1 9s milvus-cluster-scale-out-ops HorizontalScaling milvus-cluster Running 1/1 16s milvus-cluster-scale-out-ops HorizontalScaling milvus-cluster Succeed 1/1 16s

    Option 2: Direct Cluster API Update

    Alternatively, you can perform a direct update to the replicas field in the Cluster resource:

    apiVersion: apps.kubeblocks.io/v1 kind: Cluster spec: componentSpecs: - name: querynode replicas: 3 # increase replicas from 2 to 3 by 1 ...

    Or you can patch the cluster CR with command:

    kubectl patch cluster milvus-cluster -n demo --type='json' -p='[ { "op": "replace", "path": "/spec/componentSpecs/4/replicas", "value": 3 } ]'

    Verify Scale-Out

    After applying the operation, you will see a new pod created and the Milvus cluster status goes from Updating to Running, and the newly created pod has a new role secondary.

    kubectl get pods -n demo -l app.kubernetes.io/instance=milvus-cluster,apps.kubeblocks.io/component-name=querynode

    Example Output:

    NAME READY STATUS RESTARTS AGE milvus-cluster-querynode-0 1/1 Running 0 85m milvus-cluster-querynode-1 1/1 Running 0 87m milvus-cluster-querynode-2 1/1 Running 0 99m

    Scale-in (Remove Replicas)

    Expected Workflow:

    1. Selected replica (the one with the largest ordinal) is removed
    2. Pod is terminated gracefully
    3. Cluster status changes from Updating to Running

    Option 1: Using Horizontal Scaling OpsRequest

    Scale in the Milvus cluster by removing ONE replica:

    apiVersion: operations.kubeblocks.io/v1alpha1 kind: OpsRequest metadata: name: milvus-cluster-scale-in-ops namespace: demo spec: clusterName: milvus-cluster type: HorizontalScaling horizontalScaling: - componentName: querynode # Specifies the replica changes for scaling in components scaleIn: # Specifies the replica changes for the component. # remove one replica from current component replicaChanges: 1

    Monitor progress:

    kubectl get ops milvus-cluster-scale-in-ops -n demo -w

    Expected Result:

    NAME TYPE CLUSTER STATUS PROGRESS AGE milvus-cluster-scale-in-ops HorizontalScaling milvus-cluster Running 0/1 8s milvus-cluster-scale-in-ops HorizontalScaling milvus-cluster Running 1/1 24s milvus-cluster-scale-in-ops HorizontalScaling milvus-cluster Succeed 1/1 24s

    Option 2: Direct Cluster API Update

    Alternatively, you can perform a direct update to the replicas field in the Cluster resource:

    apiVersion: apps.kubeblocks.io/v1 kind: Cluster spec: componentSpecs: - name: milvus replicas: 2 # decrease replicas from 3 to 2 by 1

    Or you can patch the cluster CR with command:

    kubectl patch cluster milvus-cluster -n demo --type='json' -p='[ { "op": "replace", "path": "/spec/componentSpecs/4/replicas", "value": 2 } ]'

    Verify Scale-In

    Example Output (Two Pod):

    kubectl get pods -n demo -l app.kubernetes.io/instance=milvus-cluster NAME READY STATUS RESTARTS AGE milvus-cluster-querynode-0 1/1 Running 0 101m milvus-cluster-querynode-1 1/1 Running 0 102m
    NOTE

    Milvus Cluster consists of five components. This tutorial shows how to perform changes to one component. You may perform changes to other components in the same way.

    Best Practices

    When performing horizontal scaling:

    • Scale during low-traffic periods when possible
    • Monitor cluster health during scaling operations
    • Verify sufficient resources exist before scaling out
    • Consider storage requirements for new replicas

    Cleanup

    To remove all created resources, delete the Milvus cluster along with its namespace:

    kubectl delete cluster milvus-cluster -n demo kubectl delete ns demo

    Summary

    In this guide you learned how to:

    • Perform scale-out operations to add replicas to a Milvus cluster.
    • Perform scale-in operations to remove replicas from a Milvus cluster.
    • Use both OpsRequest and direct Cluster API updates for horizontal scaling.

    KubeBlocks ensures seamless scaling with minimal disruption to your database operations. with minimal disruption to your database operations.

    © 2025 ApeCloud PTE. Ltd.