KubeBlocks
BlogsKubeBlocks Cloud
Overview
Quickstart

Topologies

Milvus Standalone Cluster
Milvus Cluster

Operations

Lifecycle Management
Vertical Scaling
Horizontal Scaling
Manage Milvus Services
Decommission Milvus Replica

Monitoring

Observability for Milvus Clusters

tpl

  1. Prerequisites
  2. Deploy a Milvus Cluster
  3. Scale-out (Add Replicas)
    1. Verify Scale-Out
  4. Scale-in (Remove Replicas)
    1. Verify Scale-In
  5. Best Practices
  6. Cleanup
  7. Summary

Horizontal Scaling for Milvus Clusters with KubeBlocks

This guide explains how to perform horizontal scaling (scale-out and scale-in) on a Milvus cluster managed by KubeBlocks. You'll learn how to use both OpsRequest and direct Cluster API updates to achieve this.

Prerequisites

    Before proceeding, ensure the following:

    • Environment Setup:
      • A Kubernetes cluster is up and running.
      • The kubectl CLI tool is configured to communicate with your cluster.
      • KubeBlocks CLI and KubeBlocks Operator are installed. Follow the installation instructions here.
    • Namespace Preparation: To keep resources isolated, create a dedicated namespace for this tutorial:
    kubectl create ns demo
    namespace/demo created
    

    Deploy a Milvus Cluster

    Please refer to Deploying a Milvus Cluster with KubeBlocks to deploy a milvus cluster.

    Scale-out (Add Replicas)

    Expected Workflow:

    1. New pod is provisioned, and transitions from Pending to Running.
    2. Cluster status changes from Updating to Running

    Option 1: Using Horizontal Scaling OpsRequest

    Scale out the Milvus cluster by adding 1 replica to milvus component:

    apiVersion: operations.kubeblocks.io/v1alpha1
    kind: OpsRequest
    metadata:
      name: milvus-cluster-scale-out-ops
      namespace: demo
    spec:
      clusterName: milvus-cluster
      type: HorizontalScaling
      horizontalScaling:
      - componentName: querynode
        # Specifies the replica changes for scaling in components
        scaleOut:
          # Specifies the replica changes for the component.
          # add one more replica to current component
          replicaChanges: 1
    

    Monitor the progress of the scaling operation:

    kubectl get ops milvus-cluster-scale-out-ops -n demo -w
    

    Expected Result:

    NAME                             TYPE                CLUSTER          STATUS    PROGRESS   AGE
    milvus-cluster-scale-out-ops     HorizontalScaling   milvus-cluster   Running   0/1        9s
    milvus-cluster-scale-out-ops     HorizontalScaling   milvus-cluster   Running   1/1        16s
    milvus-cluster-scale-out-ops     HorizontalScaling   milvus-cluster   Succeed   1/1        16s
    

    Option 2: Direct Cluster API Update

    Alternatively, you can perform a direct update to the replicas field in the Cluster resource:

    apiVersion: apps.kubeblocks.io/v1
    kind: Cluster
    spec:
      componentSpecs:
        - name: querynode
          replicas: 3 # increase replicas from 2 to 3 by 1
    ...
    

    Or you can patch the cluster CR with command:

    kubectl patch cluster milvus-cluster -n demo --type='json' -p='[
      {
        "op": "replace",
        "path": "/spec/componentSpecs/4/replicas",
        "value": 3
      }
    ]'
    

    Verify Scale-Out

    After applying the operation, you will see a new pod created and the Milvus cluster status goes from Updating to Running, and the newly created pod has a new role secondary.

    kubectl get pods -n demo -l app.kubernetes.io/instance=milvus-cluster,apps.kubeblocks.io/component-name=querynode
    

    Example Output:

    NAME                         READY   STATUS    RESTARTS   AGE
    milvus-cluster-querynode-0   1/1     Running   0          85m
    milvus-cluster-querynode-1   1/1     Running   0          87m
    milvus-cluster-querynode-2   1/1     Running   0          99m
    

    Scale-in (Remove Replicas)

    Expected Workflow:

    1. Selected replica (the one with the largest ordinal) is removed
    2. Pod is terminated gracefully
    3. Cluster status changes from Updating to Running

    Option 1: Using Horizontal Scaling OpsRequest

    Scale in the Milvus cluster by removing ONE replica:

    apiVersion: operations.kubeblocks.io/v1alpha1
    kind: OpsRequest
    metadata:
      name: milvus-cluster-scale-in-ops
      namespace: demo
    spec:
      clusterName: milvus-cluster
      type: HorizontalScaling
      horizontalScaling:
      - componentName: querynode
        # Specifies the replica changes for scaling in components
        scaleIn:
          # Specifies the replica changes for the component.
          # remove one replica from current component
          replicaChanges: 1
    

    Monitor progress:

    kubectl get ops milvus-cluster-scale-in-ops -n demo -w
    

    Expected Result:

    NAME                          TYPE                CLUSTER          STATUS    PROGRESS   AGE
    milvus-cluster-scale-in-ops   HorizontalScaling   milvus-cluster   Running   0/1        8s
    milvus-cluster-scale-in-ops   HorizontalScaling   milvus-cluster   Running   1/1        24s
    milvus-cluster-scale-in-ops   HorizontalScaling   milvus-cluster   Succeed   1/1        24s
    

    Option 2: Direct Cluster API Update

    Alternatively, you can perform a direct update to the replicas field in the Cluster resource:

    apiVersion: apps.kubeblocks.io/v1
    kind: Cluster
    spec:
      componentSpecs:
        - name: milvus
          replicas: 2 # decrease replicas from 3 to 2 by 1
    

    Or you can patch the cluster CR with command:

    kubectl patch cluster milvus-cluster -n demo --type='json' -p='[
    {
        "op": "replace",
        "path": "/spec/componentSpecs/4/replicas",
        "value": 2
      }
    ]'
    

    Verify Scale-In

    Example Output (Two Pod):

    kubectl get pods -n demo -l app.kubernetes.io/instance=milvus-cluster
    NAME                         READY   STATUS    RESTARTS   AGE
    milvus-cluster-querynode-0   1/1     Running   0          101m
    milvus-cluster-querynode-1   1/1     Running   0          102m
    
    NOTE

    Milvus Cluster consists of five components. This tutorial shows how to perform changes to one component. You may perform changes to other components in the same way.

    Best Practices

    When performing horizontal scaling:

    • Scale during low-traffic periods when possible
    • Monitor cluster health during scaling operations
    • Verify sufficient resources exist before scaling out
    • Consider storage requirements for new replicas

    Cleanup

    To remove all created resources, delete the Milvus cluster along with its namespace:

    kubectl delete cluster milvus-cluster -n demo
    kubectl delete ns demo
    

    Summary

    In this guide you learned how to:

    • Perform scale-out operations to add replicas to a Milvus cluster.
    • Perform scale-in operations to remove replicas from a Milvus cluster.
    • Use both OpsRequest and direct Cluster API updates for horizontal scaling.

    KubeBlocks ensures seamless scaling with minimal disruption to your database operations. with minimal disruption to your database operations.

    © 2025 ApeCloud PTE. Ltd.