KubeBlocks
BlogsKubeBlocks Cloud
Overview
Quickstart

Topologies

Milvus Standalone Cluster
Milvus Cluster

Operations

Lifecycle Management
Vertical Scaling
Horizontal Scaling
Manage Milvus Services
Decommission Milvus Replica

Monitoring

Observability for Milvus Clusters

tpl

  1. Prerequisites
  2. Deploy a Milvus Cluster
  3. Vertical Scale
  4. Best Practices & Considerations
  5. Verification
  6. Key Benefits of Vertical Scaling with KubeBlocks
  7. Cleanup
  8. Summary

Vertical Scaling for Milvus Standalone Clusters with KubeBlocks

This guide demonstrates how to vertically scale a Milvus Cluster managed by KubeBlocks by adjusting compute resources (CPU and memory) while maintaining the same number of replicas.

Vertical scaling modifies compute resources (CPU and memory) for Milvus instances while maintaining replica count. Key characteristics:

  • Non-disruptive: When properly configured, maintains availability during scaling
  • Granular: Adjust CPU, memory, or both independently
  • Reversible: Scale up or down as needed

KubeBlocks ensures minimal impact during scaling operations by following a controlled, role-aware update strategy: Role-Aware Replicas (Primary/Secondary Replicas)

  • Secondary replicas update first – Non-leader pods are upgraded to minimize disruption.
  • Primary updates last – Only after all secondaries are healthy does the primary pod restart.
  • Cluster state progresses from Updating → Running once all replicas are stable.

Role-Unaware Replicas (Ordinal-Based Scaling) If replicas have no defined roles, updates follow Kubernetes pod ordinal order:

  • Highest ordinal first (e.g., pod-2 → pod-1 → pod-0) to ensure deterministic rollouts.

Prerequisites

    Before proceeding, ensure the following:

    • Environment Setup:
      • A Kubernetes cluster is up and running.
      • The kubectl CLI tool is configured to communicate with your cluster.
      • KubeBlocks CLI and KubeBlocks Operator are installed. Follow the installation instructions here.
    • Namespace Preparation: To keep resources isolated, create a dedicated namespace for this tutorial:
    kubectl create ns demo
    namespace/demo created
    

    Deploy a Milvus Cluster

    Please refer to Deploying a Milvus Cluster with KubeBlocks to deploy a milvus cluster.

    Vertical Scale

    Expected Workflow:

    1. Pods are updated in pod ordinal order, from highest to lowest, (e.g., pod-2 → pod-1 → pod-0)
    2. Cluster status transitions from Updating to Running

    Check Components

    There are five components in Milvus Cluster. To get the list of components,

    kubectl get cluster -n demo milvus-cluster -oyaml | yq '.spec.componentSpecs[].name'
    

    Expected Output:

    proxy
    mixcoord
    datanode
    indexnode
    querynode
    

    Option 1: Using VerticalScaling OpsRequest

    Apply the following YAML to scale up the resources for the querynode component:

    apiVersion: operations.kubeblocks.io/v1alpha1
    kind: OpsRequest
    metadata:
      name: milvus-cluster-vscale-ops
      namespace: demo
    spec:
      clusterName: milvus-cluster
      type: VerticalScaling
      verticalScaling:
      - componentName: querynode
        requests:
          cpu: '1'
          memory: 1Gi
        limits:
          cpu: '1'
          memory: 1Gi
    

    You can check the progress of the scaling operation with the following command:

    kubectl -n demo get ops milvus-cluster-vscale-ops -w
    

    Expected Result:

    NAME                        TYPE              CLUSTER          STATUS    PROGRESS   AGE
    milvus-cluster-vscale-ops   VerticalScaling   milvus-cluster   Running   0/2        33s
    milvus-cluster-vscale-ops   VerticalScaling   milvus-cluster   Running   1/2        55s
    milvus-cluster-vscale-ops   VerticalScaling   milvus-cluster   Running   2/2        88s
    

    Option 2: Direct Cluster API Update

    Alternatively, you may update spec.componentSpecs.resources field to the desired resources for vertical scale.

    apiVersion: apps.kubeblocks.io/v1
    kind: Cluster
    spec:
      componentSpecs:
        - name: querynode
          replicas: 1
          resources:
            requests:
              cpu: "1"       # Update the resources to your need.
              memory: "1Gi"  # Update the resources to your need.
            limits:
              cpu: "1"       # Update the resources to your need.
              memory: "1Gi"  # Update the resources to your need.
      ...
    
    NOTE

    Milvus Cluster consists of five components. This tutorial shows how to perform changes to one component. You may perform changes to other components in the same way.

    Best Practices & Considerations

    Planning:

    • Scale during maintenance windows or low-traffic periods
    • Verify Kubernetes cluster has sufficient resources
    • Check for any ongoing operations before starting

    Execution:

    • Maintain balanced CPU-to-Memory ratios
    • Set identical requests/limits for guaranteed QoS

    Post-Scaling:

    • Monitor resource utilization and application performance
    • Consider adjusting Milvus parameters if needed

    Verification

    Verify the updated resources by inspecting the cluster configuration or Pod details:

    kbcli cluster describe milvus-cluster -n demo
    

    Expected Output:

    Resources Allocation:
    COMPONENT   INSTANCE-TEMPLATE   CPU(REQUEST/LIMIT)   MEMORY(REQUEST/LIMIT)   STORAGE-SIZE   STORAGE-CLASS
    milvus                          1 / 1                1Gi / 1Gi               data:20Gi      <none>
    

    Key Benefits of Vertical Scaling with KubeBlocks

    • Seamless Scaling: Pods are recreated in a specific order to ensure minimal disruption.
    • Dynamic Resource Adjustments: Easily scale CPU and memory based on workload requirements.
    • Flexibility: Choose between OpsRequest for dynamic scaling or direct API updates for precise control.
    • Improved Availability: The cluster remains operational during the scaling process, maintaining high availability.

    Cleanup

    To remove all created resources, delete the Milvus Cluster along with its namespace:

    kubectl delete cluster milvus-cluster -n demo
    kubectl delete ns demo
    

    Summary

    In this guide, you learned how to:

    1. Deploy a Milvus Cluster managed by KubeBlocks.
    2. Perform vertical scaling by increasing or decreasing resources for the milvus component.
    3. Use both OpsRequest and direct Cluster API updates to adjust resource allocations.

    Vertical scaling is a powerful tool for optimizing resource utilization and adapting to changing workload demands, ensuring your Milvus Cluster remains performant and resilient.

    © 2025 ApeCloud PTE. Ltd.