KubeBlocks
BlogsKubeBlocks Cloud
⌘K
​
Overview
Quickstart

Topologies

Milvus Standalone Cluster
Milvus Cluster

Operations

Lifecycle Management
Vertical Scaling
Horizontal Scaling
Manage Milvus Services
Decommission Milvus Replica

Monitoring

Observability for Milvus Clusters

tpl

  1. Prerequisites
  2. Deploy a Milvus Cluster
  3. Vertical Scale
  4. Best Practices & Considerations
  5. Verification
  6. Key Benefits of Vertical Scaling with KubeBlocks
  7. Cleanup
  8. Summary

Vertical Scaling for Milvus Standalone Clusters with KubeBlocks

This guide demonstrates how to vertically scale a Milvus Cluster managed by KubeBlocks by adjusting compute resources (CPU and memory) while maintaining the same number of replicas.

Vertical scaling modifies compute resources (CPU and memory) for Milvus instances while maintaining replica count. Key characteristics:

  • Non-disruptive: When properly configured, maintains availability during scaling
  • Granular: Adjust CPU, memory, or both independently
  • Reversible: Scale up or down as needed

KubeBlocks ensures minimal impact during scaling operations by following a controlled, role-aware update strategy: Role-Aware Replicas (Primary/Secondary Replicas)

  • Secondary replicas update first – Non-leader pods are upgraded to minimize disruption.
  • Primary updates last – Only after all secondaries are healthy does the primary pod restart.
  • Cluster state progresses from Updating → Running once all replicas are stable.

Role-Unaware Replicas (Ordinal-Based Scaling) If replicas have no defined roles, updates follow Kubernetes pod ordinal order:

  • Highest ordinal first (e.g., pod-2 → pod-1 → pod-0) to ensure deterministic rollouts.

Prerequisites

    Before proceeding, ensure the following:

    • Environment Setup:
      • A Kubernetes cluster is up and running.
      • The kubectl CLI tool is configured to communicate with your cluster.
      • KubeBlocks CLI and KubeBlocks Operator are installed. Follow the installation instructions here.
    • Namespace Preparation: To keep resources isolated, create a dedicated namespace for this tutorial:
    kubectl create ns demo namespace/demo created

    Deploy a Milvus Cluster

    Please refer to Deploying a Milvus Cluster with KubeBlocks to deploy a milvus cluster.

    Vertical Scale

    Expected Workflow:

    1. Pods are updated in pod ordinal order, from highest to lowest, (e.g., pod-2 → pod-1 → pod-0)
    2. Cluster status transitions from Updating to Running

    Check Components

    There are five components in Milvus Cluster. To get the list of components,

    kubectl get cluster -n demo milvus-cluster -oyaml | yq '.spec.componentSpecs[].name'

    Expected Output:

    proxy mixcoord datanode indexnode querynode

    Option 1: Using VerticalScaling OpsRequest

    Apply the following YAML to scale up the resources for the querynode component:

    apiVersion: operations.kubeblocks.io/v1alpha1 kind: OpsRequest metadata: name: milvus-cluster-vscale-ops namespace: demo spec: clusterName: milvus-cluster type: VerticalScaling verticalScaling: - componentName: querynode requests: cpu: '1' memory: 1Gi limits: cpu: '1' memory: 1Gi

    You can check the progress of the scaling operation with the following command:

    kubectl -n demo get ops milvus-cluster-vscale-ops -w

    Expected Result:

    NAME TYPE CLUSTER STATUS PROGRESS AGE milvus-cluster-vscale-ops VerticalScaling milvus-cluster Running 0/2 33s milvus-cluster-vscale-ops VerticalScaling milvus-cluster Running 1/2 55s milvus-cluster-vscale-ops VerticalScaling milvus-cluster Running 2/2 88s

    Option 2: Direct Cluster API Update

    Alternatively, you may update spec.componentSpecs.resources field to the desired resources for vertical scale.

    apiVersion: apps.kubeblocks.io/v1 kind: Cluster spec: componentSpecs: - name: querynode replicas: 1 resources: requests: cpu: "1" # Update the resources to your need. memory: "1Gi" # Update the resources to your need. limits: cpu: "1" # Update the resources to your need. memory: "1Gi" # Update the resources to your need. ...
    NOTE

    Milvus Cluster consists of five components. This tutorial shows how to perform changes to one component. You may perform changes to other components in the same way.

    Best Practices & Considerations

    Planning:

    • Scale during maintenance windows or low-traffic periods
    • Verify Kubernetes cluster has sufficient resources
    • Check for any ongoing operations before starting

    Execution:

    • Maintain balanced CPU-to-Memory ratios
    • Set identical requests/limits for guaranteed QoS

    Post-Scaling:

    • Monitor resource utilization and application performance
    • Consider adjusting Milvus parameters if needed

    Verification

    Verify the updated resources by inspecting the cluster configuration or Pod details:

    kbcli cluster describe milvus-cluster -n demo

    Expected Output:

    Resources Allocation: COMPONENT INSTANCE-TEMPLATE CPU(REQUEST/LIMIT) MEMORY(REQUEST/LIMIT) STORAGE-SIZE STORAGE-CLASS milvus 1 / 1 1Gi / 1Gi data:20Gi <none>

    Key Benefits of Vertical Scaling with KubeBlocks

    • Seamless Scaling: Pods are recreated in a specific order to ensure minimal disruption.
    • Dynamic Resource Adjustments: Easily scale CPU and memory based on workload requirements.
    • Flexibility: Choose between OpsRequest for dynamic scaling or direct API updates for precise control.
    • Improved Availability: The cluster remains operational during the scaling process, maintaining high availability.

    Cleanup

    To remove all created resources, delete the Milvus Cluster along with its namespace:

    kubectl delete cluster milvus-cluster -n demo kubectl delete ns demo

    Summary

    In this guide, you learned how to:

    1. Deploy a Milvus Cluster managed by KubeBlocks.
    2. Perform vertical scaling by increasing or decreasing resources for the milvus component.
    3. Use both OpsRequest and direct Cluster API updates to adjust resource allocations.

    Vertical scaling is a powerful tool for optimizing resource utilization and adapting to changing workload demands, ensuring your Milvus Cluster remains performant and resilient.

    © 2025 ApeCloud PTE. Ltd.