Vertical Scaling for Qdrant Clusters with KubeBlocks

This guide demonstrates how to vertically scale a Qdrant Cluster managed by KubeBlocks by adjusting compute resources (CPU and memory) while maintaining the same number of replicas.

Vertical scaling modifies compute resources (CPU and memory) for Qdrant instances while maintaining replica count. Key characteristics:

Non-disruptive: When properly configured, maintains availability during scaling
Granular: Adjust CPU, memory, or both independently
Reversible: Scale up or down as needed

KubeBlocks ensures minimal impact during scaling operations by following a controlled, role-aware update strategy: Role-Aware Replicas (Primary/Secondary Replicas)

Secondary replicas update first – Non-leader pods are upgraded to minimize disruption.
Primary updates last – Only after all secondaries are healthy does the primary pod restart.
Cluster state progresses from Updating → Running once all replicas are stable.

Role-Unaware Replicas (Ordinal-Based Scaling) If replicas have no defined roles, updates follow Kubernetes pod ordinal order:

Highest ordinal first (e.g., pod-2 → pod-1 → pod-0) to ensure deterministic rollouts.

Prerequisites

Before proceeding, ensure the following:

Environment Setup:
- A Kubernetes cluster is up and running.
- The kubectl CLI tool is configured to communicate with your cluster.
- KubeBlocks CLI and KubeBlocks Operator are installed. Follow the installation instructions here.
Namespace Preparation: To keep resources isolated, create a dedicated namespace for this tutorial:

kubectl create ns demo
namespace/demo created

Deploy a Qdrant Cluster

KubeBlocks uses a declarative approach for managing Qdrant Clusters. Below is an example configuration for deploying a Qdrant Cluster with 3 replicas.

Apply the following YAML configuration to deploy the cluster:


apiVersion: apps.kubeblocks.io/v1
kind: Cluster
metadata:
  name: qdrant-cluster
  namespace: demo
spec:
  terminationPolicy: Delete
  clusterDef: qdrant
  topology: cluster
  componentSpecs:
    - name: qdrant
      serviceVersion: 1.10.0
      replicas: 3
      resources:
        limits:
          cpu: "0.5"
          memory: "0.5Gi"
        requests:
          cpu: "0.5"
          memory: "0.5Gi"
      volumeClaimTemplates:
        - name: data
          spec:
            storageClassName: ""
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 20Gi

Verifying the Deployment

Monitor the cluster status until it transitions to the Running state:

kubectl get cluster qdrant-cluster -n demo -w

Expected Output:

kubectl get cluster qdrant-cluster -n demo
NAME             CLUSTER-DEFINITION   TERMINATION-POLICY   STATUS     AGE
qdrant-cluster   qdrant              Delete               Creating   49s
qdrant-cluster   qdrant              Delete               Running    62s

Check the pod status and roles:

kubectl get pods -l app.kubernetes.io/instance=qdrant-cluster -n demo

Expected Output:

NAME                      READY   STATUS    RESTARTS   AGE
qdrant-cluster-qdrant-0   2/2     Running   0          1m43s
qdrant-cluster-qdrant-1   2/2     Running   0          1m28s
qdrant-cluster-qdrant-2   2/2     Running   0          1m14s

Once the cluster status becomes Running, your Qdrant cluster is ready for use.

TIP

If you are creating the cluster for the very first time, it may take some time to pull images before running.

Vertical Scale

Expected Workflow:

Pods are updated in pod ordinal order, from highest to lowest, (e.g., pod-2 → pod-1 → pod-0)
Cluster status transitions from Updating to Running

Option 1: Using VerticalScaling OpsRequest

Apply the following YAML to scale up the resources for the qdrant component:


apiVersion: operations.kubeblocks.io/v1alpha1
kind: OpsRequest
metadata:
  name: qdrant-cluster-vscale-ops
  namespace: demo
spec:
  clusterName: qdrant-cluster
  type: VerticalScaling
  verticalScaling:
  - componentName: qdrant
    requests:
      cpu: '1'
      memory: 1Gi
    limits:
      cpu: '1'
      memory: 1Gi

You can check the progress of the scaling operation with the following command:

kubectl -n demo get ops qdrant-cluster-vscale-ops -w

Expected Result:

NAME                       TYPE              CLUSTER         STATUS    PROGRESS   AGE
qdrant-cluster-vscale-ops   VerticalScaling   qdrant-cluster   Running   0/3        32s
qdrant-cluster-vscale-ops   VerticalScaling   qdrant-cluster   Running   1/3        55s
qdrant-cluster-vscale-ops   VerticalScaling   qdrant-cluster   Running   2/3        82s
qdrant-cluster-vscale-ops   VerticalScaling   qdrant-cluster   Running   3/3        2m13s

Option 2: Direct Cluster API Update

Alternatively, you may update spec.componentSpecs.resources field to the desired resources for vertical scale.


apiVersion: apps.kubeblocks.io/v1
kind: Cluster
spec:
  componentSpecs:
    - name: qdrant
      replicas: 3
      resources:
        requests:
          cpu: "1"       # Update the resources to your need.
          memory: "1Gi"  # Update the resources to your need.
        limits:
          cpu: "1"       # Update the resources to your need.
          memory: "1Gi"  # Update the resources to your need.
  ...

Best Practices & Considerations

Planning:

Scale during maintenance windows or low-traffic periods
Verify Kubernetes cluster has sufficient resources
Check for any ongoing operations before starting

Execution:

Maintain balanced CPU-to-Memory ratios
Set identical requests/limits for guaranteed QoS

Post-Scaling:

Monitor resource utilization and application performance
Consider adjusting Qdrant parameters if needed

Verification

Verify the updated resources by inspecting the cluster configuration or Pod details:

kbcli cluster describe qdrant-cluster -n demo

Expected Output:

Resources Allocation:
COMPONENT   INSTANCE-TEMPLATE   CPU(REQUEST/LIMIT)   MEMORY(REQUEST/LIMIT)   STORAGE-SIZE   STORAGE-CLASS
qdrant                          1 / 1                1Gi / 1Gi               data:20Gi      <none>

Key Benefits of Vertical Scaling with KubeBlocks

Seamless Scaling: Pods are recreated in a specific order to ensure minimal disruption.
Dynamic Resource Adjustments: Easily scale CPU and memory based on workload requirements.
Flexibility: Choose between OpsRequest for dynamic scaling or direct API updates for precise control.
Improved Availability: The cluster remains operational during the scaling process, maintaining high availability.

Cleanup

To remove all created resources, delete the Qdrant Cluster along with its namespace:

kubectl delete cluster qdrant-cluster -n demo
kubectl delete ns demo

Summary

In this guide, you learned how to:

Deploy a Qdrant Cluster managed by KubeBlocks.
Perform vertical scaling by increasing or decreasing resources for the qdrant component.
Use both OpsRequest and direct Cluster API updates to adjust resource allocations.

Vertical scaling is a powerful tool for optimizing resource utilization and adapting to changing workload demands, ensuring your Qdrant Cluster remains performant and resilient.

Vertical Scaling for Qdrant Clusters with KubeBlocks

This guide demonstrates how to vertically scale a Qdrant Cluster managed by KubeBlocks by adjusting compute resources (CPU and memory) while maintaining the same number of replicas.

Vertical scaling modifies compute resources (CPU and memory) for Qdrant instances while maintaining replica count. Key characteristics:

Non-disruptive: When properly configured, maintains availability during scaling
Granular: Adjust CPU, memory, or both independently
Reversible: Scale up or down as needed

KubeBlocks ensures minimal impact during scaling operations by following a controlled, role-aware update strategy: Role-Aware Replicas (Primary/Secondary Replicas)

Secondary replicas update first – Non-leader pods are upgraded to minimize disruption.
Primary updates last – Only after all secondaries are healthy does the primary pod restart.
Cluster state progresses from Updating → Running once all replicas are stable.

Role-Unaware Replicas (Ordinal-Based Scaling) If replicas have no defined roles, updates follow Kubernetes pod ordinal order:

Highest ordinal first (e.g., pod-2 → pod-1 → pod-0) to ensure deterministic rollouts.

Prerequisites

Before proceeding, ensure the following:

Environment Setup:
- A Kubernetes cluster is up and running.
- The kubectl CLI tool is configured to communicate with your cluster.
- KubeBlocks CLI and KubeBlocks Operator are installed. Follow the installation instructions here.
Namespace Preparation: To keep resources isolated, create a dedicated namespace for this tutorial:

kubectl create ns demo
namespace/demo created

Deploy a Qdrant Cluster

KubeBlocks uses a declarative approach for managing Qdrant Clusters. Below is an example configuration for deploying a Qdrant Cluster with 3 replicas.

Apply the following YAML configuration to deploy the cluster:


apiVersion: apps.kubeblocks.io/v1
kind: Cluster
metadata:
  name: qdrant-cluster
  namespace: demo
spec:
  terminationPolicy: Delete
  clusterDef: qdrant
  topology: cluster
  componentSpecs:
    - name: qdrant
      serviceVersion: 1.10.0
      replicas: 3
      resources:
        limits:
          cpu: "0.5"
          memory: "0.5Gi"
        requests:
          cpu: "0.5"
          memory: "0.5Gi"
      volumeClaimTemplates:
        - name: data
          spec:
            storageClassName: ""
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 20Gi

Verifying the Deployment

Monitor the cluster status until it transitions to the Running state:

kubectl get cluster qdrant-cluster -n demo -w

Expected Output:

kubectl get cluster qdrant-cluster -n demo
NAME             CLUSTER-DEFINITION   TERMINATION-POLICY   STATUS     AGE
qdrant-cluster   qdrant              Delete               Creating   49s
qdrant-cluster   qdrant              Delete               Running    62s

Check the pod status and roles:

kubectl get pods -l app.kubernetes.io/instance=qdrant-cluster -n demo

Expected Output:

NAME                      READY   STATUS    RESTARTS   AGE
qdrant-cluster-qdrant-0   2/2     Running   0          1m43s
qdrant-cluster-qdrant-1   2/2     Running   0          1m28s
qdrant-cluster-qdrant-2   2/2     Running   0          1m14s

Once the cluster status becomes Running, your Qdrant cluster is ready for use.

TIP

If you are creating the cluster for the very first time, it may take some time to pull images before running.

Vertical Scale

Expected Workflow:

Pods are updated in pod ordinal order, from highest to lowest, (e.g., pod-2 → pod-1 → pod-0)
Cluster status transitions from Updating to Running

Option 1: Using VerticalScaling OpsRequest

Apply the following YAML to scale up the resources for the qdrant component:


apiVersion: operations.kubeblocks.io/v1alpha1
kind: OpsRequest
metadata:
  name: qdrant-cluster-vscale-ops
  namespace: demo
spec:
  clusterName: qdrant-cluster
  type: VerticalScaling
  verticalScaling:
  - componentName: qdrant
    requests:
      cpu: '1'
      memory: 1Gi
    limits:
      cpu: '1'
      memory: 1Gi

You can check the progress of the scaling operation with the following command:

kubectl -n demo get ops qdrant-cluster-vscale-ops -w

Expected Result:

NAME                       TYPE              CLUSTER         STATUS    PROGRESS   AGE
qdrant-cluster-vscale-ops   VerticalScaling   qdrant-cluster   Running   0/3        32s
qdrant-cluster-vscale-ops   VerticalScaling   qdrant-cluster   Running   1/3        55s
qdrant-cluster-vscale-ops   VerticalScaling   qdrant-cluster   Running   2/3        82s
qdrant-cluster-vscale-ops   VerticalScaling   qdrant-cluster   Running   3/3        2m13s

Option 2: Direct Cluster API Update

Alternatively, you may update spec.componentSpecs.resources field to the desired resources for vertical scale.


apiVersion: apps.kubeblocks.io/v1
kind: Cluster
spec:
  componentSpecs:
    - name: qdrant
      replicas: 3
      resources:
        requests:
          cpu: "1"       # Update the resources to your need.
          memory: "1Gi"  # Update the resources to your need.
        limits:
          cpu: "1"       # Update the resources to your need.
          memory: "1Gi"  # Update the resources to your need.
  ...

Best Practices & Considerations

Planning:

Scale during maintenance windows or low-traffic periods
Verify Kubernetes cluster has sufficient resources
Check for any ongoing operations before starting

Execution:

Maintain balanced CPU-to-Memory ratios
Set identical requests/limits for guaranteed QoS

Post-Scaling:

Monitor resource utilization and application performance
Consider adjusting Qdrant parameters if needed

Verification

Verify the updated resources by inspecting the cluster configuration or Pod details:

kbcli cluster describe qdrant-cluster -n demo

Expected Output:

Resources Allocation:
COMPONENT   INSTANCE-TEMPLATE   CPU(REQUEST/LIMIT)   MEMORY(REQUEST/LIMIT)   STORAGE-SIZE   STORAGE-CLASS
qdrant                          1 / 1                1Gi / 1Gi               data:20Gi      <none>

Key Benefits of Vertical Scaling with KubeBlocks

Seamless Scaling: Pods are recreated in a specific order to ensure minimal disruption.
Dynamic Resource Adjustments: Easily scale CPU and memory based on workload requirements.
Flexibility: Choose between OpsRequest for dynamic scaling or direct API updates for precise control.
Improved Availability: The cluster remains operational during the scaling process, maintaining high availability.

Cleanup

To remove all created resources, delete the Qdrant Cluster along with its namespace:

kubectl delete cluster qdrant-cluster -n demo
kubectl delete ns demo

Summary

In this guide, you learned how to:

Deploy a Qdrant Cluster managed by KubeBlocks.
Perform vertical scaling by increasing or decreasing resources for the qdrant component.
Use both OpsRequest and direct Cluster API updates to adjust resource allocations.

Vertical scaling is a powerful tool for optimizing resource utilization and adapting to changing workload demands, ensuring your Qdrant Cluster remains performant and resilient.