KubeBlocks
BlogsEnterprise
⌘K
​
Blogs
Overview
Quickstart
Architecture

Operations

Lifecycle Management
Vertical Scaling
Horizontal Scaling
Volume Expansion
Manage PostgreSQL Services
Minor Version Upgrade
Modify PostgreSQL Parameters
PostgreSQL Switchover
Decommission PostgreSQL Replica
Recovering PostgreSQL Replica

Backup And Restores

Create BackupRepo
Create Full Backup
Scheduled Backups
Scheduled Continuous Backup
Restore PostgreSQL Cluster
Restore with PITR

Custom Secret

Custom Password
Custom Password Policy

TLS

PostgreSQL Cluster with TLS
PostgreSQL Cluster with Custom TLS

Monitoring

Observability for PostgreSQL Clusters
FAQs

tpl

  1. Resource Hierarchy
  2. Containers Inside Each Pod
  3. High Availability via Patroni
  4. Traffic Routing
  5. Automatic Failover
  6. System Accounts

PostgreSQL High Availability Architecture

This page describes how KubeBlocks deploys a PostgreSQL high-availability (HA) cluster on Kubernetes — covering the resource hierarchy, pod internals, traffic routing, and automatic failover.

Application / Client
Read/Write  pg-cluster-postgresql:5432
Connection Pool  pg-cluster-postgresql:6432 (pgbouncer)
Direct Replica  pod-N.xxx-headless:5432
RW traffic → roleSelector: primary
direct pod DNS
Kubernetes Services
pg-cluster-postgresql
ClusterIP · :5432 / :6432
selector: kubeblocks.io/role=primary
Endpoints auto-switch with primary
ReadWrite
pg-cluster-postgresql-headless
Headless · no load balancing
pod-N.headless.svc → stable DNS
Patroni heartbeat · direct pod access
Headless
→ primary pod only
→ any pod
Pods · Worker Nodes
postgresql-0PRIMARY
🐘
postgresql (Patroni)
:5432 pg · :8008 patroni API
leader
🔀
pgbouncer
:6432 conn pool
🔍
dbctl (kbagent)
:5001 /v1.0/getrole
📊
pg-exporter
:9187 metrics
💾 PVC data-0 · 20Gi
postgresql-1REPLICA
🐘
postgresql (Patroni)
:5432 pg · :8008 patroni API
replica
🔀
pgbouncer
:6432 conn pool
🔍
dbctl (kbagent)
:5001 /v1.0/getrole
📊
pg-exporter
:9187 metrics
💾 PVC data-1 · 20Gi
postgresql-2REPLICA
🐘
postgresql (Patroni)
:5432 pg · :8008 patroni API
replica
🔀
pgbouncer
:6432 conn pool
🔍
dbctl (kbagent)
:5001 /v1.0/getrole
📊
pg-exporter
:9187 metrics
💾 PVC data-2 · 20Gi
↔Streaming Replication (WAL)primary-0 → replica-1 · replica-2  |  sync / async configurable
Patroni DCS(K8s API)
ConfigMap {scope}-config
TTL 30s · loop 10s
Endpoints {scope}
leader lease · heartbeat
Secret account-*
system account passwords
Failover Process
1Primary Pod crashes
2Patroni heartbeat timeout (≈30s)
3Replica acquires Endpoints lease
4roleProbe detects role change
5Pod label role=primary updated
6Service Endpoints auto-switch
System Accounts
postgressuperuser
kbadminsuperuser
kbdataprotectionbackup
kbprobemonitor
kbmonitoringmetrics
kbreplicatorreplication
Management Plane · KubeBlocks Operator
KubeBlocks Operator· watches & reconciles CRDs, drives creation and reconciliation of all above resources
Apps ControllerCluster / Component
Workloads ControllerInstanceSet → Pods
Ops ControllerSwitchover / Scale
CRD RESOURCE HIERARCHY
Cluster
→
Component
→
InstanceSet
→
Pod × 3
Operator Responsibilities
⚙ Create / reconcile Pods, Services, PVCs
⚙ roleProbe: probe role every 1s
⚙ Update Pod label role=primary
⚙ Execute switchover / scale ops
⚙ Manage SystemAccount Secrets
KubeBlocks Operator (control plane)
CRD Resource
Primary / RW Traffic
Replica Pod
Patroni DCS
Failover Path
Persistent Storage

Resource Hierarchy

KubeBlocks models a PostgreSQL cluster as a hierarchy of Kubernetes custom resources:

Cluster  →  Component  →  InstanceSet  →  Pod × N
ResourceRole
ClusterUser-facing declaration — specifies topology, replicas, storage, and resources
ComponentGenerated automatically; references a ComponentDefinition that describes container specs, lifecycle actions, and services
InstanceSetKubeBlocks custom workload (replaces StatefulSet); manages pods with stable identities and role awareness
PodActual running instance; each pod gets a unique ordinal and its own PVC

Containers Inside Each Pod

Every PostgreSQL pod runs four containers:

ContainerPortPurpose
postgresql (Patroni + Spilo)5432 (PG), 8008 (Patroni API)Database engine with built-in HA via Patroni
pgbouncer6432Per-pod connection pool (always enabled, no on/off switch)
dbctl (kbagent)5001Role probe endpoint — KubeBlocks queries GET /v1.0/getrole every second
pg-exporter9187Prometheus metrics exporter

An init container (pg-init-container) runs postgres-pre-setup.sh before the main containers start to initialize the data directory and configuration files.

Each pod mounts its own PVC (data-{ordinal}, default 20 Gi) at /home/postgres/pgdata, providing independent persistent storage.

High Availability via Patroni

KubeBlocks uses Patroni for PostgreSQL HA, with the Kubernetes API as the DCS (Distributed Configuration Store):

Kubernetes ObjectPurpose
ConfigMap {scope}-configCluster-wide Patroni configuration; TTL 30 s, loop interval 10 s
Endpoints {scope}Leader lease — the node holding this lease is primary
Secret account-*Auto-generated passwords for system accounts

On startup, every pod joins the Patroni cluster under the same scope. Patroni holds a leader election by acquiring the Endpoints lease; the winner becomes primary and the others stream WAL as replicas.

Traffic Routing

KubeBlocks creates two services for each PostgreSQL cluster:

ServiceTypePortsSelector
{cluster}-postgresqlClusterIP5432 (PG), 6432 (pgbouncer)kubeblocks.io/role=primary
{cluster}-postgresql-headlessHeadless—all pods

The key mechanism is roleSelector: primary on the ClusterIP service. KubeBlocks probes each pod's role via dbctl every second and updates the pod label kubeblocks.io/role. The service's Endpoints therefore always point at the current primary — no VIP or external load balancer required.

  • Read/write traffic: connect to {cluster}-postgresql:5432 or :6432 (pgbouncer)
  • Direct pod access (e.g., read replicas, Patroni heartbeats): use the headless service DNS pod-N.{cluster}-postgresql-headless.{namespace}.svc.cluster.local
NOTE

pgbouncer proxies connections to the local pod's PostgreSQL instance (its own pod IP), not to the primary. Traffic steering to the primary is handled by the ClusterIP service's role selector, not by pgbouncer.

Automatic Failover

When the primary pod fails, the following sequence restores service automatically:

  1. Primary pod crashes (process death, node failure, network partition)
  2. Patroni detects heartbeat timeout — the leader lease expires (TTL ≈ 30 s)
  3. Replica acquires the lease — the first healthy replica to write the Endpoints object wins and promotes itself to primary
  4. KubeBlocks detects the role change — the dbctl role probe returns primary for the new winner
  5. Pod label updated — kubeblocks.io/role=primary is applied to the new primary pod
  6. Service Endpoints switch — the ClusterIP service automatically routes traffic to the new primary

Total failover time is typically within 30–60 seconds, bounded by Patroni's TTL and the role probe interval.

For a planned switchover (e.g., maintenance), KubeBlocks calls the Patroni switchover API via switchover.sh, which performs a graceful demotion of the current primary and promotion of a chosen replica with no data loss.

System Accounts

KubeBlocks automatically creates and manages the following PostgreSQL system accounts. Passwords are auto-generated and stored in Secrets named {cluster}-{component}-account-{name}.

AccountRolePurpose
postgresSuperuserDefault admin account
kbadminSuperuserKubeBlocks internal management
kbdataprotectionSuperuserBackup and restore operations
kbprobeMonitor (read-only)Health checks
kbmonitoringMonitorPrometheus metrics collection
kbreplicatorReplicationStreaming replication from replicas

© 2026 KUBEBLOCKS INC