KubeBlocks
BlogsSkillsEnterprise
⌘K
​
Blogs
Overview
Quickstart
Architecture

Operations

Lifecycle Management
Vertical Scaling
Horizontal Scaling
Volume Expansion
Manage PostgreSQL Services
Minor Version Upgrade
Modify PostgreSQL Parameters
PostgreSQL Switchover
Decommission PostgreSQL Replica
Recovering PostgreSQL Replica

Backup And Restores

Create BackupRepo
Create Full Backup
Scheduled Backups
Scheduled Continuous Backup
Restore PostgreSQL Cluster
Restore with PITR

Custom Secret

Custom Password
Custom Password Policy

TLS

PostgreSQL Cluster with TLS
PostgreSQL Cluster with Custom TLS

Monitoring

Observability for PostgreSQL Clusters
FAQs

tpl

  1. Resource Hierarchy
  2. Containers Inside Each Pod
  3. High Availability via Patroni
  4. Traffic Routing
  5. Automatic Failover
  6. System Accounts

PostgreSQL High Availability Architecture

This page describes how KubeBlocks deploys a PostgreSQL high-availability (HA) cluster on Kubernetes — covering the resource hierarchy, pod internals, traffic routing, and automatic failover.

Application / Client
Read/Write  pg-cluster-postgresql-postgresql:5432
Connection Pool  pg-cluster-postgresql-postgresql:6432 (pgbouncer)
RW traffic → roleSelector: primary
Kubernetes Services
pg-cluster-postgresql-postgresql
ClusterIP · :5432 / :6432
selector: kubeblocks.io/role=primary
Endpoints auto-switch with primary
ReadWrite
→ primary pod only
Pods · Worker Nodes
postgresql-0PRIMARY
postgresql (Patroni)
:5432 pg · :8008 patroni API
leader
pgbouncer
:6432 conn pool
dbctl (role probe)
:5001 /v1.0/getrole
pg-exporter
:9187 metrics
PVCdata-0 · 20Gi
postgresql-1REPLICA
postgresql (Patroni)
:5432 pg · :8008 patroni API
replica
pgbouncer
:6432 conn pool
dbctl (role probe)
:5001 /v1.0/getrole
pg-exporter
:9187 metrics
PVCdata-1 · 20Gi
postgresql-2REPLICA
postgresql (Patroni)
:5432 pg · :8008 patroni API
replica
pgbouncer
:6432 conn pool
dbctl (role probe)
:5001 /v1.0/getrole
pg-exporter
:9187 metrics
PVCdata-2 · 20Gi
↔Streaming Replication (WAL)primary-0 → replica-1 · replica-2  |  sync / async configurable
Headless service — stable pod DNS for internal use (replication, HA heartbeat, operator probes); not a client endpoint
each Patroni agent reads/writes K8s API via :8008
Patroni DCS · K8s API
ConfigMap {scope}-configcluster config · TTL 30s
ConfigMap {scope}leader lease · heartbeat
Secret account-*system account passwords
Poll every 10s · TTL 30s
Leader election via K8s lock
Failover → service re-routes
Primary / RW Traffic
Replica Pod
Patroni DCS (K8s API)
Persistent Storage

Resource Hierarchy

KubeBlocks models a PostgreSQL cluster as a hierarchy of Kubernetes custom resources:

Cluster  →  Component  →  InstanceSet  →  Pod × N
ResourceRole
ClusterUser-facing declaration — specifies topology, replicas, storage, and resources
ComponentGenerated automatically; references a ComponentDefinition that describes container specs, lifecycle actions, and services
InstanceSetKubeBlocks custom workload (replaces StatefulSet); manages pods with stable identities and role awareness
PodActual running instance; each pod gets a unique ordinal and its own PVC

Containers Inside Each Pod

Every PostgreSQL pod runs four containers:

ContainerPortPurpose
postgresql (Patroni + Spilo)5432 (PG), 8008 (Patroni API)Database engine with built-in HA via Patroni
pgbouncer6432Per-pod connection pool (always enabled, no on/off switch)
dbctl5001Role probe sidecar — KubeBlocks queries GET /v1.0/getrole every second to detect the current role
exporter9187Prometheus metrics exporter

An init container (pg-init-container) runs postgres-pre-setup.sh before the main containers start to initialize the data directory and configuration files.

Each pod mounts its own PVC (data-{ordinal}, default 20 Gi) at /home/postgres/pgdata, providing independent persistent storage.

High Availability via Patroni

KubeBlocks uses Patroni for PostgreSQL HA. By default the Kubernetes API is used as the DCS (Distributed Configuration Store); etcd can optionally be used instead by providing a serviceRef to an etcd cluster.

Kubernetes ObjectPurpose
ConfigMap {scope}-configCluster-wide Patroni configuration; TTL 30 s, loop interval 10 s
ConfigMap {scope}Leader lease — the node holding this ConfigMap annotation is primary (KUBERNETES_USE_CONFIGMAPS=true)
Secret account-*Auto-generated passwords for system accounts

On startup, every pod joins the Patroni cluster under the same scope. Patroni holds a leader election by atomically acquiring the leader annotation on the {scope} ConfigMap; the winner becomes primary and the others stream WAL as replicas.

Traffic Routing

KubeBlocks creates two services for each PostgreSQL cluster:

ServiceTypePortsSelector
{cluster}-postgresql-postgresqlClusterIP5432 (PG), 6432 (pgbouncer)kubeblocks.io/role=primary
{cluster}-postgresql-headlessHeadless—all pods

The key mechanism is roleSelector: primary on the ClusterIP service. KubeBlocks probes each pod's role via dbctl every second and updates the pod label kubeblocks.io/role. The service's Endpoints therefore always point at the current primary — no VIP or external load balancer required.

  • Read/write traffic: connect to {cluster}-postgresql-postgresql:5432 or :6432 (pgbouncer)
  • Direct pod access (e.g., read replicas, Patroni heartbeats): use the headless service DNS pod-N.{cluster}-postgresql-headless.{namespace}.svc.cluster.local
NOTE

pgbouncer proxies connections to the local pod's PostgreSQL instance (its own pod IP), not to the primary. Traffic steering to the primary is handled by the ClusterIP service's role selector, not by pgbouncer.

Automatic Failover

When the primary pod fails, the following sequence restores service automatically:

  1. Primary pod crashes (process death, node failure, network partition)
  2. Patroni detects heartbeat timeout — the leader lease expires (TTL ≈ 30 s)
  3. Replica acquires the lease — the first healthy replica to atomically update the leader annotation on the {scope} ConfigMap wins and promotes itself to primary
  4. KubeBlocks detects the role change — the dbctl role probe returns primary for the new winner
  5. Pod label updated — kubeblocks.io/role=primary is applied to the new primary pod
  6. Service Endpoints switch — the ClusterIP service automatically routes traffic to the new primary

Total failover time is typically within 30–60 seconds, bounded by Patroni's TTL and the role probe interval.

For a planned switchover (e.g., maintenance), KubeBlocks calls the Patroni switchover API via switchover.sh, which performs a graceful demotion of the current primary and promotion of a chosen replica with no data loss.

System Accounts

KubeBlocks automatically creates and manages the following PostgreSQL system accounts. Passwords are auto-generated and stored in Secrets named {cluster}-{component}-account-{name}.

AccountRolePurpose
postgresSuperuserDefault admin account
kbadminSuperuserKubeBlocks internal management
kbdataprotectionSuperuserBackup and restore operations
kbprobeMonitor (pg_monitor role)Health checks and liveness monitoring; role detection and kubeblocks.io/role label updates are driven by the dbctl sidecar probe, not this account
kbmonitoringMonitor (pg_monitor role)Prometheus metrics collection
kbreplicatorReplicationPostgreSQL streaming replication — standby connects to the primary using this account to pull WAL

© 2026 KUBEBLOCKS INC