Elasticsearch Architecture in KubeBlocks

This page describes how KubeBlocks deploys an Elasticsearch cluster on Kubernetes — covering the resource hierarchy, pod internals, node roles, and built-in HA through Elasticsearch's cluster coordination protocol.

Application / Client

REST es-cluster-dit-http:9200 · DIT component handles all client traffic

REST :9200

es-cluster-dit-http

ClusterIP · :9200 REST API
Client traffic → DIT pods

*-headless (both components)

Headless · :9200 / :9300
Inter-node transport + operator probes

to DIT pods

Master Component

master-0 (elected)ELECTED

🧠elasticsearchrole: master

⚙️es-agent:8080

📊exporter:9114

💾 PVC data · 10Gi

master-1MASTER-ELIG

🧠elasticsearchrole: master

⚙️es-agent:8080

📊exporter:9114

💾 PVC data · 10Gi

master-2MASTER-ELIG

🧠elasticsearchrole: master

⚙️es-agent:8080

📊exporter:9114

💾 PVC data · 10Gi

🗳️Quorum election· tolerates 1 failure

DIT Component (data + ingest + transform)

dit-0DATA

🔍elasticsearchdata·ingest·transform

⚙️es-agent:8080

📊exporter:9114

💾 PVC data · 30Gi

dit-1DATA

🔍elasticsearchdata·ingest·transform

⚙️es-agent:8080

📊exporter:9114

💾 PVC data · 30Gi

+ scale-out via HorizontalScaling OpsRequest

↔Shard replication· primaries + replicas distributed across dit pods

🔗Transport :9300— master ↔ DIT inter-node communication (cluster state, shard allocation, replication)

Master — cluster state & election

DIT — data, ingest, transform

Client REST traffic

Inter-node transport

Persistent storage

Resource Hierarchy

KubeBlocks models an Elasticsearch cluster as a hierarchy of Kubernetes custom resources:

Cluster  →  Component  →  InstanceSet  →  Pod × N

Resource	Role
Cluster	User-facing declaration — specifies topology, node roles, shard counts, replicas, and resources
Component	Generated automatically; references a `ComponentDefinition` that describes container specs, lifecycle actions, and services
InstanceSet	KubeBlocks custom workload (replaces `StatefulSet`); manages pods with stable identities and role awareness
Pod	Actual running instance; each pod gets a unique ordinal and its own PVC

Containers Inside Each Pod

Every Elasticsearch pod runs three main containers (plus three init containers on startup: prepare-plugins stages plugin files from a plugin image into a shared volume, install-plugins installs those plugins and prepares the filesystem layout, and install-es-agent copies the es-agent binary into the container's local bin path):

Container	Port	Purpose
`elasticsearch`	9200 (HTTP REST), 9300 (transport/inter-node)	Elasticsearch engine handling indexing, search, and cluster coordination
`es-agent`	8080	Sidecar agent for configuration management and lifecycle operations
`exporter`	9114	Prometheus metrics exporter (`elasticsearch-exporter`)

Each pod mounts its own PVC for the Elasticsearch data directory (/usr/share/elasticsearch/data), providing independent persistent storage per node.

Node roles (Elasticsearch)

Elasticsearch nodes are configured with roles (master-eligible, data, ingest, coordinating, etc.). The sections below describe what each role means for capacity and HA.

Node role	Responsibility
Master-eligible	Participates in leader election; manages cluster state, index mappings, and shard allocation
Data	Stores shard data; handles indexing and search requests for its assigned shards
Ingest	Pre-processes documents before indexing via ingest pipelines
Coordinating (optional)	Routes client requests to the appropriate data nodes and aggregates results

In smaller deployments, one process can hold several roles. In production, splitting roles across nodes improves stability.

Topologies and component names (ClusterDefinition)

In the kubeblocks-addons Elasticsearch chart, spec.topology selects a layout. KubeBlocks creates one Component per entry in that topology; component names are short labels (master, dit, mdit, …), while the Elasticsearch role set is defined inside the image/config for each layout.

Topology (`spec.topology`)	Components created	Notes
`single-node`	`mdit`	Single-node layout
`multi-node` (chart default)	`master`, `dit`	Split layout: dedicated `master` Component plus `dit` Component for the remaining node group
`m-dit`	`master`, `dit`	Same component names as `multi-node`; chart distinguishes layouts for ordering/defaults
`mdit`	`mdit`	Combined multi-role naming under one component
`m-d-i-t`	`m`, `d`, `i`, `t`	Dedicated components per role family (master / data / ingest / coordinating)

Service names look like {cluster}-{component}-http — the {component} segment is the KubeBlocks Component name above (for example mdit, master, dit), not the long English phrase “master-eligible”. Use your Cluster’s status or kubectl get component -n <ns> to see the exact names for a running cluster.

High Availability via Cluster Coordination

Elasticsearch provides built-in HA through its own cluster coordination protocol (quorum-based master election). No external coordinator is required:

Mechanism	Description
Master election	Master-eligible nodes elect a leader via quorum voting; requires `(N/2 + 1)` master-eligible nodes to agree
Shard replication	Each index shard has a primary and one or more replica shards; replicas are placed on different nodes for fault tolerance
Primary shard promotion	If a node holding a primary shard fails, Elasticsearch automatically promotes an in-sync replica to primary
Cluster state replication	The elected master replicates cluster state changes to all nodes before acknowledging writes

In Elasticsearch 7.x and later, the quorum is configured via the voting configuration — a set of master-eligible nodes whose agreement is required. Split-brain prevention relies on maintaining quorum rather than the legacy minimum_master_nodes setting (which was removed in 7.0). KubeBlocks provides separate ComponentDefinition configurations for Elasticsearch 6.x, 7.x, and 8.x.

Traffic Routing

KubeBlocks creates three services for each Elasticsearch cluster:

Service	Type	Port	Notes
`{cluster}-{component}-http`	ClusterIP	9200	Client REST API endpoint; all pods (no roleSelector)
`{cluster}-{component}-agent`	ClusterIP	8080	`es-agent` sidecar endpoint; used for lifecycle operations and configuration management
`{cluster}-{component}-headless`	Headless	9200, 9300	Per-pod DNS — inter-node transport and operator probes

The component name depends on the topology — for example mdit (single combined node) or master/dit (multi-node). Client applications send REST API requests to {cluster}-{component}-http:9200. Inter-node transport communication (shard replication, cluster state synchronization) uses port 9300 over the headless service so each pod is individually addressable:

{pod-name}.{cluster}-{component}-headless.{namespace}.svc.cluster.local:9300

Automatic Failover

When an Elasticsearch node fails, the cluster responds automatically:

Node leaves the cluster — heartbeat timeout expires; the master removes the node from cluster state
Primary shard promotion — for any primary shards on the failed node, an in-sync replica is promoted to primary immediately
Shard rebalancing — Elasticsearch allocates new replica shards on remaining nodes to restore the target replication factor
Master failover — if the failed node was the elected master, surviving master-eligible nodes hold a new election within seconds

Recovery time depends on shard size and network throughput but requires no manual intervention.

System Accounts

KubeBlocks automatically manages the following Elasticsearch system account. Passwords are auto-generated and stored in a Secret named {cluster}-{component}-account-{name}.

Account	Role	Purpose
`elastic`	Superuser	Built-in Elasticsearch superuser; used for cluster setup, index management, and security configuration
`kibana_system`	Monitor / manage index	Built-in account used by Kibana to communicate with Elasticsearch

Elasticsearch Architecture in KubeBlocks

Application / Client

REST es-cluster-dit-http:9200 · DIT component handles all client traffic

REST :9200

es-cluster-dit-http

ClusterIP · :9200 REST API
Client traffic → DIT pods

*-headless (both components)

Headless · :9200 / :9300
Inter-node transport + operator probes

to DIT pods

Master Component

master-0 (elected)ELECTED

🧠elasticsearchrole: master

⚙️es-agent:8080

📊exporter:9114

💾 PVC data · 10Gi

master-1MASTER-ELIG

🧠elasticsearchrole: master

⚙️es-agent:8080

📊exporter:9114

💾 PVC data · 10Gi

master-2MASTER-ELIG

🧠elasticsearchrole: master

⚙️es-agent:8080

📊exporter:9114

💾 PVC data · 10Gi

🗳️Quorum election· tolerates 1 failure

DIT Component (data + ingest + transform)

dit-0DATA

🔍elasticsearchdata·ingest·transform

⚙️es-agent:8080

📊exporter:9114

💾 PVC data · 30Gi

dit-1DATA

🔍elasticsearchdata·ingest·transform

⚙️es-agent:8080

📊exporter:9114

💾 PVC data · 30Gi

+ scale-out via HorizontalScaling OpsRequest

↔Shard replication· primaries + replicas distributed across dit pods

🔗Transport :9300— master ↔ DIT inter-node communication (cluster state, shard allocation, replication)

Master — cluster state & election

DIT — data, ingest, transform

Client REST traffic

Inter-node transport

Persistent storage

Resource Hierarchy

KubeBlocks models an Elasticsearch cluster as a hierarchy of Kubernetes custom resources:

Cluster  →  Component  →  InstanceSet  →  Pod × N

Resource	Role
Cluster	User-facing declaration — specifies topology, node roles, shard counts, replicas, and resources
Component	Generated automatically; references a `ComponentDefinition` that describes container specs, lifecycle actions, and services
InstanceSet	KubeBlocks custom workload (replaces `StatefulSet`); manages pods with stable identities and role awareness
Pod	Actual running instance; each pod gets a unique ordinal and its own PVC

Containers Inside Each Pod

Container	Port	Purpose
`elasticsearch`	9200 (HTTP REST), 9300 (transport/inter-node)	Elasticsearch engine handling indexing, search, and cluster coordination
`es-agent`	8080	Sidecar agent for configuration management and lifecycle operations
`exporter`	9114	Prometheus metrics exporter (`elasticsearch-exporter`)

Each pod mounts its own PVC for the Elasticsearch data directory (/usr/share/elasticsearch/data), providing independent persistent storage per node.

Node roles (Elasticsearch)

Elasticsearch nodes are configured with roles (master-eligible, data, ingest, coordinating, etc.). The sections below describe what each role means for capacity and HA.

Node role	Responsibility
Master-eligible	Participates in leader election; manages cluster state, index mappings, and shard allocation
Data	Stores shard data; handles indexing and search requests for its assigned shards
Ingest	Pre-processes documents before indexing via ingest pipelines
Coordinating (optional)	Routes client requests to the appropriate data nodes and aggregates results

In smaller deployments, one process can hold several roles. In production, splitting roles across nodes improves stability.

Topologies and component names (ClusterDefinition)

Topology (`spec.topology`)	Components created	Notes
`single-node`	`mdit`	Single-node layout
`multi-node` (chart default)	`master`, `dit`	Split layout: dedicated `master` Component plus `dit` Component for the remaining node group
`m-dit`	`master`, `dit`	Same component names as `multi-node`; chart distinguishes layouts for ordering/defaults
`mdit`	`mdit`	Combined multi-role naming under one component
`m-d-i-t`	`m`, `d`, `i`, `t`	Dedicated components per role family (master / data / ingest / coordinating)

High Availability via Cluster Coordination

Elasticsearch provides built-in HA through its own cluster coordination protocol (quorum-based master election). No external coordinator is required:

Mechanism	Description
Master election	Master-eligible nodes elect a leader via quorum voting; requires `(N/2 + 1)` master-eligible nodes to agree
Shard replication	Each index shard has a primary and one or more replica shards; replicas are placed on different nodes for fault tolerance
Primary shard promotion	If a node holding a primary shard fails, Elasticsearch automatically promotes an in-sync replica to primary
Cluster state replication	The elected master replicates cluster state changes to all nodes before acknowledging writes

Traffic Routing

KubeBlocks creates three services for each Elasticsearch cluster:

Service	Type	Port	Notes
`{cluster}-{component}-http`	ClusterIP	9200	Client REST API endpoint; all pods (no roleSelector)
`{cluster}-{component}-agent`	ClusterIP	8080	`es-agent` sidecar endpoint; used for lifecycle operations and configuration management
`{cluster}-{component}-headless`	Headless	9200, 9300	Per-pod DNS — inter-node transport and operator probes

{pod-name}.{cluster}-{component}-headless.{namespace}.svc.cluster.local:9300

Automatic Failover

When an Elasticsearch node fails, the cluster responds automatically:

Node leaves the cluster — heartbeat timeout expires; the master removes the node from cluster state
Primary shard promotion — for any primary shards on the failed node, an in-sync replica is promoted to primary immediately
Shard rebalancing — Elasticsearch allocates new replica shards on remaining nodes to restore the target replication factor
Master failover — if the failed node was the elected master, surviving master-eligible nodes hold a new election within seconds

Recovery time depends on shard size and network throughput but requires no manual intervention.

System Accounts

KubeBlocks automatically manages the following Elasticsearch system account. Passwords are auto-generated and stored in a Secret named {cluster}-{component}-account-{name}.

Account	Role	Purpose
`elastic`	Superuser	Built-in Elasticsearch superuser; used for cluster setup, index management, and security configuration
`kibana_system`	Monitor / manage index	Built-in account used by Kibana to communicate with Elasticsearch