Version: release-0.8

Backup and restore

This tutorial takes Oracle MySQL as an example and introduces how to create backups and restore data in KubeBlocks. The full PR can be found at Learn KubeBlocks Add-on.

Different classification results in different types of backups, including volume snapshot backup and file backup, data backup and log backup, full backup and incremental backup, as well as scheduled backup and on-demand backup, which differ in terms of their methods, contents, volumes, and timing.

This tutorial illustrates how to realize the most frequently used snapshot backup and file backup in KubeBlocks.

Snapshot backup relies on the volume snapshot capability of Kubernetes.
File backup relies on backup tools provided by database engines.

Now take a quick look at the basic concepts of KubeBlocks in the table below, which also are elaborated in the following tutorial.

📎 Table 1. Terminology

Term	Description	Scope
Backup	Backup object It defines the entity to be backed up.	Namespace
BackupPolicy	Backup policy It defines the policy for each backup type, such as scheduling, retention time, and tools.	Namespace
BackupTool	Backup tool It is the carrier of backup tools in KubeBlocks and should realize the backup and restoration logic of corresponding tools.	Cluster
BackupPolicyTemplate	Template of backup policy It is the bridge between the backup and ClusterDefinition. When creating a cluster, KubeBlocks automatically generates a default backup policy for each cluster according to BackupPolicyTemplate.	Cluster

Before you start

Finish the configuration in Add an add-on to KubeBlocks.
Grasp the basics of K8s concepts, such as Pod, PVC, PV, VolumeSnapshot, etc.

Step 1. Prepare environment

Install CSI Driver.

Since volume snapshot is only available for CSI Drivers, make sure your Kubernetes is properly configured.
- For the localhost, you can quickly install csi-host-driver by KubeBlocks add-on.
```
kbcli addon enable csi-hostpath-driver
```
- For a cloud environment, configure the corresponding CSI Driver based on your environment.

Set the storageclass to the default to make it easier to create clusters.

kubectl get sc
>
NAME                        PROVISIONER             RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
csi-hostpath-sc (default)   hostpath.csi.k8s.io     Delete          WaitForFirstConsumer   true                   35s

Step 2. Specify a volume type

Specify a volume type in the ClusterDefinition and it is required.

  componentDefs:
    - name: mysql-compdef
      characterType: mysql
      workloadType: Stateful
      service:
        ports:
          - name: mysql
            port: 3306
            targetPort: mysql
      volumeTypes:
        - name: data
          type: data

volumeTypes is used to specify volume type and name.

There are mainly two kinds of volume types (volumeTypes.type):

data: Data information
log: Log information

KubeBlocks supports different backup methods for data and logs. In this tutorial, only data volume information is configured.

Step 3. Add backup configuration

Prepare BackupPolicyTemplate.yml and BackupTool.yml to add the backup configuration.

BackupPolicy template

It is a template of backup policy, and covers:

Which cluster components to back up
Whether the backups are scheduled
How to set up snapshot backup
How to set up file backup

apiVersion: apps.kubeblocks.io/v1alpha1
kind: BackupPolicyTemplate
metadata:
  name: oracle-mysql-backup-policy-template
  labels:
    clusterdefinition.kubeblocks.io/name: oracle-mysql # Specify scope through labels (Required)
spec:
  clusterDefinitionRef: oracle-mysql  # Specify the scope, indicating which ClusterDef generates the cluster
  backupPolicies:
  - componentDefRef: mysql-compdef    # Specify the scope, indicating which component is involved
    schedule:                         # Specify the timing of scheduled backups and startup status
      snapshot:
        enable: true                  # Enable scheduled snapshot backups
        cronExpression: "0 18 * * *"
      datafile:                       # Disable scheduled datafile backups
        enable: false
        cronExpression: "0 18 * * *"        
    snapshot:                         # Snapshot backup, which keeps the latest 5 versions by default
      backupsHistoryLimit: 5
    datafile:                         # Datafile backup which depends on backup tools
      backupToolName: oracle-mysql-xtrabackup

If a scheduled task is enabled, KubeBlocks creates a CronJob in the background.

After a new cluster is created, KubeBlocks discovers the corresponding template name by clusterdefinition.kubeblocks.io/name and creates the corresponding BackupPolicy.

note

If you have added BackupPolicyTemplate but there is no default BackupPolicy for the new cluster, check whether the following requirements:

Whether ClusterDefinitionRef is correct.
Whether the BackupPolicyTemplate label is correct.
Whether there are multiple BackupPolicyTemplates. If yes, mark one as the default template using annotations.
```
  annotations:
   dataprotection.kubeblocks.io/is-default-policy-template: "true"
```

BackupTool

note

BackupTool mainly serves datafile backup. If you only need snapshot backups, there is no need to configure BackupTool.

BackTool.yml describes the detailed execution logic of a backup tool and mainly serves datafile backup. It should cover:

Image of backup tools
Scripts of backup
Scripts of restore

apiVersion: dataprotection.kubeblocks.io/v1alpha1
kind: BackupTool
metadata:
  name: oracle-mysql-xtrabackup
  labels:
spec:
  image: docker.io/perconalab/percona-xtrabackup:8.0.32  # Back up via xtrabackup
  env:                         # Inject the name of dependent environment variables
    - name: DATA_DIR
      value: /var/lib/mysql
  physical:
    restoreCommands:           # Restore commands
      - sh
      - -c
      ...
  backupCommands:             # Backup commands
    - sh
    - -c
    ...

The configuration of BackupTool is closely related to the tools used.

For example, if you back up via Percona Xtrabackup, you need to fill in scripts in backupCommands and restoreCommands.

Step 4. Back up and restore a cluster

With everything ready, try to back up a cluster and restore data to a new cluster.

4.1 Create a cluster

Since BackupPolicyTemplate has been added, after a cluster is created, KubeBlocks can discover the backup policy and create a BackupPolicy for this cluster.

Create a cluster.

kbcli
Helm

kbcli cluster create mycluster --cluster-definition oracle-mysql

helm install mysql ./tutorial-2-backup-restore/oracle-mysql

View the backup policy of this cluster.

kbcli cluster list-backup-policy mycluster

4.2 Snapshot backups

kbcli cluster backup mycluster --type snapshot

type specifies the backup type, indicating whether it is a snapshot or datafile.

If there are multiple backup policies, specify it with the --policy flag.

4.3 Datafile backups

KubeBlocks supports backup to local storage and cloud object storage. The following is an example of backing up to your localhost.

Modify BackupPolicy and specify the PVC name.

As shown below in spec.datafile.persistentVolumeClaim.name, specify the PVC name.

  spec:
    datafile:
      backupToolName: oracle-mysql-xtrabackup
      backupsHistoryLimit: 7
      persistentVolumeClaim:
        name: mycluster-backup-pvc
        createPolicy: IfNotPresent
        initCapacity: 20Gi

Set --type to datafile.

kbcli cluster backup mycluster  --type datafile

4.4 Create a cluster from backups

Check the backups.
```
kbcli cluster list-backups
```

Select a backup and create a cluster.

kbcli cluster restore <clusterName> --backup <backup-name>

And a new cluster is created.

caution

It should be noted that some databases only create the root account and password during the first initialization.

Therefore, although a new root account and password are created when restoring a cluster from backups, they are not effective. You still need to log in with the root account and password of the original cluster.

Reference

For more details on the backup and restore function of KubeBlocks, refer to Backup and Restore.

Appendix

A.1 Cluster data protection policies

KubeBlocks provides various data protection policies for stateful clusters, each offering various data options. Try the following scenarios:

If you delete a cluster using kbcli cluster delete, will the backup still be available?
If you change the terminationPolicy of a cluster to WipeOut and then delete it, will the backup still be available?
If you change the terminationPolicy of a cluster to DoNotTerminate and then delete it, what will happen?

note

Refer to the data protection policies of KubeBlocks via Termination Policy.

A.2 Monitor backup progress

In Step 4, you have created a backup using the backup subcommand.

kbcli cluster backup mycluster  --type snapshot

A new backup object is generated and you can view the progress by running the describe-backup subcommand.

kbcli cluster describe-backup <your-back-up-name>

Before you start​

Step 1. Prepare environment​

Step 2. Specify a volume type​

Step 3. Add backup configuration​

BackupPolicy template​

BackupTool​

Step 4. Back up and restore a cluster​

4.1 Create a cluster​

4.2 Snapshot backups​

4.3 Datafile backups​

4.4 Create a cluster from backups​

Reference​

Appendix​

A.1 Cluster data protection policies​

A.2 Monitor backup progress​

Before you start

Step 1. Prepare environment

Step 2. Specify a volume type

Step 3. Add backup configuration

BackupPolicy template

BackupTool

Step 4. Back up and restore a cluster

4.1 Create a cluster

4.2 Snapshot backups

4.3 Datafile backups

4.4 Create a cluster from backups

Reference

Appendix

A.1 Cluster data protection policies

A.2 Monitor backup progress