Migrate data to ApeCloud MySQL by AWS DMS
- Using the public network and network load balancer may incur expenses.
- The following tutorial is based on the prerequisite that ApeCloud MySQL is deployed on AWS EKS. Using other Kubernetes clusters to deploy ApeCloud MySQL is not included.
Network configuration
Expose the target ApeCloud MySQL network
The Kubernetes ClusterIP of ApeCloud MySQL is exposed by default in the EKS environment. But the migration task of DMS (Database Migration Service) runs in an independent Replication Instance, in which the Replication Instance can be set with the same VPC used by the Kubernetes clusters, but visiting ClusterIP still fails. This solution aims to connect this part of the network.
KubeBlocks native solution
Before you start
-
Install KubeBlocks: You can install KubeBlocks by kbcli or by Helm.
-
Enable the AWS loadbalancer controller add-on.
kbcli addon list
kbcli addon enable aws-load-balancer-controller
>
addon.extensions.kubeblocks.io/aws-load-balancer-controller enabledIf the loadbalancer is not enabled successfully, it may relate to your environment since the loadbalancer add-on relies on the EKS environment.
Check your EKS environment and enable this add-on again. For enabling add-on details, refer to Enable add-ons.
Steps
-
Create an ApeCloud MySQL cluster on AWS. Refer to Create an ApeCloud MySQL cluster for details.
-
Fill in the cluster name and run the command below to expose the external IP of the cluster.
kbcli cluster expose mysql-cluster --enable=true --type='vpc'
noteFor the above
kbcli cluster expose
command, the available value for--type
arevpc
andinternet
. Use--type=vpc
for access within the same VPC and--type=internet
for cross-VPC access under the public network.Run the command below to view the external IP:Port address which can be accessed by the same VPC machine but outside the EKS cluster.
kbcli cluster describe mysql-cluster | grep -A 3 Endpoints
>
Endpoints:
COMPONENT MODE INTERNAL EXTERNAL
mysql ReadWrite 10.100.51.xxx:3306 172.31.35.xxx:3306 -
Configure the external IP:Port as the target endpoint on AWS DMS.
This operation generates an ENI (Elastic Network Interface) on EC2. If the quota of the low-spec machine is small, pay more attention to the available level of ENI.
For the corresponding ENI specifications, refer to Elastic network interfaces - Amazon Elastic Compute Cloud.
Use Network Load Balancer (NLB) to expose the service
-
Install Load Balancer Controller on EKS.
For installation details, refer to Installing the AWS Load Balancer Controller add-on.
For how to create NLB in a cluster, refer to Network load balancing on Amazon EKS.
-
Create a service that uses NLB to expose the ApeCloud MySQL service.
Configure
metadata.name
,metadata.annotations
,metadata.labels
, andspec.selector
according to your actual environment.cat <<EOF | kubectl apply -f -
kind: Service
apiVersion: v1
metadata:
name: apecloud-mysql-service
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: nlb-ip
alb.ingress.kubernetes.io/scheme: internet-facing
service.beta.kubernetes.io/aws-load-balancer-subnets: <subnet name1>,<subnet name2>
labels:
apps.kubeblocks.io/component-name: mysql
app.kubernetes.io/instance: <apecloud-mysql clustername>
app.kubernetes.io/managed-by: kubeblocks
app.kubernetes.io/name: apecloud-mysql
spec:
externalTrafficPolicy: Cluster
type: LoadBalancer
selector:
apps.kubeblocks.io/component-name: mysql
app.kubernetes.io/instance: <apecloud-mysql clustername>
app.kubernetes.io/managed-by: kubeblocks
kubeblocks.io/role: leader
ports:
- name: http
protocol: TCP
port: 3306
targetPort: mysql
EOF -
Check whether this new service and NLB run normally.
kubectl get svc
>
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S)
apecloud-mysql-service LoadBalancer 10.100.xx.xx k8s-xx-xx-xx.elb.cn-northwest-1.amazonaws.com.cn 3306:xx/TCPMake sure the server runs normally and can generate EXTERNAL-IP. Meanwhile, verify whether the NLB state is
Active
by the AWS console, then you can access the cluster by EXTERNAL-IP:Port.
Expose the source network
There exist four different conditions for the source network. Choose one method to expose the source network according to your actual environment.
-
Alibaba Cloud ApsaraDB RDS
Use the public network. Refer to Apply for or release a public endpoint for an ApsaraDB RDS for MySQL instance to release a public endpoint then create an endpoint in AWS DMS.
-
RDS within the same VPC in AWS
You only need to specify an RDS when creating an endpoint in DMS and no extra operation is required.
For creating an endpoint, refer to step 2 in Configure AWS DMS tasks.
-
RDS within different VPCs in AWS
Use the public network to create an endpoint. Refer to this document to make public network access available, then create an endpoint in AWS DMS.
For creating an endpoint, refer to step 2 in Configure AWS DMS tasks.
-
MySQL in AWS EKS
Use NLB to expose the service.
-
Install Load Balancer Controller.
For installation details, refer to Installing the AWS Load Balancer Controller add-on.
For how to create NLB in a cluster, refer to Network load balancing on Amazon EKS.
-
Create the service using NLB.
Make sure the value of
some.label.key
inmetadata.labels
is consistent with the value of ApeCloud MySQL you created.Configure
port
andtargetPort
inspec.ports
according to your current environment.cat <<EOF | kubectl apply -f -
kind: Service
apiVersion: v1
metadata:
name: mysql-local-service
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: nlb-ip
alb.ingress.kubernetes.io/scheme: internet-facing
service.beta.kubernetes.io/aws-load-balancer-subnets: ${subnet name1},${subnet name2}
labels:
some.label.key: some-label-value
spec:
externalTrafficPolicy: Cluster
type: LoadBalancer
selector:
some.label.key: some-label-value
ports:
- name: http
protocol: TCP
port: 3306
targetPort: 3306
EOF -
Make sure Service and NLB run normally.
Refer to step 3 in Use Network Load Balancer (NLB) to expose the service for details.
-
Configure AWS DMS tasks
Pay attention to the following potential issues during the migration task.
-
Double write
During the migration, make sure no business is writing to the target data instance. Otherwise, double write occurs.
-
Disk space of the target instance
Since the transfer tool uses a concurrent write model when writing to the target database, out-of-order writes may occur, which may trigger page splitting and cause the data space of the target database to be slightly enlarged compared with that of the original instance. It is recommended to plan appropriately when allocating the storage size of the target database, for example, at least 1.5 times the current storage size of the source database.
-
DDL and onlineDDL
Locked structure changes often affect the speed of data migration.
The lock-free structure change is based on the rename of the temporary table in principle, which causes data problems if the migration object is not the whole database migration.
For example, if the migration object chooses to migrate db1.table1 to the target, and an onlineDDL is performed on db1.table1 on the source database during the process, the data of db1.table1 on the target database will be inconsistent with the source database.
It should be noted that the way some database management tools initiate DDL is performed by using lock-free mutation by default.
Migration is a short-term behavior. To avoid unnecessary troubles, it is recommended not to perform DDL operations during the migration process.
-
BinLog retention hours
The incrementally migrating process of data transmission relies on the BinLog of the source database.
It is recommended to extend the BinLog retention hours to avoid a long-term interruption and the situation that the BinLog of the source database is cleared during recovery, resulting in the migration not being resumed.
For example, in AWS RDS, connect to the database and run the command below:
# View configuration
# Input:
call mysql.rds_show_configuration;
# Output: Pay attention to the BinLog retention hours.
+------------------------+-------+-----------------------------------------------------------------------------------------------------------+
| name | value | description |
+------------------------+-------+-----------------------------------------------------------------------------------------------------------+
| binlog retention hours | 8 | binlog retention hours specifies the duration in hours before binary logs are automatically deleted. |
| source delay | 0 | source delay specifies replication delay in seconds between current instance and its master. |
| target delay | 0 | target delay specifies replication delay in seconds between current instance and its future read-replica. |
+------------------------+-------+-----------------------------------------------------------------------------------------------------------+
# Adjust the retention hours to 72 hours
# Input:
call mysql.rds_set_configuration('binlog retention hours', 72);
Steps:
-
Create a Replication Instance for migration.
Go to DMS -> Replication Instance and click Create replication instance.
cautionSelect the VPC that you have configured in EKS.
-
Create endpoints.
Go to DMS -> Endpoints and click Create endpoint.
Create the source endpoint and target endpoint respectively. If the target endpoint is the RDS instance, check Select RDS DB instance to configure it.
After configuration, specify a replication instance to test the connection.
-
Create migration tasks.
Click Create task and configure the task according to the instructions.
Pay attention to the following parameters.
-
Migration Type
AWS DMS provides three migration types:
- Migrate existing data: AWS DMS migrates only your existing data. Changes to your source data aren’t captured and applied to your target.
- Migrate existing data and replicate ongoing changes: AWS DMS migrates both existing data and ongoing data changes, i.e. the existing data before the migration task and the data changes during the migration task will be synchronized to the target instance.
- Replicate data changes only: AWS DMS only migrates the ongoing data changes. If you select this type, you can use CDC start mode for source transactions to specify a location and migrate the data changes. For this tutorial, select Migrate existing data and replicate ongoing changes.
-
Target table preparation mode
The target table preparation mode specifies the initial mode of the data structure. You can click the Info link beside the options to view the definition of each mode. For example, if ApeCloud MySQL is a newly created empty instance, you can select Do nothing mode.
In addition, create a database on ApeCloud MySQL before migration because AWS DMS does not create a database.
-
Turn on validation
It is recommended to enable this function.
-
Batch-optimized apply
It is recommended to enable this function as this function enables you to write target instances in batch and can improve the write speed.
-
Full load tuning settings: Maximum number of tables to load in parallel
This number decides how many concurrencies DMS uses to get source table data. Theoretically speaking, this can cause pressure on the source table during the full-load migration. Lower this number when the business in the source table is delicate.
-
Table Mapping
Table mapping decides which tables in the database are used for migration and can also apply easy conversions. It is recommended to enable Wizard mode to configure this parameter.
-
-
Start the migration task.
Switch applications
Before you start
- Make sure DMS migration tasks run normally. If you perform a validation task, make sure the results are as expected.
- To differentiate conversation and improve data security, it is recommended to create and authorize a database account solely for migration.
- It is recommended to switch applications during business off-peak hours because for safety concerns during the switching process, it is necessary to stop business write.
Steps:
-
Make sure the transmission task runs normally.
Pay attention to Status, Last updated in Table statistics, and CDC latency target in CloudWatch metrics.
You can also refer to this document to verify the migration task.
-
Pause business and prohibit new business write in the source database.
-
Verify the transmission task status again to make sure the task runs normally and the running status lasts at least 1 minute.
Refer to step 1 above to observe whether the link is normal and whether latency exists.
-
Use the target database to resume business.
-
Verify the migration with business.