k8s备份恢复神器之Velero

Velero简介

Velero是一个云原生的灾难恢复和迁移工具,它本身也是开源的,采用Go语言编写,可以安全的备份、恢复和迁移Kubernetes集群资源数据。Velero 是西班牙语意思是帆船,非常符合Kubernetes社区的命名风格,Velero的开发公司Heptio,已被VMware收购。Velero 支持标准的K8S集群,既可以是私有云平台也可以是公有云,除了灾备之外它还能做资源移转,支持把容器应用从一个集群迁移到另一个集群。
此次操作需要velero与存储结合这里采用minio来做后端存储池,也可以使用其他的存储池来都行,先找一台空闲机器安装好velero和minio
由于默认的etcd备份和恢复只能全量,假如我只想备份或者恢复某个namespace下的所有pod,etcd的功能就实现不了,这时候velero就可以派上用场了,它可以恢复指定的某个namespace的数据

安装docker

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
root@velero-server:~# cd /usr/local/src/
root@velero-server:/usr/local/src# wget https://image.oaali.com/download/docker_auto_ops.tar.gz
# 解压docker的安装包
root@velero-server:/usr/local/src# tar xf docker_auto_ops.tar.gz
root@velero-server:/usr/local/src# ls
docker docker_auto_ops.tar.gz
root@velero-server:/usr/local/src# cd docker/
root@velero-server:/usr/local/src/docker# ls
containerd containerd-shim-runc-v2 daemon.json docker dockerd docker-proxy docker.socket
containerd.service ctr deploy_docker.sh docker-24.0.8.tgz docker-init docker.service runc
root@velero-server:/usr/local/src/docker# bash deploy_docker.sh
root@velero-server:/usr/local/src/docker# docker info
Client:
Version: 24.0.8
Context: default
Debug Mode: false

Server:
Containers: 0
Running: 0
Paused: 0
Stopped: 0
Images: 0
Server Version: 24.0.8
AWK

安装minio

minio这里采用docker方式部署

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# 先下载minio镜像
root@velero-server:/usr/local/src/docker# docker pull minio/minio:RELEASE.2024-12-18T13-15-44Z
# 创建本地挂载数据目录
root@velero-server:~# mkdir -p /data/minio
# 运行minio
root@velero-server:~# docker run --name minio \
-p 9000:9000 \
-p 9001:9001 \
-d --restart=always \
-e "MINIO_ROOT_USER=admin" \
-e "MINIO_ROOT_PASSWORD=12345678" \
-v /data/minio/data:/data \
minio/minio:RELEASE.2024-12-18T13-15-44Z server /data \
--console-address '0.0.0.0:9001'
2e574f55d13d5d1e3519fe4bf5e07b4a09c91f5bc43ca733a9bb7909ae363b51
AWK

通过浏览器访问

minio部署好后通过浏览器访问,地址为http://192.168.1.49:9001、 访问没问题后使用部署前设置的用户密码登录进去创建一个桶后面要用

安装velero

velero是由vmware开源出来的。安装velero需要在能够使用kubectl命令的机器上安装,这里就在master上安装了。

velero客户端

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# https://github.com/vmware-tanzu/velero/releases/download/v1.15.1/velero-v1.15.1-linux-amd64.tar.gz
# 下载velero
root@k8s-master-50:~# cd /usr/local/src/
root@k8s-master-50:/usr/local/src# wget https://github.com/vmware-tanzu/velero/releases/download/v1.15.1/velero-v1.15.1-linux-amd64.tar.gz
# 解压
root@k8s-master-50:/usr/local/src# tar xf velero-v1.15.1-linux-amd64.tar.gz
root@k8s-master-50:/usr/local/src# ls
velero-v1.15.1-linux-amd64 velero-v1.15.1-linux-amd64.tar.gz
root@k8s-master-50:/usr/local/src# ll velero-v1.15.1-linux-amd64/velero
-rwxr-xr-x 1 root root 102019022 Dec 27 13:55 velero-v1.15.1-linux-amd64/velero*
root@k8s-master-50:/usr/local/src# cp velero-v1.15.1-linux-amd64/velero /usr/local/bin/
root@k8s-master-50:/usr/local/src# velero version
Client:
Version: v1.15.1
Git commit: 32499fc287815058802c1bc46ef620799cca7392
<error getting server version: no matches for kind "ServerStatusRequest" in version "velero.io/v1">
ELIXIR

velero服务端

1
2
3
4
# 创建目录,后面所有velero的配置文件都放在这个目录下
root@k8s-master-50:/usr/local/src# mkdir /data/velero -p
root@k8s-master-50:/usr/local/src# cd /data/velero/
root@k8s-master-50:/data/velero#
ELIXIR

配置velero认证环境

1
2
3
4
5
root@k8s-master-50:/data/velero# vim velero-auth.txt

[default]
aws_access_key_id = admin
aws_secret_access_key = 12345678
ELIXIR

准备user-csr文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
root@k8s-master-50:/data/velero# vim awsuser-csr.json
{
"CN": "awsuser",
"hosts": [],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "GuangDong",
"L": "ShenZhen",
"O": "k8s",
"OU": "System"
}
]
}
ELIXIR

证书签发环境

需要用到cfssl工具来进行证书的签发

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
root@k8s-master-50:/data/velero# apt install golang-cfssl
root@k8s-master-50:/data/velero# wget https://github.com/cloudflare/cfssl/releases/download/v1.6.5/cfssljson_1.6.5_linux_amd64
root@k8s-master-50:/data/velero# wget https://github.com/cloudflare/cfssl/releases/download/v1.6.5/cfssl-certinfo_1.6.5_linux_amd64
root@k8s-master-50:/data/velero# wget https://github.com/cloudflare/cfssl/releases/download/v1.6.5/cfssl_1.6.5_linux_amd64
# 移动文件并重命名去掉版本号
root@k8s-master-50:/data/velero# mv cfssl-certinfo_1.6.5_linux_amd64 /usr/local/bin/cfssl-certinfo
root@k8s-master-50:/data/velero# mv cfssljson_1.6.5_linux_amd64 /usr/local/bin/cfssljson
root@k8s-master-50:/data/velero# mv cfssl_1.6.5_linux_amd64 /usr/local/bin/cfssl
root@k8s-master-50:/data/velero# chmod +x /usr/local/bin/cfssl*
# 签发证书
root@k8s-master-50:/data/velero# /usr/local/bin/cfssl gencert -ca=/etc/kubernetes/ssl/ca.pem -ca-key=/etc/kubernetes/ssl/ca-key.pem -config=/etc/kubeasz/clusters/k8s-01/ssl/ca-config.json -profile=kubernetes ./awsuser-csr.json | cfssljson -bare awsuser
2025/01/02 17:04:28 [INFO] generate received request
2025/01/02 17:04:28 [INFO] received CSR
2025/01/02 17:04:28 [INFO] generating key: rsa-2048
2025/01/02 17:04:28 [INFO] encoded CSR
2025/01/02 17:04:28 [INFO] signed certificate with serial number 706349371693351049949854028719990556839999418873
2025/01/02 17:04:28 [WARNING] This certificate lacks a "hosts" field. This makes it unsuitable for
websites. For more information see the Baseline Requirements for the Issuance and Management
of Publicly-Trusted Certificates, v.1.1.6, from the CA/Browser Forum (https://cabforum.org);
specifically, section 10.2.3 ("Information Requirements").
# 把生成好的证书拷贝到/etc/kubernetes/ssl目录下
root@k8s-master-50:/data/velero# cp awsuser-key.pem /etc/kubernetes/ssl/
root@k8s-master-50:/data/velero# cp awsuser.pem /etc/kubernetes/ssl/
ELIXIR

生成集群认证config文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
root@k8s-master-50:/data/velero# export KUBE_APISERVER="https://192.168.1.90:6443"
#
root@k8s-master-50:/data/velero# kubectl config set-cluster kubernetes \
--certificate-authority=/etc/kubernetes/ssl/ca.pem \
--embed-certs=true \
--server=${KUBE_APISERVER} \
--kubeconfig=./awsuser.kubeconfig
Cluster "kubernetes" set.
# 设置客户端证书认证:
root@k8s-master-50:/data/velero# kubectl config set-credentials awsuser \
--client-certificate=/etc/kubernetes/ssl/awsuser.pem \
--client-key=/etc/kubernetes/ssl/awsuser-key.pem \
--embed-certs=true \
--kubeconfig=./awsuser.kubeconfig
User "awsuser" set.
# 设置上下文参数:
root@k8s-master-50:/data/velero# kubectl config set-context kubernetes \
--cluster=kubernetes \
--user=awsuser \
--namespace=velero-system \
--kubeconfig=./awsuser.kubeconfig
Context "kubernetes" created.
# 设置默认上下文:
root@k8s-master-50:/data/velero# kubectl config use-context kubernetes --kubeconfig=awsuser.kubeconfig
Switched to context "kubernetes".
# k8s集群中创建awsuser账户:
root@k8s-master-50:/data/velero# kubectl create clusterrolebinding awsuser --clusterrole=cluster-admin --user=awsuser
clusterrolebinding.rbac.authorization.k8s.io/awsuser created
# 创建namespace:
root@k8s-master-50:/data/velero# kubectl create ns velero-system
namespace/velero-system created
# 执行安装velero
root@k8s-master-50:/data/velero# velero --kubeconfig ./awsuser.kubeconfig \
install \
--provider aws \
--plugins velero/velero-plugin-for-aws:v1.3.1 \
--bucket velerodata \
--secret-file ./velero-auth.txt \
--use-volume-snapshots=false \
--namespace velero-system \
--backup-location-config region=minio,s3ForcePathStyle="true",s3Url=http://192.168.1.49:9000
CustomResourceDefinition/backuprepositories.velero.io: attempting to create resource
CustomResourceDefinition/backuprepositories.velero.io: attempting to create resource client
...
Deployment/velero: created
Velero is installed! ⛵ Use 'kubectl logs deployment/velero -n velero-system' to view the status.
root@k8s-master-50:/data/velero# kubectl get pod -n velero-system
NAME READY STATUS RESTARTS AGE
velero-78649846cf-2nvnz 1/1 Running 0 7m14s
ELIXIR

velero备份k8s集群

部署完成后就可以使用velero备份了

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
# 指定某个namespace备份
root@k8s-master-50:/data/velero# velero backup create dklwj-k8s-25112 --include-namespaces myserver -n velero-system
Backup request "dklwj-k8s-25112" submitted successfully.
Run `velero backup describe dklwj-k8s-25112` or `velero backup logs dklwj-k8s-25112` for more details.

# 定义一个时间戳节点后续在minio上就不会重复
root@k8s-master-50:~# DATE=`date +%Y%m%d%H%M%S`
root@k8s-master-50:/data/velero# velero backup create myserver-backup-${DATE} \
--include-namespaces myserver \
--kubeconfig=./awsuser.kubeconfig \
--namespace velero-system

Backup request "myserver-backup-20250116165618" submitted successfully.
Run `velero backup describe myserver-backup-20250116165618` or `velero backup logs myserver-backup-20250116165618` for more details.
# 备份完后验证备份
root@k8s-master-50:/data/velero# velero backup describe myserver-backup-20250116165618 --kubeconfig=./awsuser.kubeconfig --namespace velero-system
Name: myserver-backup-20250116165618
Namespace: velero-system
Labels: velero.io/storage-location=default
Annotations: velero.io/resource-timeout=10m0s
velero.io/source-cluster-k8s-gitversion=v1.23.5
velero.io/source-cluster-k8s-major-version=1
velero.io/source-cluster-k8s-minor-version=23

Phase: Completed


Namespaces:
Included: myserver
Excluded: <none>

Resources:
Included: *
Excluded: <none>
Cluster-scoped: auto

Label selector: <none>

Or label selector: <none>

Storage Location: default

Velero-Native Snapshot PVs: auto
Snapshot Move Data: false
Data Mover: velero

TTL: 720h0m0s

CSISnapshotTimeout: 10m0s
ItemOperationTimeout: 4h0m0s

Hooks: <none>

Backup Format Version: 1.1.0

Started: 2025-01-16 16:57:55 +0800 CST
Completed: 2025-01-16 16:57:56 +0800 CST

Expiration: 2025-02-15 16:57:54 +0800 CST

Total items to be backed up: 12
Items backed up: 12

Backup Volumes:
Velero-Native Snapshots: <none included>

CSI Snapshots: <none included>

Pod Volume Backups: <none included>

HooksAttempted: 0
HooksFailed: 0
YAML

然后打开minio控制端查看有没有备份文件在里面

现在使用velero备份成功了,那该怎么恢复呢,不急接着往下看

velero恢复

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
# 模拟pod被删
root@k8s-master-50:/data/velero# kubectl delete pod net-test1 -n myserver
pod "net-test1" deleted
# 开始恢复pod
root@k8s-master-50:/data/velero# velero restore create --from-backup dklwj-k8s-25112 --wait \
--kubeconfig=./awsuser.kubeconfig \
--namespace velero-system
Restore request "dklwj-k8s-25112-20250102175358" submitted successfully.
Waiting for restore to complete. You may safely press ctrl-c to stop waiting - your restore will continue in the background.

Restore completed with status: Completed. You may check for more information using the commands `velero restore describe dklwj-k8s-25112-20250102175358` and `velero restore logs dklwj-k8s-25112-20250102175358`.
# 查看pod
root@k8s-master-50:/data/velero# kubectl get pod -n myserver
NAME READY STATUS RESTARTS AGE
net-test1 1/1 Running 0 44s

# 删除myserver的所有控制器以及service
root@k8s-master-50:/data/velero# kubectl delete deployment tomcat-app1-deployment -n myserver
deployment.apps "tomcat-app1-deployment" deleted
root@k8s-master-50:/data/velero# kubectl delete service tomcat-app1-service -n myserver
service "tomcat-app1-service" deleted
root@k8s-master-50:/data/velero# kubectl get deployment -A
NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE
kube-system calico-kube-controllers 1/1 1 1 19d
kube-system coredns 2/2 2 2 19d
kubernetes-dashboard dashboard-metrics-scraper 1/1 1 1 18d
kubernetes-dashboard kubernetes-dashboard 1/1 1 1 18d
velero-system velero 1/1 1 1 13d
root@k8s-master-50:/data/velero# kubectl get svc -A
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default kubernetes ClusterIP 10.100.0.1 <none> 443/TCP 19d
kube-system kube-dns ClusterIP 10.100.0.2 <none> 53/UDP,53/TCP,9153/TCP 19d
kubernetes-dashboard dashboard-metrics-scraper ClusterIP 10.100.206.89 <none> 8000/TCP 18d
kubernetes-dashboard kubernetes-dashboard NodePort 10.100.110.26 <none> 443:30004/TCP 18d
# 恢复
root@k8s-master-50:/data/velero# velero restore create --from-backup myserver-backup-20250116165618 --wait \
> --kubeconfig=./awsuser.kubeconfig \
> --namespace velero-system
Restore request "myserver-backup-20250116165618-20250116171021" submitted successfully.
Waiting for restore to complete. You may safely press ctrl-c to stop waiting - your restore will continue in the background.
.
Restore completed with status: Completed. You may check for more information using the commands `velero restore describe myserver-backup-20250116165618-20250116171021` and `velero restore logs myserver-backup-20250116165618-20250116171021`.
# 查看恢复后的deployment以及service信息
root@k8s-master-50:/data/velero# kubectl get svc -A
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default kubernetes ClusterIP 10.100.0.1 <none> 443/TCP 19d
kube-system kube-dns ClusterIP 10.100.0.2 <none> 53/UDP,53/TCP,9153/TCP 19d
kubernetes-dashboard dashboard-metrics-scraper ClusterIP 10.100.206.89 <none> 8000/TCP 18d
kubernetes-dashboard kubernetes-dashboard NodePort 10.100.110.26 <none> 443:30004/TCP 18d
myserver tomcat-app1-service NodePort 10.100.226.94 <none> 80:38031/TCP 34s
root@k8s-master-50:/data/velero# kubectl get deployment -A
NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE
kube-system calico-kube-controllers 1/1 1 1 19d
kube-system coredns 2/2 2 2 19d
kubernetes-dashboard dashboard-metrics-scraper 1/1 1 1 18d
kubernetes-dashboard kubernetes-dashboard 1/1 1 1 18d
myserver tomcat-app1-deployment 1/1 1 1 45s
velero-system velero 1/1 1 1 13d
# 查看myserver这个名称空间下的pod是否被恢复回来
root@k8s-master-50:/data/velero# kubectl get pod -n myserver
NAME READY STATUS RESTARTS AGE
tomcat-app1-deployment-768db4c897-xqpvv 1/1 Running 0 59s
RUBY

k8s备份恢复神器之Velero
https://www.dklwj.com/2025/01/k8s-backup-and-recovery-artifact-Velero.html
作者
阿伟
发布于
2025年1月14日
许可协议