kubernetes 集群中 Etcd 备份与恢复 + 备份脚本

kubernetes 集群中 Etcd 备份与恢复 + 备份脚本

 

先来谈谈为什么要备份,首先我etcd 是部署的3节点的集群 ,按理说 已经算上是高可用了吧

大家都知道etcd 是 k8s 集群 中配置存储中心,与api-server 进行互相通信,任何写入的操作最终的数据都落地到etcd 中,可见etcd 在k8s 集群中的重要性。

但是 一些队友 在k8s 中人为的一些误操作 还是要对etcd 的数据进行备份,提高数据安全性。

etcd数据备份和恢复时要注意使用的API接口,分为2和3两个版本,这里我使用的是ETCD 3

环境:

k8s 集群 二进制部署

etcd v3.4

etcd-1 192.168.31.61

etcd-2 192.168.31.63

etcd-3 192.168.31.66

 

备份:

#在每个节点上都创建备份目录

[root@master-1 ~]#mkdir /opt/etcd/bak

 

#在etcd-1 节点上 使用命令备份

#ETCDCTL_API=3 宣称使用的etcd api 接口版本为3版本

#/opt/etcd/bak/snap.db 备份的快照保存的位置

#指定etcd 公钥私钥,以及ca证书

 

[root@master-1 bin]# ETCDCTL_API=3 /opt/etcd/bin/etcdctl snapshot save /opt/etcd/bak/snap.db \

–endpoints=https://192.168.31.61:2379 \

–cacert=/opt/etcd/ssl/ca.pem \

–cert=/opt/etcd/ssl/server.pem \

–key=/opt/etcd/ssl/server-key.pem

{"level":"info","ts":1609649619.228145,"caller":"snapshot/v3_snapshot.go:119","msg":"created temporary db file","path":"/opt/etcd/bak/snap-2020-0101.db.part"}

{"level":"info","ts":"2021-01-03T12:53:39.249+0800","caller":"clientv3/maintenance.go:200","msg":"opened snapshot stream; downloading"}

{"level":"info","ts":1609649619.2496758,"caller":"snapshot/v3_snapshot.go:127","msg":"fetching snapshot","endpoint":"https://192.168.31.61:2379"}

{"level":"info","ts":"2021-01-03T12:53:39.361+0800","caller":"clientv3/maintenance.go:208","msg":"completed snapshot read; closing"}

{"level":"info","ts":1609649619.379133,"caller":"snapshot/v3_snapshot.go:142","msg":"fetched snapshot","endpoint":"https://192.168.31.61:2379","size":"4.3 MB","took":0.150711783}

{"level":"info","ts":1609649619.3793566,"caller":"snapshot/v3_snapshot.go:152","msg":"saved","path":"/opt/etcd/bak/snap-2020-0101.db"}

Snapshot saved at /opt/etcd/bak/snap-2020-0101.db

 

#查看备份的快照

[root@master-1 ~]# ll /opt/etcd/bak

total 4212

-rw——- 1 root root 4309024 Jan 3 12:53 snap-2020-0101.db

 

#查看快照状态

[root@master-1 bak]# ETCDCTL_API=3 /opt/etcd/bin/etcdctl snapshot status snap-2020-0101.db

ad0f41b3, 174429, 1547, 4.3 MB

 

#拷贝快照到其他主机上的备份目录下

[root@master-1 bak]# scp snap-2020-0101.db root@192.168.31.63:/opt/etcd/bak/

snap-2020-0101.db 100% 4208KB 23.4MB/s 00:00

[root@master-1 bak]# scp snap-2020-0101.db root@192.168.31.66:/opt/etcd/bak/

snap-2020-0101.db

 

 

恢复:

模拟故障:

#看看下当前deployment控制器和 pod

[root@master-1 cfg]# kubectl get pods,deployment

NAME READY STATUS RESTARTS AGE

pod/web-test-5cdbd79b55-87pqt 1/1 Running 1 4d15h

pod/web-test-5cdbd79b55-p54nq 1/1 Running 1 4d15h

pod/web-test-5cdbd79b55-r9swh 1/1 Running 1 4d14h

pod/web-test-5cdbd79b55-t8pcx 1/1 Running 1 4d14h

NAME READY UP-TO-DATE AVAILABLE AGE

deployment.apps/web-test 4/4 4 4 5d20h

 

#删除deployments控制器

[root@master-1 bak]# kubectl delete deployments web-test

deployment.apps “web-test” deleted

 

#再查看下当前POD,发现默认命名空间下已经没有Pod 存在了

[root@master-1 bak]# kubectl get pods

No resources found in default namespace.

 

 

恢复快照:

1、在每个节点上停止kube-apiserver和etcd

# 注意: 停止的 方式有所不同:

  •               如果是kubeadm 部署的话,需要删除 etcd , Kube-apiserver 的 yaml 文件才行,如果只是删除deployment 控制器的话是没有用的
  •               如果是 二进制 部署的话,只需要在master 节点上 停止 kube-apiserver 进程,在各etcd 节点 停止 etcd 进程即可

 

#我这里是 二进制部署 的 k8s 集群

master节点:

[root@master-1 bak]# systemctl stop kube-apiserver

 

etcd 各节点:

[root@master-1 bak]# systemctl stop etcd

 

 

2、删除个节点 etcd 数据目录(最好直接备份走从命名)

[root@master-1 bak]# mv /var/lib/etcd/default.etcd /var/lib/etcd/default.etcd-2020-0101-bak

 

3、在每个节点上恢复

#(注意etcd节点名称,以及IP地址,以及cluster-token,每个人的环境都不一样)别忘了指定备份的快照

etcd-1:

[root@master-1 / ]# ETCDCTL_API=3 /opt/etcd/bin/etcdctl snapshot restore /opt/etcd/bak/snap-2020-0101.db \

–name etcd-1 \

–initial-cluster=”etcd-1=https://192.168.31.61:2380,etcd-2=https://192.168.31.63:2380,etcd-3=https://192.168.31.66:2380″ \

–initial-cluster-token=etcd-cluster \

–initial-advertise-peer-urls=https://192.168.31.61:2380 \

–data-dir=/var/lib/etcd/default.etcd

{"level":"info","ts":1609651738.9956837,"caller":"snapshot/v3_snapshot.go:296","msg":"restoring snapshot","path":"/opt/etcd/bak/snap-2020-0101.db","wal-dir":"/var/lib/etcd/default.etcd/member/wal","data-dir":"/var/lib/etcd/default.etcd","snap-dir":"/var/lib/etcd/default.etcd/member/snap"}

{"level":"info","ts":1609651739.1263688,"caller":"mvcc/kvstore.go:380","msg":"restored last compact revision","meta-bucket-name":"meta","meta-bucket-name-key":"finishedCompactRev","restored-compact-revision":173190}

{"level":"info","ts":1609651739.1434276,"caller":"membership/cluster.go:392","msg":"added member","cluster-id":"bc5dd24e13e697c0","local-member-id":"0","added-peer-id":"72130f86e474b7bb","added-peer-peer-urls":["https://192.168.31.66:2380"]}

{"level":"info","ts":1609651739.1435857,"caller":"membership/cluster.go:392","msg":"added member","cluster-id":"bc5dd24e13e697c0","local-member-id":"0","added-peer-id":"b10f0bac3883a232","added-peer-peer-urls":["https://192.168.31.61:2380"]}

{"level":"info","ts":1609651739.143635,"caller":"membership/cluster.go:392","msg":"added member","cluster-id":"bc5dd24e13e697c0","local-member-id":"0","added-peer-id":"b46624837acedac9","added-peer-peer-urls":["https://192.168.31.63:2380"]}

{"level":"info","ts":1609651739.1764905,"caller":"snapshot/v3_snapshot.go:309","msg":"restored snapshot","path":"/opt/etcd/bak/snap-2020-0101.db","wal-dir":"/var/lib/etcd/default.etcd/member/wal","data-dir":"/var/lib/etcd/default.etcd","snap-dir":"/var/lib/etcd/default.etcd/member/snap"}

 

etcd-2:

[root@node-1 / ]# ETCDCTL_API=3 /opt/etcd/bin/etcdctl snapshot restore /opt/etcd/bak/snap-2020-0101.db \

–name etcd-2 \

–initial-cluster=”etcd-1=https://192.168.31.61:2380,etcd-2=https://192.168.31.63:2380,etcd-3=https://192.168.31.66:2380″ \

–initial-cluster-token=etcd-cluster \

–initial-advertise-peer-urls=https://192.168.31.63:2380 \

–data-dir=/var/lib/etcd/default.etcd

{"level":"info","ts":1609651738.9956837,"caller":"snapshot/v3_snapshot.go:296","msg":"restoring snapshot","path":"/opt/etcd/bak/snap-2020-0101.db","wal-dir":"/var/lib/etcd/default.etcd/member/wal","data-dir":"/var/lib/etcd/default.etcd","snap-dir":"/var/lib/etcd/default.etcd/member/snap"}

{"level":"info","ts":1609651739.1263688,"caller":"mvcc/kvstore.go:380","msg":"restored last compact revision","meta-bucket-name":"meta","meta-bucket-name-key":"finishedCompactRev","restored-compact-revision":173190}

{"level":"info","ts":1609651739.1434276,"caller":"membership/cluster.go:392","msg":"added member","cluster-id":"bc5dd24e13e697c0","local-member-id":"0","added-peer-id":"72130f86e474b7bb","added-peer-peer-urls":["https://192.168.31.66:2380"]}

{"level":"info","ts":1609651739.1435857,"caller":"membership/cluster.go:392","msg":"added member","cluster-id":"bc5dd24e13e697c0","local-member-id":"0","added-peer-id":"b10f0bac3883a232","added-peer-peer-urls":["https://192.168.31.61:2380"]}

{"level":"info","ts":1609651739.143635,"caller":"membership/cluster.go:392","msg":"added member","cluster-id":"bc5dd24e13e697c0","local-member-id":"0","added-peer-id":"b46624837acedac9","added-peer-peer-urls":["https://192.168.31.63:2380"]}

{"level":"info","ts":1609651739.1764905,"caller":"snapshot/v3_snapshot.go:309","msg":"restored snapshot","path":"/opt/etcd/bak/snap-2020-0101.db","wal-dir":"/var/lib/etcd/default.etcd/member/wal","data-dir":"/var/lib/etcd/default.etcd","snap-dir":"/var/lib/etcd/default.etcd/member/snap"}

 

etcd-3:

[root@node-2 / ]# ETCDCTL_API=3 /opt/etcd/bin/etcdctl snapshot restore /opt/etcd/bak/snap-2020-0101.db \

–name etcd-3 \

–initial-cluster=”etcd-1=https://192.168.31.61:2380,etcd-2=https://192.168.31.63:2380,etcd-3=https://192.168.31.66:2380″ \

–initial-cluster-token=etcd-cluster \

–initial-advertise-peer-urls=https://192.168.31.66:2380 \

–data-dir=/var/lib/etcd/default.etcd

{"level":"info","ts":1609651738.9956837,"caller":"snapshot/v3_snapshot.go:296","msg":"restoring snapshot","path":"/opt/etcd/bak/snap-2020-0101.db","wal-dir":"/var/lib/etcd/default.etcd/member/wal","data-dir":"/var/lib/etcd/default.etcd","snap-dir":"/var/lib/etcd/default.etcd/member/snap"}

{"level":"info","ts":1609651739.1263688,"caller":"mvcc/kvstore.go:380","msg":"restored last compact revision","meta-bucket-name":"meta","meta-bucket-name-key":"finishedCompactRev","restored-compact-revision":173190}

{"level":"info","ts":1609651739.1434276,"caller":"membership/cluster.go:392","msg":"added member","cluster-id":"bc5dd24e13e697c0","local-member-id":"0","added-peer-id":"72130f86e474b7bb","added-peer-peer-urls":["https://192.168.31.66:2380"]}

{"level":"info","ts":1609651739.1435857,"caller":"membership/cluster.go:392","msg":"added member","cluster-id":"bc5dd24e13e697c0","local-member-id":"0","added-peer-id":"b10f0bac3883a232","added-peer-peer-urls":["https://192.168.31.61:2380"]}

{"level":"info","ts":1609651739.143635,"caller":"membership/cluster.go:392","msg":"added member","cluster-id":"bc5dd24e13e697c0","local-member-id":"0","added-peer-id":"b46624837acedac9","added-peer-peer-urls":["https://192.168.31.63:2380"]}

{"level":"info","ts":1609651739.1764905,"caller":"snapshot/v3_snapshot.go:309","msg":"restored snapshot","path":"/opt/etcd/bak/snap-2020-0101.db","wal-dir":"/var/lib/etcd/default.etcd/member/wal","data-dir":"/var/lib/etcd/default.etcd","snap-dir":"/var/lib/etcd/default.etcd/member/snap"}

 

#本别启动 api-server 和 etcd各节点

master节点:

[root@master-1 bak]# systemctl start api-server

 

etcd 节点

[root@master-1 bak]# systemctl start etcd

 

#验证数据是否还原:

[root@master-1 cfg]# kubectl get pods,deployment

NAME READY STATUS RESTARTS AGE

pod/web-test-5cdbd79b55-87pqt 1/1 Running 1 4d15h

pod/web-test-5cdbd79b55-p54nq 1/1 Running 1 4d15h

pod/web-test-5cdbd79b55-r9swh 1/1 Running 1 4d14h

pod/web-test-5cdbd79b55-t8pcx 1/1 Running 1 4d14h

NAME READY UP-TO-DATE AVAILABLE AGE

deployment.apps/web-test 4/4 4 4 5d20h

可以看到被删除的POD 已经恢复回来了

 

 

Etcd 实时备份 脚本

#以各自环境为准

[root@master-1 /]# vim /opt/etcd/back_etcd.sh

#!/bin/bash

set -e

exec >> /opt/etcd/log/backup_etcd.log

Date=`date +%Y-%m-%d-%H-%M`

EtcdEndpoints="https://192.168.31.61:2379"

EtcdCmd="/opt/etcd/bin/etcdctl"

BackupDir="/opt/etcd/bak"

BackupFile="snapshot.db.$Date"

cacertfile="/opt/etcd/ssl/ca.pem"

certfile="/opt/etcd/ssl/server.pem"

keyfile="/opt/etcd/ssl/server-key.pem"

echo "`date` backup etcd..."

export ETCDCTL_API=3

$EtcdCmd snapshot save $BackupDir/$BackupFile --endpoints=$EtcdEndpoints --cacert=$cacertfile --cert=$certfile --key=$keyfile

echo "`date` backup done!"

 

####另外 #######

备份的节点最好不是单一的,虽然是集群,但也怕哪天突然一个节点没有备份,等日后要恢复的时候就傻眼了

所有最好是 备份 >= 2 个节点, 且 要给备份的文件 做实时的监控

 

至此,本篇结束

2021/1/4   南京

文章来源:https://www.cnaaa.net,转载请注明出处:https://www.cnaaa.net/archives/7002

(0)
打赏 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
上一篇 2023年2月3日 下午7:43
下一篇 2023年2月3日 下午7:43

相关推荐

  • Centos7.9下宝塔部署点可云进销存系统

    1、宝塔安装 见Centos7.6下宝塔安装及资产管理系统部署 2、安装环境 3、源码获取 4、系统部署 1、上传文件并解压 2、创建站点指向子目录 3、配置伪静态 4、配置数据库信息 直接输入 http://www.你的网址.com/install/ 然后输入你的数据库信息 至此 安装完成,默认账户密码:admin / admin888

    2023年2月14日
    6400
  • kubeadm 部署Kubernetes1.24[cri-docker]版本

    kubeadm极速部署Kubernetes 1.24版本集群 一、Kubernetes 1.24版本发布及改动 1.1 Kubernetes 1.24 发布 2022 年 5 月 3 日,Kubernetes 1.24 正式发布,在新版本中,我们看到 Kubernetes 作为容器编排的事实标准,正愈发变得成熟,有 12 项功能都更新到了稳定版本,同时引入了…

    2023年1月20日
    5000
  • CentOS7系统下扩容根目录

    利用单一磁盘的剩余空间 在同一块磁盘下,有剩余空间未分配,将该空间合并到根目录中 本文以/dev/sda为例 查看磁盘分区情况 lsblk或者fdisk -l /dev/sda 可以看到 sda 总容量为30G,而 sda1和 sda2 加起来总共使用了10G,因此我们可以将剩余的容量添加到目录中 对剩余空间进行分区格式化操作 fdisk /dev/sda …

    2022年6月9日
    47600
  • Linux系统管理本地 Linux 用户和组

    Linux的用户UID 系统中的每个进程(运行程序)都作为一个特定用户运行。每个文件归一个特定用户所有。对文件和目录的访问受到用户的限制。与运行进程相关联的用户可确定该进程可访问的文件和目录。 用户的分类 root用户 用户系统中唯一,权限最大,可以操作任意命令 普通用户 权限较低,只能编辑自己的用户家目录,由root账户创建 虚拟用户 没有登录系统的权限,…

    2022年6月11日
    28300
  • OpenSuSe系统登录密码忘了如何进行重置

    如果你的OpenSusE系统密码忘记了,又不想重装系统,那么,你可以通过如下几个步骤来强制重置root密码 1、重新启动机器,在出现grub引导界面后,按“e”,出现启动Linux的选项,移动光标至第一个选项上再次按“e”出现一个编辑字符的界面在字符后面加入“init=/bin/bash”字段,然后回车。按Ctrl+X,通过给内核传递init=/bin/ba…

    2022年11月24日
    10500

在线咨询: QQ交谈

邮件:712342017@qq.com

工作时间:周一至周五,8:30-17:30,节假日休息

关注微信