Kubernetes: 集群升级
- TAGS: Kubernetes
二进制Etcd集群升级
步骤
- 备份etcd 数据
- 下载新版 etcd包
- 停止etcd
- 替换etcd和 etcdctl
- 启动
说明
需求:如果从k8s 1.20 升级到 k8s 1.21
查看官方的 更改日志
https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.21.md
找到下面这种,类似这种更新的
kubeadm installs etcd v3.4.13 when creating cluster v1.19 (#97244, @pacoxu)
升级etcd操作
# v2 etcdctl --ca-file /etc/kubernetes/pki/etcd/etcd-ca.pem --key-file /etc/kubernetes/pki/etcd/etcd-key-pem --cert-file /etc/kubernetes/pki/etcd/etcd.pem --endpoints https://10.4.7.107:2379,https://10.4.7.108:2379,https://10.4.7.109:2379 member list # v3 export ETCDCTL_API=3 etcdctl --cacert=/etc/kubernetes/pki/etcd/etcd-ca.pem --cert=/etc/kubernetes/pki/etcd/etcd.pem --key=/etc/kubernetes/pki/etcd/etcd-key.pem --endpoints https://10.4.7.107:2379,https://10.4.7.108:2379,https://10.4.7.109:2379 member list 2075659d8c0e93a, started, k8s-master02, https://10.4.7.108:2380, https://10.4.7.108:2379, false 7626dc6cd4e892c1, started, k8s-master03, https://10.4.7.109:2380, https://10.4.7.109:2379, false badc9b068aea441f, started, k8s-master01, https://10.4.7.107:2380, https://10.4.7.107:2379, false ## or $ etcdctl --endpoints="10.4.7.109:2379,10.4.7.108:2379,10.4.7.107:2379" --cacert=/etc/kubernetes/pki/etcd/etcd-ca.pem --cert=/etc/kubernetes/pki/etcd/etcd.pem --key=/etc/kubernetes/pki/etcd/etcd-key.pem endpoint status --write-out=table +-----------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ | ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS | +-----------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ | 10.4.7.109:2379 | 7626dc6cd4e892c1 | 3.4.13 | 8.5 MB | false | false | 235 | 99457 | 99457 | | | 10.4.7.108:2379 | 2075659d8c0e93a | 3.4.13 | 8.5 MB | false | false | 235 | 99457 | 99457 | | | 10.4.7.107:2379 | badc9b068aea441f | 3.4.13 | 8.5 MB | true | false | 235 | 99457 | 99457 | | +-----------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------
# 备份 etcdctl --cacert=/etc/kubernetes/pki/etcd/etcd-ca.pem --cert=/etc/kubernetes/pki/etcd/etcd.pem --key=/etc/kubernetes/pki/etcd/etcd-key.pem --endpoints https://10.4.7.107:2379 snapshot save 20210610-2301.db
# master02, 停掉 etcd systemctl stop etcd which etcd /usr/local/bin/etcd
# master01,查看 etcd集群, 发现master02已经停掉 etcdctl --cacert=/etc/kubernetes/pki/etcd/etcd-ca.pem --cert=/etc/kubernetes/pki/etcd/etcd.pem --key=/etc/kubernetes/pki/etcd/etcd-key.pem --endpoints https://10.4.7.107:2379,https://10.4.7.108:2379,https://10.4.7.109:2379 endpoin health {"level":"warn","ts":"2021-06-11T00:13:10.066+0800","caller":"clientv3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"endpoint://client-36d6a97d-a057-42a8-8079-d58b87a52b13/10.4.7.108:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: all SubConns are in TransientFailure, latest connection error: connection error: desc = \"transport: Error while dialing dial tcp 10.4.7.108:2379: connect: connection refused\""} https://10.4.7.107:2379 is healthy: successfully committed proposal: took = 18.702919ms https://10.4.7.109:2379 is healthy: successfully committed proposal: took = 20.945198ms https://10.4.7.108:2379 is unhealthy: failed to commit proposal: context deadline exceeded # tar etcd tar xf etcd-v3.4.13-linux-amd64.tar.gz cd etcd-v3.4.13 scp etcd* k8s-master02:/usr/local/bin/etcd
# master02, 停掉 etcd etcdctl version etcdctl version: 3.4.13 API version: 3.4 systemctl start etcd # 如果启动失败 vim /etc/etcd/etcd.config.yml ## 修改成下面这种 log-outputs: [default]
升级其他节点
# master03 升级同上面操作 # 主节点也是同上面操作
二进制Kubernetes升级
升级Master节点
下载对应要升级的版本
升级master01
tar -xf kubernetes-server-linux-amd64.tar.gz
cd /kubernetes/server/bin
./kubectl version
# 升级 apiserver cd /kubernetes/server/bin systemctl stop kube-apiserver which kube-apiserver /usr/local/bin/kube-apiserver cp -rp kube-apiserver /usr/local/bin/kube-apiserver /usr/local/bin/kube-apiserver --version systemctl daemon-reload systemctl restart kube-apiserver tail -f /var/log/messages
# 升级 kube-controller-manager kube-scheduler cd /kubernetes/server/bin systemctl stop kube-controller-manager kube-scheduler \cp -rp kube-controller-manager kube-scheduler /usr/local/bin/ systemctl restart kube-controller-manager systemctl status kube-controller-manager tail -f /var/log/messages systemctl restart kube-scheduler tail -f /var/log/messages
# 升级 kube-proxy cd /kubernetes/server/bin systemctl stop kube-proxy \cp -rp kube-proxy /usr/local/bin/ systemctl restart kube-proxy systemctl status kube-proxy
# 升级 kubectl cd /kubernetes/server/bin cp -rp kubectl /usr/local/bin/
升级其他master 节点
master01上拷贝到其他master节点
scp kube-apiserver kube-controller-manager kube-scheduler kube-proxy kubectl k8s-master02:/tmp/ scp kube-apiserver kube-controller-manager kube-scheduler kube-proxy kubectl k8s-master03:/tmp/
master02, master03机器操作,同上
升级Node节点和Calico
建议:kubelet和 calico一起升级,每次升级一个节点
master02
# 下线Node节点 kubectl drain k8s-master02 --delete-local-data --force --ignore-daemonsets systemctl stop kubelet cp -rp kubelet /usr/local/bin/kubelet
calico升级
文档:https://docs.projectcalico.org/maintenance/kubernetes-upgrade
安装:https://docs.projectcalico.org/getting-started/kubernetes/self-managed-onprem/onpremises
master02升级
# master01 curl https://docs.projectcalico.org/manifests/calico.yaml -O vim calico.yaml # 修改如下 updateStrategy type: OnDelete # 修改成这个 kubectl apply -f calico.yaml kubectl get po -n kube-system -owide # 恢复 master02 kubectl uncordon k8s-master02 # 去master02 systemclt restart kubelet # master01上操作 # 查看master02上的 calico kubectl get po -n kuby-system -owide kubectl describe po calico-node-dwgwe ## 这个对应的是master02节点的 #发现 node上的节点开始拉取新的calico
master01升级
systemctl stop kubelet cp -rp kubelet /user/local/bin/ systemctl start kubelet systemctl status kubelet kubectl get node
master 03同上
二进制CoreDNS升级
下载
master01
git clone [email protected]:coredns/deployment.git cd deployment cd kubernetes # 备份 mkdir bak kubectl get cm coredns -n kube-system -oyaml > bak/coredns-cm.yaml kubectl get deploy coredns -n kube-system -oyaml > bak/coredns-dp.yaml kubectl get clusterrole coredns -n kube-system -oyaml > bak/coredns-cr.yaml kubectl get clusterrolebinding coredns -n kube-system -oyaml > bak/coredns-crb.yaml ./deploy.sh -s | kubectl apply -f - kubectl get po -n kube-system kubectl logs -f coredns-csegwe-zege -n kube-system # 测试 kubectl get po kubectl exec -it nginx-fegwe-egew -- bash ## 容器内命令 curl kubernetes:443