기타/자격증

[CKA][실습] 5. Cluster Maintenance

백곰곰 2023. 2. 26. 15:21
728x90
반응형

Practice Test - OS Upgrades

bash
$ alias k=kubectl ## node01 drain $ k drain node01 node/node01 cordoned error: unable to drain node "node01" due to error:cannot delete DaemonSet-managed Pods (use --ignore-daemonsets to ignore): kube-flannel/kube-flannel-ds-296dv, kube-system/kube-proxy-9lkjz, continuing command... There are pending nodes to be drained: node01 cannot delete DaemonSet-managed Pods (use --ignore-daemonsets to ignore): kube-flannel/kube-flannel-ds-296dv, kube-system/kube-proxy-9lkjz ## daemonset 무시 옵션 추가 $ k drain node01 --ignore-daemonsets node/node01 already cordoned Warning: ignoring DaemonSet-managed Pods: kube-flannel/kube-flannel-ds-296dv, kube-system/kube-proxy-9lkjz evicting pod default/blue-987f68cb5-pnqg6 evicting pod default/blue-987f68cb5-6k5cd pod/blue-987f68cb5-6k5cd evicted pod/blue-987f68cb5-pnqg6 evicted node/node01 drained $ k uncordon node01 node/node01 uncordoned ## replicaset으로 만들어지지 않은 pod가 node에 있을 때 ## drain하면 해당 pod는 다른 node에 다시 만들어지지 않음 $ k drain node01 --force --ignore-daemonsets Warning: deleting Pods that declare no controller: default/hr-app; ignoring DaemonSet-managed Pods: kube-flannel/kube-flannel-ds-296dv, kube-system/kube-proxy-9lkjz evicting pod default/hr-app pod/hr-app evicted node/node01 drained ## 새 pod만 생성되지 않길 바랄 때 $ k cordon node01 node/node01 cordoned $ k get no -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME controlplane Ready control-plane 27m v1.26.0 192.3.248.12 <none> Ubuntu 20.04.5 LTS 5.4.0-1100-gcp containerd://1.6.6 node01 Ready,SchedulingDisabled <none> 26m v1.26.0 192.3.248.3 <none> Ubuntu 20.04.5 LTS 5.4.0-1100-gcp containerd://1.6.6

Practice Test - Cluster Upgrade Process

  • 신규 버전 확인
bash
$ kubeadm upgrade plan [upgrade/config] Making sure the configuration is correct: [upgrade/config] Reading configuration from the cluster... [upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml' [preflight] Running pre-flight checks. [upgrade] Running cluster health checks [upgrade] Fetching available versions to upgrade to [upgrade/versions] Cluster version: v1.25.0 [upgrade/versions] kubeadm version: v1.25.0 I0225 21:00:47.708667 16334 version.go:256] remote version is much newer: v1.26.1; falling back to: stable-1.25 [upgrade/versions] Target version: v1.25.6 [upgrade/versions] Latest version in the v1.25 series: v1.25.6 Components that must be upgraded manually after you have upgraded the control plane with 'kubeadm upgrade apply': COMPONENT CURRENT TARGET kubelet 2 x v1.25.0 v1.25.6 Upgrade to the latest version in the v1.25 series: COMPONENT CURRENT TARGET kube-apiserver v1.25.0 v1.25.6 kube-controller-manager v1.25.0 v1.25.6 kube-scheduler v1.25.0 v1.25.6 kube-proxy v1.25.0 v1.25.6 CoreDNS v1.9.3 v1.9.3 etcd 3.5.4-0 3.5.4-0 You can now apply the upgrade by executing the following command: kubeadm upgrade apply v1.25.6 Note: Before you can perform this upgrade, you have to update kubeadm to v1.25.6. _____________________________________________________________________ The table below shows the current state of component configs as understood by this version of kubeadm. Configs that have a "yes" mark in the "MANUAL UPGRADE REQUIRED" column require manual config upgrade or resetting to kubeadm defaults before a successful upgrade can be performed. The version to manually upgrade to is denoted in the "PREFERRED VERSION" column. API GROUP CURRENT VERSION PREFERRED VERSION MANUAL UPGRADE REQUIRED kubeproxy.config.k8s.io v1alpha1 v1alpha1 no kubelet.config.k8s.io v1beta1 v1beta1 no _____________________________________________________________________
  • Cluster Upgrade
bash
$ k drain controlplane --ignore-daemonsets node/controlplane already cordoned Warning: ignoring DaemonSet-managed Pods: kube-flannel/kube-flannel-ds-xsgl4, kube-system/kube-proxy-gw4dk evicting pod kube-system/coredns-565d847f94-rmmng evicting pod default/blue-5db6db69f7-556ld evicting pod kube-system/coredns-565d847f94-c5mpg evicting pod default/blue-5db6db69f7-4x6kz pod/blue-5db6db69f7-556ld evicted pod/blue-5db6db69f7-4x6kz evicted pod/coredns-565d847f94-c5mpg evicted pod/coredns-565d847f94-rmmng evicted node/controlplane drained ## kubeadm 업그레이드 $ apt-get upgrade -y kubeadm=1.26.0-00 $ kubeadm upgrade apply v1.26.0 ## kubelet 업그레이드 $ apt install -y kubelet=1.26.0-00 $ systemctl restart kubelet $ k get no -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME controlplane Ready,SchedulingDisabled control-plane 92m v1.26.0 192.2.221.12 <none> Ubuntu 20.04.5 LTS 5.4.0-1100-gcp containerd://1.6.6 node01 Ready <none> 92m v1.25.0 192.2.221.3 <none> Ubuntu 20.04.5 LTS 5.4.0-1100-gcp containerd://1.6.6 $ k uncordon controlplane node/controlplane uncordoned ## node 업그레이드 $ k drain node01 --ignore-daemonsets $ ssh node01 $ apt-get update $ apt-get upgrade -y kubeadm=1.26.0-00 $ apt-get install -y kubelet=1.26.0-00 $ kubeadm upgrade node confing --kubelet-version v1.26.0 $ systemctl restart kubelet $ k uncordon node01

Practice Test - Backup and Restore methods (1)

bash
$ export ETCDCTL_API=3 ## 오류 $ etcdctl snapshot save /opt/snapshot-pre-boot.db Error: rpc error: code = Unavailable desc = transport is closing $ k describe po etcd-controlplane -n kube-system | grep crt --cert-file=/etc/kubernetes/pki/etcd/server.crt --peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt $ k describe po etcd-controlplane -n kube-system | grep key --key-file=/etc/kubernetes/pki/etcd/server.key --peer-key-file=/etc/kubernetes/pki/etcd/peer.key $ etcdctl --endpoints=192.6.236.9:2379 --cacert="/etc/kubernetes/pki/etcd/ca.crt" --cert="/etc/kubernetes/pki/etcd/server.crt" --key="/etc/kubernetes/pki/etcd/server.key" snapshot save /opt/snapshot-pre-boot.db Snapshot saved at /opt/snapshot-pre-boot.db ## 복구 $ ETCDCTL_API=3 etcdctl snapshot restore --data-dir /opt/restore-dir /opt/snapshot-pre-boot.db $ vi /etc/kubernetes/manifests/etcd.yaml apiVersion: v1 kind: Pod metadata: annotations: kubeadm.kubernetes.io/etcd.advertise-client-urls: https://192.6.236.9:2379 creationTimestamp: null labels: component: etcd tier: control-plane name: etcd namespace: kube-system spec: containers: - command: - etcd - --advertise-client-urls=https://192.6.236.9:2379 - --cert-file=/etc/kubernetes/pki/etcd/server.crt - --client-cert-auth=true - --data-dir=/var/lib/etcd - --experimental-initial-corrupt-check=true - --experimental-watch-progress-notify-interval=5s - --initial-advertise-peer-urls=https://192.6.236.9:2380 - --initial-cluster=controlplane=https://192.6.236.9:2380 - --key-file=/etc/kubernetes/pki/etcd/server.key - --listen-client-urls=https://127.0.0.1:2379,https://192.6.236.9:2379 - --listen-metrics-urls=http://127.0.0.1:2381 - --listen-peer-urls=https://192.6.236.9:2380 - --name=controlplane - --peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt - --peer-client-cert-auth=true - --peer-key-file=/etc/kubernetes/pki/etcd/peer.key - --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt - --snapshot-count=10000 - --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt image: registry.k8s.io/etcd:3.5.6-0 imagePullPolicy: IfNotPresent livenessProbe: failureThreshold: 8 httpGet: host: 127.0.0.1 path: /health?exclude=NOSPACE&serializable=true port: 2381 scheme: HTTP initialDelaySeconds: 10 periodSeconds: 10 timeoutSeconds: 15 name: etcd resources: requests: cpu: 100m memory: 100Mi startupProbe: failureThreshold: 24 httpGet: host: 127.0.0.1 path: /health?serializable=false port: 2381 scheme: HTTP initialDelaySeconds: 10 periodSeconds: 10 timeoutSeconds: 15 volumeMounts: - mountPath: /var/lib/etcd name: etcd-data - mountPath: /etc/kubernetes/pki/etcd name: etcd-certs hostNetwork: true priorityClassName: system-node-critical securityContext: seccompProfile: type: RuntimeDefault volumes: - hostPath: path: /etc/kubernetes/pki/etcd type: DirectoryOrCreate name: etcd-certs - hostPath: path: /opt/restore-dir ## /var/lib/etcd -> /opt/restore-dir수정 type: DirectoryOrCreate name: etcd-data status: {}

Practice Test - Backup and Restore methods (2)

bash
$ k config get-clusters NAME cluster1 cluster2 $ k config use-context cluster1 $ k get no $ k config use-context cluster1 $ ssh controlplane-cluster1 ## etcd endpoint 확인 $ ps -ef | grep etcd root 1708 1362 0 03:49 ? 00:04:54 kube-apiserver --advertise-address=192.6.113.23 --allow-privileged=true --authorization-mode=Node,RBAC --client-ca-file=/etc/kubernetes/pki/ca.crt --enable-admission-plugins=NodeRestriction --enable-bootstrap-token-auth=true --etcd-cafile=/etc/kubernetes/pki/etcd/ca.pem --etcd-certfile=/etc/kubernetes/pki/etcd/etcd.pem --etcd-keyfile=/etc/kubernetes/pki/etcd/etcd-key.pem --etcd-servers=https://192.6.113.12:2379 --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key --requestheader-allowed-names=front-proxy-client --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt --requestheader-extra-headers-prefix=X-Remote-Extra- --requestheader-group-headers=X-Remote-Group --requestheader-username-headers=X-Remote-User --secure-port=6443 --service-account-issuer=https://kubernetes.default.svc.cluster.local --service-account-key-file=/etc/kubernetes/pki/sa.pub --service-account-signing-key-file=/etc/kubernetes/pki/sa.key --service-cluster-ip-range=10.96.0.0/12 --tls-cert-file=/etc/kubernetes/pki/apiserver.crt --tls-private-key-file=/etc/kubernetes/pki/apiserver.key root 10380 10280 0 05:29 pts/0 00:00:00 grep etcd ## etcd external 서버 접속 $ ssh 192.6.113.12 $ ps -ef | grep etcd etcd 813 1 0 03:49 ? 00:01:50 /usr/local/bin/etcd --name etcd-server --data-dir=/var/lib/etcd-data --cert-file=/etc/etcd/pki/etcd.pem --key-file=/etc/etcd/pki/etcd-key.pem --peer-cert-file=/etc/etcd/pki/etcd.pem --peer-key-file=/etc/etcd/pki/etcd-key.pem --trusted-ca-file=/etc/etcd/pki/ca.pem --peer-trusted-ca-file=/etc/etcd/pki/ca.pem --peer-client-cert-auth --client-cert-auth --initial-advertise-peer-urls https://192.6.113.12:2380 --listen-peer-urls https://192.6.113.12:2380 --advertise-client-urls https://192.6.113.12:2379 --listen-client-urls https://192.6.113.12:2379,https://127.0.0.1:2379 --initial-cluster-token etcd-cluster-1 --initial-cluster etcd-server=https://192.6.113.12:2380 --initial-cluster-state new root 1305 963 0 05:34 pts/0 00:00:00 grep etcd ## etcd 멤버 확인 $ ETCDCTL_API=3 etcdctl \ --endpoints=https://127.0.0.1:2379 \ --cacert=/etc/etcd/pki/ca.pem \ --cert=/etc/etcd/pki/etcd.pem \ --key=/etc/etcd/pki/etcd-key.pem \ member list e8ac847a8c08805b, started, etcd-server, https://192.6.113.12:2380, https://192.6.113.12:2379, false
  • cluster1 백업
bash
$ k config use-context cluster1 $ k describe po etcd-cluster1-controlplane -n kube-system | grep cert --cert-file=/etc/kubernetes/pki/etcd/server.crt --client-cert-auth=true --peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt --peer-client-cert-auth=true /etc/kubernetes/pki/etcd from etcd-certs (rw) etcd-certs: $ k describe po etcd-cluster1-controlplane -n kube-system | grep ca Priority Class Name: system-node-critical --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt $ k describe po etcd-cluster1-controlplane -n kube-system | grep client Annotations: kubeadm.kubernetes.io/etcd.advertise-client-urls: https://192.6.113.21:2379 --advertise-client-urls=https://192.6.113.21:2379 --client-cert-auth=true --listen-client-urls=https://127.0.0.1:2379,https://192.6.113.21:2379 --peer-client-cert-auth=true $ k describe po etcd-cluster1-controlplane -n kube-system | grep key --key-file=/etc/kubernetes/pki/etcd/server.key --peer-key-file=/etc/kubernetes/pki/etcd/peer.key $ ssh cluster1-controlplane ## student-node로 exit $ cd /opt $ scp cluster1-controlplane:/opt/cluster1.db .
  • cluster2 복구 (external etcd)
bash
## etcd 서버 ip 확인 $ ssh cluster2-controlplane $ ps -ef |grep etcd root 1708 1362 0 03:49 ? 00:06:25 kube-apiserver --advertise-address=192.6.113.23 --allow-privileged=true --authorization-mode=Node,RBAC --client-ca-file=/etc/kubernetes/pki/ca.crt --enable-admission-plugins=NodeRestriction --enable-bootstrap-token-auth=true --etcd-cafile=/etc/kubernetes/pki/etcd/ca.pem --etcd-certfile=/etc/kubernetes/pki/etcd/etcd.pem --etcd-keyfile=/etc/kubernetes/pki/etcd/etcd-key.pem --etcd-servers=https://192.6.113.12:2379 --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key --requestheader-allowed-names=front-proxy-client --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt --requestheader-extra-headers-prefix=X-Remote-Extra- --requestheader-group-headers=X-Remote-Group --requestheader-username-headers=X-Remote-User --secure-port=6443 --service-account-issuer=https://kubernetes.default.svc.cluster.local --service-account-key-file=/etc/kubernetes/pki/sa.pub --service-account-signing-key-file=/etc/kubernetes/pki/sa.key --service-cluster-ip-range=10.96.0.0/12 --tls-cert-file=/etc/kubernetes/pki/apiserver.crt --tls-private-key-file=/etc/kubernetes/pki/apiserver.key $ exit ## student-node $ scp /opt/cluster2.db 192.6.113.12:/opt/. $ ssh 192.6.113.12 $ ETCDCTL_API=3 etcdctl snapshot restore --data-dir="/var/lib/etcd-data-new" /opt/cluster2.db $ chown -R etcd:etcd /var/lib/etcd-data-new $ vi /etc/systemd/system/etcd.service [Unit] Description=etcd key-value store Documentation=https://github.com/etcd-io/etcd After=network.target [Service] User=etcd Type=notify ExecStart=/usr/local/bin/etcd \ --name etcd-server \ --data-dir=/var/lib/etcd-data \ --cert-file=/etc/etcd/pki/etcd.pem \ --key-file=/etc/etcd/pki/etcd-key.pem \ --peer-cert-file=/etc/etcd/pki/etcd.pem \ --peer-key-file=/etc/etcd/pki/etcd-key.pem \ --trusted-ca-file=/etc/etcd/pki/ca.pem \ --peer-trusted-ca-file=/etc/etcd/pki/ca.pem \ --peer-client-cert-auth \ --client-cert-auth \ --initial-advertise-peer-urls https://192.6.113.12:2380 \ --listen-peer-urls https://192.6.113.12:2380 \ --advertise-client-urls https://192.6.113.12:2379 \ --listen-client-urls https://192.6.113.12:2379,https://127.0.0.1:2379 \ --initial-cluster-token etcd-cluster-1 \ --initial-cluster etcd-server=https://192.6.113.12:2380 \ --initial-cluster-state new Restart=on-failure RestartSec=5 LimitNOFILE=40000 $ systemctl daemon-reload $ systemctl restart etcd
728x90