网友的kubernetes集群中pod创建调度到某节点后,长期处于 containercreating状态,给我付费后解决了故障,现在写出解决方案以供他人参详。
解决方案
最开始是没挂载的部署一个nginx的pod出问题,describe确实看不到啥信息,后面是nfs pvc的pod无法调度。
环境信息
$ kubectl get node -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME master Ready master 2y129d v1.18.3 xxx.xx.xx.9 <none> CentOS Linux 7 (Core) 3.10.0-1160.24.1.el7.x86_64 docker://19.3.5 work01 Ready <none> 2y129d v1.18.3 xxx.xx.xx.1 <none> CentOS Linux 7 (Core) 3.10.0-1160.24.1.el7.x86_64 docker://19.3.5 work02 Ready <none> 2y129d v1.18.3 xxx.xx.xx.3 <none> CentOS Linux 7 (Core) 3.10.0-1160.24.1.el7.x86_64 docker://19.3.5 work03 Ready <none> 2y129d v1.18.3 xxx.xx.xx.4 <none> CentOS Linux 7 (Core) 3.10.0-1160.24.1.el7.x86_64 docker://19.3.5
orphaned pod xxx found, but
kubelet 日志刷下面的
kubelet_volumes.go:154] orphaned pod "xxx" found, but volume paths are still present on disk : There were a total of 84 errors similar to this. Turn up verbosity to see them.
这个pod到其他节点或者删掉后,相关的一些目录还遗留在节点上的–root-dir下,默认是 /var/lib/kubelet/pods下的uuid字样的目录,可以find下它确认里面的内容,以及看etc-hosts文件,看 hostname后利用 kubectl get pod查看是否存在这个pod名,不存在就是遗留目录,可以手动清理下。 这个问题我记得后续有人提交了 pr kubelet会定期清理这种目录的。
Faild to get system container stats
依次处理掉上面的日志里错误后,看到下面的
summary_sys_containers.go:47] Failed to get system container stats for "/system.slice/docker.service": failed to get cgroup stats for "/system.slice/docker.service": faild to get container info for "system.slice/docker.service": unknow container "/system.slice/docker.service"
这个是kubelet无法从docker获取一些信息,一般是docker出问题了,看了下docker日志也还正常。询问了下能不能重启 docker,说上午就重启过了,那就是其他问题了。
他说 df -h 会卡住,应该要解决掉卡住的问题后再重启 docker,因为 docker 和 kubelet 收集信息会调用一些 fs 操作。安装了个 strace 看了下:
stat("/var/lib/kubelet/pods/e3b61daa-86a7-4e2d-8c82-bdd96e7a6da2/volumes/kubernetes.io~secret/jenkins-admin-token-6479r", {st_mode=S_IFDIR|S_ISVTX|0777, st_size=140, ...}) = 0 stat("/var/lib/kubelet/pods/c527ccff-e623-48e1-90be-11930facc11b/volumes/kubernetes.io~secret/default-token-r9mv9", {st_mode=S_IFDIR|S_ISVTX|0777, st_size=140, ...}) = 0 stat("/sys/fs/bpf", {st_mode=S_IFDIR|S_ISVTX|0777, st_size=0, ...}) = 0 stat("/var/lib/kubelet/pods/e3b61daa-86a7-4e2d-8c82-bdd96e7a6da2/volumes/kubernetes.io~nfs/jenkins", {st_mode=S_IFDIR|0777, st_size=3688, ...}) = 0 stat("/var/lib/kubelet/pods/c527ccff-e623-48e1-90be-11930facc11b/volumes/kubernetes.io~nfs/pv-es-master", ^C^Z [1]+ 已停止 strace df -h
可以看到卡住的路径,然后 umount 掉:
umount -lf /var/lib/kubelet/pods/c527ccff-e623-48e1-90be-11930facc11b/volumes/kubernetes.io~nfs/pv-es-master
多次处理,直到df -h不卡,然后重启docker后,nginx能调度到这个节点上了,有个容器删不掉,最后删掉/var/lib/docker/containers/xxxxx后重启才清理掉它的docker ps -a显示。
nfs
然后nginx能调度后,发现带pvc的pod无法调度到该节点上,等待后describe显示:
$ kubectl -n content-dev get pod content-754c9964bc-8dbxw -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES content-754c9964bc-8dbxw 0/1 ContainerCreating 0 63s <none> work02 <none> <none> $ kubectl -n content-dev describe pod content-754c9964bc-8dbxw ... Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedMount 8s kubelet Unable to attach or mount volumes: unmounted volumes=[nfs], unattached volumes=[nfs default-token-vdjpz]: timed out waiting for the condition
查看pod使用的pvc信息:
$ kubectl get deploy content -n content-dev -o yaml ... volumes: - name: nfs persistentVolumeClaim: claimName: datanfs-pvc $ kubectl -n content-dev get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE datanfs-pvc Bound pvc-9865b525-2cb5-4a2a-b7a7-036ca9f524cf 10Gi RWO nfs-client 39d
查看这个pvc的pv属性:
$ kubectl -n content-dev get pv pvc-9865b525-2cb5-4a2a-b7a7-036ca9f524cf NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE pvc-9865b525-2cb5-4a2a-b7a7-036ca9f524cf 10Gi RWO Delete Bound content-dev/datanfs-pvc nfs-client 39d $ kubectl -n content-dev get pv pvc-9865b525-2cb5-4a2a-b7a7-036ca9f524cf -o yaml apiVersion: v1 kind: PersistentVolume metadata: annotations: pv.kubernetes.io/provisioned-by: cluster.local/nfs-client-nfs-client-provisioner creationTimestamp: "2023-07-12T05:03:19Z" finalizers: - kubernetes.io/pv-protection name: pvc-9865b525-2cb5-4a2a-b7a7-036ca9f524cf resourceVersion: "269296403" selfLink: /api/v1/persistentvolumes/pvc-9865b525-2cb5-4a2a-b7a7-036ca9f524cf uid: add520cb-6506-47b0-b21a-24b1c92e9d56 spec: accessModes: - ReadWriteOnce capacity: storage: 10Gi claimRef: apiVersion: v1 kind: PersistentVolumeClaim name: datanfs-pvc namespace: content-dev resourceVersion: "253312376" uid: 9865b525-2cb5-4a2a-b7a7-036ca9f524cf nfs: path: /volume3/cloudxxx-pre/content-dev-datanfs-pvc-pvc-9865b525-2cb5-4a2a-b7a7-036ca9f524cf server: xxx.xx.xx.50 persistentVolumeReclaimPolicy: Delete storageClassName: nfs-client volumeMode: Filesystem status: phase: Bound
使用的nfs , 在work02上使用showmount查看下本机是否有挂载权限:
$ showmount -e xxx.xx.xx.50 Export list for xxx.xx.xx.50: /volume1/secondary * /volume1/primary * /volume3/devnfs * /volume4/xxxxxxxx-demo * /volume4/xxx2 * /volume4/xxx * /volume4/xxxxxxxx * /volume4/xxxxxxxx-xxx * /volume4/xxxxxxxx-xxxsoft * /volume4/k8s-nfs * /volume3/cicd * /volume3/xxxxxxxx-pre * /volume2/web xxx.xxx.xxx.235,xxx.xxx.xxx.211,xxx.xxx.xxx.51,xxx.xxx.xxx.50,xxx.xxx.xxx.55,xxx.xxx.xxx.238 /volume3/VSPHERE-NFS-LUN1 xxx.xx.xx.167,xxx.xx.xx.166,xxx.xx.xx.165,xxx.xx.xx.164,xxx.xx.xx.163,xxx.xx.xx.162,xxx.xx.xx.161,xxx.xx.xx.160 /volume1/文件共享目录 xxx.xx.xx.11
有权限,说明nfs server的/etc/exports配置没问题,然后看下kubelet的挂载进程:
$ ps aux | grep pvc-9865b525 root 163806 0.0 0.0 123632 1056 ? S 8月18 0:00 /usr/bin/mount -t nfs xxx.xx.xx.50:/volume3/cloudxxx-pre/content-dev-datanfs-pvc-pvc-9865b525-2cb5-4a2a-b7a7-036ca9f524cf /var/lib/kubelet/pods/cd743cb6-839c-4a15-9203-5321a0ed0666/volumes/kubernetes.io~nfs/pvc-9865b525-2cb5-4a2a-b7a7-036ca9f524cf
手动尝试挂载,发现卡住
$ mkdir test1111 $ mount.nfs xxx.xx.xx.50:/volume3/cloudxxx-pre/ test1111 ^C
查看下是否有nfs内核模块:
$ lsmod |grep nfs nfsv3 43720 0 nfsd 351321 13 nfs_acl 12837 2 nfsd,nfsv3 auth_rpcgss 59415 2 nfsd,rpcsec_gss_krb5 nfsv4 584056 3 dns_resolver 13140 1 nfsv4 nfs 262045 4 nfsv3,nfsv4 lockd 98048 3 nfs,nfsd,nfsv3 grace 13515 2 nfsd,lockd fscache 64980 2 nfs,nfsv4 sunrpc 358543 33 nfs,nfsd,rpcsec_gss_krb5,auth_rpcgss,lockd,nfsv3,nfsv4,nfs_acl
确实有,之前接触过 nas ,发现不同内核对于支持的版本不一样,尝试看看:
$ mount.nfs -o vers=3 xxx.xx.xx.50:/volume3/cloudxxx-pre/ test1111 $ umount test1111 $ mount.nfs -o vers=4 xxx.xx.xx.50:/volume3/cloudxxx-pre/ test1111 ^C $ mount.nfs -o vers=4.0 xxx.xx.xx.50:/volume3/cloudxxx-pre/ test1111 $ umount test1111 $ mount.nfs -o vers=4.1 xxx.xx.xx.50:/volume3/cloudxxx-pre/ test1111 ^C
从上面看,默认挂载 nfs 时用的是最新 version,目前本机上只有 3、4.0 可以用,需要把挂载版本加到 pv 上。
$ kubectl -n content-dev edit pv pvc-9865b525-2cb5-4a2a-b7a7-036ca9f524cf mountOptions: - nfsvers=4.0
然后 pod 就能创建了:
$ kubectl -n content-dev describe pod content-754c9964bc-8dbxw Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedMount 4m18s (x4 over 11m) kubelet Unable to attach or mount volumes: unmounted volumes=[nfs], unattached volumes=[nfs default-token-vdjpz]: timed out waiting for the condition Warning FailedMount 2m kubelet Unable to attach or mount volumes: unmounted volumes=[nfs], unattached volumes=[default-token-vdjpz nfs]: timed out waiting for the condition Warning FailedMount 11s kubelet MountVolume.SetUp failed for volume "pvc-9865b525-2cb5-4a2a-b7a7-036ca9f524cf" : mount failed: signal: terminated Mounting command: systemd-run Mounting arguments: --description=Kubernetes transient mount for /var/lib/kubelet/pods/c61fd1a0-cb8f-4cc9-84c9-122a7d24cde6/volumes/kubernetes.io~nfs/pvc-9865b525-2cb5-4a2a-b7a7-036ca9f524cf --scope -- mount -t nfs xxx.xx.xx.50:/volume3/cloudxxx-pre/content-dev-datanfs-pvc-pvc-9865b525-2cb5-4a2a-b7a7-036ca9f524cf /var/lib/kubelet/pods/c61fd1a0-cb8f-4cc9-84c9-122a7d24cde6/volumes/kubernetes.io~nfs/pvc-9865b525-2cb5-4a2a-b7a7-036ca9f524cf Output: Running scope as unit run-164051.scope. Normal Pulled 10s kubelet Container image "xxx.xx.xx.215/content/content:dev-247" already present on machine Normal Created 10s kubelet Created container spring-boot Normal Started 10s kubelet Started container spring-boot
然后处理掉其他的已有的 pv pvc 都加上 mountOptions 。还需要清理掉每个节点上卡住的 mount 进程
ps aux | grep -P 'mount.+nf[s]'
查找到 pid 后 kill -9 清理下
处理后续的 pv
发现使用了nfs-provisioner,所有的pvc都是从sc创建出来的,对于已经创建的前面手动处理了,避免后续创建出来的没带mountOptions,我们需要修改sc也加上:
$ kubectl get sc nfs-client NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE nfs-client cluster.local/nfs-client-nfs-client-provisioner Delete Immediate true 499d $ kubectl edit sc nfs-client mountOptions: - nfsvers=4.0