Situation
k3s를 서버에 설치한 후 시험 삼아 jenkins를 service로 만들어서 띄워봤다
그래서 <SERVER IP>:<PORT>로 외부에서 접속하려고 하니 ‘페이지를 찾을 수 없다’ 란다 ..
흠 .. 그래서 포트를 안열었나 체크해봤는데 이미 iptime 라우터에는 열려있다.
??? 그래서 GPT 형님에게 물어봐서 kubectl describe nodes (현재 싱글 노드라 하나 밖에 안 뜬다) 명령어로 노드 상태를 봤다.
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
MemoryPressure False Sun, 02 Feb 2025 07:22:50 +0000 Sun, 02 Feb 2025 06:48:06 +0000 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure True Sun, 02 Feb 2025 07:22:50 +0000 Sun, 02 Feb 2025 07:10:25 +0000 KubeletHasDiskPressure kubelet has disk pressure
PIDPressure False Sun, 02 Feb 2025 07:22:50 +0000 Sun, 02 Feb 2025 06:48:06 +0000 KubeletHasSufficientPID kubelet has sufficient PID available
Ready True Sun, 02 Feb 2025 07:22:50 +0000 Sun, 02 Feb 2025 06:48:06 +0000 KubeletReady kubelet is posting ready status
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Starting 37m kube-proxy
Warning PossibleMemoryBackedVolumesOnDisk 37m kubelet The tmpfs noswap option is not supported. Memory-backed volumes (e.g. secrets, emptyDirs, etc.) might be swapped to disk and should no longer be considered secure.
Normal Starting 37m kubelet Starting kubelet.
Warning InvalidDiskCapacity 37m kubelet invalid capacity 0 on image filesystem
Normal NodeAllocatableEnforced 37m kubelet Updated Node Allocatable limit across pods
Normal NodeHasSufficientMemory 37m (x2 over 37m) kubelet Node status is now: NodeHasSufficientMemory
Normal NodeHasNoDiskPressure 37m (x2 over 37m) kubelet Node status is now: NodeHasNoDiskPressure
Normal NodeHasSufficientPID 37m (x2 over 37m) kubelet Node status is now: NodeHasSufficientPID
Normal NodeReady 37m kubelet Node status is now: NodeReady
Normal NodePasswordValidationComplete 37m k3s-supervisor Deferred node password secret validation complete
Normal Synced 37m cloud-node-controller Node synced successfully
Normal RegisteredNode 37m node-controller Node gausslab-hq event: Registered Node gausslab-hq in Controller
Warning FreeDiskSpaceFailed 32m kubelet Failed to garbage collect required amount of images. Attempted to free 15598348697 bytes, but only found 0 bytes eligible to free.
Warning ImageGCFailed 32m kubelet Failed to garbage collect required amount of images. Attempted to free 15598348697 bytes, but only found 0 bytes eligible to free.
Warning FreeDiskSpaceFailed 27m kubelet Failed to garbage collect required amount of images. Attempted to free 15599892889 bytes, but only found 0 bytes eligible to free.
Warning ImageGCFailed 27m kubelet Failed to garbage collect required amount of images. Attempted to free 15599892889 bytes, but only found 0 bytes eligible to free.
Warning FreeDiskSpaceFailed 22m kubelet Failed to garbage collect required amount of images. Attempted to free 15601461657 bytes, but only found 0 bytes eligible to free.
Warning ImageGCFailed 22m kubelet Failed to garbage collect required amount of images. Attempted to free 15601461657 bytes, but only found 0 bytes eligible to free.
Warning FreeDiskSpaceFailed 17m kubelet Failed to garbage collect required amount of images. Attempted to free 15602899353 bytes, but only found 0 bytes eligible to free.
Warning ImageGCFailed 17m kubelet Failed to garbage collect required amount of images. Attempted to free 15602899353 bytes, but only found 0 bytes eligible to free.
Normal NodeHasDiskPressure 15m kubelet Node gausslab-hq status is now: NodeHasDiskPressure
Warning EvictionThresholdMet 14m (x9 over 15m) kubelet Attempting to reclaim ephemeral-storage
Warning FreeDiskSpaceFailed 12m kubelet Failed to garbage collect required amount of images. Attempted to free 15357118873 bytes, but only found 0 bytes eligible to free.
Warning FreeDiskSpaceFailed 7m53s kubelet Failed to garbage collect required amount of images. Attempted to free 15358769561 bytes, but only found 0 bytes eligible to free.
Warning FreeDiskSpaceFailed 2m53s kubelet Failed to garbage collect required amount of images. Attempted to free 15360424345 bytes, but only found 0 bytes eligible to free.
disk가 부족하다라 ..? 어케 해결해야하지
Solution
GPT 형님한테 물어보다가 아 이건 좀 아닌 것 같은데 .. ;; 라고 생각해서 구글링을 시작
Kubernetes node tainted with disk-pressure
킹갓제네럴 선배님께서 해결하신 이슈임을 발견.
kubectl drain --delete-emptydir-data --ignore-daemonsets <노드명> && kubectl uncordon <노드명>
명령어를 치니까 ?
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal NodeNotSchedulable 32s kubelet Node status is now: NodeNotSchedulable
Normal NodeSchedulable 22s kubelet Node status is now: NodeSchedulable
놀랍도록 정상화가 됐다.
GPT 형님께 왜 이런 이유가 발생됐는지 물어봤는데
💡 정리하자면:
- 디스크 압박(Disk Pressure) 발생
- kubelet의 가비지 컬렉션(GC)이 실패하면서 이미지와 컨테이너 정리가 안 됨.
- 디스크 공간 부족으로 EvictionThresholdMet 이벤트 발생, 파드 강제 종료(Evicted).
- 해결 방법
- kubectl drain --delete-emptydir-data --ignore-daemonsets <노드명>
- 노드를 drain하여 모든 파드를 제거하고,
- emptyDir 데이터를 삭제하여 공간 확보.
- kubectl uncordon <노드명>
- 노드가 다시 정상적으로 스케줄링되도록 활성화.
- kubectl drain --delete-emptydir-data --ignore-daemonsets <노드명>
라고 한다.
그래서 jenkins를 접속하니까 ? 아직 접속이 안 돼버림 ; ;