Configure Graceful Node Shutdown and lengthen max inhibitor delay

* Configure Kubelet Graceful Node Shutdown to detect system shutdown
events and stop running containers gracefully when possible
* Allow up to 30s for critical pods to gracefully shutdown
* Allow up to 15s for regular pods to gracefully shutdown
* Node will be marked as NotReady promptly, instead of having to
wait for health checks
* Kubelet uses systemd inhibitor locks to delay shutdown for a limited
number of seconds
* Raise the default max inhibitor time from 5s to 45s

Verify systemd inhibitor locks are present:

```
sudo systemd-inhibit --list
WHO     UID USER PID  COMM    WHAT     WHY                                        MODE
kubelet 0   root 4581 kubelet shutdown Kubelet needs time to handle node shutdown delay
```

Tail journal logs and then shutdown a node via systemctl reboot
or via the cloud console to watch container shutdown

Rel:

* https://kubernetes.io/blog/2021/04/21/graceful-node-shutdown-beta/
* https://kubernetes.io/docs/reference/config-api/kubelet-config.v1beta1/
* https://github.com/kubernetes/kubernetes/issues/107043
* https://github.com/coreos/fedora-coreos-tracker/issues/821
* https://www.freedesktop.org/software/systemd/man/systemd-inhibit.html
* https://github.com/kubernetes/kubernetes/blob/release-1.24/pkg/kubelet/nodeshutdown/nodeshutdown_manager_linux.go
* https://github.com/godbus/dbus/blob/master/conn.go
This commit is contained in:
Dalton Hubble 2022-08-28 09:49:28 -07:00
parent 76d92e9c2d
commit 393a38deff
21 changed files with 146 additions and 0 deletions

View File

@ -6,6 +6,12 @@ Notable changes between versions.
* Kubernetes [v1.25.0](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.25.md#v1250) * Kubernetes [v1.25.0](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.25.md#v1250)
* Disable LocalStorageCapacityIsolationFSQuotaMonitoring feature gate ([#1220](https://github.com/poseidon/typhoon/pull/1220)) * Disable LocalStorageCapacityIsolationFSQuotaMonitoring feature gate ([#1220](https://github.com/poseidon/typhoon/pull/1220))
* Migrate most Kubelet flags to KubeletConfiguration file ([#1219](https://github.com/poseidon/typhoon/pull/1219))
* Configure Kubelet Graceful Node Shutdown ([#1222](https://github.com/poseidon/typhoon/pull/1222))
* Allow up to 30s for critical pods to gracefully shutdown on node shutdown
* Allow up to 15s for regular pods to gracefully shutdown on node shutdown
* Mark node NotReady promptly on node shutdown
* Lengthen systemd inhibitor lock max delay from 5s to 45s
### Fedora CoreOS ### Fedora CoreOS

View File

@ -154,6 +154,8 @@ storage:
featureGates: featureGates:
LocalStorageCapacityIsolationFSQuotaMonitoring: false LocalStorageCapacityIsolationFSQuotaMonitoring: false
rotateCertificates: true rotateCertificates: true
shutdownGracePeriod: 45s
shutdownGracePeriodCriticalPods: 30s
staticPodPath: /etc/kubernetes/manifests staticPodPath: /etc/kubernetes/manifests
readOnlyPort: 0 readOnlyPort: 0
resolvConf: /run/systemd/resolve/resolv.conf resolvConf: /run/systemd/resolve/resolv.conf
@ -194,6 +196,11 @@ storage:
echo "Retry applying manifests" echo "Retry applying manifests"
sleep 5 sleep 5
done done
- path: /etc/systemd/logind.conf.d/inhibitors.conf
contents:
inline: |
[Login]
InhibitDelayMaxSec=45s
- path: /etc/sysctl.d/max-user-watches.conf - path: /etc/sysctl.d/max-user-watches.conf
contents: contents:
inline: | inline: |

View File

@ -122,10 +122,17 @@ storage:
featureGates: featureGates:
LocalStorageCapacityIsolationFSQuotaMonitoring: false LocalStorageCapacityIsolationFSQuotaMonitoring: false
rotateCertificates: true rotateCertificates: true
shutdownGracePeriod: 45s
shutdownGracePeriodCriticalPods: 30s
staticPodPath: /etc/kubernetes/manifests staticPodPath: /etc/kubernetes/manifests
readOnlyPort: 0 readOnlyPort: 0
resolvConf: /run/systemd/resolve/resolv.conf resolvConf: /run/systemd/resolve/resolv.conf
volumePluginDir: /var/lib/kubelet/volumeplugins volumePluginDir: /var/lib/kubelet/volumeplugins
- path: /etc/systemd/logind.conf.d/inhibitors.conf
contents:
inline: |
[Login]
InhibitDelayMaxSec=45s
- path: /etc/sysctl.d/max-user-watches.conf - path: /etc/sysctl.d/max-user-watches.conf
contents: contents:
inline: | inline: |

View File

@ -153,6 +153,8 @@ storage:
featureGates: featureGates:
LocalStorageCapacityIsolationFSQuotaMonitoring: false LocalStorageCapacityIsolationFSQuotaMonitoring: false
rotateCertificates: true rotateCertificates: true
shutdownGracePeriod: 45s
shutdownGracePeriodCriticalPods: 30s
staticPodPath: /etc/kubernetes/manifests staticPodPath: /etc/kubernetes/manifests
readOnlyPort: 0 readOnlyPort: 0
resolvConf: /run/systemd/resolve/resolv.conf resolvConf: /run/systemd/resolve/resolv.conf
@ -193,6 +195,11 @@ storage:
echo "Retry applying manifests" echo "Retry applying manifests"
sleep 5 sleep 5
done done
- path: /etc/systemd/logind.conf.d/inhibitors.conf
contents:
inline: |
[Login]
InhibitDelayMaxSec=45s
- path: /etc/sysctl.d/max-user-watches.conf - path: /etc/sysctl.d/max-user-watches.conf
mode: 0644 mode: 0644
contents: contents:

View File

@ -121,10 +121,17 @@ storage:
featureGates: featureGates:
LocalStorageCapacityIsolationFSQuotaMonitoring: false LocalStorageCapacityIsolationFSQuotaMonitoring: false
rotateCertificates: true rotateCertificates: true
shutdownGracePeriod: 45s
shutdownGracePeriodCriticalPods: 30s
staticPodPath: /etc/kubernetes/manifests staticPodPath: /etc/kubernetes/manifests
readOnlyPort: 0 readOnlyPort: 0
resolvConf: /run/systemd/resolve/resolv.conf resolvConf: /run/systemd/resolve/resolv.conf
volumePluginDir: /var/lib/kubelet/volumeplugins volumePluginDir: /var/lib/kubelet/volumeplugins
- path: /etc/systemd/logind.conf.d/inhibitors.conf
contents:
inline: |
[Login]
InhibitDelayMaxSec=45s
- path: /etc/sysctl.d/max-user-watches.conf - path: /etc/sysctl.d/max-user-watches.conf
mode: 0644 mode: 0644
contents: contents:

View File

@ -149,6 +149,8 @@ storage:
featureGates: featureGates:
LocalStorageCapacityIsolationFSQuotaMonitoring: false LocalStorageCapacityIsolationFSQuotaMonitoring: false
rotateCertificates: true rotateCertificates: true
shutdownGracePeriod: 45s
shutdownGracePeriodCriticalPods: 30s
staticPodPath: /etc/kubernetes/manifests staticPodPath: /etc/kubernetes/manifests
readOnlyPort: 0 readOnlyPort: 0
resolvConf: /run/systemd/resolve/resolv.conf resolvConf: /run/systemd/resolve/resolv.conf
@ -189,6 +191,11 @@ storage:
echo "Retry applying manifests" echo "Retry applying manifests"
sleep 5 sleep 5
done done
- path: /etc/systemd/logind.conf.d/inhibitors.conf
contents:
inline: |
[Login]
InhibitDelayMaxSec=45s
- path: /etc/sysctl.d/max-user-watches.conf - path: /etc/sysctl.d/max-user-watches.conf
contents: contents:
inline: | inline: |

View File

@ -117,10 +117,17 @@ storage:
featureGates: featureGates:
LocalStorageCapacityIsolationFSQuotaMonitoring: false LocalStorageCapacityIsolationFSQuotaMonitoring: false
rotateCertificates: true rotateCertificates: true
shutdownGracePeriod: 45s
shutdownGracePeriodCriticalPods: 30s
staticPodPath: /etc/kubernetes/manifests staticPodPath: /etc/kubernetes/manifests
readOnlyPort: 0 readOnlyPort: 0
resolvConf: /run/systemd/resolve/resolv.conf resolvConf: /run/systemd/resolve/resolv.conf
volumePluginDir: /var/lib/kubelet/volumeplugins volumePluginDir: /var/lib/kubelet/volumeplugins
- path: /etc/systemd/logind.conf.d/inhibitors.conf
contents:
inline: |
[Login]
InhibitDelayMaxSec=45s
- path: /etc/sysctl.d/max-user-watches.conf - path: /etc/sysctl.d/max-user-watches.conf
contents: contents:
inline: | inline: |

View File

@ -149,6 +149,8 @@ storage:
featureGates: featureGates:
LocalStorageCapacityIsolationFSQuotaMonitoring: false LocalStorageCapacityIsolationFSQuotaMonitoring: false
rotateCertificates: true rotateCertificates: true
shutdownGracePeriod: 45s
shutdownGracePeriodCriticalPods: 30s
staticPodPath: /etc/kubernetes/manifests staticPodPath: /etc/kubernetes/manifests
readOnlyPort: 0 readOnlyPort: 0
resolvConf: /run/systemd/resolve/resolv.conf resolvConf: /run/systemd/resolve/resolv.conf
@ -189,6 +191,11 @@ storage:
echo "Retry applying manifests" echo "Retry applying manifests"
sleep 5 sleep 5
done done
- path: /etc/systemd/logind.conf.d/inhibitors.conf
contents:
inline: |
[Login]
InhibitDelayMaxSec=45s
- path: /etc/sysctl.d/max-user-watches.conf - path: /etc/sysctl.d/max-user-watches.conf
mode: 0644 mode: 0644
contents: contents:

View File

@ -117,10 +117,17 @@ storage:
featureGates: featureGates:
LocalStorageCapacityIsolationFSQuotaMonitoring: false LocalStorageCapacityIsolationFSQuotaMonitoring: false
rotateCertificates: true rotateCertificates: true
shutdownGracePeriod: 45s
shutdownGracePeriodCriticalPods: 30s
staticPodPath: /etc/kubernetes/manifests staticPodPath: /etc/kubernetes/manifests
readOnlyPort: 0 readOnlyPort: 0
resolvConf: /run/systemd/resolve/resolv.conf resolvConf: /run/systemd/resolve/resolv.conf
volumePluginDir: /var/lib/kubelet/volumeplugins volumePluginDir: /var/lib/kubelet/volumeplugins
- path: /etc/systemd/logind.conf.d/inhibitors.conf
contents:
inline: |
[Login]
InhibitDelayMaxSec=45s
- path: /etc/sysctl.d/max-user-watches.conf - path: /etc/sysctl.d/max-user-watches.conf
mode: 0644 mode: 0644
contents: contents:

View File

@ -159,6 +159,8 @@ storage:
featureGates: featureGates:
LocalStorageCapacityIsolationFSQuotaMonitoring: false LocalStorageCapacityIsolationFSQuotaMonitoring: false
rotateCertificates: true rotateCertificates: true
shutdownGracePeriod: 45s
shutdownGracePeriodCriticalPods: 30s
staticPodPath: /etc/kubernetes/manifests staticPodPath: /etc/kubernetes/manifests
readOnlyPort: 0 readOnlyPort: 0
resolvConf: /run/systemd/resolve/resolv.conf resolvConf: /run/systemd/resolve/resolv.conf
@ -199,6 +201,11 @@ storage:
echo "Retry applying manifests" echo "Retry applying manifests"
sleep 5 sleep 5
done done
- path: /etc/systemd/logind.conf.d/inhibitors.conf
contents:
inline: |
[Login]
InhibitDelayMaxSec=45s
- path: /etc/sysctl.d/max-user-watches.conf - path: /etc/sysctl.d/max-user-watches.conf
contents: contents:
inline: | inline: |

View File

@ -113,10 +113,17 @@ storage:
featureGates: featureGates:
LocalStorageCapacityIsolationFSQuotaMonitoring: false LocalStorageCapacityIsolationFSQuotaMonitoring: false
rotateCertificates: true rotateCertificates: true
shutdownGracePeriod: 45s
shutdownGracePeriodCriticalPods: 30s
staticPodPath: /etc/kubernetes/manifests staticPodPath: /etc/kubernetes/manifests
readOnlyPort: 0 readOnlyPort: 0
resolvConf: /run/systemd/resolve/resolv.conf resolvConf: /run/systemd/resolve/resolv.conf
volumePluginDir: /var/lib/kubelet/volumeplugins volumePluginDir: /var/lib/kubelet/volumeplugins
- path: /etc/systemd/logind.conf.d/inhibitors.conf
contents:
inline: |
[Login]
InhibitDelayMaxSec=45s
- path: /etc/sysctl.d/max-user-watches.conf - path: /etc/sysctl.d/max-user-watches.conf
contents: contents:
inline: | inline: |

View File

@ -160,6 +160,8 @@ storage:
featureGates: featureGates:
LocalStorageCapacityIsolationFSQuotaMonitoring: false LocalStorageCapacityIsolationFSQuotaMonitoring: false
rotateCertificates: true rotateCertificates: true
shutdownGracePeriod: 45s
shutdownGracePeriodCriticalPods: 30s
staticPodPath: /etc/kubernetes/manifests staticPodPath: /etc/kubernetes/manifests
readOnlyPort: 0 readOnlyPort: 0
resolvConf: /run/systemd/resolve/resolv.conf resolvConf: /run/systemd/resolve/resolv.conf
@ -200,6 +202,11 @@ storage:
echo "Retry applying manifests" echo "Retry applying manifests"
sleep 5 sleep 5
done done
- path: /etc/systemd/logind.conf.d/inhibitors.conf
contents:
inline: |
[Login]
InhibitDelayMaxSec=45s
- path: /etc/sysctl.d/max-user-watches.conf - path: /etc/sysctl.d/max-user-watches.conf
mode: 0644 mode: 0644
contents: contents:

View File

@ -118,10 +118,17 @@ storage:
featureGates: featureGates:
LocalStorageCapacityIsolationFSQuotaMonitoring: false LocalStorageCapacityIsolationFSQuotaMonitoring: false
rotateCertificates: true rotateCertificates: true
shutdownGracePeriod: 45s
shutdownGracePeriodCriticalPods: 30s
staticPodPath: /etc/kubernetes/manifests staticPodPath: /etc/kubernetes/manifests
readOnlyPort: 0 readOnlyPort: 0
resolvConf: /run/systemd/resolve/resolv.conf resolvConf: /run/systemd/resolve/resolv.conf
volumePluginDir: /var/lib/kubelet/volumeplugins volumePluginDir: /var/lib/kubelet/volumeplugins
- path: /etc/systemd/logind.conf.d/inhibitors.conf
contents:
inline: |
[Login]
InhibitDelayMaxSec=45s
- path: /etc/sysctl.d/max-user-watches.conf - path: /etc/sysctl.d/max-user-watches.conf
mode: 0644 mode: 0644
contents: contents:

View File

@ -156,6 +156,8 @@ storage:
featureGates: featureGates:
LocalStorageCapacityIsolationFSQuotaMonitoring: false LocalStorageCapacityIsolationFSQuotaMonitoring: false
rotateCertificates: true rotateCertificates: true
shutdownGracePeriod: 45s
shutdownGracePeriodCriticalPods: 30s
staticPodPath: /etc/kubernetes/manifests staticPodPath: /etc/kubernetes/manifests
readOnlyPort: 0 readOnlyPort: 0
resolvConf: /run/systemd/resolve/resolv.conf resolvConf: /run/systemd/resolve/resolv.conf
@ -196,6 +198,11 @@ storage:
echo "Retry applying manifests" echo "Retry applying manifests"
sleep 5 sleep 5
done done
- path: /etc/systemd/logind.conf.d/inhibitors.conf
contents:
inline: |
[Login]
InhibitDelayMaxSec=45s
- path: /etc/sysctl.d/max-user-watches.conf - path: /etc/sysctl.d/max-user-watches.conf
contents: contents:
inline: | inline: |

View File

@ -122,10 +122,17 @@ storage:
featureGates: featureGates:
LocalStorageCapacityIsolationFSQuotaMonitoring: false LocalStorageCapacityIsolationFSQuotaMonitoring: false
rotateCertificates: true rotateCertificates: true
shutdownGracePeriod: 45s
shutdownGracePeriodCriticalPods: 30s
staticPodPath: /etc/kubernetes/manifests staticPodPath: /etc/kubernetes/manifests
readOnlyPort: 0 readOnlyPort: 0
resolvConf: /run/systemd/resolve/resolv.conf resolvConf: /run/systemd/resolve/resolv.conf
volumePluginDir: /var/lib/kubelet/volumeplugins volumePluginDir: /var/lib/kubelet/volumeplugins
- path: /etc/systemd/logind.conf.d/inhibitors.conf
contents:
inline: |
[Login]
InhibitDelayMaxSec=45s
- path: /etc/sysctl.d/max-user-watches.conf - path: /etc/sysctl.d/max-user-watches.conf
contents: contents:
inline: | inline: |

View File

@ -158,6 +158,8 @@ storage:
featureGates: featureGates:
LocalStorageCapacityIsolationFSQuotaMonitoring: false LocalStorageCapacityIsolationFSQuotaMonitoring: false
rotateCertificates: true rotateCertificates: true
shutdownGracePeriod: 45s
shutdownGracePeriodCriticalPods: 30s
staticPodPath: /etc/kubernetes/manifests staticPodPath: /etc/kubernetes/manifests
readOnlyPort: 0 readOnlyPort: 0
resolvConf: /run/systemd/resolve/resolv.conf resolvConf: /run/systemd/resolve/resolv.conf
@ -198,6 +200,11 @@ storage:
echo "Retry applying manifests" echo "Retry applying manifests"
sleep 5 sleep 5
done done
- path: /etc/systemd/logind.conf.d/inhibitors.conf
contents:
inline: |
[Login]
InhibitDelayMaxSec=45s
- path: /etc/sysctl.d/max-user-watches.conf - path: /etc/sysctl.d/max-user-watches.conf
mode: 0644 mode: 0644
contents: contents:

View File

@ -121,10 +121,17 @@ storage:
featureGates: featureGates:
LocalStorageCapacityIsolationFSQuotaMonitoring: false LocalStorageCapacityIsolationFSQuotaMonitoring: false
rotateCertificates: true rotateCertificates: true
shutdownGracePeriod: 45s
shutdownGracePeriodCriticalPods: 30s
staticPodPath: /etc/kubernetes/manifests staticPodPath: /etc/kubernetes/manifests
readOnlyPort: 0 readOnlyPort: 0
resolvConf: /run/systemd/resolve/resolv.conf resolvConf: /run/systemd/resolve/resolv.conf
volumePluginDir: /var/lib/kubelet/volumeplugins volumePluginDir: /var/lib/kubelet/volumeplugins
- path: /etc/systemd/logind.conf.d/inhibitors.conf
contents:
inline: |
[Login]
InhibitDelayMaxSec=45s
- path: /etc/sysctl.d/max-user-watches.conf - path: /etc/sysctl.d/max-user-watches.conf
mode: 0644 mode: 0644
contents: contents:

View File

@ -148,6 +148,8 @@ storage:
featureGates: featureGates:
LocalStorageCapacityIsolationFSQuotaMonitoring: false LocalStorageCapacityIsolationFSQuotaMonitoring: false
rotateCertificates: true rotateCertificates: true
shutdownGracePeriod: 45s
shutdownGracePeriodCriticalPods: 30s
staticPodPath: /etc/kubernetes/manifests staticPodPath: /etc/kubernetes/manifests
readOnlyPort: 0 readOnlyPort: 0
resolvConf: /run/systemd/resolve/resolv.conf resolvConf: /run/systemd/resolve/resolv.conf
@ -188,6 +190,11 @@ storage:
echo "Retry applying manifests" echo "Retry applying manifests"
sleep 5 sleep 5
done done
- path: /etc/systemd/logind.conf.d/inhibitors.conf
contents:
inline: |
[Login]
InhibitDelayMaxSec=45s
- path: /etc/sysctl.d/max-user-watches.conf - path: /etc/sysctl.d/max-user-watches.conf
contents: contents:
inline: | inline: |

View File

@ -116,10 +116,17 @@ storage:
featureGates: featureGates:
LocalStorageCapacityIsolationFSQuotaMonitoring: false LocalStorageCapacityIsolationFSQuotaMonitoring: false
rotateCertificates: true rotateCertificates: true
shutdownGracePeriod: 45s
shutdownGracePeriodCriticalPods: 30s
staticPodPath: /etc/kubernetes/manifests staticPodPath: /etc/kubernetes/manifests
readOnlyPort: 0 readOnlyPort: 0
resolvConf: /run/systemd/resolve/resolv.conf resolvConf: /run/systemd/resolve/resolv.conf
volumePluginDir: /var/lib/kubelet/volumeplugins volumePluginDir: /var/lib/kubelet/volumeplugins
- path: /etc/systemd/logind.conf.d/inhibitors.conf
contents:
inline: |
[Login]
InhibitDelayMaxSec=45s
- path: /etc/sysctl.d/max-user-watches.conf - path: /etc/sysctl.d/max-user-watches.conf
contents: contents:
inline: | inline: |

View File

@ -148,6 +148,8 @@ storage:
featureGates: featureGates:
LocalStorageCapacityIsolationFSQuotaMonitoring: false LocalStorageCapacityIsolationFSQuotaMonitoring: false
rotateCertificates: true rotateCertificates: true
shutdownGracePeriod: 45s
shutdownGracePeriodCriticalPods: 30s
staticPodPath: /etc/kubernetes/manifests staticPodPath: /etc/kubernetes/manifests
readOnlyPort: 0 readOnlyPort: 0
resolvConf: /run/systemd/resolve/resolv.conf resolvConf: /run/systemd/resolve/resolv.conf
@ -188,6 +190,11 @@ storage:
echo "Retry applying manifests" echo "Retry applying manifests"
sleep 5 sleep 5
done done
- path: /etc/systemd/logind.conf.d/inhibitors.conf
contents:
inline: |
[Login]
InhibitDelayMaxSec=45s
- path: /etc/sysctl.d/max-user-watches.conf - path: /etc/sysctl.d/max-user-watches.conf
mode: 0644 mode: 0644
contents: contents:

View File

@ -116,10 +116,17 @@ storage:
featureGates: featureGates:
LocalStorageCapacityIsolationFSQuotaMonitoring: false LocalStorageCapacityIsolationFSQuotaMonitoring: false
rotateCertificates: true rotateCertificates: true
shutdownGracePeriod: 45s
shutdownGracePeriodCriticalPods: 30s
staticPodPath: /etc/kubernetes/manifests staticPodPath: /etc/kubernetes/manifests
readOnlyPort: 0 readOnlyPort: 0
resolvConf: /run/systemd/resolve/resolv.conf resolvConf: /run/systemd/resolve/resolv.conf
volumePluginDir: /var/lib/kubelet/volumeplugins volumePluginDir: /var/lib/kubelet/volumeplugins
- path: /etc/systemd/logind.conf.d/inhibitors.conf
contents:
inline: |
[Login]
InhibitDelayMaxSec=45s
- path: /etc/sysctl.d/max-user-watches.conf - path: /etc/sysctl.d/max-user-watches.conf
mode: 0644 mode: 0644
contents: contents: