* Disable Kubelet Graceful Node Shutdown on worker nodes (enabled in
Kubernetes v1.25.0 https://github.com/poseidon/typhoon/pull/1222)
* Graceful node shutdown shutdown allows 30s for critical pods to
shutdown and 15s for regular pods to shutdown before releasing the
inhibitor lock to allow the host to shutdown
* Unfortunately, both pods and the node are shutdown at the same
time at the end of the 45s period without further configuration
options. As a result, regular pods and the node are shutdown at the
same time. In practice, enabling this feature leaves Error or Completed
pods in kube-apiserver state until manually cleaned up. This feature
is not ready for general use
* Fix issue where Error/Completed pods are accumulating whenever any
node restarts (or auto-updates), visible in kubectl get pods
* This issue wasn't apparent in initial testing and seems to only
affect non-critical pods (due to critical pods being killed earlier)
But its very apparent on our real clusters
Rel: https://github.com/kubernetes/kubernetes/issues/110755
* Kubernetes v1.25.0 moved the LocalStorageCapacityIsolationFSQuotaMonitoring
feature from alpha to beta, but it breaks Kubelet updating ConfigMaps in
Pods, as shown by conformance tests
* Kubernetes is rolling LocalStorageCapacityIsolationFSQuotaMonitoring back
to alpha so its not enabled by default, but that will require a release
* Disable the feature gate directly as a workaround for now to make
Kubernetes v1.25.0 usable
```
FailedMount: MountVolume.SetUp failed for volume "configmap-volume" : requesting quota on existing directory /var/lib/kubelet/pods/f09fae17-ff16-4a05-aab3-7b897cb5b732/volumes/kubernetes.io~configmap/configmap-volume but different pod 673ad247-abf0-434e-99eb-1c3f57d7fdaa a4568e94-2b2d-438f-a4bd-c9edc814e478
```
Rel:
* https://github.com/kubernetes/kubernetes/pull/112076
* https://github.com/kubernetes/kubernetes/pull/107329
* Change the workers managed instance group to health check nodes
via HTTP probe of the kube-proxy port 10256 /healthz endpoints
* Advantages: kube-proxy is a lower value target (in case there
were bugs in firewalls) that Kubelet, its more representative than
health checking Kubelet (Kubelet must run AND kube-proxy Daemonset
must be healthy), and its already used by kube-proxy liveness probes
(better discoverability via kubectl or alerts on pods crashlooping)
* Another motivator is that GKE clusters also use kube-proxy port
10256 checks to assess node health
* When a worker managed instance group's (MIG) instance template
changes (including machine type, disk size, or Butane snippets
but excluding new AMIs), use Google Cloud's rolling update features
to ensure instances match declared state
* Ignore new AMIs since Fedora CoreOS and Flatcar Linux nodes
already auto-update and reboot themselves
* Rolling updates will create surge instances, wait for health
checks, then delete old instances (0 unavilable instances)
* Instances are replaced to ensure new Ignition/Butane snippets
are respected
* Add managed instance group autohealing (i.e. health checks) to
ensure new instances' Kubelet is running
Renames
* Name apiserver and kubelet health checks consistently
* Rename MIG from `${var.name}-worker-group` to `${var.name}-worker`
Rel: https://cloud.google.com/compute/docs/instance-groups/rolling-out-updates-to-managed-instance-groups
* Requires poseidon v0.11+ and Flatcar Linux 3185.0.0+ (action required)
* Previously, Flatcar Linux configs have been parsed as Container
Linux Configs to Ignition v2.2.0 specs by poseidon/ct
* Flatcar Linux starting in 3185.0.0 now supports Ignition v3.x specs
(which are rendered from Butane Configs, like Fedora CoreOS)
* poseidon/ct v0.11.0 adds support for the flatcar Butane Config
variant so that Flatcar Linux can use Ignition v3.x
Rel:
* [Flatcar Support](https://flatcar-linux.org/docs/latest/provisioning/ignition/specification/#ignition-v3)
* [poseidon/ct support](https://github.com/poseidon/terraform-provider-ct/pull/131)
* Use the official Kinvolk Flatcar Linux image on Google Cloud
* Change `os_image` from a custom image name to `flatcar-stable`
(default), `flatcar-beta`, or `flatcar-alpha` (**action required**)
* Change `os_image` from a required to an optional variable
* Promote Typhoon on Flatcar Linux / Google Cloud to stable
* Remove docs about needing to upload a Flatcar Linux image
manually on Google Cloud and drop support for custom images
* Both Flatcar Linux and Fedora CoreOS use systemd-resolved,
but they setup /etc/resolv.conf symlinks differently
* Prefer using /run/systemd/resolve/resolv.conf directly, which
also updates to reflect runtime changes (e.g. resolvectl)
* Update `null` provider to allow use of v3.1.x releases,
instead of being stuck on v2.1.2
* Update min versions in terraform-render-boostrap
https://github.com/poseidon/terraform-render-bootstrap/pull/287
* Document the recommended versions of Terraform cloud providers
* Remove `/sys/fs/cgroup/systemd` mount since Flatcar Linux
uses cgroups v2
* Flatcar Linux's `docker` switched from the `cgroupfs` to
`systemd` driver without notice
* Add `node_taints` variable to worker modules to set custom
initial node taints on cloud platforms that support auto-scaling
worker pools of heterogeneous nodes (i.e. AWS, Azure, GCP)
* Worker pools could use custom `node_labels` to allowed workloads
to select among differentiated nodes, while custom `node_taints`
allows a worker pool's nodes to be tainted as special to prevent
scheduling, except by workloads that explicitly tolerate the
taint
* Expose `daemonset_tolerations` in AWS, Azure, and GCP kubernetes
cluster modules, to determine whether `kube-system` components
should tolerate the custom taint (advanced use covered in docs)
Rel: #550, #663Closes#429