typhoon

mirror of https://github.com/puppetmaster/typhoon.git synced 2025-10-03 07:24:38 +02:00

Author	SHA1	Message	Date
Dalton Hubble	f04e1d25a8	Add Flatcar Linux ARM64 support on Azure * Kinvolk now publishes Flatcar Linux images for ARM64 * For now, amd64 image must specify a plan while arm64 images must NOT specify a plan due to how Kinvolk publishes. Rel: https://github.com/flatcar/Flatcar/issues/872	2022-10-17 08:36:57 -07:00
Dalton Hubble	b68f8bb2a9	Switch Azure Fedora CoreOS default worker type * Change default Azure worker_type from Standard_DS1_v2 to Standard_D2as_v5 * Get 2 VCPU, 7 GiB, 12500Mbps (vs 1 VCPU, 3.5GiB, 750 Mbps) * Small increase in pay-as-you-go price ($53.29 -> $62.78) * Small increase in spot price ($5.64/mo -> $7.37/mo) * Change from Intel to AMD EPYC (`D2as_v5` cheaper than `D2s_v5`) Rel: * https://github.com/poseidon/typhoon/pull/1248 * https://learn.microsoft.com/en-us/azure/virtual-machines/dasv5-dadsv5-series#dasv5-series * https://learn.microsoft.com/en-us/azure/virtual-machines/dv2-dsv2-series#dsv2-series	2022-10-13 21:23:57 -07:00
Dalton Hubble	651151805d	Update Kubernetes v1.25.2 to v1.25.3 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.25.md#v1253	2022-10-13 21:02:39 -07:00
Dalton Hubble	8d2c8b8db6	Switch to Flatcar Azure gen2 images and change worker type * Switch from Azure Hypervisor generation 1 to generation 2 * Change default Azure `worker_type` from Standard_DS1_v2 to Standard_D2as_v5 * Get 2 VCPU, 7 GiB, 12500Mbps (vs 1 VCPU, 3.5GiB, 750 Mbps) * Small increase in pay-as-you-go price ($53.29 -> $62.78) * Small increase in spot price ($5.64/mo -> $7.37/mo) * Change from Intel to AMD EPYC (`D2as_v5` cheaper than `D2s_v5`) Notes: Azure makes you accept terms for each plan: ``` az vm image terms accept --publish kinvolk --offer flatcar-container-linux-free --plan stable-gen2 ``` Rel: * https://learn.microsoft.com/en-us/azure/virtual-machines/dasv5-dadsv5-series#dasv5-series * https://learn.microsoft.com/en-us/azure/virtual-machines/dv2-dsv2-series#dsv2-series	2022-10-13 09:57:52 -07:00
Dalton Hubble	675ac63159	Remove note about not supporting ARM64 with Calico CNI * Calico v3.22.0 introduced multi-arch container images so Typhoon's ARM64 support has allowed choosing Calico CNI since Typhoon v1.23.5	2022-10-11 23:21:02 -07:00
Dalton Hubble	b4c8b1729c	Switch addons images from k8s.gcr.io to registry.k8s.io * Switch addon manifests to use the new Kubernetes image registry Rel: * https://github.com/poseidon/typhoon/pull/1206 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.25.md#moved-container-registry-service-from-k8sgcrio-to-registryk8sio	2022-10-09 16:14:28 -07:00
Dalton Hubble	e82241169a	Update Prometheus from v2.38.0 to v2.39.1 * https://github.com/prometheus/prometheus/releases/tag/v2.39.1	2022-10-09 16:12:35 -07:00
dependabot[bot]	ffe4929ff6	Bump mkdocs-material from 8.5.3 to 8.5.6 Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 8.5.3 to 8.5.6. - [Release notes](https://github.com/squidfunk/mkdocs-material/releases) - [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/CHANGELOG) - [Commits](https://github.com/squidfunk/mkdocs-material/compare/8.5.3...8.5.6) --- updated-dependencies: - dependency-name: mkdocs-material dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com>	2022-10-09 14:44:06 -07:00
dependabot[bot]	88b3925318	Bump pymdown-extensions from 9.5 to 9.6 Bumps [pymdown-extensions](https://github.com/facelessuser/pymdown-extensions) from 9.5 to 9.6. - [Release notes](https://github.com/facelessuser/pymdown-extensions/releases) - [Commits](https://github.com/facelessuser/pymdown-extensions/compare/9.5...9.6) --- updated-dependencies: - dependency-name: pymdown-extensions dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com>	2022-10-03 15:34:37 -07:00
dependabot[bot]	29876dc85a	Bump mkdocs from 1.3.1 to 1.4.0 Bumps [mkdocs](https://github.com/mkdocs/mkdocs) from 1.3.1 to 1.4.0. - [Release notes](https://github.com/mkdocs/mkdocs/releases) - [Commits](https://github.com/mkdocs/mkdocs/compare/1.3.1...1.4.0) --- updated-dependencies: - dependency-name: mkdocs dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com>	2022-10-03 14:49:24 -07:00
dependabot[bot]	7e29e35457	Bump mkdocs-material from 8.5.2 to 8.5.3 Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 8.5.2 to 8.5.3. - [Release notes](https://github.com/squidfunk/mkdocs-material/releases) - [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/CHANGELOG) - [Commits](https://github.com/squidfunk/mkdocs-material/compare/8.5.2...8.5.3) --- updated-dependencies: - dependency-name: mkdocs-material dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com>	2022-09-28 08:57:03 -07:00
Dalton Hubble	3ee462a24c	Update Kubernetes from v1.25.1 to v1.25.2 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.25.md#v1252	2022-09-22 08:15:30 -07:00
Dalton Hubble	f833b7205d	Sync recommended Terraform providers in docs v1.25.1	2022-09-20 08:30:15 -07:00
Dalton Hubble	558e293f78	Update Nginx Ingress and Grafana addons	2022-09-20 08:28:30 -07:00
Dalton Hubble	90782ea820	Remove workaround for preventing search . propagation * Kubelet v1.25.1 has the fix https://github.com/kubernetes/kubernetes/pull/112157	2022-09-19 22:37:02 -07:00
dependabot[bot]	8dc7cc614c	Bump mkdocs-material from 8.4.4 to 8.5.2 Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 8.4.4 to 8.5.2. - [Release notes](https://github.com/squidfunk/mkdocs-material/releases) - [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/CHANGELOG) - [Commits](https://github.com/squidfunk/mkdocs-material/compare/8.4.4...8.5.2) --- updated-dependencies: - dependency-name: mkdocs-material dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com>	2022-09-19 22:16:32 -07:00
Dalton Hubble	74d4d56dbd	Remove workaround for v1.25.0 ConfigMap rendering issue * LocalStorageCapacityIsolationFSQuotaMonitoring was reverted back to alpha in v1.25.1, so we don't need to explicitly disable it anymore Rel: https://github.com/kubernetes/kubernetes/issues/112081	2022-09-19 09:10:24 -07:00
Dalton Hubble	5abe84b520	Update etcd from v3.5.4 to v3.5.5 * https://github.com/etcd-io/etcd/blob/main/CHANGELOG/CHANGELOG-3.5.md#v355	2022-09-15 09:01:45 -07:00
Dalton Hubble	951209d113	Update Cilium from v1.12.1 to v1.12.2 * https://github.com/cilium/cilium/releases/tag/v1.12.2	2022-09-15 08:28:37 -07:00
Dalton Hubble	09751cc0e8	Update Kubernetes from v1.25.0 to v1.25.1 * https://github.com/kubernetes/kubernetes/releases/tag/v1.25.1	2022-09-15 08:23:22 -07:00
Dalton Hubble	c14300f0be	Update Calico from v3.23.3 to v3.24.1 * https://github.com/projectcalico/calico/releases/tag/v3.24.1	2022-09-14 08:09:38 -07:00
dependabot[bot]	37de9ca2ae	Bump mkdocs-material from 8.4.2 to 8.4.4 Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 8.4.2 to 8.4.4. - [Release notes](https://github.com/squidfunk/mkdocs-material/releases) - [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/CHANGELOG) - [Commits](https://github.com/squidfunk/mkdocs-material/compare/8.4.2...8.4.4) --- updated-dependencies: - dependency-name: mkdocs-material dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com>	2022-09-14 07:42:59 -07:00
Dalton Hubble	1786e34f33	Revert Graceful Node Shutdown feature * Disable Kubelet Graceful Node Shutdown on worker nodes (enabled in Kubernetes v1.25.0 https://github.com/poseidon/typhoon/pull/1222) * Graceful node shutdown shutdown allows 30s for critical pods to shutdown and 15s for regular pods to shutdown before releasing the inhibitor lock to allow the host to shutdown * Unfortunately, both pods and the node are shutdown at the same time at the end of the 45s period without further configuration options. As a result, regular pods and the node are shutdown at the same time. In practice, enabling this feature leaves Error or Completed pods in kube-apiserver state until manually cleaned up. This feature is not ready for general use * Fix issue where Error/Completed pods are accumulating whenever any node restarts (or auto-updates), visible in kubectl get pods * This issue wasn't apparent in initial testing and seems to only affect non-critical pods (due to critical pods being killed earlier) But its very apparent on our real clusters Rel: https://github.com/kubernetes/kubernetes/issues/110755	2022-09-10 14:58:44 -07:00
Dalton Hubble	5f612c82e2	Update kube-state-metrics and Grafana addons	2022-09-01 08:58:32 -07:00
Dalton Hubble	e60a321185	Sync Terraform providers shown in docs v1.25.0	2022-09-01 08:07:15 -07:00
dependabot[bot]	5ad74883fe	Bump mkdocs-material from 8.4.1 to 8.4.2 Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 8.4.1 to 8.4.2. - [Release notes](https://github.com/squidfunk/mkdocs-material/releases) - [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/CHANGELOG) - [Commits](https://github.com/squidfunk/mkdocs-material/compare/8.4.1...8.4.2) --- updated-dependencies: - dependency-name: mkdocs-material dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com>	2022-09-01 08:06:34 -07:00
Dalton Hubble	4ad473cd3c	Add workaround patch to strip "search ." from resolv.conf * systemd adds "search ." to hosts /run/systemd/resolve/resolv.conf on hosts with a fqdn hostname * Kubelet v1.25 began propagating "search ." from the host node into containers' `/etc/resolv.conf` * musl-based DNS resolvers don't behave correctly when `search .` is used in their `/etc/resolv.conf`. This breaks Alpine images * Adapt the same workaround used by Openshift to strip the "search ." * This only applies to bare-metal Typhoon nodes (where hostnames are set to fqdn's), nodes on cloud platforms aren't affected in the Typhoon configuration Kubernetes tracking issue: https://github.com/kubernetes/kubernetes/issues/112135 Rel: * https://github.com/systemd/systemd/pull/17201 * https://github.com/kubernetes/kubernetes/pull/109441 * https://github.com/coreos/fedora-coreos-tracker/issues/1287 * https://github.com/openshift/okd-machine-os/pull/159	2022-08-31 08:05:45 -07:00
Dalton Hubble	393a38deff	Configure Graceful Node Shutdown and lengthen max inhibitor delay * Configure Kubelet Graceful Node Shutdown to detect system shutdown events and stop running containers gracefully when possible * Allow up to 30s for critical pods to gracefully shutdown * Allow up to 15s for regular pods to gracefully shutdown * Node will be marked as NotReady promptly, instead of having to wait for health checks * Kubelet uses systemd inhibitor locks to delay shutdown for a limited number of seconds * Raise the default max inhibitor time from 5s to 45s Verify systemd inhibitor locks are present: ``` sudo systemd-inhibit --list WHO UID USER PID COMM WHAT WHY MODE kubelet 0 root 4581 kubelet shutdown Kubelet needs time to handle node shutdown delay ``` Tail journal logs and then shutdown a node via systemctl reboot or via the cloud console to watch container shutdown Rel: * https://kubernetes.io/blog/2021/04/21/graceful-node-shutdown-beta/ * https://kubernetes.io/docs/reference/config-api/kubelet-config.v1beta1/ * https://github.com/kubernetes/kubernetes/issues/107043 * https://github.com/coreos/fedora-coreos-tracker/issues/821 * https://www.freedesktop.org/software/systemd/man/systemd-inhibit.html * https://github.com/kubernetes/kubernetes/blob/release-1.24/pkg/kubelet/nodeshutdown/nodeshutdown_manager_linux.go * https://github.com/godbus/dbus/blob/master/conn.go	2022-08-28 10:37:33 -07:00
Dalton Hubble	76d92e9c2d	Change podman log-driver from journald to k8s-file * When podman runs the Kubelet container, logging to journald means log lines are duplicated in the journal. journalctl -u kubelet shows Kubelet's logs and the same log messages from podman. Using the k8s-file driver alleviates this problem * Fix Kubelet and etcd-member logs to be more readable and reduce unneccessary Kubelet log volume	2022-08-27 17:15:22 -07:00
Dalton Hubble	275fc0f9e8	Disable LocalStorageCapacityIsolationFSQuotaMonitoring feature * Kubernetes v1.25.0 moved the LocalStorageCapacityIsolationFSQuotaMonitoring feature from alpha to beta, but it breaks Kubelet updating ConfigMaps in Pods, as shown by conformance tests * Kubernetes is rolling LocalStorageCapacityIsolationFSQuotaMonitoring back to alpha so its not enabled by default, but that will require a release * Disable the feature gate directly as a workaround for now to make Kubernetes v1.25.0 usable ``` FailedMount: MountVolume.SetUp failed for volume "configmap-volume" : requesting quota on existing directory /var/lib/kubelet/pods/f09fae17-ff16-4a05-aab3-7b897cb5b732/volumes/kubernetes.io~configmap/configmap-volume but different pod 673ad247-abf0-434e-99eb-1c3f57d7fdaa a4568e94-2b2d-438f-a4bd-c9edc814e478 ``` Rel: * https://github.com/kubernetes/kubernetes/pull/112076 * https://github.com/kubernetes/kubernetes/pull/107329	2022-08-27 09:49:35 -07:00
Dalton Hubble	3fb59a3289	Migrate most Kubelet flags to KubeletConfiguration file * Add a KubeletConfiguration file to replace most Kubelet flags, to prepare for upcoming changes * Pass Kubelet the --config flag to specify the location of the KubeletConfiguration * Remove flsgs / configuration where it matches the defaults * Remove --cgroups-per-qos, defaults to true * Remove --container-runtime, defaults to remote * Remove enforce-node-allocatable=pods, defaults to pods Rel: * https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/ * https://kubernetes.io/docs/reference/config-api/kubelet-config.v1beta1/	2022-08-27 09:28:15 -07:00
Dalton Hubble	a31dbceac6	Update Kubernetes from v1.24.4 to v1.25.0 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.25.md	2022-08-25 09:18:14 -07:00
dependabot[bot]	1dcf56127b	Bump mkdocs-material from 8.4.0 to 8.4.1 Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 8.4.0 to 8.4.1. - [Release notes](https://github.com/squidfunk/mkdocs-material/releases) - [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/CHANGELOG) - [Commits](https://github.com/squidfunk/mkdocs-material/compare/8.4.0...8.4.1) --- updated-dependencies: - dependency-name: mkdocs-material dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com>	2022-08-23 08:53:12 -07:00
Dalton Hubble	bf06412dfd	Update Prometheus and Grafana addons v1.24.4	2022-08-21 08:56:00 -07:00
Dalton Hubble	505818b7d5	Update docs showing the terraform plan resources count * Although I don't plan to keep these in sync, some users are confused when the docs don't match the actual resource count	2022-08-21 08:52:35 -07:00
Dalton Hubble	0d27811265	Update recommended Terraform provider versions	2022-08-18 09:08:55 -07:00
Dalton Hubble	c13d060b38	Add docs for GCP MIG update and AWS instance refresh * Document that worker instances are rolling replaced when changes to their configuration are applied	2022-08-18 09:02:38 -07:00
Dalton Hubble	e87d5aabc3	Adjust Google Cloud worker health checks to use kube-proxy healthz * Change the workers managed instance group to health check nodes via HTTP probe of the kube-proxy port 10256 /healthz endpoints * Advantages: kube-proxy is a lower value target (in case there were bugs in firewalls) that Kubelet, its more representative than health checking Kubelet (Kubelet must run AND kube-proxy Daemonset must be healthy), and its already used by kube-proxy liveness probes (better discoverability via kubectl or alerts on pods crashlooping) * Another motivator is that GKE clusters also use kube-proxy port 10256 checks to assess node health	2022-08-17 20:50:52 -07:00
Dalton Hubble	760b4cd5ee	Update Kubernetes from v1.24.3 to v1.24.4 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.24.md#v1244	2022-08-17 20:09:30 -07:00
Dalton Hubble	fcd8ff2b17	Update Cilium from v1.12.0 to v1.12.1 * https://github.com/cilium/cilium/releases/tag/v1.12.1	2022-08-17 08:53:56 -07:00
dependabot[bot]	ef2d2af0c7	Bump mkdocs-material from 8.3.9 to 8.4.0 Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 8.3.9 to 8.4.0. - [Release notes](https://github.com/squidfunk/mkdocs-material/releases) - [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/CHANGELOG) - [Commits](https://github.com/squidfunk/mkdocs-material/compare/8.3.9...8.4.0) --- updated-dependencies: - dependency-name: mkdocs-material dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com>	2022-08-16 08:29:51 -07:00
dependabot[bot]	8e2027ed2d	Bump pygments from 2.12.0 to 2.13.0 Bumps [pygments](https://github.com/pygments/pygments) from 2.12.0 to 2.13.0. - [Release notes](https://github.com/pygments/pygments/releases) - [Changelog](https://github.com/pygments/pygments/blob/master/CHANGES) - [Commits](https://github.com/pygments/pygments/compare/2.12.0...2.13.0) --- updated-dependencies: - dependency-name: pygments dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com>	2022-08-16 08:26:45 -07:00
Dalton Hubble	52427a4271	Refresh instances in autoscaling group when launch configuration changes * Changes to worker launch configurations start an autoscaling group instance refresh to replace instances * Instance refresh creates surge instances, waits for a warm-up period, then deletes old instances * Changing worker_type, disk_, worker_price, worker_target_groups, or Butane worker_snippets on existing worker nodes will replace instances New AMIs or changing `os_stream` will be ignored, to allow Fedora CoreOS or Flatcar Linux to keep themselves updated * Previously, new launch configurations were made in the same way, but not applied to instances unless manually replaced	2022-08-14 21:43:49 -07:00
Dalton Hubble	20b76d6e00	Roll instance template changes to worker managed instance groups * When a worker managed instance group's (MIG) instance template changes (including machine type, disk size, or Butane snippets but excluding new AMIs), use Google Cloud's rolling update features to ensure instances match declared state * Ignore new AMIs since Fedora CoreOS and Flatcar Linux nodes already auto-update and reboot themselves * Rolling updates will create surge instances, wait for health checks, then delete old instances (0 unavilable instances) * Instances are replaced to ensure new Ignition/Butane snippets are respected * Add managed instance group autohealing (i.e. health checks) to ensure new instances' Kubelet is running Renames * Name apiserver and kubelet health checks consistently * Rename MIG from `${var.name}-worker-group` to `${var.name}-worker` Rel: https://cloud.google.com/compute/docs/instance-groups/rolling-out-updates-to-managed-instance-groups	2022-08-14 13:06:53 -07:00
Dalton Hubble	6facfca4ed	Switch Kubernetes image registry from k8s.gcr.io to registry.k8s.io * Announce: https://groups.google.com/g/kubernetes-sig-testing/c/U7b_im9vRrM Rel: https://github.com/poseidon/terraform-render-bootstrap/pull/319	2022-08-13 16:16:21 -07:00
Dalton Hubble	ed8c6a5aeb	Upgrade CoreDNS from v1.8.5 to v1.9.3 Rel: https://github.com/poseidon/terraform-render-bootstrap/pull/318	2022-08-13 15:43:03 -07:00
Dalton Hubble	003af72cc8	Rename google-cloud/fedora-coreos/kubernetes/workers fcc to butane * Should have been part of https://github.com/poseidon/typhoon/pull/1203	2022-08-13 15:40:16 -07:00
Dalton Hubble	b321b90a4f	Update Grafana from v9.0.6 to v9.0.7	2022-08-13 15:39:44 -07:00
Dalton Hubble	e5d0e2d48b	Rename Fedora CoreOS fcc directory to butane * Align both Fedora CoreOS and Flatcar Linux keeping Butane Configs in a directory called butane	2022-08-10 09:10:18 -07:00
Dalton Hubble	679f8b878f	Update Grafana from v9.0.5 to v9.0.6	2022-08-10 08:23:04 -07:00

1 2 3 4 5 ...

1468 Commits