typhoon

mirror of https://github.com/puppetmaster/typhoon.git synced 2024-12-26 06:19:33 +01:00

Author	SHA1	Message	Date
Dalton Hubble	5f612c82e2	Update kube-state-metrics and Grafana addons	2022-09-01 08:58:32 -07:00
Dalton Hubble	e60a321185	Sync Terraform providers shown in docs	2022-09-01 08:07:15 -07:00
Dalton Hubble	4ad473cd3c	Add workaround patch to strip "search ." from resolv.conf * systemd adds "search ." to hosts /run/systemd/resolve/resolv.conf on hosts with a fqdn hostname * Kubelet v1.25 began propagating "search ." from the host node into containers' `/etc/resolv.conf` * musl-based DNS resolvers don't behave correctly when `search .` is used in their `/etc/resolv.conf`. This breaks Alpine images * Adapt the same workaround used by Openshift to strip the "search ." * This only applies to bare-metal Typhoon nodes (where hostnames are set to fqdn's), nodes on cloud platforms aren't affected in the Typhoon configuration Kubernetes tracking issue: https://github.com/kubernetes/kubernetes/issues/112135 Rel: * https://github.com/systemd/systemd/pull/17201 * https://github.com/kubernetes/kubernetes/pull/109441 * https://github.com/coreos/fedora-coreos-tracker/issues/1287 * https://github.com/openshift/okd-machine-os/pull/159	2022-08-31 08:05:45 -07:00
Dalton Hubble	393a38deff	Configure Graceful Node Shutdown and lengthen max inhibitor delay * Configure Kubelet Graceful Node Shutdown to detect system shutdown events and stop running containers gracefully when possible * Allow up to 30s for critical pods to gracefully shutdown * Allow up to 15s for regular pods to gracefully shutdown * Node will be marked as NotReady promptly, instead of having to wait for health checks * Kubelet uses systemd inhibitor locks to delay shutdown for a limited number of seconds * Raise the default max inhibitor time from 5s to 45s Verify systemd inhibitor locks are present: ``` sudo systemd-inhibit --list WHO UID USER PID COMM WHAT WHY MODE kubelet 0 root 4581 kubelet shutdown Kubelet needs time to handle node shutdown delay ``` Tail journal logs and then shutdown a node via systemctl reboot or via the cloud console to watch container shutdown Rel: * https://kubernetes.io/blog/2021/04/21/graceful-node-shutdown-beta/ * https://kubernetes.io/docs/reference/config-api/kubelet-config.v1beta1/ * https://github.com/kubernetes/kubernetes/issues/107043 * https://github.com/coreos/fedora-coreos-tracker/issues/821 * https://www.freedesktop.org/software/systemd/man/systemd-inhibit.html * https://github.com/kubernetes/kubernetes/blob/release-1.24/pkg/kubelet/nodeshutdown/nodeshutdown_manager_linux.go * https://github.com/godbus/dbus/blob/master/conn.go	2022-08-28 10:37:33 -07:00
Dalton Hubble	76d92e9c2d	Change podman log-driver from journald to k8s-file * When podman runs the Kubelet container, logging to journald means log lines are duplicated in the journal. journalctl -u kubelet shows Kubelet's logs and the same log messages from podman. Using the k8s-file driver alleviates this problem * Fix Kubelet and etcd-member logs to be more readable and reduce unneccessary Kubelet log volume	2022-08-27 17:15:22 -07:00
Dalton Hubble	bf06412dfd	Update Prometheus and Grafana addons	2022-08-21 08:56:00 -07:00
Dalton Hubble	0d27811265	Update recommended Terraform provider versions	2022-08-18 09:08:55 -07:00
Dalton Hubble	c13d060b38	Add docs for GCP MIG update and AWS instance refresh * Document that worker instances are rolling replaced when changes to their configuration are applied	2022-08-18 09:02:38 -07:00
Dalton Hubble	e87d5aabc3	Adjust Google Cloud worker health checks to use kube-proxy healthz * Change the workers managed instance group to health check nodes via HTTP probe of the kube-proxy port 10256 /healthz endpoints * Advantages: kube-proxy is a lower value target (in case there were bugs in firewalls) that Kubelet, its more representative than health checking Kubelet (Kubelet must run AND kube-proxy Daemonset must be healthy), and its already used by kube-proxy liveness probes (better discoverability via kubectl or alerts on pods crashlooping) * Another motivator is that GKE clusters also use kube-proxy port 10256 checks to assess node health	2022-08-17 20:50:52 -07:00
Dalton Hubble	760b4cd5ee	Update Kubernetes from v1.24.3 to v1.24.4 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.24.md#v1244	2022-08-17 20:09:30 -07:00
Dalton Hubble	fcd8ff2b17	Update Cilium from v1.12.0 to v1.12.1 * https://github.com/cilium/cilium/releases/tag/v1.12.1	2022-08-17 08:53:56 -07:00
Dalton Hubble	52427a4271	Refresh instances in autoscaling group when launch configuration changes * Changes to worker launch configurations start an autoscaling group instance refresh to replace instances * Instance refresh creates surge instances, waits for a warm-up period, then deletes old instances * Changing worker_type, disk_, worker_price, worker_target_groups, or Butane worker_snippets on existing worker nodes will replace instances New AMIs or changing `os_stream` will be ignored, to allow Fedora CoreOS or Flatcar Linux to keep themselves updated * Previously, new launch configurations were made in the same way, but not applied to instances unless manually replaced	2022-08-14 21:43:49 -07:00
Dalton Hubble	20b76d6e00	Roll instance template changes to worker managed instance groups * When a worker managed instance group's (MIG) instance template changes (including machine type, disk size, or Butane snippets but excluding new AMIs), use Google Cloud's rolling update features to ensure instances match declared state * Ignore new AMIs since Fedora CoreOS and Flatcar Linux nodes already auto-update and reboot themselves * Rolling updates will create surge instances, wait for health checks, then delete old instances (0 unavilable instances) * Instances are replaced to ensure new Ignition/Butane snippets are respected * Add managed instance group autohealing (i.e. health checks) to ensure new instances' Kubelet is running Renames * Name apiserver and kubelet health checks consistently * Rename MIG from `${var.name}-worker-group` to `${var.name}-worker` Rel: https://cloud.google.com/compute/docs/instance-groups/rolling-out-updates-to-managed-instance-groups	2022-08-14 13:06:53 -07:00
Dalton Hubble	6facfca4ed	Switch Kubernetes image registry from k8s.gcr.io to registry.k8s.io * Announce: https://groups.google.com/g/kubernetes-sig-testing/c/U7b_im9vRrM Rel: https://github.com/poseidon/terraform-render-bootstrap/pull/319	2022-08-13 16:16:21 -07:00
Dalton Hubble	ed8c6a5aeb	Upgrade CoreDNS from v1.8.5 to v1.9.3 Rel: https://github.com/poseidon/terraform-render-bootstrap/pull/318	2022-08-13 15:43:03 -07:00
Dalton Hubble	b321b90a4f	Update Grafana from v9.0.6 to v9.0.7	2022-08-13 15:39:44 -07:00
Dalton Hubble	679f8b878f	Update Grafana from v9.0.5 to v9.0.6	2022-08-10 08:23:04 -07:00
Dalton Hubble	87a8278c9d	Improve AWS autoscaling group and launch config names * Rename launch configuration to use a name_prefix named after the cluster and worker to improve identifiability * Shorten AWS autoscaling group name to not include the launch config id. Years ago this used to be needed to update the ASG but the AWS provider detects changes to the launch configuration just fine	2022-08-08 20:46:08 -07:00
Dalton Hubble	93b7f2554e	Remove ineffective iptables-legacy.stamp * Typhoon Fedora CoreOS is already using iptables nf_tables since F36. The file to pin to legacy iptables was renamed to /etc/coreos/iptables-legacy.stamp	2022-08-08 20:27:21 -07:00
Dalton Hubble	62d47ad3f0	Update Cilium from v1.11.7 to v1.12.0 * https://github.com/cilium/cilium/releases/tag/v1.12.0	2022-08-08 19:59:03 -07:00
Dalton Hubble	6eb7861f96	Update Grafana liveness and readiness probes * Use the liveness and readiness probes that Grafana recommends * Update Grafana from v9.0.3 to v9.0.5	2022-08-08 09:22:44 -07:00
Dalton Hubble	4a469513dd	Migrate Flatcar Linux from Ignition spec v2.3.0 to v3.3.0 * Requires poseidon v0.11+ and Flatcar Linux 3185.0.0+ (action required) * Previously, Flatcar Linux configs have been parsed as Container Linux Configs to Ignition v2.2.0 specs by poseidon/ct * Flatcar Linux starting in 3185.0.0 now supports Ignition v3.x specs (which are rendered from Butane Configs, like Fedora CoreOS) * poseidon/ct v0.11.0 adds support for the flatcar Butane Config variant so that Flatcar Linux can use Ignition v3.x Rel: * [Flatcar Support](https://flatcar-linux.org/docs/latest/provisioning/ignition/specification/#ignition-v3) * [poseidon/ct support](https://github.com/poseidon/terraform-provider-ct/pull/131)	2022-08-03 08:32:52 -07:00
Dalton Hubble	47d8431fe0	Fix bug provisioning multi-controller clusters on Google Cloud * Google Cloud Terraform provider resource google_dns_record_set's name field provides the full domain name with a trailing ".". This isn't a new behavior, Google has behaved this way as long as I can remember * etcd domain names are passed to the bootstrap module to generate TLS certificates. What seems to be new(ish?) is that etcd peers see example.foo and example.foo. as different domains during TLS SANs validation. As a result, clusters with multiple controller nodes fail to run etcd-member, which manifests as cluster provisioning hanging. Single controller/master clusters (default) are unaffected * Fix etcd-member.service error in multi-controller clusters: ``` "error":"x509: certificate is valid for conformance-etcd0.redacted., conform-etcd1.redacted., conform-etcd2.redacted., not conform-etcd1.redacted"} ```	2022-08-02 20:21:02 -07:00
Dalton Hubble	256b87812e	Remove Terraform template provider dependency * Use Terraform builtin templatefile functionality * Remove dependency on deprecated Terraform template provider Rel: * https://registry.terraform.io/providers/hashicorp/template/2.2.0 * https://github.com/poseidon/terraform-render-bootstrap/pull/293	2022-08-02 18:15:03 -07:00
Dalton Hubble	c6794f1007	Update Calico from v3.23.1 to v3.23.3 * https://github.com/projectcalico/calico/releases/tag/v3.23.3	2022-07-30 18:15:33 -07:00
Dalton Hubble	7f445b0dba	Add release note about master to main branch rename * Update Terraform provider versions	2022-07-19 18:12:37 -07:00
Dalton Hubble	f42b45451b	Update Cilium from v1.11.6 to v1.11.7 * https://github.com/cilium/cilium/releases/tag/v1.11.7	2022-07-19 09:06:15 -07:00
Dalton Hubble	767a653baa	Update Prometheus, Grafana, and ingress-nginx addons * Update ingress-nginx RBAC Role to include coordination.k8s.io leases permissions that are required with ingress-nginx v1.3.0	2022-07-15 20:19:12 -07:00
Dalton Hubble	42bf82b325	Update Prometheus and Grafana addons * Bump recommended Terraform provider versions	2022-07-02 11:28:34 -07:00
Dalton Hubble	07df0c2552	Add warning about Terraform AWS provider version * Sync Terraform provider versions with those used internally	2022-06-23 21:31:20 -07:00
Dalton Hubble	8398182956	Update Cilium and Calico CNI providers * Update Cilium from v1.11.5 to v1.11.6 * Update Calico from v3.22.2 to v3.23.1	2022-06-18 19:29:01 -07:00
Dalton Hubble	2a8915fee9	Update Prometheus, kube-state-metrics, and Grafana addons * Update monitoring addons	2022-06-18 18:32:17 -07:00
Dalton Hubble	31c7f0ba0e	Update nginx-ingress addon from v1.2.0 to v1.2.1 * https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v1.2.1	2022-05-31 16:37:57 +01:00
Dalton Hubble	b8549a1e32	Update Cilium from v1.11.4 to v1.11.5 * https://github.com/poseidon/terraform-render-bootstrap/pull/309	2022-05-31 15:23:07 +01:00
Dalton Hubble	8e8bf305c3	Update Prometheus and Grafana addons	2022-05-31 14:29:55 +01:00
Dalton Hubble	b0e0b132e4	Update Kubernetes from v1.23.6 to v1.24.0 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.24.md#v1240	2022-05-04 08:27:14 -07:00
Dalton Hubble	02f78fbd1a	Update Grafana from v8.4.5 to v8.5.1	2022-05-02 08:19:41 -07:00
Dalton Hubble	a122867748	Update nginx-ingress, Prometheus, and Grafana addons * Sync addons with versions used in Poseidon	2022-04-27 21:02:32 -07:00
Dalton Hubble	91b38bf3fd	Update etcd from v3.5.2 to v3.5.4 * https://github.com/etcd-io/etcd/releases/tag/v3.5.4	2022-04-27 20:57:02 -07:00
Dalton Hubble	d7f55c4e46	Remove use of deprecated `key_algorithm` field in TLS assets * Fixes warning about use of deprecated field `key_algorithm` in the `hashicorp/tls` provider. The key algorithm can now be inferred directly from the private key so resources don't have to output and pass around the algorithm	2022-04-20 19:52:03 -07:00
Dalton Hubble	80c6e2e7e6	Update Kubernetes from v1.23.5 to v1.23.6 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.23.md#v1236	2022-04-20 19:39:05 -07:00
Dalton Hubble	fddd8ac69d	Fix Flatcar Linux nodes on Google Cloud not ignoring image changes * Add `boot_disk[0].initialize_params` to the ignored fields for the controller nodes * Nodes will auto-update, Terraform should not attempt to delete and recreate nodes (especially controllers!). Lack of this ignore causes Terraform to propose deleting controller nodes when Flatcar Linux releases new images * Matches the configuration on Typhoon Fedora CoreOS (which does not have the issue)	2022-04-20 18:53:00 -07:00
Dalton Hubble	2f7d2a92e0	Update Cilium and Calico CNI providers * Update Cilium from v1.11.3 to v1.11.4 * Update Calico from v3.22.1 to v3.22.2	2022-04-19 08:28:52 -07:00
Dalton Hubble	d91408258b	Update nginx-ingress, Prometheus, and Grafana addons	2022-04-04 08:53:29 -07:00
Dalton Hubble	2df1873b7f	Update Cilium from v1.11.2 to v1.11.3 * https://github.com/cilium/cilium/releases/tag/v1.11.3	2022-04-01 16:44:30 -07:00
Dalton Hubble	93ebfc7dd0	Allow upgrading Azure Terraform Provider to v3.x * Change subnet references to source and destinations prefixes (plural) * Remove references to a resource group in some load balancing components, which no longer require it (inferred) * Rename `worker_address_prefix` output to `worker_address_prefixes`	2022-04-01 16:36:53 -07:00
Dalton Hubble	b47edca6be	Refresh Prometheus rules and Grafana dashboards * Update Prometheus rules and Grafana dashboards * Add new networking dashboards	2022-03-19 17:08:00 -07:00
Dalton Hubble	e61d4b92da	Update Kubernetes from v1.23.4 to v1.23.5 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.23.md#v1235	2022-03-16 21:01:41 -07:00
Dalton Hubble	dca745fa4a	Update monitoring addon components * Update Prometheus, kube-state-metrics, and Grafana	2022-03-11 11:50:16 -08:00
Dalton Hubble	661347fa71	Update nginx-ingress from v1.1.1 to v1.1.2 * https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v1.1.2	2022-03-11 11:42:33 -08:00
Dalton Hubble	69770b4827	Update Calico from v3.21.2 to v3.22.1 * https://github.com/projectcalico/calico/releases/tag/v3.22.1 * Fix https://github.com/projectcalico/calico/issues/5011	2022-03-11 11:22:29 -08:00
Dalton Hubble	f797f97675	Update Cilium from v1.11.1 to v1.11.2 * https://github.com/cilium/cilium/releases/tag/v1.11.2	2022-03-11 10:08:24 -08:00
Dalton Hubble	6cf40722de	Revert kube-state-metrics upgrade * kube-state-metrics:v2.4.0 isn't published, skip it	2022-02-21 19:57:47 -08:00
Dalton Hubble	c230cdec46	Update Grafana and kube-state-metrics addons	2022-02-21 19:36:16 -08:00
Dalton Hubble	9aa99f1996	Allow upgrading AWS Terraform provider to v4.x * https://github.com/hashicorp/terraform-provider-aws/releases/tag/v4.0.0	2022-02-17 09:35:15 -08:00
Dalton Hubble	28a42238c4	Update nginx-ingress, Prometheus, and Grafana addons * Align `nginx-ingress` `--controller-class` with `IngressClass` to provide a better example (e.g. if extended to multiple ingress controllers)	2022-02-17 08:58:29 -08:00
Dalton Hubble	6c70d06937	Update etcd from v3.5.1 to v3.5.2 * https://github.com/etcd-io/etcd/releases/tag/v3.5.2	2022-02-07 08:10:17 -08:00
Dalton Hubble	cf4beeba34	Change default CNI provider from Calico to Cilium * Cilium (v1.8) was added to Typhoon in v1.18.5 in June 2020 and its become more impressive since then. Its currently the leading CNI provider choice. * Calico has grown complex, has lots of CRDs, masks its management complexity with an operator (which we won't use), doesn't provide multi-arch images, and hasn't been compatible with Kubernetes v1.23 (with ipvs) for several releases. * Both have CNCF conformance quirks (flannel used for conformance), but that's not the main factor in choosing the default	2022-02-07 08:07:00 -08:00
Dalton Hubble	e06ee042ee	Switch to using Flatcar Linux images on Google Cloud * Use the official Kinvolk Flatcar Linux image on Google Cloud * Change `os_image` from a custom image name to `flatcar-stable` (default), `flatcar-beta`, or `flatcar-alpha` (action required) * Change `os_image` from a required to an optional variable * Promote Typhoon on Flatcar Linux / Google Cloud to stable * Remove docs about needing to upload a Flatcar Linux image manually on Google Cloud and drop support for custom images	2022-01-28 21:04:10 -08:00
Dalton Hubble	a527f73f5a	Update Kubernetes from v1.23.2 to v1.23.3 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.23.md#v1233	2022-01-27 09:23:37 -08:00
Dalton Hubble	3da8c1575c	Update nginx-ingress and Grafana addons	2022-01-19 21:09:21 -08:00
Dalton Hubble	dedd17d085	Upgrade to DigitalOcean Terraform provider v2.x * Remove deprecated `private_networking` parameter	2022-01-19 18:32:17 -08:00
Dalton Hubble	e274a451ff	Update Kubernetes from v1.23.1 to v1.23.2 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.23.md#v1232	2022-01-19 17:59:49 -08:00
Dalton Hubble	2265ab5375	Remove Kubelet `--network-plugin=cni` flag * Now that `docker-shim` is no longer used, the Kubelet flag is no longer needed and will be removed in v1.24	2022-01-14 10:43:07 -08:00
Dalton Hubble	08ea9776f3	Mask docker.service to prevent socket activation * Kubelet now uses `containerd` as the container runtime, but `docker.service` still starts when `docker.sock` is probed bc the service is socket activated. Prevent this by masking the `docker.service` unit	2022-01-14 10:31:47 -08:00
Dalton Hubble	2e8bc99164	Remove `template` provider usage from terraform-render-bootstrap	2022-01-14 10:27:24 -08:00
Dalton Hubble	beb9f1477a	Add experimental Flatcar Linux arm64 support on AWS * Add `arch` variable to Flatcar Linux AWS `kubernetes` and `workers` modules. Accept `amd64` (default) or `arm64` to support native arm64/aarch64 clusters or mixed/hybrid clusters with arm64 workers * Requires `flannel` or `cilium` CNI Similar to https://github.com/poseidon/typhoon/pull/875	2022-01-14 10:24:48 -08:00
Dalton Hubble	f544a9c71f	Switch Fedora CoreOS from docker-shim to containerd * Migrate from `docker-shim` to `containerd` in preparation for Kubernetes v1.24.0 dropping `docker-shim` support * Much consideration was given to the container runtime choice. https://github.com/poseidon/typhoon/issues/899 provides relevant rationales	2022-01-13 09:17:29 -08:00
Dalton Hubble	50215e373b	Add Prometheus config for monitoring Kubernetes Ingress * Allow Kubernetes Ingress resources to be probed via Blackbox Exporter (if present) if annotated `prometheus.io/probe: "true"` * Fix probes of Services via Blackbox Exporter. Require Blackbox Exporter to be deployed in the same `monitoring` namespace, be named `blackbox-exporter`, and use port 8080	2021-12-29 11:57:50 -08:00
Dalton Hubble	a9f9c59b91	Configure Prometheus to allow a custom scrape query param * Set `prometheus.io/param` on a Kubernetes Service to scrape the service endpoints and pass a custom query parameter * For example, scrape Consul with `?format=prometheus` ```yaml kind: Service metadata: annotations: prometheus.io/scrape: 'true' prometheus.io/port: '8500' prometheus.io/path: /v1/agent/metrics prometheus.io/param: format=prometheus ```	2021-12-29 11:47:10 -08:00
Dalton Hubble	6ed048eb65	Workaround Terraform v1.1 file provisioner regression * Terraform v1.1 changed the behavior of provisioners and `remote-exec` in a way that breaks support for expansions in commands (including file provisioner, where `destination` is part of an `scp` command) * Terraform will likely revert the change eventually, but I suspect it will take a while * Instead, we can stop relying on Terraform's expansion behavior. `/home/core` is a suitable choice for `$HOME` on both Flatcar Linux and Fedora CoreOS (harldink `/var/home/core`) Rel: https://github.com/hashicorp/terraform/issues/30243	2021-12-28 13:25:23 -08:00
Dalton Hubble	9e3807798f	Update Kubernetes from v1.23.0 to v1.23.1 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.23.md#v1231	2021-12-20 08:36:19 -08:00
Dalton Hubble	ef9c6aa423	Switch Flatcar Linux to using containerd CRI * Use containerd as the Kubernetes Container Runtime	2021-12-15 08:42:13 -08:00
Dalton Hubble	bb5e5811ec	Update Prometheus and Grafana addons	2021-12-15 08:16:46 -08:00
Dalton Hubble	16aa997604	Fix Azure `backend_address_pool_id` deprecation warning * Change to `backend_address_pool_ids` list	2021-12-14 10:26:08 -08:00
Dalton Hubble	43c6558aaf	Update nginx-ingress and monitoring addons	2021-12-10 11:29:49 -08:00
Dalton Hubble	125008fbb3	Update Cilium from v1.10.5 to v1.11.0 * https://github.com/cilium/cilium/releases/tag/v1.11.0	2021-12-10 11:26:05 -08:00
Dalton Hubble	136107b448	Set Kubelet resolver config to /run/systemd/resolve/resolv.conf * Both Flatcar Linux and Fedora CoreOS use systemd-resolved, but they setup /etc/resolv.conf symlinks differently * Prefer using /run/systemd/resolve/resolv.conf directly, which also updates to reflect runtime changes (e.g. resolvectl)	2021-12-10 08:22:30 -08:00
Dalton Hubble	e97c1cc9e5	Enable Kubernetes aggregation by default * Change `enable_aggregation` default from false to true * These days, Kubernetes control plane components emit annoying messages related to assumptions baked into the Kubernetes API Aggregation Layer if you don't enable it. Further the conformance tests force you to remember to enable it if you care about passing those * This change is motivated by eliminating annoyances, rather than any enthusiasm for Kubernetes' aggregation features Rel: https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/apiserver-aggregation/	2021-12-09 17:30:35 -08:00
Dalton Hubble	41f739891b	Normalize CA certs mounts in static Pods and kube-proxy * Mount both /etc/ssl/certs and /etc/pki into control plane static pods and kube-proxy, rather than choosing one based a variable (set based on Flatcar Linux or Fedora CoreOS) * Remove deprecated `--port` from `kube-scheduler` static Pod	2021-12-09 09:56:37 -08:00
Dalton Hubble	861021ee98	Update Kubernetes from v1.22.4 to v1.23.0 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.23.md#v1230 * With Calico, add missing caliconodestatuses CRD added in v3.21.0 https://github.com/poseidon/terraform-render-bootstrap/pull/289	2021-12-09 09:28:41 -08:00
Dalton Hubble	c1d28e6f61	Change default disk_iops on Flatcar Linux * Same as #1073, but for Flatcar Linux on AWS as well	2021-12-07 16:52:55 -08:00
Dalton Hubble	9c626c9dbd	Change default `disk_iops` from unset to 3000 * Since v1.21.3 switched controllers default disk type from `gp2` to `gp3`, an iops diff has been shown (harmless, but annoying) * Controller nodes default to a 30GB `gp3` disk. `gp3` disks do respect `iops` and the corresponding default is 3000	2021-12-07 15:44:09 -08:00
Dalton Hubble	5d7b6f611e	Update nginx-ingess and Prometheus exporter addons	2021-11-21 09:28:17 -08:00
Dalton Hubble	93594292eb	Update Kubernetes from v1.22.3 to v1.22.4 * Update flannel from v0.15.0 to v0.15.1 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.22.md#v1224	2021-11-17 19:53:32 -08:00
Dalton Hubble	94b2793e40	Update CoreDNS from v1.8.4 to v1.8.6 * https://coredns.io/2021/10/07/coredns-1.8.6-release/	2021-11-12 21:09:04 -08:00
Dalton Hubble	4fd43b39ad	Fix Flatcar Linux docker driver and add cgroups v2 * Remove `/sys/fs/cgroup/systemd` mount since Flatcar Linux uses cgroups v2 * Flatcar Linux's `docker` switched from the `cgroupfs` to `systemd` driver without notice	2021-11-12 21:07:20 -08:00
Dalton Hubble	65083aca7d	Update Calico and Flannel CNI providers * Update Calico from v3.20.2 to v3.21.0 * Update Flannel from v0.14.0 to v0.15.0	2021-11-12 11:03:39 -08:00
Dalton Hubble	07db4c1143	Allow use of google Terraform provider v4.0+ * https://github.com/hashicorp/terraform-provider-google/releases/tag/v4.0.0	2021-11-11 10:17:58 -08:00
Dalton Hubble	b934a13605	Update Prometheus and Grafana addons	2021-11-07 17:00:40 -08:00
Dalton Hubble	cd005a0b27	Prepare for v1.22.3 release	2021-10-28 11:58:55 -07:00
Dalton Hubble	dd4a5a4e7e	Update Kubernetes from v1.22.2 to v1.22.3 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.22.md#v1223	2021-10-28 10:11:06 -07:00
Dalton Hubble	af835f976f	Update flannel from v0.13.0 to v0.14.0 * https://github.com/flannel-io/flannel/releases/tag/v0.14.0	2021-10-28 10:09:06 -07:00
Dalton Hubble	17dce49982	Update etcd from v3.5.0 to v3.5.1 * https://github.com/etcd-io/etcd/releases/tag/v3.5.1	2021-10-17 11:28:27 -07:00
Dalton Hubble	5744e10329	Update Cilium from v1.0.4 to v1.0.5 * https://github.com/cilium/cilium/releases/tag/v1.10.5	2021-10-17 11:26:59 -07:00
Dalton Hubble	20748536df	Update nginx-ingress from v1.0.2 to v1.0.4 * https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v1.0.4	2021-10-17 11:17:43 -07:00
Dalton Hubble	f2e6256dd9	Update Prometheus, kube-state-metrics, and Grafana * Update monitoring addons	2021-10-17 11:15:39 -07:00
Dalton Hubble	f8162b9be3	Update Calico from v3.20.1 to v3.20.2 * Use Calico's iptables legacy vs nft auto-detection	2021-10-11 20:28:48 -07:00
Dalton Hubble	15117fb95b	Update Prometheus and nginx-ingress	2021-10-05 19:15:58 -07:00
Dalton Hubble	cb72b261c7	Update Terraform provider poseidon/matchbox to v0.5+ * Relax version constraint to allow future minor version releases to be used without a corresponding Typhoon change	2021-09-29 23:41:44 -07:00
Dalton Hubble	209efd2f5b	Update Prometheus, Grafana, and kube-state-metrics	2021-09-29 23:39:10 -07:00
Dalton Hubble	5a1e455220	Update nginx-ingress from v1.0.0 to v1.0.1	2021-09-24 09:38:18 -07:00
Dalton Hubble	69f37c8b17	Update Prometheus from v2.29.2 to v2.30.0	2021-09-24 09:34:00 -07:00
Dalton Hubble	b30de949b8	Update Calico and Cilium CNI * Update Calico from v3.20.0 to v3.20.1 * Update Cilium from v1.10.3 to v1.10.4	2021-09-22 22:18:16 -07:00
Dalton Hubble	bb7f31822e	Update Kubernetes from v1.22.1 to v1.22.2 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.22.md#v1222	2021-09-15 19:56:24 -07:00
Dalton Hubble	eb29fb639b	Update nginx-ingress, Prometheus, and Grafana addons	2021-08-24 22:14:57 -07:00
Dalton Hubble	fcbdb50d93	Update Kubernetes from v1.22.0 to v1.22.1 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.22.md#v1221	2021-08-19 21:12:02 -07:00
Dalton Hubble	0d8ceae1d9	Add etcd v3.5.0 note to CHANGES	2021-08-11 09:24:43 -07:00
Dalton Hubble	c5cf803634	Update Grafana and kube-state-metrics addons	2021-08-10 22:17:16 -07:00
Dalton Hubble	cbef202eec	Update Prometheus discovery of kube components * Kubernetes v1.22.0 disabled kube-controller-manager insecure port, which was used internally for Prometheus metrics scraping * Configure Prometheus to discover and scrape endpoints for kube-scheduler and kube-controller-manager via the authenticated https ports, via bearer token * Change firewall ports to allow Prometheus (on worker nodes) to scrape kube-scheduler and kube-controller-manager targets that run on controller(s) with hostNetwork * Disable the insecure port on kube-scheduler	2021-08-10 21:25:19 -07:00
Dalton Hubble	0c99b909a9	Update nginx-ingress from v0.47.0 to v1.0.0-beta.1 * https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v1.0.0-beta.1	2021-08-07 12:46:00 -07:00
Dalton Hubble	739db3b35f	Update Grafana and node-exporter addons * https://github.com/grafana/grafana/releases/tag/v8.1.0 * https://github.com/prometheus/node_exporter/releases/tag/v1.2.1	2021-08-05 23:24:57 -07:00
Dalton Hubble	9bac641511	Update Kubernetes from v1.21.3 to v1.22.0 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.22.md#v1220	2021-08-04 22:09:19 -07:00
Dalton Hubble	f03045f0dc	Update Cilium for cgroups v2 support * On Fedora CoreOS, Cilium cross-node service IP load balancing stopped working for a time (first observable as CoreDNS pods located on worker nodes not being able to reach the kubernetes API service 10.3.0.1). This turned out to have two parts: * Fedora CoreOS switched to cgroups v2 by default. In our early testing with cgroups v2, Calico (default) was used. With the cgroups v2 change, SELinux policy denied some eBPF operations. Since fixed in all Fedora CoreOS channels * Cilium requires new mounts to support cgroups v2, which are added here * https://github.com/coreos/fedora-coreos-tracker/issues/292 * https://github.com/coreos/fedora-coreos-tracker/issues/881 * https://github.com/cilium/cilium/pull/16259	2021-07-24 10:36:47 -07:00
Dalton Hubble	b603bbde3d	Update Butane Config from v1.2.0 to v1.4.0 * Rename Fedora CoreOS Config (FCC) to Butane Config * Require any snippets customizations use version v1.4.0 * https://typhoon.psdn.io/advanced/customization/#hosts	2021-07-19 23:53:51 -07:00
Dalton Hubble	c734fa7b84	Update node-exporter from v1.1.2 to v1.2.0 * https://github.com/prometheus/node_exporter/releases/tag/v1.2.0	2021-07-18 15:26:44 -07:00
Dalton Hubble	fdade5b40c	Update poseidon/ct provider from v0.8.0 to v0.9.0 * Continue targeting Ignition v3.2.0 for some time	2021-07-18 09:05:02 -07:00
Dalton Hubble	171fd2c998	Update Kubernetes from v1.21.2 to v1.21.3 * https://github.com/kubernetes/kubernetes/releases/tag/v1.21.3	2021-07-17 18:22:24 -07:00
Dalton Hubble	545bd79624	Update Grafana from v8.0.4 to v8.0.6 * https://github.com/grafana/grafana/releases/tag/v8.0.6	2021-07-16 12:02:36 -07:00
Dalton Hubble	66e7354c8a	Change AWS default disk type from gp2 to gp3 * https://aws.amazon.com/about-aws/whats-new/2020/12/introducing-new-amazon-ebs-general-purpose-volumes-gp3/	2021-07-04 10:43:05 -07:00
Dalton Hubble	3a71b2ccb1	Update Cilium from v1.10.1 to v1.10.2 * https://github.com/cilium/cilium/releases/tag/v1.10.2	2021-07-04 10:11:21 -07:00
Dalton Hubble	c7e327417b	Update Prometheus and Grafana addons	2021-07-04 10:02:44 -07:00
Dalton Hubble	65ddd2419c	Add Known Issues with FCOS to CHANGES	2021-06-27 16:51:59 -07:00
Dalton Hubble	b0e9b1fa60	Update Prometheus and Grafana addons * https://github.com/prometheus/prometheus/releases/tag/v2.28.0 * https://github.com/grafana/grafana/releases/tag/v8.0.3	2021-06-27 14:46:43 -07:00
Dalton Hubble	485feb82c4	Update CoreDNS from v1.8.0 to v1.8.4 * https://coredns.io/2021/01/20/coredns-1.8.1-release/ * https://coredns.io/2021/02/23/coredns-1.8.2-release/ * https://coredns.io/2021/02/24/coredns-1.8.3-release/ * https://coredns.io/2021/05/28/coredns-1.8.4-release/	2021-06-23 23:31:25 -07:00
Dalton Hubble	0b276b6b7e	Update Kubernetes from v1.21.1 to v1.21.2 * https://github.com/kubernetes/kubernetes/releases/tag/v1.21.2	2021-06-17 16:15:20 -07:00
Dalton Hubble	e8513e58bb	Add support for Terraform v1.0.0 * https://github.com/hashicorp/terraform/releases/tag/v1.0.0	2021-06-17 13:32:56 -07:00
Dalton Hubble	30cfeec6c1	Update nginx-ingress from v0.46.0 to v0.47.0 * https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.47.0	2021-06-07 10:11:07 -07:00
Dalton Hubble	24e63bd134	Update Prometheus, Grafana, kube-state-metrics addons	2021-06-07 09:40:06 -07:00
Dalton Hubble	996bdd9112	Update Calico from v3.19.0 to v3.19.1 * https://docs.projectcalico.org/archive/v3.19/release-notes/	2021-06-02 14:51:15 -07:00
Dalton Hubble	9f0126a410	Fix typo in CHANGES.md	2021-05-25 21:16:53 -07:00
Dalton Hubble	966fd280b0	Update Cilium from v0.10.0-rc1 to v0.10.0 * https://github.com/cilium/cilium/releases/tag/v1.10.0	2021-05-24 11:16:51 -07:00
Dalton Hubble	e4e074c894	Update Cilium from v1.9.6 to v1.10.0-rc1 * Add multi-arch container images and arm64 support * https://github.com/cilium/cilium/releases/tag/v1.10.0-rc1	2021-05-14 14:24:52 -07:00
Dalton Hubble	d51da49925	Update docs for Kubernetes v1.21.1 and Terraform v0.15.x	2021-05-13 11:34:01 -07:00
Dalton Hubble	2076a779a3	Update Kubernetes from v1.21.0 to v1.21.1 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.21.md#v1211	2021-05-13 11:23:26 -07:00
Dalton Hubble	048094b256	Update etcd from v3.4.15 to v3.4.16 * https://github.com/etcd-io/etcd/blob/main/CHANGELOG-3.4.md	2021-05-13 10:53:04 -07:00
Dalton Hubble	75b063c586	Update Prometheus from v2.25.2 to v2.27.0 * Update Grafana from v7.5.4 to v7.5.6 * https://github.com/prometheus/prometheus/releases/tag/v2.27.0 * https://github.com/grafana/grafana/releases/tag/v7.5.6	2021-05-12 11:47:07 -07:00
Dalton Hubble	bc96443710	Update nginx-ingress from v0.45.0 to v0.46.0 * https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.46.0	2021-05-05 12:06:20 -07:00
Dalton Hubble	5f87eb3ec9	Update Fedora CoreOS Kubelet for cgroups v2 * Fedora CoreOS is beginning to switch from cgroups v1 to cgroups v2 by default, which changes the sysfs hierarchy * This will be needed when using a Fedora Coreos OS image that enables cgroups v2 (`next` stream as of this writing) Rel: https://github.com/coreos/fedora-coreos-tracker/issues/292	2021-04-26 11:48:58 -07:00
Dalton Hubble	b152b9f973	Reduce the default disk_size from 40GB to 30GB * We're typically reducing the `disk_size` in real clusters since the space is under used. The default should be lower.	2021-04-26 11:43:26 -07:00
Dalton Hubble	9c842395a8	Update Cilium from v1.9.5 to v1.9.6 * https://github.com/cilium/cilium/releases/tag/v1.9.6	2021-04-26 10:55:23 -07:00
Dalton Hubble	e535ddd15a	Update Grafana from v7.5.3 to v7.5.4 * https://github.com/grafana/grafana/releases/tag/v7.5.4	2021-04-17 11:38:14 -07:00
Dalton Hubble	5752a8f041	Update kube-state-metrics from v2.0.0-rc.1 to v2.0.0 * https://github.com/kubernetes/kube-state-metrics/releases/tag/v2.0.0	2021-04-17 11:34:52 -07:00
Dalton Hubble	c11e23fc50	Fix minor docs issues and missing changelog links	2021-04-13 09:35:11 -07:00
Dalton Hubble	2eb1ac1b4d	Update nginx-ingress from v0.44.0 to v0.45.0 * https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.45.0	2021-04-12 00:18:47 -07:00
Dalton Hubble	cb2721ef7d	Update Grafana from v7.5.2 to v7.5.3 * https://github.com/grafana/grafana/releases/tag/v7.5.3	2021-04-12 00:17:22 -07:00
Dalton Hubble	fc06d28e13	Remove deprecated field on azurerm_lb_backend_address_pool * Remove the deprecated `resource_group_name` field from Azure `azurerm_lb_backend_address_pool` resources	2021-04-11 23:59:17 -07:00
Dalton Hubble	ebd9570ede	Update Fedora CoreOS Config version from v1.1.0 to v1.2.0 * Require [poseidon/ct](https://github.com/poseidon/terraform-provider-ct) Terraform provider v0.8+ * Require any [snippets](https://typhoon.psdn.io/advanced/customization/#hosts) customizations to update to v1.2.0 See upgrade [notes](https://typhoon.psdn.io/topics/maintenance/#upgrade-terraform-provider-ct)	2021-04-11 15:26:54 -07:00
Dalton Hubble	34e8db7aae	Update static Pod manifests for Kubernetes v1.21.0 * https://github.com/poseidon/terraform-render-bootstrap/pull/257	2021-04-11 15:05:46 -07:00
Dalton Hubble	084e8bea49	Allow custom initial node taints on worker pool nodes * Add `node_taints` variable to worker modules to set custom initial node taints on cloud platforms that support auto-scaling worker pools of heterogeneous nodes (i.e. AWS, Azure, GCP) * Worker pools could use custom `node_labels` to allowed workloads to select among differentiated nodes, while custom `node_taints` allows a worker pool's nodes to be tainted as special to prevent scheduling, except by workloads that explicitly tolerate the taint * Expose `daemonset_tolerations` in AWS, Azure, and GCP kubernetes cluster modules, to determine whether `kube-system` components should tolerate the custom taint (advanced use covered in docs) Rel: #550, #663 Closes #429	2021-04-11 15:00:11 -07:00

1 2 3 4 5 ...

969 Commits