typhoon

mirror of https://github.com/puppetmaster/typhoon.git synced 2024-12-27 08:49:33 +01:00

Author	SHA1	Message	Date
Dalton Hubble	9bac641511	Update Kubernetes from v1.21.3 to v1.22.0 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.22.md#v1220	2021-08-04 22:09:19 -07:00
Dalton Hubble	f03045f0dc	Update Cilium for cgroups v2 support * On Fedora CoreOS, Cilium cross-node service IP load balancing stopped working for a time (first observable as CoreDNS pods located on worker nodes not being able to reach the kubernetes API service 10.3.0.1). This turned out to have two parts: * Fedora CoreOS switched to cgroups v2 by default. In our early testing with cgroups v2, Calico (default) was used. With the cgroups v2 change, SELinux policy denied some eBPF operations. Since fixed in all Fedora CoreOS channels * Cilium requires new mounts to support cgroups v2, which are added here * https://github.com/coreos/fedora-coreos-tracker/issues/292 * https://github.com/coreos/fedora-coreos-tracker/issues/881 * https://github.com/cilium/cilium/pull/16259	2021-07-24 10:36:47 -07:00
Dalton Hubble	b603bbde3d	Update Butane Config from v1.2.0 to v1.4.0 * Rename Fedora CoreOS Config (FCC) to Butane Config * Require any snippets customizations use version v1.4.0 * https://typhoon.psdn.io/advanced/customization/#hosts	2021-07-19 23:53:51 -07:00
Dalton Hubble	c734fa7b84	Update node-exporter from v1.1.2 to v1.2.0 * https://github.com/prometheus/node_exporter/releases/tag/v1.2.0	2021-07-18 15:26:44 -07:00
Dalton Hubble	fdade5b40c	Update poseidon/ct provider from v0.8.0 to v0.9.0 * Continue targeting Ignition v3.2.0 for some time	2021-07-18 09:05:02 -07:00
Dalton Hubble	171fd2c998	Update Kubernetes from v1.21.2 to v1.21.3 * https://github.com/kubernetes/kubernetes/releases/tag/v1.21.3	2021-07-17 18:22:24 -07:00
Dalton Hubble	545bd79624	Update Grafana from v8.0.4 to v8.0.6 * https://github.com/grafana/grafana/releases/tag/v8.0.6	2021-07-16 12:02:36 -07:00
Dalton Hubble	66e7354c8a	Change AWS default disk type from gp2 to gp3 * https://aws.amazon.com/about-aws/whats-new/2020/12/introducing-new-amazon-ebs-general-purpose-volumes-gp3/	2021-07-04 10:43:05 -07:00
Dalton Hubble	3a71b2ccb1	Update Cilium from v1.10.1 to v1.10.2 * https://github.com/cilium/cilium/releases/tag/v1.10.2	2021-07-04 10:11:21 -07:00
Dalton Hubble	c7e327417b	Update Prometheus and Grafana addons	2021-07-04 10:02:44 -07:00
Dalton Hubble	65ddd2419c	Add Known Issues with FCOS to CHANGES	2021-06-27 16:51:59 -07:00
Dalton Hubble	b0e9b1fa60	Update Prometheus and Grafana addons * https://github.com/prometheus/prometheus/releases/tag/v2.28.0 * https://github.com/grafana/grafana/releases/tag/v8.0.3	2021-06-27 14:46:43 -07:00
Dalton Hubble	485feb82c4	Update CoreDNS from v1.8.0 to v1.8.4 * https://coredns.io/2021/01/20/coredns-1.8.1-release/ * https://coredns.io/2021/02/23/coredns-1.8.2-release/ * https://coredns.io/2021/02/24/coredns-1.8.3-release/ * https://coredns.io/2021/05/28/coredns-1.8.4-release/	2021-06-23 23:31:25 -07:00
Dalton Hubble	0b276b6b7e	Update Kubernetes from v1.21.1 to v1.21.2 * https://github.com/kubernetes/kubernetes/releases/tag/v1.21.2	2021-06-17 16:15:20 -07:00
Dalton Hubble	e8513e58bb	Add support for Terraform v1.0.0 * https://github.com/hashicorp/terraform/releases/tag/v1.0.0	2021-06-17 13:32:56 -07:00
Dalton Hubble	30cfeec6c1	Update nginx-ingress from v0.46.0 to v0.47.0 * https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.47.0	2021-06-07 10:11:07 -07:00
Dalton Hubble	24e63bd134	Update Prometheus, Grafana, kube-state-metrics addons	2021-06-07 09:40:06 -07:00
Dalton Hubble	996bdd9112	Update Calico from v3.19.0 to v3.19.1 * https://docs.projectcalico.org/archive/v3.19/release-notes/	2021-06-02 14:51:15 -07:00
Dalton Hubble	9f0126a410	Fix typo in CHANGES.md	2021-05-25 21:16:53 -07:00
Dalton Hubble	966fd280b0	Update Cilium from v0.10.0-rc1 to v0.10.0 * https://github.com/cilium/cilium/releases/tag/v1.10.0	2021-05-24 11:16:51 -07:00
Dalton Hubble	e4e074c894	Update Cilium from v1.9.6 to v1.10.0-rc1 * Add multi-arch container images and arm64 support * https://github.com/cilium/cilium/releases/tag/v1.10.0-rc1	2021-05-14 14:24:52 -07:00
Dalton Hubble	d51da49925	Update docs for Kubernetes v1.21.1 and Terraform v0.15.x	2021-05-13 11:34:01 -07:00
Dalton Hubble	2076a779a3	Update Kubernetes from v1.21.0 to v1.21.1 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.21.md#v1211	2021-05-13 11:23:26 -07:00
Dalton Hubble	048094b256	Update etcd from v3.4.15 to v3.4.16 * https://github.com/etcd-io/etcd/blob/main/CHANGELOG-3.4.md	2021-05-13 10:53:04 -07:00
Dalton Hubble	75b063c586	Update Prometheus from v2.25.2 to v2.27.0 * Update Grafana from v7.5.4 to v7.5.6 * https://github.com/prometheus/prometheus/releases/tag/v2.27.0 * https://github.com/grafana/grafana/releases/tag/v7.5.6	2021-05-12 11:47:07 -07:00
Dalton Hubble	bc96443710	Update nginx-ingress from v0.45.0 to v0.46.0 * https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.46.0	2021-05-05 12:06:20 -07:00
Dalton Hubble	5f87eb3ec9	Update Fedora CoreOS Kubelet for cgroups v2 * Fedora CoreOS is beginning to switch from cgroups v1 to cgroups v2 by default, which changes the sysfs hierarchy * This will be needed when using a Fedora Coreos OS image that enables cgroups v2 (`next` stream as of this writing) Rel: https://github.com/coreos/fedora-coreos-tracker/issues/292	2021-04-26 11:48:58 -07:00
Dalton Hubble	b152b9f973	Reduce the default disk_size from 40GB to 30GB * We're typically reducing the `disk_size` in real clusters since the space is under used. The default should be lower.	2021-04-26 11:43:26 -07:00
Dalton Hubble	9c842395a8	Update Cilium from v1.9.5 to v1.9.6 * https://github.com/cilium/cilium/releases/tag/v1.9.6	2021-04-26 10:55:23 -07:00
Dalton Hubble	e535ddd15a	Update Grafana from v7.5.3 to v7.5.4 * https://github.com/grafana/grafana/releases/tag/v7.5.4	2021-04-17 11:38:14 -07:00
Dalton Hubble	5752a8f041	Update kube-state-metrics from v2.0.0-rc.1 to v2.0.0 * https://github.com/kubernetes/kube-state-metrics/releases/tag/v2.0.0	2021-04-17 11:34:52 -07:00
Dalton Hubble	c11e23fc50	Fix minor docs issues and missing changelog links	2021-04-13 09:35:11 -07:00
Dalton Hubble	2eb1ac1b4d	Update nginx-ingress from v0.44.0 to v0.45.0 * https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.45.0	2021-04-12 00:18:47 -07:00
Dalton Hubble	cb2721ef7d	Update Grafana from v7.5.2 to v7.5.3 * https://github.com/grafana/grafana/releases/tag/v7.5.3	2021-04-12 00:17:22 -07:00
Dalton Hubble	fc06d28e13	Remove deprecated field on azurerm_lb_backend_address_pool * Remove the deprecated `resource_group_name` field from Azure `azurerm_lb_backend_address_pool` resources	2021-04-11 23:59:17 -07:00
Dalton Hubble	ebd9570ede	Update Fedora CoreOS Config version from v1.1.0 to v1.2.0 * Require [poseidon/ct](https://github.com/poseidon/terraform-provider-ct) Terraform provider v0.8+ * Require any [snippets](https://typhoon.psdn.io/advanced/customization/#hosts) customizations to update to v1.2.0 See upgrade [notes](https://typhoon.psdn.io/topics/maintenance/#upgrade-terraform-provider-ct)	2021-04-11 15:26:54 -07:00
Dalton Hubble	34e8db7aae	Update static Pod manifests for Kubernetes v1.21.0 * https://github.com/poseidon/terraform-render-bootstrap/pull/257	2021-04-11 15:05:46 -07:00
Dalton Hubble	084e8bea49	Allow custom initial node taints on worker pool nodes * Add `node_taints` variable to worker modules to set custom initial node taints on cloud platforms that support auto-scaling worker pools of heterogeneous nodes (i.e. AWS, Azure, GCP) * Worker pools could use custom `node_labels` to allowed workloads to select among differentiated nodes, while custom `node_taints` allows a worker pool's nodes to be tainted as special to prevent scheduling, except by workloads that explicitly tolerate the taint * Expose `daemonset_tolerations` in AWS, Azure, and GCP kubernetes cluster modules, to determine whether `kube-system` components should tolerate the custom taint (advanced use covered in docs) Rel: #550, #663 Closes #429	2021-04-11 15:00:11 -07:00
Dalton Hubble	d73621c838	Update Kubernetes from v1.20.5 to v1.21.0 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.21.md#v1210	2021-04-08 21:44:31 -07:00
Dalton Hubble	1a6481df04	Update Grafana from v7.5.1 to v7.5.2 * https://github.com/grafana/grafana/releases/tag/v7.5.2	2021-04-04 18:20:02 -07:00
Dalton Hubble	7372d33af8	Update kube-state-metrics and Grafana * https://github.com/kubernetes/kube-state-metrics/releases/tag/v2.0.0-rc.1 * https://github.com/grafana/grafana/releases/tag/v7.5.1	2021-03-28 10:53:52 -07:00
Dalton Hubble	451ec771a8	Update Terraform providers and CHANGES for release	2021-03-23 08:45:57 -07:00
Dalton Hubble	597ca4acce	Update CoreDNS from v1.7.0 to v1.8.0 * https://github.com/poseidon/terraform-render-bootstrap/pull/254	2021-03-20 16:47:25 -07:00
Dalton Hubble	048f1f514e	Update Grafana from v7.4.3 to v7.4.5 * https://github.com/grafana/grafana/releases/tag/v7.4.5	2021-03-19 11:51:52 -07:00
Dalton Hubble	b825cd9afe	Update Prometheus from v2.25.1 to v2.25.2 * https://github.com/prometheus/prometheus/releases/tag/v2.25.2	2021-03-19 11:49:38 -07:00
Dalton Hubble	796149d122	Update Kubernetes from v1.20.4 to v1.20.5 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.20.md#v1205	2021-03-19 11:27:31 -07:00
Dalton Hubble	a66bccd590	Update Cilium from v1.9.4 to v1.9.5 * https://github.com/cilium/cilium/releases/tag/v1.9.5	2021-03-14 11:48:22 -07:00
Dalton Hubble	30b1edfcc6	Mark bootstrap token as sensitive in plan/apply * Mark the bootstrap token as sensitive, which is useful when Terraform is run in automated CI/CD systems to avoid showing the token * https://github.com/poseidon/terraform-render-bootstrap/pull/251	2021-03-14 11:32:35 -07:00
Dalton Hubble	a4afe06b64	Update Calico from v3.17.3 to v3.18.1 * https://docs.projectcalico.org/archive/v3.18/release-notes/	2021-03-14 10:35:24 -07:00
Dalton Hubble	4d58be0816	Update Prometheus from v2.25.0 to v2.25.1 * https://github.com/prometheus/prometheus/releases/tag/v2.25.1	2021-03-14 09:43:15 -07:00
Dalton Hubble	5bc1cd28c3	Switch kube-state-metrics image from quay to k8s.gcr.io * kube-state-metrics is continuing publishing container images to `k8s.gcr.io` instead of `quay.io` Rel: https://github.com/kubernetes/kube-state-metrics/issues/1409	2021-03-11 10:56:18 -08:00
Dalton Hubble	13fbac6c79	Update Grafana from v7.4.2 to v7.4.3 * https://github.com/grafana/grafana/releases/tag/v7.4.3	2021-03-05 17:19:54 -08:00
Dalton Hubble	a8fa4a9a06	Update node-exporter and kube-state-metrics * https://github.com/kubernetes/kube-state-metrics/releases/tag/v2.0.0-rc.0 * https://github.com/prometheus/node_exporter/releases/tag/v1.1.2	2021-03-05 17:13:45 -08:00
Dalton Hubble	a5c1a96df1	Update etcd from v3.4.14 to v3.4.15 * https://github.com/etcd-io/etcd/releases/tag/v3.4.15	2021-03-05 17:02:57 -08:00
Dalton Hubble	6a091e245e	Remove Flatcar Linux Edge `os_image` option * Flatcar Linux has not published an Edge channel image since April 2020 and recently removed mention of the channel from their documentation https://github.com/kinvolk/Flatcar/pull/345 * Users of Flatcar Linux Edge should move to the stable, beta, or alpha channel, barring any alternate advice from upstream Flatcar Linux	2021-02-20 16:09:54 -08:00
Dalton Hubble	ec389295fe	Update Grafana from v7.4.0 to v7.4.2 * https://github.com/grafana/grafana/releases/tag/v7.4.2	2021-02-19 00:18:39 -08:00
Dalton Hubble	3c807f3478	Update Prometheus from v2.24.1 to v2.25.0 * https://github.com/prometheus/prometheus/releases/tag/v2.25.0	2021-02-19 00:16:35 -08:00
Dalton Hubble	e76fe80b45	Update Kubernetes from v1.20.3 to v1.20.4 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.20.md#v1204	2021-02-19 00:02:07 -08:00
Dalton Hubble	32853aaa7b	Update Kubernetes from v1.20.2 to v1.20.3 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.20.md#v1203	2021-02-17 22:29:33 -08:00
Dalton Hubble	c32a54db40	Update node-exporter from v1.0.1 to v1.1.1 * https://github.com/prometheus/node_exporter/releases/tag/v1.1.1	2021-02-14 14:30:28 -08:00
Dalton Hubble	9671b1c734	Update flannel-cni from v0.4.1 to v0.4.2 * https://github.com/poseidon/flannel-cni/releases/tag/v0.4.2	2021-02-14 12:04:59 -08:00
Dalton Hubble	3b933e1ab3	Update Grafana from v7.3.7 to v7.4.0 * https://github.com/grafana/grafana/releases/tag/v7.4.0	2021-02-07 21:42:18 -08:00
Dalton Hubble	58d8f6f505	Update Prometheus from v2.24.0 to v2.24.1 * https://github.com/prometheus/prometheus/releases/tag/v2.24.1	2021-02-04 22:28:32 -08:00
Dalton Hubble	56853fe222	Update nginx-ingress from v0.43.0 to v0.44.0 * https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.44.0	2021-02-04 22:19:58 -08:00
Dalton Hubble	18165d8076	Update Calico from v3.17.1 to v3.17.2 * https://github.com/projectcalico/calico/releases/tag/v3.17.2	2021-02-04 22:03:51 -08:00
Dalton Hubble	50acf28ce5	Update Cilium from v1.9.3 to v1.9.4 * https://github.com/cilium/cilium/releases/tag/v1.9.4	2021-02-03 23:08:22 -08:00
Dalton Hubble	ab793eb842	Update Cilium from v1.9.2 to v1.9.3 * https://github.com/cilium/cilium/releases/tag/v1.9.3	2021-01-26 17:13:52 -08:00
Dalton Hubble	b74c958524	Update Cilium from v1.9.1 to v1.9.2 * https://github.com/cilium/cilium/releases/tag/v1.9.2	2021-01-20 22:06:45 -08:00
Dalton Hubble	11c434915f	Update Grafana from v7.3.6 to v7.3.7 * https://github.com/grafana/grafana/releases/tag/v7.3.7	2021-01-16 10:46:56 -08:00
Dalton Hubble	05f7df9e80	Update Kubernetes from v1.20.1 to v1.20.2 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.20.md#v1202	2021-01-13 17:46:51 -08:00
Dalton Hubble	4220b9ce18	Add support for Terraform v0.14.4+ * Support Terraform v0.13.x and v0.14.4+	2021-01-12 21:43:12 -08:00
Dalton Hubble	6a6af4aa16	Update Prometheus from v2.24.0-rc.0 to v2.24.0 * https://github.com/prometheus/prometheus/releases/tag/v2.24.0	2021-01-12 20:49:18 -08:00
Dalton Hubble	3dcd10f3b8	Update Prometheus v2.23.0 to v2.24.0-rc.0 * https://github.com/prometheus/prometheus/releases/tag/v2.24.0-rc.0	2021-01-01 13:49:28 -08:00
Dalton Hubble	22503993b9	Update nginx-ingress from v0.41.2 to v0.43.0 * https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.43.0 * https://github.com/kubernetes/ingress-nginx/issues/6696	2021-01-01 13:44:45 -08:00
Dalton Hubble	cf3aa8885b	Update Prometheus rules and Grafana dashboards * Update Grafana from v7.3.5 to v7.3.6	2020-12-19 14:56:42 -08:00
Dalton Hubble	ba61a137db	Add notice about upstream Fedora CoreOS changes * Highlight that short-term, use of Fedora CoreOS will require non-RSA SSH keys or a workaround snippet	2020-12-19 14:10:42 -08:00
Dalton Hubble	646bdd78e4	Update Kubernetes from v1.20.0 to v1.20.1 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.20.md#v1201	2020-12-19 12:56:28 -08:00
Dalton Hubble	c163fbbbcd	Update docs and README for release	2020-12-12 12:31:35 -08:00
Dalton Hubble	dc7be431e0	Remove iSCSI mounts from Kubelet * Remove Kubelet `/etc/iscsi` and `iscsiadm` host mounts that were added on bare-metal, since these no longer work on either Fedora CoreOS or Flatcar Linux with newer `iscsiadm` * These special mounts on bare-metal date back to #350 which added them to provide a way to use iSCSI in Kubernetes v1.10 * Today, storage should be handled by external CSI providers which handle different storage systems, which doesn't rely on Kubelet storage utils Close #907	2020-12-12 11:41:02 -08:00
Dalton Hubble	86e0f806b3	Revert "Add support for Terraform v0.14.x" This reverts commit `968febb050`.	2020-12-11 00:47:57 -08:00
Dalton Hubble	96172ad269	Update Grafana from v7.3.4 to v7.3.5 * https://github.com/grafana/grafana/releases/tag/v7.3.5	2020-12-11 00:24:43 -08:00
Dalton Hubble	ee9ce3d0ab	Update Calico from v3.17.0 to v3.17.1 * https://github.com/projectcalico/calico/releases/tag/v3.17.1	2020-12-10 22:48:38 -08:00
Dalton Hubble	a8b8a9b454	Update Kubernetes from v1.20.0-rc.0 to v1.20.0 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.20.md#v1200	2020-12-08 18:28:13 -08:00
Dalton Hubble	968febb050	Add support for Terraform v0.14.x * Support Terraform v0.13.x and v0.14.x	2020-12-07 00:22:38 -08:00
Dalton Hubble	bee455f83a	Update Cilium from v1.9.0 to v1.9.1 * https://github.com/cilium/cilium/releases/tag/v1.9.1	2020-12-04 14:14:18 -08:00
Dalton Hubble	3e89ea1b4a	Promote Fedora CoreOS bare-metal to stable * Fedora CoreOS is a good choice for use on bare-metal	2020-12-04 14:02:55 -08:00
Dalton Hubble	e77dd6ecd4	Update Kubernetes from v1.19.4 to v1.20.0-rc.0 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.20.md#v1200-rc0	2020-12-03 16:01:28 -08:00
Dalton Hubble	804dfea0f9	Add kubeconfig's for kube-scheduler and kube-controller-manager * Generate TLS client certificates for `kube-scheduler` and `kube-controller-manager` with `system:kube-scheduler` and `system:kube-controller-manager` CNs * Template separate kubeconfigs for kube-scheduler and kube-controller manager (`scheduler.conf` and `controller-manager.conf`). Rename admin for clarity * Before v1.16.0, Typhoon scheduled a self-hosted control plane, which allowed the steady-state kube-scheduler and kube-controller-manager to use a scoped ServiceAccount. With a static pod control plane, separate CN TLS client certificates are the nearest equiv. * https://kubernetes.io/docs/setup/best-practices/certificates/ * Remove unused Kubelet certificate, TLS bootstrap is used instead	2020-12-01 22:02:15 -08:00
Dalton Hubble	8ba23f364c	Add TokenReview and TokenRequestProjection flags * Add kube-apiserver flags for TokenReview and TokenRequestProjection (beta, defaults on) to allow using Service Account Token Volume Projection to create and mount service account tokens tied to a Pod's lifecycle Rel: * https://github.com/poseidon/terraform-render-bootstrap/pull/231 * https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/#service-account-token-volume-projection	2020-12-01 20:02:33 -08:00
Dalton Hubble	f6025666eb	Update etcd from v3.4.12 to v3.4.14 * https://github.com/etcd-io/etcd/releases/tag/v3.4.14	2020-11-29 20:04:25 -08:00
Dalton Hubble	85eb502f19	Update Prometheus from v2.23.0-rc.0 to v2.23.0 * https://github.com/prometheus/prometheus/releases/tag/v2.23.0	2020-11-29 19:59:27 -08:00
Dalton Hubble	fa3184fb9c	Relax terraform-provider-ct version constraint * Allow terraform-provider-ct versions v0.6+ (e.g. v0.7.1) Before, only v0.6.x point updates were allowed * Update terraform-provider-ct to v0.7.1 in docs * READ the docs before updating terraform-provider-ct, as changing worker user-data is handled differently by different cloud platforms	2020-11-29 19:51:26 -08:00
Dalton Hubble	22565e57e0	Update kube-state-metrics from v2.0.0-alpha.2 to v2.0.0-alpha.3 * https://github.com/kubernetes/kube-state-metrics/releases/tag/v2.0.0-alpha.3	2020-11-25 14:30:11 -08:00
Dalton Hubble	026e1f3648	Update Grafana from v7.3.3 to v7.3.4 * https://github.com/grafana/grafana/releases/tag/v7.3.4	2020-11-25 14:25:15 -08:00
Dalton Hubble	ae548ce213	Update Calico from v3.16.5 to v3.17.0 * Enable Calico MTU auto-detection * Remove [workaround](https://github.com/poseidon/typhoon/pull/724) to Calico cni-plugin [issue](https://github.com/projectcalico/cni-plugin/issues/874) Rel: https://github.com/poseidon/terraform-render-bootstrap/pull/230	2020-11-25 14:22:58 -08:00
Dalton Hubble	e826b49648	Update Matchbox profile to use initramfs and rootfs images * Fedora CoreOS stable (after Oct 6) ships separate initramfs and rootfs images, used as initrd's * Update profiles to match the Matchbox examples, which have already switched to the new profile and to remove the unused kernel args * Requires Fedora CoreOS version which ships rootfs images (e.g. stable 32.20200923.3.0 or later) Rel: * https://github.com/coreos/fedora-coreos-tracker/issues/390#issuecomment-661986987 * `da0df01763 (diff-4541f7b7c174f6ae6270135942c1c65ed9e09ebe81239709f5a9fb34e858ddcf)` Supercedes https://github.com/poseidon/typhoon/pull/888	2020-11-25 14:13:39 -08:00
Dalton Hubble	fa8f68f50e	Fix Fedora CoreOS AWS AMI query in non-US regions * A `aws_ami` data source will fail a Terraform plan if no matching AMI is found, even if the AMI is not used. ARM64 images are only published to a few US regions, so the `aws_ami` data query could fail when creating Fedora CoreOS AWS clusters in non-US regions * Condition `aws_ami` on whether experimental arch `arm64` is chosen * Recent regression introduced in v1.19.4 https://github.com/poseidon/typhoon/pull/875 Closes https://github.com/poseidon/typhoon/issues/886	2020-11-25 11:32:05 -08:00
Dalton Hubble	ba8d972c76	Update Prometheus from v2.22.2 to v2.23.0-rc.0 * https://github.com/prometheus/prometheus/releases/tag/v2.23.0-rc.0	2020-11-24 10:54:42 -08:00
Dalton Hubble	c0347ca0c6	Set kubeconfig and asset_dist as sensitive * Mark `kubeconfig` and `asset_dist` as `sensitive` to prevent the Terraform CLI displaying these values, esp. for CI systems * In particular, external tools or tfvars style uses (not recommended) reportedly display all outputs and are improved by setting sensitive * For Terraform v0.14, outputs referencing sensitive fields must also be annotated as sensitive Closes https://github.com/poseidon/typhoon/issues/884	2020-11-23 11:41:55 -08:00
Dalton Hubble	5e4f5de271	Enable Network Load Balancer (NLB) dualstack * NLB subnets assigned both IPv4 and IPv6 addresses * NLB DNS name has both A and AAAA records * NLB to target node traffic is IPv4 (no change), no change to security groups needed * Ingresses exposed through the recommended Nginx Ingress Controller addon will be accessible via IPv4 or IPv6. No change is needed to the app's CNAME to NLB record Related: https://aws.amazon.com/about-aws/whats-new/2020/11/network-load-balancer-supports-ipv6/	2020-11-21 14:16:24 -08:00
Dalton Hubble	be28495d79	Update Prometheus from v2.22.1 to v2.22.2 * https://github.com/prometheus/prometheus/releases/tag/v2.22.2	2020-11-19 21:50:48 -08:00
Dalton Hubble	f1356fec24	Update Grafana from v7.3.2 to v7.3.3 * https://github.com/grafana/grafana/releases/tag/v7.3.3	2020-11-19 21:49:11 -08:00
Dalton Hubble	cc00afa4e1	Add Terraform v0.13 input variable validations * Support for migrating from Terraform v0.12.x to v0.13.x was added in v1.18.8 * Require Terraform v0.13+. Drop support for Terraform v0.12	2020-11-17 12:02:34 -08:00
Dalton Hubble	f5a83667e8	Update Grafana from v7.3.1 to v7.3.2 * https://github.com/grafana/grafana/releases/tag/v7.3.2	2020-11-14 13:30:30 -08:00
Dalton Hubble	a911367c2e	Update nginx-ingress from v0.41.0 to v0.41.2 * https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.41.2	2020-11-14 13:27:06 -08:00
Dalton Hubble	1b3a0f6ebc	Add experimental Fedora CoreOS arm64 support on AWS * Add experimental `arch` variable to Fedora CoreOS AWS, accepting amd64 (default) or arm64 to support native arm64/aarch64 clusters or mixed/hybrid clusters with a worker pool of arm64 workers * Add `daemonset_tolerations` variable to cluster module (experimental) * Add `node_taints` variable to workers module * Requires flannel CNI and experimental Poseidon-built arm64 Fedora CoreOS AMIs (published to us-east-1, us-east-2, and us-west-1) WARN: * Our AMIs are experimental, may be removed at any time, and will be removed when Fedora CoreOS publishes official arm64 AMIs. Do NOT use in production Related: * https://github.com/poseidon/typhoon/pull/682	2020-11-14 13:09:24 -08:00
Dalton Hubble	1113a22f61	Update Kubernetes from v1.19.3 to v1.19.4 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.19.md#v1194	2020-11-11 22:56:27 -08:00
Dalton Hubble	152c7d86bd	Change bootstrap.service container from rkt to docker * Use docker to run `bootstrap.service` container * Background https://github.com/poseidon/typhoon/pull/855	2020-11-11 22:26:05 -08:00
Dalton Hubble	79deb8a967	Update Cilium from v1.9.0-rc3 to v1.9.0 * https://github.com/cilium/cilium/releases/tag/v1.9.0	2020-11-10 23:42:41 -08:00
Dalton Hubble	f412f0d9f2	Update Calico from v3.16.4 to v3.16.5 * https://github.com/projectcalico/calico/releases/tag/v3.16.5	2020-11-10 22:58:19 -08:00
Dalton Hubble	133d325013	Update nginx-ingress from v0.40.2 to v0.41.0 * https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.41.0	2020-11-08 14:34:52 -08:00
Dalton Hubble	4b05c0180e	Update Grafana from v7.3.0 to v7.3.1 * https://github.com/grafana/grafana/releases/tag/v7.3.1	2020-11-08 14:13:39 -08:00
Dalton Hubble	f49ab3a6ee	Update Prometheus from v2.22.0 to v2.22.1 * https://github.com/prometheus/prometheus/releases/tag/v2.22.1	2020-11-08 14:12:24 -08:00
Dalton Hubble	0eef16b274	Improve and tidy Fedora CoreOS etcd-member.service * Allow a snippet with a systemd dropin to set an alternate image via `ETCD_IMAGE`, for consistency across Fedora CoreOS and Flatcar Linux * Drop comments about integrating system containers with systemd-notify	2020-11-08 11:49:56 -08:00
Dalton Hubble	ad1f59ce91	Change Flatcar etcd-member.service container from rkt to docker * Use docker to run the `etcd-member.service` container * Use env-file `/etc/etcd/etcd.env` like podman on FCOS * Background: https://github.com/poseidon/typhoon/pull/855	2020-11-03 16:42:18 -08:00
Dalton Hubble	82e5ac3e7c	Update Cilium from v1.8.5 to v1.9.0-rc3 * https://github.com/poseidon/terraform-render-bootstrap/pull/224	2020-11-03 10:29:07 -08:00
Dalton Hubble	a8f7880511	Update Cilium from v1.8.4 to v1.8.5 * https://github.com/cilium/cilium/releases/tag/v1.8.5	2020-10-29 00:50:18 -07:00
Dalton Hubble	cda5b93b09	Update kube-state-metrics from v2.0.0-alpha.1 to v2.0.0-alpha.2 * https://github.com/kubernetes/kube-state-metrics/releases/tag/v2.0.0-alpha.2	2020-10-28 18:49:40 -07:00
Dalton Hubble	3e9f5f34de	Update Grafana from v7.2.2 to v7.3.0 * https://github.com/grafana/grafana/releases/tag/v7.3.0	2020-10-28 17:46:26 -07:00
Dalton Hubble	893d139590	Update Calico from v3.16.3 to v3.16.4 * https://github.com/projectcalico/calico/releases/tag/v3.16.4	2020-10-26 00:50:40 -07:00
Dalton Hubble	fc62e51b2a	Update Grafana from v7.2.1 to v7.2.2 * https://github.com/grafana/grafana/releases/tag/v7.2.2	2020-10-22 00:14:04 -07:00
Dalton Hubble	e5ba3329eb	Remove bare-metal CoreOS Container Linux profiles * Remove Matchbox profiles for CoreOS Container Linux * Simplify the remaining Flatcat Linux profiles	2020-10-21 00:25:10 -07:00
Dalton Hubble	7c3f3ab6d0	Rename container-linux modules to flatcar-linux * CoreOS Container Linux was deprecated in v1.18.3 * Continue transitioning docs and modules from supporting both CoreOS and Flatcar "variants" of Container Linux to now supporting Flatcar Linux and equivalents Action Required: Update the Flatcar Linux modules `source` to replace `s/container-linux/flatcar-linux`. See docs for examples	2020-10-20 22:47:19 -07:00
Dalton Hubble	df17253e72	Fix delete node permission on Fedora CoreOS node shutdown * On cloud platforms, `delete-node.service` tries to delete the local node (not always possible depending on preemption time) * Since v1.18.3, kubelet TLS bootstrap generates a kubeconfig in `/var/lib/kubelet` which should be used with kubectl in the delete-node oneshot	2020-10-18 23:38:11 -07:00
Dalton Hubble	eda78db08e	Change Flatcar kubelet.service container from rkt to docker * Use docker to run the `kubelet.service` container * Update Kubelet mounts to match Fedora CoreOS * Remove unused `/etc/ssl/certs` mount (see https://github.com/poseidon/typhoon/pull/810) * Remove unused `/usr/share/ca-certificates` mount * Remove `/etc/resolv.conf` mount, Docker default is ok * Change `delete-node.service` to use docker instead of rkt and inline ExecStart, as was done on Fedora CoreOS * Fix permission denied on shutdown `delete-node`, caused by the kubeconfig mount changing with the introduction of node TLS bootstrap Background * podmand, rkt, and runc daemonless container process runners provide advantages over the docker daemon for system containers. Docker requires workarounds for use in systemd units where the ExecStart must tail logs so systemd can monitor the daemonized container. https://github.com/moby/moby/issues/6791 * Why switch then? On Flatcar Linux, podman isn't shipped. rkt works, but isn't developing while container standards continue to move forward. Typhoon has used runc for the Kubelet runner before in Fedora Atomic, but its more low-level. So we're left with Docker, which is less than ideal, but shipped in Flatcar * Flatcar Linux appears to be shifting system components to use docker, which does provide some limited guards against breakages (e.g. Flatcar cannot enable docker live restore)	2020-10-18 23:24:45 -07:00
Dalton Hubble	afac46e39a	Remove asset_dir variable and optional asset writes * Originally, poseidon/terraform-render-bootstrap generated TLS certificates, manifests, and cluster "assets" written to local disk (`asset_dir`) during terraform apply cluster bootstrap * Typhoon v1.17.0 introduced bootstrapping using only Terraform state to store cluster assets, to avoid ever writing sensitive materials to disk and improve automated use-cases. `asset_dir` was changed to optional and defaulted to "" (no writes) * Typhoon v1.18.0 deprecated the `asset_dir` variable, removed docs, and announced it would be deleted in future. * Add Terraform output `assets_dir` map * Remove the `asset_dir` variable Cluster assets are now stored in Terraform state only. For those who wish to write those assets to local files, this is possible doing so explicitly. ``` resource local_file "assets" { for_each = module.yavin.assets_dist filename = "some-assets/${each.key}" content = each.value } ``` Related: * https://github.com/poseidon/typhoon/pull/595 * https://github.com/poseidon/typhoon/pull/678	2020-10-17 15:00:15 -07:00
Dalton Hubble	b1e680ac0c	Update recommended Terraform provider versions * Sync Terraform provider plugins with those used internally	2020-10-17 13:56:24 -07:00
Dalton Hubble	9fbfbdb854	Update Prometheus from v2.21.0 to v2.22.0 * https://github.com/prometheus/prometheus/releases/tag/v2.22.0	2020-10-17 12:38:25 -07:00
Dalton Hubble	511f5272f4	Update Calico from v3.15.3 to v3.16.3 * https://github.com/projectcalico/calico/releases/tag/v3.16.3 * https://github.com/poseidon/terraform-render-bootstrap/pull/212	2020-10-15 20:08:51 -07:00
Dalton Hubble	46ca5e8813	Update Kubernetes from v1.19.2 to v1.19.3 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.19.md#v1193	2020-10-14 20:47:49 -07:00
Dalton Hubble	394e496cc7	Update Grafana from v7.2.0 to v7.2.1 * https://github.com/grafana/grafana/releases/tag/v7.2.1	2020-10-11 13:21:25 -07:00
Dalton Hubble	7881f4bd86	Update kube-state-metrics from v1.9.7 to v2.0.0-alpha.1 * https://github.com/kubernetes/kube-state-metrics/releases/tag/v2.0.0-alpha * https://github.com/kubernetes/kube-state-metrics/releases/tag/v2.0.0-alpha.1	2020-10-11 12:35:43 -07:00
Dalton Hubble	d5b5b7cb02	Update nginx-ingress from v0.40.0 to v0.40.2 * https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.40.2	2020-10-06 23:52:15 -07:00
Dalton Hubble	b39a1d70da	Update nginx-ingress from v0.35.0 to v0.40.0 * https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.40.0	2020-10-02 01:00:35 -07:00
Dalton Hubble	901f7939b2	Update Cilium from v1.8.3 to v1.8.4 * https://github.com/cilium/cilium/releases/tag/v1.8.4	2020-10-02 00:24:26 -07:00
Dalton Hubble	d65085ce14	Update Grafana from v7.1.5 to v7.2.0 * https://github.com/grafana/grafana/releases/tag/v7.2.0	2020-09-24 20:58:32 -07:00
Dalton Hubble	343db5b578	Remove references to CoreOS Container Linux * CoreOS Container Linux was deprecated in v1.18.3 (May 2020) in favor of Fedora CoreOS and Flatcar Linux. CoreOS Container Linux references were kept to give folks more time to migrate, but AMIs have now been deleted. Time is up. Rel: https://coreos.com/os/eol/	2020-09-24 20:51:02 -07:00
Dalton Hubble	444363be2d	Update Kubernetes from v1.19.1 to v1.19.2 * Update flannel from v0.12.0 to v0.13.0-rc2 * Update flannel-cni from v0.4.0 to v0.4.1 * Update CNI plugins from v0.8.6 to v0.8.7	2020-09-16 20:05:54 -07:00
Dalton Hubble	e838d4dc3d	Refresh Prometheus rules/alerts and Grafana dashboards * Refresh upstream Prometheus rules/alerts and Grafana dashboards	2020-09-13 15:03:27 -07:00
Dalton Hubble	979c092ef6	Reduce apiserver metrics cardinality of non-core APIs * Reduce `apiserver_request_duration_seconds_count` cardinality by dropping series for non-core Kubernetes APIs. This is done to match `apiserver_request_duration_seconds_count` relabeling * These two relabels must be performed the same way to avoid affecting new SLO calculations (upcoming) * See https://github.com/kubernetes-monitoring/kubernetes-mixin/issues/498 Related: https://github.com/poseidon/typhoon/pull/596	2020-09-13 14:47:49 -07:00
Dalton Hubble	eb093af9ed	Drop Kubelet labelmap relabel for node_name * Originally, Kubelet and CAdvisor metrics used a labelmap relabel to add Kubernetes SD node labels onto timeseries * With https://github.com/poseidon/typhoon/pull/596 that relabel was dropped since node labels aren't usually that valuable. `__meta_kubernetes_node_name` was retained but the field name is empty * Favor just using Prometheus server-side `instance` in queries that require some node identifier for aggregation or debugging Fix https://github.com/poseidon/typhoon/issues/823	2020-09-12 19:40:00 -07:00
Dalton Hubble	36096f844d	Promote Cilium from experimental to GA * Cilium was added as an experimental CNI provider in June * Since then, I've been choosing it for an increasing number of clusters and scenarios.	2020-09-12 19:24:55 -07:00
Dalton Hubble	d236628e53	Update Prometheus from v2.20.0 to v2.21.0 * https://github.com/prometheus/prometheus/releases/tag/v2.21.0	2020-09-12 19:20:54 -07:00
Dalton Hubble	577b927a2b	Update Fedora CoreOS Config version from v1.0.0 to v1.1.0 * No notable changes in the config spec, just house keeping * Require any snippets customization to update to v1.1.0. Version skew between the main config and snippets will show an err message * https://github.com/coreos/fcct/blob/master/docs/configuration-v1_1.md	2020-09-10 23:38:40 -07:00
Dalton Hubble	000c11edf6	Update IngressClass resources to networking.k8s.io/v1 * Kubernetes v1.19 graduated Ingress and IngressClass from networking.k8s.io/v1beta1 to networking.k8s.io/v1	2020-09-10 23:25:53 -07:00
Dalton Hubble	29b16c3fc0	Change seccomp annotations to seccompProfile * seccomp graduated to GA in Kubernetes v1.19. Support for seccomp alpha annotations will be removed in v1.22 * Replace seccomp annotations with the GA seccompProfile field in the PodTemplate securityContext * Switch profile from `docker/default` to `runtime/default` (no effective change, since docker is the runtime) * Verify with docker inspect SecurityOpt. Without the profile, you'd see `seccomp=unconfined` Related: https://github.com/poseidon/terraform-render-bootstrap/pull/215	2020-09-10 01:15:07 -07:00
Dalton Hubble	0c7a879bc4	Update Kubernetes from v1.19.0 to v1.19.1 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.19.md#v1191	2020-09-09 20:52:29 -07:00
Dalton Hubble	28ee693e6b	Update Cilium from v1.8.2 to v1.8.3 * https://github.com/cilium/cilium/releases/tag/v1.8.3	2020-09-07 21:10:27 -07:00
Dalton Hubble	8c7d95aefd	Update mkdocs-material from v5.5.9 to v5.5.11	2020-08-29 13:52:16 -07:00
Dalton Hubble	d45dfdbf91	Update nginx-ingress from v0.34.1 to v0.35.0 * Repo changed to k8s.gcr.io/ingress-nginx/controller * https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.35.0	2020-08-29 13:38:28 -07:00
Dalton Hubble	8dd221a57c	Add fleetlock docs and links to addons * Add links to fleetlock for Fedora CoreOS reboot coordination * https://github.com/poseidon/fleetlock	2020-08-28 00:02:24 -07:00
Dalton Hubble	a504264e24	Update Grafana from v7.1.4 to v7.1.5 * https://github.com/grafana/grafana/releases/tag/v7.1.5	2020-08-27 08:52:07 -07:00
Dalton Hubble	88cf7273dc	Update Kubernetes from v1.18.8 to v1.19.0 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.19.md	2020-08-27 08:50:01 -07:00
Dalton Hubble	58def65a09	Update Grafana from v7.1.3 to v7.1.4 * https://github.com/grafana/grafana/releases/tag/v7.1.4	2020-08-22 15:40:09 -07:00
Dalton Hubble	cd7fd29194	Update etcd from v3.4.10 to v3.4.12 * https://github.com/etcd-io/etcd/blob/master/CHANGELOG-3.4.md	2020-08-19 21:25:41 -07:00
Bo Huang	aafa38476a	Fix SELinux race condition on non-bootstrap controllers in multi-controller (#808 ) * Fix race condition for bootstrap-secrets SELinux context on non-bootstrap controllers in multi-controller FCOS clusters * On first boot from disk on non-bootstrap controllers, adding bootstrap-secrets races with kubelet.service starting, which can cause the secrets assets to have the wrong label until kubelet.service restarts (service, reboot, auto-update) * This can manifest as `kube-apiserver`, `kube-controller-manager`, and `kube-scheduler` pods crashlooping on spare controllers on first cluster creation	2020-08-19 21:18:10 -07:00
Dalton Hubble	9a07f1d30b	Update recommended Terraform provider versions * Sync Terraform provider plugin versions to those used internally * Update mkdocs-material from v5.5.1 to v5.5.6 * Fix minor details in docs	2020-08-14 10:05:52 -07:00
Dalton Hubble	c87db3ef37	Update Kubernetes from v1.18.6 to v1.18.8 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md#v1188	2020-08-13 20:47:43 -07:00
Dalton Hubble	342380cfa4	Update Terraform migration guide SHA * Mention the first master branch SHA that introduced Terraform v0.13 forward compatibility * Link the migration guide on Github until a release is available and website docs are published	2020-08-13 00:36:47 -07:00
Dalton Hubble	5e70d7e2c8	Migrate from Terraform v0.12.x to v0.13.x * Recommend Terraform v0.13.x * Support automatic install of poseidon's provider plugins * Update tutorial docs for Terraform v0.13.x * Add migration guide for Terraform v0.13.x (best-effort) * Require Terraform v0.12.26+ (migration compatibility) * Require `terraform-provider-ct` v0.6.1 * Require `terraform-provider-matchbox` v0.4.1 * Require `terraform-provider-digitalocean` v1.20+ Related: * https://www.hashicorp.com/blog/announcing-hashicorp-terraform-0-13/ * https://www.terraform.io/upgrade-guides/0-13.html * https://registry.terraform.io/providers/poseidon/ct/latest * https://registry.terraform.io/providers/poseidon/matchbox/latest	2020-08-12 01:54:32 -07:00
Dalton Hubble	f6ce12766b	Allow terraform-provider-aws v3.0+ plugin * Typhoon AWS is compatible with terraform-provider-aws v3.x releases * Continue to allow v2.23+, no v3.x specific features are used * Set required provider versions in the worker module, since it can be used independently Related: * https://github.com/terraform-providers/terraform-provider-aws/releases/tag/v3.0.0	2020-08-09 12:39:26 -07:00
Dalton Hubble	e1d6ab2f24	Update Grafana from v7.1.1 to v7.1.3 * https://github.com/grafana/grafana/releases/tag/v7.1.3 * https://github.com/grafana/grafana/releases/tag/v7.1.2	2020-08-08 18:59:49 -07:00
Dalton Hubble	ccee5d3d89	Update from coreos/flannel-cni to poseidon/flannel-cni * Update CNI plugins from v0.6.0 to v0.8.6 to fix several CVEs * Update the base image to alpine:3.12 * Use `flannel-cni` as an init container and remove sleep * https://github.com/poseidon/terraform-render-bootstrap/pull/205 * https://github.com/poseidon/flannel-cni * https://quay.io/repository/poseidon/flannel-cni Background * Switch from github.com/coreos/flannel-cni v0.3.0 which was last published by me in 2017 and is no longer accessible to me to maintain or patch * Port to the poseidon/flannel-cni rewrite, which releases v0.4.0 to continue the prior release numbering	2020-08-02 15:13:15 -07:00
Dalton Hubble	78e6409bd0	Fix flannel support on Fedora CoreOS * Fedora CoreOS now ships systemd-udev's `default.link` while Flannel relies on being able to pick its own MAC address for the `flannel.1` link for tunneled traffic to reach cni0 on the destination side, without being dropped * This change first appeared in FCOS testing-devel 32.20200624.20.1 and is the behavior going forward in FCOS since it was added to align FCOS network naming / configs with the rest of Fedora and address issues related to the default being missing * Flatcar Linux (and Container Linux) has a specific flannel.link configuration builtin, so it was not affected * https://github.com/coreos/fedora-coreos-tracker/issues/574#issuecomment-665487296 Note: Typhoon's recommended and default CNI provider is Calico, unless `networking` is set to flannel directly.	2020-08-01 21:22:08 -07:00
Dalton Hubble	2aef42d4f6	Update Prometheus from v2.19.2 to v2.20.0 * https://github.com/prometheus/prometheus/releases/tag/v2.20.0	2020-07-25 16:37:28 -07:00
Dalton Hubble	b7d67757de	Update Grafana from v7.1.0 to v7.1.1 * https://github.com/grafana/grafana/releases/tag/v7.1.1	2020-07-25 16:33:40 -07:00
Dalton Hubble	cd0a28904e	Update Cilium from v1.8.1 to v1.8.2 * https://github.com/cilium/cilium/releases/tag/v1.8.2	2020-07-25 16:06:27 -07:00
Dalton Hubble	618f8b30fd	Update CoreDNS from v1.6.7 to v1.7.0 * https://coredns.io/2020/06/15/coredns-1.7.0-release/ * Update Grafana dashboard with revised metrics names	2020-07-25 15:51:31 -07:00
Dalton Hubble	f96e91f225	Update etcd from v3.4.9 to v3.4.10 * https://github.com/etcd-io/etcd/releases/tag/v3.4.10	2020-07-18 14:08:22 -07:00
Dalton Hubble	efd4a0319d	Update Grafana from v7.0.6 to v7.1.0 * https://github.com/grafana/grafana/releases/tag/v7.1.0	2020-07-18 13:54:56 -07:00
Dalton Hubble	5fba20d358	Update recommended Terraform provider versions * Sync Terraform provider plugin versions with those used internally	2020-07-18 13:19:25 -07:00
Dalton Hubble	a8d3d3bb12	Update ingress-nginx from v0.33.0 to v0.34.1 * Switch to ingress-nginx controller images from us.grc.io (eu, asia can also be used if desired) * https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.34.1 * https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.34.0	2020-07-15 22:43:49 -07:00
Dalton Hubble	dfd2a0ec23	Update Grafana from v7.0.5 to v7.0.6 * https://github.com/grafana/grafana/releases/tag/v7.0.6	2020-07-09 21:10:48 -07:00
Dalton Hubble	e3bf7d8f9b	Update Prometheus from v2.19.1 to v2.19.2 * https://github.com/prometheus/prometheus/releases/tag/v2.19.2	2020-07-09 21:08:55 -07:00
Dalton Hubble	49050320ce	Update Cilium from v1.8.0 to v1.8.1 * https://github.com/cilium/cilium/releases/tag/v1.8.1	2020-07-05 16:00:00 -07:00
Dalton Hubble	74e025c9e4	Update Grafana from v7.0.4 to v7.0.5 * https://github.com/grafana/grafana/releases/tag/v7.0.5	2020-07-05 15:49:34 -07:00
Dalton Hubble	df3f40bcce	Allow using Flatcar Linux edge on Azure * Set Kubelet cgroup driver to systemd when Flatcar Linux edge is chosen Note: Typhoon module status assumes use of the stable variant of an OS channel/stream. Its possible to use earlier variants and those are sometimes tested or developed against, but stable is the recommendation	2020-06-30 01:35:29 -07:00
Dalton Hubble	32886cfba1	Promote Fedora CoreOS on Google Cloud to stable status	2020-06-29 23:09:11 -07:00
Dalton Hubble	430d139a5b	Remove os_image variable on Google Cloud Fedora CoreOS * In v1.18.3, the `os_stream` variable was added to select a Fedora CoreOS image stream (stable, testing, next) on AWS and Google Cloud (which publish official streams) * Remove `os_image` variable deprecated in v1.18.3. Manually uploaded images are no longer needed	2020-06-29 22:57:11 -07:00
Dalton Hubble	7c6ab21b94	Isolate each DigitalOcean cluster in its own VPC * DigitalOcean introduced Virtual Private Cloud (VPC) support to match other clouds and enhance the prior "private networking" feature. Before, droplet's belonging to different clusters (but residing in the same region) could reach one another (although Typhoon firewall rules prohibit this). Now, droplets in a VPC reside in their own network * https://www.digitalocean.com/docs/networking/vpc/ * Create droplet instances in a VPC per cluster. This matches the design of Typhoon AWS, Azure, and GCP. * Require `terraform-provider-digitalocean` v1.16.0+ (action required) * Output `vpc_id` for use with an attached DigitalOcean loadbalancer	2020-06-28 23:25:30 -07:00
Dalton Hubble	21178868db	Revert "Update Prometheus from v2.19.1 to v2.19.2" * Prometheus has not published the v1.19.2 * This reverts commit `81b6f54169`.	2020-06-27 14:53:58 -07:00
Dalton Hubble	81b6f54169	Update Prometheus from v2.19.1 to v2.19.2 * https://github.com/prometheus/prometheus/releases/tag/v2.19.2	2020-06-27 14:34:30 -07:00
Dalton Hubble	7bce15975c	Update Kubernetes from v1.18.4 to v1.18.5 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md#v1185	2020-06-27 13:52:18 -07:00
Dalton Hubble	1f83ae7dbb	Update Calico from v3.14.1 to v3.15.0 * https://docs.projectcalico.org/v3.15/release-notes/	2020-06-26 02:40:12 -07:00
Dalton Hubble	a79ad34ba3	Update Grafana from v7.0.3 to v7.0.4 * https://github.com/grafana/grafana/releases/tag/v7.0.4	2020-06-26 02:06:38 -07:00
Dalton Hubble	99a11442c7	Update Prometheus from v2.19.0 to v2.19.1 * https://github.com/prometheus/prometheus/releases/tag/v2.19.1	2020-06-26 02:01:58 -07:00
Dalton Hubble	37f00a3882	Reduce Calcio MTU on Fedora CoreOS Azure * Change the Calico VXLAN interface for MTU from 1450 to 1410 * VXLAN on Azure should support MTU 1450. However, there is history where performance measures have shown that 1410 is needed to have expected performance. Flatcar Linux has the same MTU 1410 override and note * FCOS 31.20200323.3.2 was known to perform fine with 1450, but now in 31.20200517.3.0 the right value seems to be 1410	2020-06-19 00:24:56 -07:00
Dalton Hubble	4cfafeaa07	Fix Kubelet starting before hostname set on FCOS AWS * Fedora CoreOS `kubelet.service` can start before the hostname is set. Kubelet reads the hostname to determine the node name to register. If the hostname was read as localhost, Kubelet will continue trying to register as localhost (problem) * This race manifests as a node that appears NotReady, the Kubelet is trying to register as localhost, while the host itself (by then) has an AWS provided hostname. Restarting kubelet.service is a manual fix so Kubelet re-reads the hostname * This race could only be shown on AWS, not on Google Cloud or Azure despite attempts. Bare-metal and DigitalOcean differ and use hostname-override (e.g. afterburn) so they're not affected * Wait for nodes to have a non-localhost hostname in the oneshot that awaits /etc/resolve.conf. Typhoon has no valid cases for a node hostname being localhost (not even single-node clusters) Related Openshift: https://github.com/openshift/machine-config-operator/pull/1813 Close https://github.com/poseidon/typhoon/issues/765	2020-06-19 00:19:54 -07:00
Dalton Hubble	90e23f5822	Rename controller node label and NoSchedule taint * Remove node label `node.kubernetes.io/master` from controller nodes * Use `node.kubernetes.io/controller` (present since v1.9.5, [#160](https://github.com/poseidon/typhoon/pull/160)) to node select controllers * Rename controller NoSchedule taint from `node-role.kubernetes.io/master` to `node-role.kubernetes.io/controller` * Tolerate the new taint name for workloads that may run on controller nodes and stop tolerating `node-role.kubernetes.io/master` taint	2020-06-19 00:12:13 -07:00
Dalton Hubble	c25c59058c	Update Kubernetes from v1.18.3 to v1.18.4 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md#v1184	2020-06-17 19:53:19 -07:00
Dalton Hubble	bc9b808d44	Update nginx-ingress from v0.32.0 to v0.33.0 * https://github.com/kubernetes/ingress-nginx/releases/tag/controller-0.33.0	2020-06-16 18:44:40 -07:00
Dalton Hubble	4b0203fdb2	Fix typo in DigitalOcean docs title	2020-06-16 18:33:56 -07:00
Dalton Hubble	04520e447c	Update node-exporter from v1.0.0 to v1.0.1 * https://github.com/prometheus/node_exporter/releases/tag/v1.0.1	2020-06-16 17:57:09 -07:00
Dalton Hubble	413585681b	Remove unused Kubelet lock-file and exit-on-lock-contention * Kubelet `--lock-file` and `--exit-on-lock-contention` date back to usage of bootkube and at one point running Kubelet in a "self-hosted" style whereby an on-host Kubelet (rkt) started pods, but then a Kubelet DaemonSet was scheduled and able to take over (hence self-hosted). `lock-file` and `exit-on-lock-contention` flags supported this pivot. The pattern has been out of favor (in bootkube too) for years because of dueling Kubelet complexity * Typhoon runs Kubelet as a container via an on-host systemd unit using podman (Fedora CoreOS) or rkt (Flatcar Linux). In fact, Typhoon no longer uses bootkube or control plane pivot (let alone Kubelet pivot) and uses static pods since v1.16.0 * https://github.com/poseidon/typhoon/pull/536	2020-06-12 00:06:41 -07:00
Dalton Hubble	c9059d3fe9	Update Prometheus from v2.19.0-rc.0 to v2.19.0 * https://github.com/prometheus/prometheus/releases/tag/v2.19.0	2020-06-09 23:05:03 -07:00
Dalton Hubble	a287920169	Use strict mode for Container Linux Configs * Enable terraform-provider-ct `strict` mode for parsing Container Linux Configs and snippets * Fix Container Linux Config systemd unit syntax `enable` (old) to `enabled` * Align with Fedora CoreOS which uses strict mode already	2020-06-09 23:00:36 -07:00
Dalton Hubble	31d02b0221	Update Prometheus from v2.18.1 to v2.19.0-rc.0 * https://github.com/prometheus/prometheus/releases/tag/v2.19.0-rc.0	2020-06-05 00:16:45 -07:00
Dalton Hubble	8f875f80f5	Update Grafana from v7.0.1 to v7.0.3 * https://github.com/grafana/grafana/releases/tag/v7.0.2 * https://github.com/grafana/grafana/releases/tag/v7.0.3	2020-06-03 12:31:58 -07:00
Dalton Hubble	16c0b9152b	Update kube-state-metrics from v1.9.6 to v1.9.7 * https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.9.7	2020-06-03 11:35:10 -07:00
Dalton Hubble	20bfd69780	Change Kubelet container image publishing * Build Kubelet container images internally and publish to Quay and Dockerhub (new) as an alternative in case of registry outage or breach * Use our infra to provide single and multi-arch (default) Kublet images for possible future use * Docs: Show how to use alternative Kubelet images via snippets and a systemd dropin (builds on #737) Changes: * Update docs with changes to Kubelet image building * If you prefer to trust images built by Quay/Dockerhub, automated image builds are still available with unique tags (albeit with some limitations): * Quay automated builds are tagged `build-{short_sha}` (limit: only amd64) * Dockerhub automated builts are tagged `build-{tag}` and `build-master` (limit: only amd64, no shas) Links: * Kubelet: https://github.com/poseidon/kubelet * Docs: https://typhoon.psdn.io/topics/security/#container-images * Registries: * quay.io/poseidon/kubelet * docker.io/psdn/kubelet	2020-05-30 23:34:23 -07:00
Dalton Hubble	ba44408b76	Update Calico from v3.14.0 to v3.14.1 * https://docs.projectcalico.org/v3.14/release-notes/	2020-05-30 22:08:37 -07:00
Dalton Hubble	187bb17d39	Update Grafana from v7.0.0 to v7.0.1 * https://github.com/grafana/grafana/releases/tag/v7.0.1	2020-05-27 21:35:24 -07:00
Dalton Hubble	abc31c3711	Update node-exporter from v1.0.0-rc.1 to v1.0.0 * https://github.com/prometheus/node_exporter/releases/tag/v1.0.0	2020-05-27 21:33:03 -07:00
Dalton Hubble	e72f916c8d	Update etcd from v3.4.8 to v3.4.9 * https://github.com/etcd-io/etcd/blob/master/CHANGELOG-3.4.md#v349-2020-05-20	2020-05-22 00:52:20 -07:00
Dalton Hubble	c52f9f8d08	Upgrade docs packages and refresh content * Promote DigitalOcean from alpha to beta for Fedora CoreOS and Flatcar Linux * Upgrade mkdocs-material and PyPI packages for docs * Replace docs mentions of Container Linux with Flatcar Linux and move docs/cl to docs/flatcar-linux * Deprecate CoreOS Container Linux support. Its still usable for some time, but start removing docs	2020-05-20 23:31:26 -07:00
Dalton Hubble	3bdddc452c	Update Grafana from v7.0.0-beta2 to v7.0.0 * https://grafana.com/docs/grafana/latest/guides/whats-new-in-v7-0/	2020-05-18 23:42:32 -07:00
Dalton Hubble	ff4187a1fb	Use new Azure subnet to set address_prefixes list * Update Azure subnet `address_prefix` to `azure_prefixes` list * Fix warning that `address_prefix` is deprecated * Require `terraform-provider-azurerm` v2.8.0+ (action required) Rel: https://github.com/terraform-providers/terraform-provider-azurerm/pull/6493	2020-05-18 23:35:47 -07:00
Dalton Hubble	90edcd3d77	Update node-exporter from v1.0.0-rc.0 to v1.0.0-rc.1 * https://github.com/prometheus/node_exporter/releases/tag/v1.0.0-rc.1	2020-05-15 18:03:19 -07:00
Dalton Hubble	a927c7c790	Update kube-state-metrics from v1.9.5 to v1.9.6 * https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.9.6	2020-05-15 17:42:24 -07:00
Dalton Hubble	d952576d2f	Update Grafana from v7.0.0-beta3 to v7.0.0 * https://github.com/grafana/grafana/releases/tag/7.0.0	2020-05-15 17:38:59 -07:00
Dalton Hubble	70e389f37f	Restore use of Flatcar Linux Azure Marketplace image * Switch Flatcar Linux Azure to use the Marketplace image from Kinvolk (offer `flatcar-container-linux-free`) * Accepting Azure Marketplace terms is still neccessary, update docs to show accepting the free offer rather than BYOL * Upstream Flatcar: https://github.com/flatcar-linux/Flatcar/issues/82 * Typhoon: https://github.com/poseidon/typhoon/issues/703	2020-05-13 22:50:24 -07:00
Dalton Hubble	01905b00bc	Support Fedora CoreOS OS image streams on AWS * Add `os_stream` variable to set the stream to stable (default), testing, or next * Remove unused os_image variable on Fedora CoreOS AWS	2020-05-13 21:45:12 -07:00
Dalton Hubble	f4194cd57a	Update Grafana from v7.0.0-beta2 to v7.0.0-beta.3 * https://github.com/grafana/grafana/releases/tag/v7.0.0-beta3	2020-05-09 17:50:40 -07:00
Dalton Hubble	a2db4fa8c4	Update Calico from v3.13.3 to v3.14.0 * https://docs.projectcalico.org/v3.14/release-notes/	2020-05-09 16:05:30 -07:00
Dalton Hubble	358854e712	Fix Calico install-cni crash loop on Pod restarts * Set a consistent MCS level/range for Calico install-cni * Note: Rebooting a node was a workaround, because Kubelet relabels /etc/kubernetes(/cni/net.d) Background: * On SELinux enforcing systems, the Calico CNI install-cni container ran with default SELinux context and a random MCS pair. install-cni places CNI configs by first creating a temporary file and then moving them into place, which means the file MCS categories depend on the containers SELinux context. * calico-node Pod restarts creates a new install-cni container with a different MCS pair that cannot access the earlier written file (it places configs every time), causing the init container to error and calico-node to crash loop * https://github.com/projectcalico/cni-plugin/issues/874 ``` mv: inter-device move failed: '/calico.conf.tmp' to '/host/etc/cni/net.d/10-calico.conflist'; unable to remove target: Permission denied Failed to mv files. This may be caused by selinux configuration on the host, or something else. ``` Note, this isn't a host SELinux configuration issue. Related: * https://github.com/poseidon/terraform-render-bootstrap/pull/186	2020-05-09 16:01:44 -07:00
Dalton Hubble	b5dabcea31	Use Fedora CoreOS image streams on Google Cloud * Add `os_stream` variable to set a Fedora CoreOS stream to `stable` (default), `testing`, or `next` * Deprecate `os_image` variable. Remove docs about uploading Fedora CoreOS images manually, this is no longer needed * https://docs.fedoraproject.org/en-US/fedora-coreos/update-streams/ Rel: https://github.com/coreos/fedora-coreos-docs/pull/70	2020-05-08 01:23:12 -07:00
Dalton Hubble	3f0a5d2715	Update Grafana from v7.0.0-beta1 to v7.0.0-beta2 * https://github.com/grafana/grafana/releases/tag/v7.0.0-beta2	2020-05-07 23:04:44 -07:00
Dalton Hubble	33173c0206	Update Prometheus from v2.18.0 to v2.18.1 * https://github.com/prometheus/prometheus/releases/tag/v2.18.1	2020-05-07 22:59:11 -07:00
Dalton Hubble	70f30d9c07	Update Prometheus from v2.18.0-rc.1 to v2.18.0 * https://github.com/prometheus/prometheus/releases/tag/v2.18.0	2020-05-05 22:31:11 -07:00
Dalton Hubble	6afc1643d9	Update nginx-ingress from v0.30.0 to v0.32.0 * Add support for IngressClass and RBAC authorization * Since our nginx ingress controller example uses the flag `--ingress-class=public`, add an IngressClass to go along with it Rel: https://kubernetes.io/docs/concepts/services-networking/ingress/#ingress-class	2020-05-03 23:24:19 -07:00
Dalton Hubble	e71e27e769	Update Prometheus from v2.17.2 to v2.18.0-rc.1 * https://github.com/prometheus/prometheus/releases/tag/v2.18.0-rc.1	2020-04-29 20:57:48 -07:00
Dalton Hubble	64035005d4	Update Grafana from v6.7.2 to v7.0.0-beta1 * https://github.com/grafana/grafana/releases/tag/v7.0.0-beta1	2020-04-29 20:53:30 -07:00
Dalton Hubble	fd044ee117	Enable Kubelet TLS bootstrap and NodeRestriction * Enable bootstrap token authentication on kube-apiserver * Generate the bootstrap.kubernetes.io/token Secret that may be used as a bootstrap token * Generate a bootstrap kubeconfig (with a bootstrap token) to be securely distributed to nodes. Each Kubelet will use the bootstrap kubeconfig to authenticate to kube-apiserver as `system:bootstrappers` and send a node-unique CSR for kube-controller-manager to automatically approve to issue a Kubelet certificate and kubeconfig (expires in 72 hours) * Add ClusterRoleBinding for bootstrap token subjects (`system:bootstrappers`) to have the `system:node-bootstrapper` ClusterRole * Add ClusterRoleBinding for bootstrap token subjects (`system:bootstrappers`) to have the csr nodeclient ClusterRole * Add ClusterRoleBinding for bootstrap token subjects (`system:bootstrappers`) to have the csr selfnodeclient ClusterRole * Enable NodeRestriction admission controller to limit the scope of Node or Pod objects a Kubelet can modify to those of the node itself * Ability for a Kubelet to delete its Node object is retained as preemptible nodes or those in auto-scaling instance groups need to be able to remove themselves on shutdown. This need continues to have precedence over any risk of a node deleting itself maliciously Security notes: 1. Issued Kubelet certificates authenticate as user `system:node:NAME` and group `system:nodes` and are limited in their authorization to perform API operations by Node authorization and NodeRestriction admission. Previously, a Kubelet's authorization was broader. This is the primary security motivation. 2. The bootstrap kubeconfig credential has the same sensitivity as the previous generated TLS client-certificate kubeconfig. It must be distributed securely to nodes. Its compromise still allows an attacker to obtain a Kubelet kubeconfig 3. Bootstrapping Kubelet kubeconfig's with a limited lifetime offers a slight security improvement. * An attacker who obtains the kubeconfig can likely obtain the bootstrap kubeconfig as well, to obtain the ability to renew their access * A compromised bootstrap kubeconfig could plausibly be handled by replacing the bootstrap token Secret, distributing the token to new nodes, and expiration. Whereas a compromised TLS-client certificate kubeconfig can't be revoked (no CRL). However, replacing a bootstrap token can be impractical in real cluster environments, so the limited lifetime is mostly a theoretical benefit. * Cluster CSR objects are visible via kubectl which is nice 4. Bootstrapping node-unique Kubelet kubeconfigs means Kubelet clients have more identity information, which can improve the utility of audits and future features Rel: https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet-tls-bootstrapping/ Rel: https://github.com/poseidon/terraform-render-bootstrap/pull/185	2020-04-28 19:35:33 -07:00
Dalton Hubble	38a6bddd06	Update Calico from v3.13.1 to v3.13.3 * https://docs.projectcalico.org/v3.13/release-notes/	2020-04-23 23:58:02 -07:00
Dalton Hubble	84ed0a31c3	Update Prometheus from v2.17.1 to v2.17.2 * https://github.com/prometheus/prometheus/releases/tag/v2.17.2	2020-04-20 18:09:24 -07:00
Dalton Hubble	fcbee12334	Fix race condition creating DigitalOcean firewall rules * DigitalOcean firewall rules should reference Terraform tag resources rather than using tag strings. Otherwise, terraform apply can fail (neeeds rerun) if a tag has not yet been created	2020-04-19 16:55:02 -07:00
Dalton Hubble	feac94605a	Fix bootstrap mount to use shared volume SELinux label * Race: During initial bootstrap, static control plane pods could hang with Permission denied to bootstrap secrets. A manual fix involved restarting Kubelet, which relabeled mounts The race had no effect on subsequent reboots. * bootstrap.service runs podman with a private unshared mount of /etc/kubernetes/bootstrap-secrets which uses an SELinux MCS label with a category pair. However, bootstrap-secrets should be shared as its mounted by Docker pods kube-apiserver, kube-scheduler, and kube-controller-manager. Restarting Kubelet was a manual fix because Kubelet relabels all /etc/kubernetes * Fix bootstrap Pod to use the shared volume label, which leaves bootstrap-secrets files with SELinux level s0 without MCS * Also allow failed bootstrap.service to be re-applied. This was missing on bare-metal and AWS	2020-04-19 16:31:32 -07:00
Dalton Hubble	2b1b918b43	Revert Flatcar Linux Azure to manual upload images * Initial support for Flatcar Linux on Azure used the Flatcar Linux Azure Marketplace images (e.g. `flatcar-stable`) in https://github.com/poseidon/typhoon/pull/664 * Flatcar Linux Azure Marketplace images have some unresolved items https://github.com/poseidon/typhoon/issues/703 * Until the Marketplace items are resolved, revert to requiring Flatcar Linux's images be manually uploaded (like GCP and DigitalOcean)	2020-04-18 15:40:57 -07:00
Dalton Hubble	671eacb86e	Update Kubernetes from v1.18.1 to v1.18.2 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md#changelog-since-v1181	2020-04-16 23:40:52 -07:00
Dalton Hubble	5c4a3f73d5	Add support for Fedora CoreOS on Azure * Add `azure/fedora-coreos/kubernetes` module	2020-04-12 16:35:49 -07:00
Dalton Hubble	76ab4c4c2a	Change `container-linux` module preference to Flatcar Linux * No change to Fedora CoreOS modules * For Container Linx AWS and Azure, change the `os_image` default from coreos-stable to flatcar-stable * For Container Linux GCP and DigitalOcean, change `os_image` to be required since users should upload a Flatcar Linux image and set the variable * For Container Linux bare-metal, recommend users change the `os_channel` to Flatcar Linux. No actual module change.	2020-04-11 14:52:30 -07:00
Dalton Hubble	1420700bc0	Update CHANGES for v1.18.1 release * Change order of modules in the README	2020-04-11 13:23:49 -07:00
Dalton Hubble	80538e2953	Add support for Fedora CoreOS on DigitalOcean * Add `digital-ocean/fedora-coreos/kubernetes` module * DigitalOcean custom uploaded images do not permit droplet IPv6 networking	2020-04-09 23:55:29 -07:00
Dalton Hubble	73af2f3b7c	Update Kubernetes from v1.18.0 to v1.18.1 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md#v1181	2020-04-08 19:41:48 -07:00
Dalton Hubble	17ea547723	Update etcd from v3.4.5 to v3.4.7 * https://github.com/etcd-io/etcd/releases/tag/v3.4.7 * https://github.com/etcd-io/etcd/releases/tag/v3.4.6	2020-04-06 21:09:25 -07:00
Dalton Hubble	2b5dfece93	Update Grafana from v6.7.1 to v6.7.2 * https://github.com/grafana/grafana/releases/tag/v6.7.2	2020-04-04 13:13:19 -07:00
Dalton Hubble	d47d40b517	Refresh Prometheus rules/alerts and Grafana dashboards * Refresh upstream Prometheus rules and alerts and Grafana dashboards * All Loki recording rules for convenience	2020-03-31 21:53:01 -07:00
Dalton Hubble	bbbaf949f9	Fix UDP outbound and clock sync timeouts on Azure workers * Add "lb" outbound rule for worker TCP _and_ UDP traffic * Fix Azure worker nodes clock synchronization being inactive due to timeouts reaching the CoreOS / Flatcar NTP pool * Fix Azure worker nodes not providing outbount UDP connectivity Background: Azure provides VMs outbound connectivity either by having a public IP or via an SNAT masquerade feature bundled with their virtual load balancing abstraction (in contrast with, say, a NAT gateway). Azure worker nodes have only a private IP, but are associated with the cluster load balancer's backend pool and ingress frontend IP. Outbound traffic uses SNAT with this frontend IP. A subtle detail with Azure SNAT seems to be that since both inbound lb_rule's are TCP only, outbound UDP traffic isn't SNAT'd (highlights the reasons Azure shouldn't have conflated inbound load balancing with outbound SNAT concepts). However, adding a separate outbound rule and disabling outbound SNAT on our ingress lb_rule's we can tell Azure to continue load balancing as before, and support outbound SNAT for worker traffic of both the TCP and UDP protocol. Fixes clock synchronization timeouts: ``` systemd-timesyncd[786]: Timed out waiting for reply from 45.79.36.123:123 (3.flatcar.pool.ntp.org) ``` Azure controller nodes have their own public IP, so controllers (and etcd) nodes have not had clock synchronization or outbound UDP issues	2020-03-31 21:00:16 -07:00
Dalton Hubble	135c6182b8	Update flannel from v0.11.0 to v0.12.0 * https://github.com/coreos/flannel/releases/tag/v0.12.0	2020-03-31 18:31:59 -07:00
Dalton Hubble	c53dc66d4a	Rename Container Linux snippets variable for consistency * Rename controller_clc_snippets to controller_snippets (cloud platforms) * Rename worker_clc_snippets to worker_snippets (cloud platforms) * Rename clc_snippets to snippets (bare-metal)	2020-03-31 18:25:51 -07:00
Dalton Hubble	9960972726	Fix bootstrap regression when networking="flannel" * Fix bootstrap error for missing `manifests-networking/crdyaml` when `networking = "flannel"` Cleanup manifest-networking directory left during bootstrap * Regressed in v1.18.0 changes for Calico https://github.com/poseidon/typhoon/pull/675	2020-03-31 18:21:59 -07:00
Dalton Hubble	bac5acb3bd	Change default kube-system DaemonSet tolerations * Change kube-proxy, flannel, and calico-node DaemonSet tolerations to tolerate `node.kubernetes.io/not-ready` and `node-role.kubernetes.io/master` (i.e. controllers) explicitly, rather than tolerating all taints * kube-system DaemonSets will no longer tolerate custom node taints by default. Instead, custom node taints must be enumerated to opt-in to scheduling/executing the kube-system DaemonSets * Consider setting the daemonset_tolerations variable of terraform-render-bootstrap at a later date Background: Tolerating all taints ruled out use-cases where certain nodes might legitimately need to keep kube-proxy or CNI networking disabled Related: https://github.com/poseidon/terraform-render-bootstrap/pull/179	2020-03-31 01:00:45 -07:00
Dalton Hubble	70bdc9ec94	Allow bootstrap re-apply for Fedora CoreOS GCP * Problem: Fedora CoreOS images are manually uploaded to GCP. When a cluster is created with a stale image, Zincati immediately checks for the latest stable image, fetches, and reboots. In practice, this can unfortunately occur exactly during the initial cluster bootstrap phase. * Recommended: Upload the latest Fedora CoreOS image regularly * Mitigation: Allow a failed bootstrap.service run (which won't touch the done ConditionalPathExists) to be re-run by running `terraforma apply` again. Add a known issue to CHANGES * Update docs to show the current Fedora CoreOS stable version to reduce likelihood users see this issue Longer term ideas: * Ideal: Fedora CoreOS publishes a stable channel. Instances will always boot with the latest image in a channel. The problem disappears since it works the same way AWS does * Timer: Consider some timer-based approach to have zincati delay any system reboots for the first ~30 min of a machine's life. Possibly just configured on the controller node https://github.com/coreos/zincati/pull/251 * External coordination: For Container Linux, locksmith filled a similar role and was disabled to allow CLUO to coordinate reboots. By running atop Kubernetes, it was not possible for the reboot to occur before cluster bootstrap * Rely on https://github.com/coreos/zincati/issues/115 to delay the reboot since bootstrap involves an SSH session * Use path-based activation of zincati on controllers and set that path at the end of the bootstrap process Rel: https://github.com/coreos/fedora-coreos-tracker/issues/239	2020-03-28 18:12:31 -07:00
Dalton Hubble	144bb9403c	Add support for Fedora CoreOS snippets * Refresh snippets customization docs * Requires terraform-provider-ct v0.5+	2020-03-28 16:15:04 -07:00
Dalton Hubble	5fca08064b	Fix Fedora CoreOS AMI to filter for stable images * Fix issue observed in us-east-1 where AMI filters chose the latest testing channel release, rather than the stable chanel * Fedora CoreOS AMI filter selects the latest image with a matching name, x86_64, and hvm, excluding dev images. Add a filter for "Fedora CoreOS stable", which seems to be the only distinguishing metadata indicating the channel	2020-03-28 12:57:45 -07:00
Dalton Hubble	a1a5da6bc2	Add CoreOS Container Linux EOL recommendation to CHANGES * Recommend that users who have not yet tried Fedora CoreOS or Flatcar Linux do so. Likely, Container Linux will reach EOL and platform support / stability ratings will be in a mixed state. Nevertheless, folks should migrate by September.	2020-03-26 23:41:54 -07:00
Dalton Hubble	076b8e3c42	Update Prometheus from v2.17.0 to v2.17.1 * https://github.com/prometheus/prometheus/releases/tag/v2.17.1	2020-03-26 22:17:13 -07:00
Dalton Hubble	ef5f953e04	Set docker log driver to journald on Fedora CoreOS * Before Kubernetes v1.18.0, Kubelet only supported kubectl `--limit-bytes` with the Docker `json-file` log driver so the Fedora CoreOS default was overridden for conformance. See https://github.com/poseidon/typhoon/pull/642 * Kubelet v1.18+ implemented support for other docker log drivers, so the Fedora CoreOS default `journald` can be used again Rel: https://github.com/kubernetes/kubernetes/issues/86367	2020-03-26 22:06:45 -07:00
Dalton Hubble	f100a90d28	Update Kubernetes from v1.17.4 to v1.18.0 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md	2020-03-25 17:51:50 -07:00
Dalton Hubble	5d1e4ad333	Deprecate asset_dir variable and remove docs * Remove docs for the `asset_dir` variable and deprecate it in CHANGES. It will be removed in an upcoming release * Typhoon v1.17.0 introduced a new mechanism for managing and distributing generated assets that stopped relying on writing out to disk. `asset_dir` became optional and defaulted to being unset / off (recommended)	2020-03-25 00:00:01 -07:00

... 3 4 5 6 7 ...

957 Commits