typhoon

mirror of https://github.com/puppetmaster/typhoon.git synced 2025-10-03 14:24:37 +02:00

Author	SHA1	Message	Date
Dalton Hubble	4fd4a0f540	Move control plane static pod TLS assets to /etc/kubernetes/pki * Change control plane static pods to mount `/etc/kubernetes/pki`, instead of `/etc/kubernetes/bootstrap-secrets` to better reflect their purpose and match some loose conventions upstream * Place control plane and bootstrap TLS assets and kubeconfig's in `/etc/kubernetes/pki` * Mount to `/etc/kubernetes/pki` (rather than `/etc/kubernetes/secrets`) to match the host location (less surprise) Rel: https://github.com/poseidon/terraform-render-bootstrap/pull/233	2020-12-02 23:26:42 -08:00
Dalton Hubble	804dfea0f9	Add kubeconfig's for kube-scheduler and kube-controller-manager * Generate TLS client certificates for `kube-scheduler` and `kube-controller-manager` with `system:kube-scheduler` and `system:kube-controller-manager` CNs * Template separate kubeconfigs for kube-scheduler and kube-controller manager (`scheduler.conf` and `controller-manager.conf`). Rename admin for clarity * Before v1.16.0, Typhoon scheduled a self-hosted control plane, which allowed the steady-state kube-scheduler and kube-controller-manager to use a scoped ServiceAccount. With a static pod control plane, separate CN TLS client certificates are the nearest equiv. * https://kubernetes.io/docs/setup/best-practices/certificates/ * Remove unused Kubelet certificate, TLS bootstrap is used instead	2020-12-01 22:02:15 -08:00
Dalton Hubble	8ba23f364c	Add TokenReview and TokenRequestProjection flags * Add kube-apiserver flags for TokenReview and TokenRequestProjection (beta, defaults on) to allow using Service Account Token Volume Projection to create and mount service account tokens tied to a Pod's lifecycle Rel: * https://github.com/poseidon/terraform-render-bootstrap/pull/231 * https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/#service-account-token-volume-projection	2020-12-01 20:02:33 -08:00
Dalton Hubble	f6025666eb	Update etcd from v3.4.12 to v3.4.14 * https://github.com/etcd-io/etcd/releases/tag/v3.4.14	2020-11-29 20:04:25 -08:00
Dalton Hubble	fa3184fb9c	Relax terraform-provider-ct version constraint * Allow terraform-provider-ct versions v0.6+ (e.g. v0.7.1) Before, only v0.6.x point updates were allowed * Update terraform-provider-ct to v0.7.1 in docs * READ the docs before updating terraform-provider-ct, as changing worker user-data is handled differently by different cloud platforms	2020-11-29 19:51:26 -08:00
Dalton Hubble	ae548ce213	Update Calico from v3.16.5 to v3.17.0 * Enable Calico MTU auto-detection * Remove [workaround](https://github.com/poseidon/typhoon/pull/724) to Calico cni-plugin [issue](https://github.com/projectcalico/cni-plugin/issues/874) Rel: https://github.com/poseidon/terraform-render-bootstrap/pull/230	2020-11-25 14:22:58 -08:00
Dalton Hubble	c0347ca0c6	Set kubeconfig and asset_dist as sensitive * Mark `kubeconfig` and `asset_dist` as `sensitive` to prevent the Terraform CLI displaying these values, esp. for CI systems * In particular, external tools or tfvars style uses (not recommended) reportedly display all outputs and are improved by setting sensitive * For Terraform v0.14, outputs referencing sensitive fields must also be annotated as sensitive Closes https://github.com/poseidon/typhoon/issues/884	2020-11-23 11:41:55 -08:00
Dalton Hubble	cc00afa4e1	Add Terraform v0.13 input variable validations * Support for migrating from Terraform v0.12.x to v0.13.x was added in v1.18.8 * Require Terraform v0.13+. Drop support for Terraform v0.12	2020-11-17 12:02:34 -08:00
Dalton Hubble	1113a22f61	Update Kubernetes from v1.19.3 to v1.19.4 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.19.md#v1194	2020-11-11 22:56:27 -08:00
Dalton Hubble	79deb8a967	Update Cilium from v1.9.0-rc3 to v1.9.0 * https://github.com/cilium/cilium/releases/tag/v1.9.0	2020-11-10 23:42:41 -08:00
Dalton Hubble	f412f0d9f2	Update Calico from v3.16.4 to v3.16.5 * https://github.com/projectcalico/calico/releases/tag/v3.16.5	2020-11-10 22:58:19 -08:00
Dalton Hubble	0eef16b274	Improve and tidy Fedora CoreOS etcd-member.service * Allow a snippet with a systemd dropin to set an alternate image via `ETCD_IMAGE`, for consistency across Fedora CoreOS and Flatcar Linux * Drop comments about integrating system containers with systemd-notify	2020-11-08 11:49:56 -08:00
Dalton Hubble	82e5ac3e7c	Update Cilium from v1.8.5 to v1.9.0-rc3 * https://github.com/poseidon/terraform-render-bootstrap/pull/224	2020-11-03 10:29:07 -08:00
Dalton Hubble	a8f7880511	Update Cilium from v1.8.4 to v1.8.5 * https://github.com/cilium/cilium/releases/tag/v1.8.5	2020-10-29 00:50:18 -07:00
Dalton Hubble	893d139590	Update Calico from v3.16.3 to v3.16.4 * https://github.com/projectcalico/calico/releases/tag/v3.16.4	2020-10-26 00:50:40 -07:00
Dalton Hubble	7c3f3ab6d0	Rename container-linux modules to flatcar-linux * CoreOS Container Linux was deprecated in v1.18.3 * Continue transitioning docs and modules from supporting both CoreOS and Flatcar "variants" of Container Linux to now supporting Flatcar Linux and equivalents Action Required: Update the Flatcar Linux modules `source` to replace `s/container-linux/flatcar-linux`. See docs for examples	2020-10-20 22:47:19 -07:00
Dalton Hubble	a99a990d49	Remove unused Kubelet tls mounts * Kubelet trusts only the cluster CA certificate (and certificates in the Kubelet debian base image), there is no longer a need to mount the host's trusted certs * Similar change on Flatcar Linux in https://github.com/poseidon/typhoon/pull/855 Rel: https://github.com/poseidon/typhoon/pull/810	2020-10-18 23:48:21 -07:00
Dalton Hubble	df17253e72	Fix delete node permission on Fedora CoreOS node shutdown * On cloud platforms, `delete-node.service` tries to delete the local node (not always possible depending on preemption time) * Since v1.18.3, kubelet TLS bootstrap generates a kubeconfig in `/var/lib/kubelet` which should be used with kubectl in the delete-node oneshot	2020-10-18 23:38:11 -07:00
Dalton Hubble	afac46e39a	Remove asset_dir variable and optional asset writes * Originally, poseidon/terraform-render-bootstrap generated TLS certificates, manifests, and cluster "assets" written to local disk (`asset_dir`) during terraform apply cluster bootstrap * Typhoon v1.17.0 introduced bootstrapping using only Terraform state to store cluster assets, to avoid ever writing sensitive materials to disk and improve automated use-cases. `asset_dir` was changed to optional and defaulted to "" (no writes) * Typhoon v1.18.0 deprecated the `asset_dir` variable, removed docs, and announced it would be deleted in future. * Add Terraform output `assets_dir` map * Remove the `asset_dir` variable Cluster assets are now stored in Terraform state only. For those who wish to write those assets to local files, this is possible doing so explicitly. ``` resource local_file "assets" { for_each = module.yavin.assets_dist filename = "some-assets/${each.key}" content = each.value } ``` Related: * https://github.com/poseidon/typhoon/pull/595 * https://github.com/poseidon/typhoon/pull/678	2020-10-17 15:00:15 -07:00
Dalton Hubble	511f5272f4	Update Calico from v3.15.3 to v3.16.3 * https://github.com/projectcalico/calico/releases/tag/v3.16.3 * https://github.com/poseidon/terraform-render-bootstrap/pull/212	2020-10-15 20:08:51 -07:00
Dalton Hubble	46ca5e8813	Update Kubernetes from v1.19.2 to v1.19.3 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.19.md#v1193	2020-10-14 20:47:49 -07:00
Dalton Hubble	901f7939b2	Update Cilium from v1.8.3 to v1.8.4 * https://github.com/cilium/cilium/releases/tag/v1.8.4	2020-10-02 00:24:26 -07:00
Dalton Hubble	444363be2d	Update Kubernetes from v1.19.1 to v1.19.2 * Update flannel from v0.12.0 to v0.13.0-rc2 * Update flannel-cni from v0.4.0 to v0.4.1 * Update CNI plugins from v0.8.6 to v0.8.7	2020-09-16 20:05:54 -07:00
Dalton Hubble	577b927a2b	Update Fedora CoreOS Config version from v1.0.0 to v1.1.0 * No notable changes in the config spec, just house keeping * Require any snippets customization to update to v1.1.0. Version skew between the main config and snippets will show an err message * https://github.com/coreos/fcct/blob/master/docs/configuration-v1_1.md	2020-09-10 23:38:40 -07:00
Dalton Hubble	29b16c3fc0	Change seccomp annotations to seccompProfile * seccomp graduated to GA in Kubernetes v1.19. Support for seccomp alpha annotations will be removed in v1.22 * Replace seccomp annotations with the GA seccompProfile field in the PodTemplate securityContext * Switch profile from `docker/default` to `runtime/default` (no effective change, since docker is the runtime) * Verify with docker inspect SecurityOpt. Without the profile, you'd see `seccomp=unconfined` Related: https://github.com/poseidon/terraform-render-bootstrap/pull/215	2020-09-10 01:15:07 -07:00
Dalton Hubble	0c7a879bc4	Update Kubernetes from v1.19.0 to v1.19.1 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.19.md#v1191	2020-09-09 20:52:29 -07:00
Dalton Hubble	28ee693e6b	Update Cilium from v1.8.2 to v1.8.3 * https://github.com/cilium/cilium/releases/tag/v1.8.3	2020-09-07 21:10:27 -07:00
Dalton Hubble	88cf7273dc	Update Kubernetes from v1.18.8 to v1.19.0 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.19.md	2020-08-27 08:50:01 -07:00
Dalton Hubble	cd7fd29194	Update etcd from v3.4.10 to v3.4.12 * https://github.com/etcd-io/etcd/blob/master/CHANGELOG-3.4.md	2020-08-19 21:25:41 -07:00
Bo Huang	aafa38476a	Fix SELinux race condition on non-bootstrap controllers in multi-controller (#808 ) * Fix race condition for bootstrap-secrets SELinux context on non-bootstrap controllers in multi-controller FCOS clusters * On first boot from disk on non-bootstrap controllers, adding bootstrap-secrets races with kubelet.service starting, which can cause the secrets assets to have the wrong label until kubelet.service restarts (service, reboot, auto-update) * This can manifest as `kube-apiserver`, `kube-controller-manager`, and `kube-scheduler` pods crashlooping on spare controllers on first cluster creation	2020-08-19 21:18:10 -07:00
Dalton Hubble	c87db3ef37	Update Kubernetes from v1.18.6 to v1.18.8 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md#v1188	2020-08-13 20:47:43 -07:00
Dalton Hubble	5e70d7e2c8	Migrate from Terraform v0.12.x to v0.13.x * Recommend Terraform v0.13.x * Support automatic install of poseidon's provider plugins * Update tutorial docs for Terraform v0.13.x * Add migration guide for Terraform v0.13.x (best-effort) * Require Terraform v0.12.26+ (migration compatibility) * Require `terraform-provider-ct` v0.6.1 * Require `terraform-provider-matchbox` v0.4.1 * Require `terraform-provider-digitalocean` v1.20+ Related: * https://www.hashicorp.com/blog/announcing-hashicorp-terraform-0-13/ * https://www.terraform.io/upgrade-guides/0-13.html * https://registry.terraform.io/providers/poseidon/ct/latest * https://registry.terraform.io/providers/poseidon/matchbox/latest	2020-08-12 01:54:32 -07:00
Dalton Hubble	ccee5d3d89	Update from coreos/flannel-cni to poseidon/flannel-cni * Update CNI plugins from v0.6.0 to v0.8.6 to fix several CVEs * Update the base image to alpine:3.12 * Use `flannel-cni` as an init container and remove sleep * https://github.com/poseidon/terraform-render-bootstrap/pull/205 * https://github.com/poseidon/flannel-cni * https://quay.io/repository/poseidon/flannel-cni Background * Switch from github.com/coreos/flannel-cni v0.3.0 which was last published by me in 2017 and is no longer accessible to me to maintain or patch * Port to the poseidon/flannel-cni rewrite, which releases v0.4.0 to continue the prior release numbering	2020-08-02 15:13:15 -07:00
Dalton Hubble	78e6409bd0	Fix flannel support on Fedora CoreOS * Fedora CoreOS now ships systemd-udev's `default.link` while Flannel relies on being able to pick its own MAC address for the `flannel.1` link for tunneled traffic to reach cni0 on the destination side, without being dropped * This change first appeared in FCOS testing-devel 32.20200624.20.1 and is the behavior going forward in FCOS since it was added to align FCOS network naming / configs with the rest of Fedora and address issues related to the default being missing * Flatcar Linux (and Container Linux) has a specific flannel.link configuration builtin, so it was not affected * https://github.com/coreos/fedora-coreos-tracker/issues/574#issuecomment-665487296 Note: Typhoon's recommended and default CNI provider is Calico, unless `networking` is set to flannel directly.	2020-08-01 21:22:08 -07:00
Dalton Hubble	cd0a28904e	Update Cilium from v1.8.1 to v1.8.2 * https://github.com/cilium/cilium/releases/tag/v1.8.2	2020-07-25 16:06:27 -07:00
Dalton Hubble	618f8b30fd	Update CoreDNS from v1.6.7 to v1.7.0 * https://coredns.io/2020/06/15/coredns-1.7.0-release/ * Update Grafana dashboard with revised metrics names	2020-07-25 15:51:31 -07:00
Dalton Hubble	264d23a1b5	Declare etcd data directory permissions * Set etcd data directory /var/lib/etcd permissions to 700 * On Flatcar Linux, /var/lib/etcd is pre-existing and Ignition v2 doesn't overwrite the directory. Update the Container Linux config, but add the manual chmod workaround to bootstrap for Flatcar Linux users * https://github.com/etcd-io/etcd/blob/master/CHANGELOG-3.4.md#v3410-2020-07-16 * https://github.com/etcd-io/etcd/pull/11798	2020-07-25 15:48:27 -07:00
Dalton Hubble	f96e91f225	Update etcd from v3.4.9 to v3.4.10 * https://github.com/etcd-io/etcd/releases/tag/v3.4.10	2020-07-18 14:08:22 -07:00
Dalton Hubble	6df6bf904a	Show Cilium as a CNI provider option in docs * Start to show Cilium as a CNI option * https://github.com/cilium/cilium	2020-07-18 13:27:56 -07:00
Dalton Hubble	9ea6d2c245	Update Kubernetes from v1.18.5 to v1.18.6 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md#v1186 * https://github.com/poseidon/terraform-render-bootstrap/pull/201	2020-07-15 22:05:57 -07:00
Dalton Hubble	49050320ce	Update Cilium from v1.8.0 to v1.8.1 * https://github.com/cilium/cilium/releases/tag/v1.8.1	2020-07-05 16:00:00 -07:00
Dalton Hubble	0ba2c1a4da	Fix terraform fmt in firewall rules	2020-06-29 23:04:54 -07:00
Dalton Hubble	7bce15975c	Update Kubernetes from v1.18.4 to v1.18.5 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md#v1185	2020-06-27 13:52:18 -07:00
Dalton Hubble	1f83ae7dbb	Update Calico from v3.14.1 to v3.15.0 * https://docs.projectcalico.org/v3.15/release-notes/	2020-06-26 02:40:12 -07:00
Dalton Hubble	d27f367004	Update Cilium from v1.8.0-rc4 to v1.8.0 * https://github.com/cilium/cilium/releases/tag/v1.8.0	2020-06-22 22:26:49 -07:00
Dalton Hubble	e9c8520359	Add experimental Cilium CNI provider * Accept experimental CNI `networking` mode "cilium" * Run Cilium v1.8.0-rc4 with overlay vxlan tunnels and a minimal set of features. We're interested in: * IPAM: Divide pod_cidr into /24 subnets per node * CNI networking pod-to-pod, pod-to-external * BPF masquerade * NetworkPolicy as defined by Kubernetes (no L7 Policy) * Continue using kube-proxy with Cilium probe mode * Firewall changes: * Require UDP 8472 for vxlan (Linux kernel default) between nodes * Optional ICMP echo(8) between nodes for host reachability (health) * Optional TCP 4240 between nodes for endpoint reachability (health) Known Issues: * Containers with `hostPort` don't listen on all host addresses, these workloads must use `hostNetwork` for now https://github.com/cilium/cilium/issues/12116 * Erroneous warning on Fedora CoreOS https://github.com/cilium/cilium/issues/10256 Note: This is experimental. It is not listed in docs and may be changed or removed without a deprecation notice Related: * https://github.com/poseidon/terraform-render-bootstrap/pull/192 * https://github.com/cilium/cilium/issues/12217	2020-06-21 20:41:53 -07:00
Dalton Hubble	37f00a3882	Reduce Calcio MTU on Fedora CoreOS Azure * Change the Calico VXLAN interface for MTU from 1450 to 1410 * VXLAN on Azure should support MTU 1450. However, there is history where performance measures have shown that 1410 is needed to have expected performance. Flatcar Linux has the same MTU 1410 override and note * FCOS 31.20200323.3.2 was known to perform fine with 1450, but now in 31.20200517.3.0 the right value seems to be 1410	2020-06-19 00:24:56 -07:00
Dalton Hubble	90e23f5822	Rename controller node label and NoSchedule taint * Remove node label `node.kubernetes.io/master` from controller nodes * Use `node.kubernetes.io/controller` (present since v1.9.5, [#160](https://github.com/poseidon/typhoon/pull/160)) to node select controllers * Rename controller NoSchedule taint from `node-role.kubernetes.io/master` to `node-role.kubernetes.io/controller` * Tolerate the new taint name for workloads that may run on controller nodes and stop tolerating `node-role.kubernetes.io/master` taint	2020-06-19 00:12:13 -07:00
Dalton Hubble	c25c59058c	Update Kubernetes from v1.18.3 to v1.18.4 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md#v1184	2020-06-17 19:53:19 -07:00
Dalton Hubble	413585681b	Remove unused Kubelet lock-file and exit-on-lock-contention * Kubelet `--lock-file` and `--exit-on-lock-contention` date back to usage of bootkube and at one point running Kubelet in a "self-hosted" style whereby an on-host Kubelet (rkt) started pods, but then a Kubelet DaemonSet was scheduled and able to take over (hence self-hosted). `lock-file` and `exit-on-lock-contention` flags supported this pivot. The pattern has been out of favor (in bootkube too) for years because of dueling Kubelet complexity * Typhoon runs Kubelet as a container via an on-host systemd unit using podman (Fedora CoreOS) or rkt (Flatcar Linux). In fact, Typhoon no longer uses bootkube or control plane pivot (let alone Kubelet pivot) and uses static pods since v1.16.0 * https://github.com/poseidon/typhoon/pull/536	2020-06-12 00:06:41 -07:00

1 2

69 Commits