* Fixes warning about use of deprecated field `key_algorithm` in
the `hashicorp/tls` provider. The key algorithm can now be inferred
directly from the private key so resources don't have to output
and pass around the algorithm
* Cilium (v1.8) was added to Typhoon in v1.18.5 in June 2020
and its become more impressive since then. Its currently the
leading CNI provider choice.
* Calico has grown complex, has lots of CRDs, masks its
management complexity with an operator (which we won't use),
doesn't provide multi-arch images, and hasn't been compatible
with Kubernetes v1.23 (with ipvs) for several releases.
* Both have CNCF conformance quirks (flannel used for conformance),
but that's not the main factor in choosing the default
* Add `arch` variable to Flatcar Linux AWS `kubernetes` and
`workers` modules. Accept `amd64` (default) or `arm64` to support
native arm64/aarch64 clusters or mixed/hybrid clusters with arm64
workers
* Requires `flannel` or `cilium` CNI
Similar to https://github.com/poseidon/typhoon/pull/875
* Terraform v1.1 changed the behavior of provisioners and
`remote-exec` in a way that breaks support for expansions
in commands (including file provisioner, where `destination`
is part of an `scp` command)
* Terraform will likely revert the change eventually, but I
suspect it will take a while
* Instead, we can stop relying on Terraform's expansion
behavior. `/home/core` is a suitable choice for `$HOME` on
both Flatcar Linux and Fedora CoreOS (harldink `/var/home/core`)
Rel: https://github.com/hashicorp/terraform/issues/30243
* Both Flatcar Linux and Fedora CoreOS use systemd-resolved,
but they setup /etc/resolv.conf symlinks differently
* Prefer using /run/systemd/resolve/resolv.conf directly, which
also updates to reflect runtime changes (e.g. resolvectl)
* Change `enable_aggregation` default from false to true
* These days, Kubernetes control plane components emit annoying
messages related to assumptions baked into the Kubernetes API
Aggregation Layer if you don't enable it. Further the conformance
tests force you to remember to enable it if you care about passing
those
* This change is motivated by eliminating annoyances, rather than
any enthusiasm for Kubernetes' aggregation features
Rel: https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/apiserver-aggregation/
* Mount both /etc/ssl/certs and /etc/pki into control plane static
pods and kube-proxy, rather than choosing one based a variable
(set based on Flatcar Linux or Fedora CoreOS)
* Remove deprecated `--port` from `kube-scheduler` static Pod
* Update `null` provider to allow use of v3.1.x releases,
instead of being stuck on v2.1.2
* Update min versions in terraform-render-boostrap
https://github.com/poseidon/terraform-render-bootstrap/pull/287
* Document the recommended versions of Terraform cloud providers
* Remove `/sys/fs/cgroup/systemd` mount since Flatcar Linux
uses cgroups v2
* Flatcar Linux's `docker` switched from the `cgroupfs` to
`systemd` driver without notice
* Kubernetes v1.22.0 disabled kube-controller-manager insecure
port, which was used internally for Prometheus metrics scraping
* Configure Prometheus to discover and scrape endpoints for
kube-scheduler and kube-controller-manager via the authenticated
https ports, via bearer token
* Change firewall ports to allow Prometheus (on worker nodes)
to scrape kube-scheduler and kube-controller-manager targets
that run on controller(s) with hostNetwork
* Disable the insecure port on kube-scheduler
* On Fedora CoreOS, Cilium cross-node service IP load balancing
stopped working for a time (first observable as CoreDNS pods
located on worker nodes not being able to reach the kubernetes
API service 10.3.0.1). This turned out to have two parts:
* Fedora CoreOS switched to cgroups v2 by default. In our early
testing with cgroups v2, Calico (default) was used. With the
cgroups v2 change, SELinux policy denied some eBPF operations.
Since fixed in all Fedora CoreOS channels
* Cilium requires new mounts to support cgroups v2, which are
added here
* https://github.com/coreos/fedora-coreos-tracker/issues/292
* https://github.com/coreos/fedora-coreos-tracker/issues/881
* https://github.com/cilium/cilium/pull/16259
* Add `node_taints` variable to worker modules to set custom
initial node taints on cloud platforms that support auto-scaling
worker pools of heterogeneous nodes (i.e. AWS, Azure, GCP)
* Worker pools could use custom `node_labels` to allowed workloads
to select among differentiated nodes, while custom `node_taints`
allows a worker pool's nodes to be tainted as special to prevent
scheduling, except by workloads that explicitly tolerate the
taint
* Expose `daemonset_tolerations` in AWS, Azure, and GCP kubernetes
cluster modules, to determine whether `kube-system` components
should tolerate the custom taint (advanced use covered in docs)
Rel: #550, #663Closes#429
* Set the Kubelet `--provider-id` on AWS based on metadata from
Fedora CoreOS afterburn or Flatcar Linux coreos-metadata
* Based on https://github.com/poseidon/typhoon/pull/951
* Flatcar Linux has not published an Edge channel image since
April 2020 and recently removed mention of the channel from
their documentation https://github.com/kinvolk/Flatcar/pull/345
* Users of Flatcar Linux Edge should move to the stable, beta, or
alpha channel, barring any alternate advice from upstream Flatcar
Linux
* Change control plane static pods to mount `/etc/kubernetes/pki`,
instead of `/etc/kubernetes/bootstrap-secrets` to better reflect
their purpose and match some loose conventions upstream
* Place control plane and bootstrap TLS assets and kubeconfig's
in `/etc/kubernetes/pki`
* Mount to `/etc/kubernetes/pki` (rather than `/etc/kubernetes/secrets`)
to match the host location (less surprise)
Rel: https://github.com/poseidon/terraform-render-bootstrap/pull/233
* Generate TLS client certificates for `kube-scheduler` and
`kube-controller-manager` with `system:kube-scheduler` and
`system:kube-controller-manager` CNs
* Template separate kubeconfigs for kube-scheduler and
kube-controller manager (`scheduler.conf` and
`controller-manager.conf`). Rename admin for clarity
* Before v1.16.0, Typhoon scheduled a self-hosted control
plane, which allowed the steady-state kube-scheduler and
kube-controller-manager to use a scoped ServiceAccount.
With a static pod control plane, separate CN TLS client
certificates are the nearest equiv.
* https://kubernetes.io/docs/setup/best-practices/certificates/
* Remove unused Kubelet certificate, TLS bootstrap is used
instead
* Allow terraform-provider-ct versions v0.6+ (e.g. v0.7.1)
Before, only v0.6.x point updates were allowed
* Update terraform-provider-ct to v0.7.1 in docs
* READ the docs before updating terraform-provider-ct,
as changing worker user-data is handled differently
by different cloud platforms