Commit Graph

595 Commits

Author SHA1 Message Date
Dalton Hubble
0dc8740c77 Update Kubernetes from v1.26.0-rc.0 to v1.26.0-rc.1
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.26.md#v1260-rc1
2022-12-05 09:31:45 -08:00
Dalton Hubble
a9b12b6bca Update Kubernetes from v1.25.4 to v1.26.0-rc.0
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.26.md#v1260-rc0
2022-11-30 08:47:40 -08:00
Dalton Hubble
da76d32aba Migrate AWS launch configurations to launch templates
* Same features, but AWS will soon require launch templates
* Starting Dec 31, 2022 AWS will not add new instance types
(e.g. graviton 4) to launch configuration support

Rel: https://aws.amazon.com/blogs/compute/amazon-ec2-auto-scaling-will-no-longer-add-support-for-new-ec2-features-to-launch-configurations/
2022-11-30 00:26:03 -08:00
Dalton Hubble
a8990b3045 Fix flannel container image registry location
* https://github.com/poseidon/terraform-render-bootstrap/pull/336
2022-11-23 16:18:30 -08:00
Dalton Hubble
b4857c123e Update flannel from v0.15.1 to v0.20.1
* https://github.com/flannel-io/flannel/releases/tag/v0.20.1
2022-11-23 11:03:29 -08:00
Dalton Hubble
a193762eed Update etcd from v3.5.5 to v3.5.6
* https://github.com/etcd-io/etcd/releases/tag/v3.5.6
2022-11-23 10:59:17 -08:00
Dalton Hubble
adf33df99b Update Cilium from v1.12.3 to v1.12.4
* https://github.com/cilium/cilium/releases/tag/v1.12.4
2022-11-23 10:58:27 -08:00
Dalton Hubble
26dbc7e91d Update Kubernetes from v1.25.3 to v1.25.4
* Update Calico from v3.24.3 to v3.24.5
* Update Prometheus and Grafana addons
2022-11-10 09:42:21 -08:00
Dalton Hubble
937acc4b5a Re-enable Graceful Node Shutdown feature
* Kubelet GracefulNodeShutdown works, but only partially handles
gracefully stopping the Kubelet. The most noticeable drawback
is that Completed Pods are left around
* Use a project like poseidon/scuttle or a similar systemd unit
as a snippet to add drain and/or delete behaviors if desired
* This reverts commit 1786e34f33.

Rel:

* https://www.psdn.io/posts/kubelet-graceful-shutdown/
* https://github.com/poseidon/scuttle
2022-11-02 20:49:01 -07:00
Dalton Hubble
9b733d79c7 Update Calico v3.24.2 to v3.24.3
* https://github.com/projectcalico/calico/releases/tag/v3.24.3
* Add patch to allow Kubelet kubeconfig to drain nodes if desired
in addition to just deleting them in shutdown integrations. See
https://github.com/poseidon/terraform-render-bootstrap/pull/330
2022-10-23 22:00:15 -07:00
Dalton Hubble
35a9e22b1f Update Calico from v3.24.1 to v3.24.2
* https://github.com/projectcalico/calico/releases/tag/v3.24.2
2022-10-20 09:28:19 -07:00
Dalton Hubble
0f38a6d405 Remove defunct delete-node.service from worker nodes
* delete-node.service used to be used to remove nodes from the
cluster on shutdown, but its long since it last worked properly
* If there is still a desire for this concept, it can be added
with a custom snippet and with a better systemd unit
2022-10-20 08:43:48 -07:00
Dalton Hubble
a535581ef2 Remove unused Wants=network.target from etcd-member
* network.target is a passive unit that's not actually pulled
in by units requiring or wanting it, its only used for shutdown
ordering
> "Services using the network should ... avoid any Wants=network.target or even Requires=network.target"

Rel: https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/
2022-10-20 08:32:55 -07:00
Dalton Hubble
3ff2d38fa5 Update Cilium from v1.12.2 to v1.12.3
* https://github.com/cilium/cilium/releases/tag/v1.12.3
2022-10-17 17:25:23 -07:00
Dalton Hubble
651151805d Update Kubernetes v1.25.2 to v1.25.3
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.25.md#v1253
2022-10-13 21:02:39 -07:00
Dalton Hubble
3ee462a24c Update Kubernetes from v1.25.1 to v1.25.2
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.25.md#v1252
2022-09-22 08:15:30 -07:00
Dalton Hubble
74d4d56dbd Remove workaround for v1.25.0 ConfigMap rendering issue
* LocalStorageCapacityIsolationFSQuotaMonitoring was reverted back to
alpha in v1.25.1, so we don't need to explicitly disable it anymore

Rel: https://github.com/kubernetes/kubernetes/issues/112081
2022-09-19 09:10:24 -07:00
Dalton Hubble
5abe84b520 Update etcd from v3.5.4 to v3.5.5
* https://github.com/etcd-io/etcd/blob/main/CHANGELOG/CHANGELOG-3.5.md#v355
2022-09-15 09:01:45 -07:00
Dalton Hubble
951209d113 Update Cilium from v1.12.1 to v1.12.2
* https://github.com/cilium/cilium/releases/tag/v1.12.2
2022-09-15 08:28:37 -07:00
Dalton Hubble
09751cc0e8 Update Kubernetes from v1.25.0 to v1.25.1
* https://github.com/kubernetes/kubernetes/releases/tag/v1.25.1
2022-09-15 08:23:22 -07:00
Dalton Hubble
c14300f0be Update Calico from v3.23.3 to v3.24.1
* https://github.com/projectcalico/calico/releases/tag/v3.24.1
2022-09-14 08:09:38 -07:00
Dalton Hubble
1786e34f33 Revert Graceful Node Shutdown feature
* Disable Kubelet Graceful Node Shutdown on worker nodes (enabled in
Kubernetes v1.25.0 https://github.com/poseidon/typhoon/pull/1222)
* Graceful node shutdown shutdown allows 30s for critical pods to
shutdown and 15s for regular pods to shutdown before releasing the
inhibitor lock to allow the host to shutdown
* Unfortunately, both pods and the node are shutdown at the same
time at the end of the 45s period without further configuration
options. As a result, regular pods and the node are shutdown at the
same time. In practice, enabling this feature leaves Error or Completed
pods in kube-apiserver state until manually cleaned up. This feature
is not ready for general use
* Fix issue where Error/Completed pods are accumulating whenever any
node restarts (or auto-updates), visible in kubectl get pods
* This issue wasn't apparent in initial testing and seems to only
affect non-critical pods (due to critical pods being killed earlier)
But its very apparent on our real clusters

Rel: https://github.com/kubernetes/kubernetes/issues/110755
2022-09-10 14:58:44 -07:00
Dalton Hubble
393a38deff Configure Graceful Node Shutdown and lengthen max inhibitor delay
* Configure Kubelet Graceful Node Shutdown to detect system shutdown
events and stop running containers gracefully when possible
* Allow up to 30s for critical pods to gracefully shutdown
* Allow up to 15s for regular pods to gracefully shutdown
* Node will be marked as NotReady promptly, instead of having to
wait for health checks
* Kubelet uses systemd inhibitor locks to delay shutdown for a limited
number of seconds
* Raise the default max inhibitor time from 5s to 45s

Verify systemd inhibitor locks are present:

```
sudo systemd-inhibit --list
WHO     UID USER PID  COMM    WHAT     WHY                                        MODE
kubelet 0   root 4581 kubelet shutdown Kubelet needs time to handle node shutdown delay
```

Tail journal logs and then shutdown a node via systemctl reboot
or via the cloud console to watch container shutdown

Rel:

* https://kubernetes.io/blog/2021/04/21/graceful-node-shutdown-beta/
* https://kubernetes.io/docs/reference/config-api/kubelet-config.v1beta1/
* https://github.com/kubernetes/kubernetes/issues/107043
* https://github.com/coreos/fedora-coreos-tracker/issues/821
* https://www.freedesktop.org/software/systemd/man/systemd-inhibit.html
* https://github.com/kubernetes/kubernetes/blob/release-1.24/pkg/kubelet/nodeshutdown/nodeshutdown_manager_linux.go
* https://github.com/godbus/dbus/blob/master/conn.go
2022-08-28 10:37:33 -07:00
Dalton Hubble
76d92e9c2d Change podman log-driver from journald to k8s-file
* When podman runs the Kubelet container, logging to journald means
log lines are duplicated in the journal. journalctl -u kubelet shows
Kubelet's logs and the same log messages from podman. Using the
k8s-file driver alleviates this problem
* Fix Kubelet and etcd-member logs to be more readable and reduce
unneccessary Kubelet log volume
2022-08-27 17:15:22 -07:00
Dalton Hubble
275fc0f9e8 Disable LocalStorageCapacityIsolationFSQuotaMonitoring feature
* Kubernetes v1.25.0 moved the LocalStorageCapacityIsolationFSQuotaMonitoring
feature from alpha to beta, but it breaks Kubelet updating ConfigMaps in
Pods, as shown by conformance tests
* Kubernetes is rolling LocalStorageCapacityIsolationFSQuotaMonitoring back
to alpha so its not enabled by default, but that will require a release
* Disable the feature gate directly as a workaround for now to make
Kubernetes v1.25.0 usable

```
FailedMount: MountVolume.SetUp failed for volume "configmap-volume" : requesting quota on existing directory /var/lib/kubelet/pods/f09fae17-ff16-4a05-aab3-7b897cb5b732/volumes/kubernetes.io~configmap/configmap-volume but different pod 673ad247-abf0-434e-99eb-1c3f57d7fdaa a4568e94-2b2d-438f-a4bd-c9edc814e478
```

Rel:

* https://github.com/kubernetes/kubernetes/pull/112076
* https://github.com/kubernetes/kubernetes/pull/107329
2022-08-27 09:49:35 -07:00
Dalton Hubble
3fb59a3289 Migrate most Kubelet flags to KubeletConfiguration file
* Add a KubeletConfiguration file to replace most Kubelet
flags, to prepare for upcoming changes
* Pass Kubelet the --config flag to specify the location of
the KubeletConfiguration
* Remove flsgs / configuration where it matches the defaults
  * Remove --cgroups-per-qos, defaults to true
  * Remove --container-runtime, defaults to remote
  * Remove enforce-node-allocatable=pods, defaults to pods

Rel:

* https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/
* https://kubernetes.io/docs/reference/config-api/kubelet-config.v1beta1/
2022-08-27 09:28:15 -07:00
Dalton Hubble
a31dbceac6 Update Kubernetes from v1.24.4 to v1.25.0
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.25.md
2022-08-25 09:18:14 -07:00
Dalton Hubble
760b4cd5ee Update Kubernetes from v1.24.3 to v1.24.4
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.24.md#v1244
2022-08-17 20:09:30 -07:00
Dalton Hubble
fcd8ff2b17 Update Cilium from v1.12.0 to v1.12.1
* https://github.com/cilium/cilium/releases/tag/v1.12.1
2022-08-17 08:53:56 -07:00
Dalton Hubble
52427a4271 Refresh instances in autoscaling group when launch configuration changes
* Changes to worker launch configurations start an autoscaling group instance
refresh to replace instances
* Instance refresh creates surge instances, waits for a warm-up period, then
deletes old instances
* Changing worker_type, disk_*, worker_price, worker_target_groups, or Butane
worker_snippets on existing worker nodes will replace instances
* New AMIs or changing `os_stream` will be ignored, to allow Fedora CoreOS or
Flatcar Linux to keep themselves updated
* Previously, new launch configurations were made in the same way, but not
applied to instances unless manually replaced
2022-08-14 21:43:49 -07:00
Dalton Hubble
6facfca4ed Switch Kubernetes image registry from k8s.gcr.io to registry.k8s.io
* Announce: https://groups.google.com/g/kubernetes-sig-testing/c/U7b_im9vRrM

Rel: https://github.com/poseidon/terraform-render-bootstrap/pull/319
2022-08-13 16:16:21 -07:00
Dalton Hubble
ed8c6a5aeb Upgrade CoreDNS from v1.8.5 to v1.9.3
Rel: https://github.com/poseidon/terraform-render-bootstrap/pull/318
2022-08-13 15:43:03 -07:00
Dalton Hubble
e5d0e2d48b Rename Fedora CoreOS fcc directory to butane
* Align both Fedora CoreOS and Flatcar Linux keeping Butane
Configs in a directory called butane
2022-08-10 09:10:18 -07:00
Dalton Hubble
87a8278c9d Improve AWS autoscaling group and launch config names
* Rename launch configuration to use a name_prefix named after the
cluster and worker to improve identifiability
* Shorten AWS autoscaling group name to not include the launch config
id. Years ago this used to be needed to update the ASG but the AWS
provider detects changes to the launch configuration just fine
2022-08-08 20:46:08 -07:00
Dalton Hubble
93b7f2554e Remove ineffective iptables-legacy.stamp
* Typhoon Fedora CoreOS is already using iptables nf_tables since
F36. The file to pin to legacy iptables was renamed to
/etc/coreos/iptables-legacy.stamp
2022-08-08 20:27:21 -07:00
Dalton Hubble
62d47ad3f0 Update Cilium from v1.11.7 to v1.12.0
* https://github.com/cilium/cilium/releases/tag/v1.12.0
2022-08-08 19:59:03 -07:00
Dalton Hubble
4a469513dd Migrate Flatcar Linux from Ignition spec v2.3.0 to v3.3.0
* Requires poseidon v0.11+ and Flatcar Linux 3185.0.0+ (action required)
* Previously, Flatcar Linux configs have been parsed as Container
Linux Configs to Ignition v2.2.0 specs by poseidon/ct
* Flatcar Linux starting in 3185.0.0 now supports Ignition v3.x specs
(which are rendered from Butane Configs, like Fedora CoreOS)
* poseidon/ct v0.11.0 adds support for the flatcar Butane Config
variant so that Flatcar Linux can use Ignition v3.x

Rel:

* [Flatcar Support](https://flatcar-linux.org/docs/latest/provisioning/ignition/specification/#ignition-v3)
* [poseidon/ct support](https://github.com/poseidon/terraform-provider-ct/pull/131)
2022-08-03 08:32:52 -07:00
Dalton Hubble
256b87812e Remove Terraform template provider dependency
* Use Terraform builtin templatefile functionality
* Remove dependency on deprecated Terraform template provider

Rel:

* https://registry.terraform.io/providers/hashicorp/template/2.2.0
* https://github.com/poseidon/terraform-render-bootstrap/pull/293
2022-08-02 18:15:03 -07:00
Dalton Hubble
c6794f1007 Update Calico from v3.23.1 to v3.23.3
* https://github.com/projectcalico/calico/releases/tag/v3.23.3
2022-07-30 18:15:33 -07:00
Dalton Hubble
f42b45451b Update Cilium from v1.11.6 to v1.11.7
* https://github.com/cilium/cilium/releases/tag/v1.11.7
2022-07-19 09:06:15 -07:00
Dalton Hubble
0db5f86110 Update Kubernetes from v1.24.2 to v1.24.3
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.24.md#v1243
2022-07-13 20:59:15 -07:00
Dalton Hubble
8398182956 Update Cilium and Calico CNI providers
* Update Cilium from v1.11.5 to v1.11.6
* Update Calico from v3.22.2 to v3.23.1
2022-06-18 19:29:01 -07:00
Dalton Hubble
6d6b48b201 Update Kubernetes from v1.24.1 to v1.24.2
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.24.md#v1242
2022-06-18 18:35:42 -07:00
Dalton Hubble
b8549a1e32 Update Cilium from v1.11.4 to v1.11.5
* https://github.com/poseidon/terraform-render-bootstrap/pull/309
2022-05-31 15:23:07 +01:00
Dalton Hubble
c5573199db Update Kubernetes from v1.24.0 to v1.24.1
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.24.md#v1241
2022-05-28 09:39:14 +01:00
Dalton Hubble
b0e0b132e4 Update Kubernetes from v1.23.6 to v1.24.0
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.24.md#v1240
2022-05-04 08:27:14 -07:00
Dalton Hubble
91b38bf3fd Update etcd from v3.5.2 to v3.5.4
* https://github.com/etcd-io/etcd/releases/tag/v3.5.4
2022-04-27 20:57:02 -07:00
James Harmison
9a4887d028 Add bind mounts for selinux to fcos kubelets
fixes #1123

Enables the use of CSI drivers with a StorageClass that lacks an explicit context mount option. In cases where the kubelet lacks mounts for `/etc/selinux` and `/sys/fs/selinux`, it is unable to set the `:Z` option for the CRI volume definition automatically. See [KEP 1710](https://github.com/kubernetes/enhancements/blob/master/keps/sig-storage/1710-selinux-relabeling/README.md#volume-mounting) for more information on how SELinux is passed to the CRI by Kubelet.

Prior to this change, a not-explicitly-labelled mount would have an `unlabeled_t` SELinux type on the host. Following this change, the Kubelet and CRI work together to dynamically relabel mounts that lack an explicit context specification every time it is rebound to a pod with SELinux type `container_file_t` and appropriate context labels to match the specifics for the pod it is bound to. This enables applications running in containers to consume dynamically provisioned storage on SELinux enforcing systems without explicitly setting the context on the StorageClass or PersistentVolume.
2022-04-26 21:33:26 -07:00
Dalton Hubble
d7f55c4e46 Remove use of deprecated key_algorithm field in TLS assets
* Fixes warning about use of deprecated field `key_algorithm` in
the `hashicorp/tls` provider. The key algorithm can now be inferred
directly from the private key so resources don't have to output
and pass around the algorithm
2022-04-20 19:52:03 -07:00
Dalton Hubble
80c6e2e7e6 Update Kubernetes from v1.23.5 to v1.23.6
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.23.md#v1236
2022-04-20 19:39:05 -07:00
Dalton Hubble
2f7d2a92e0 Update Cilium and Calico CNI providers
* Update Cilium from v1.11.3 to v1.11.4
* Update Calico from v3.22.1 to v3.22.2
2022-04-19 08:28:52 -07:00
Dalton Hubble
2df1873b7f Update Cilium from v1.11.2 to v1.11.3
* https://github.com/cilium/cilium/releases/tag/v1.11.3
2022-04-01 16:44:30 -07:00
Dalton Hubble
5365ce8204 Mount /etc/machine-id from host into Kubelet
* Kubelet node's System UUID can be detected from the sysfs
filesystem without a host mount, but if you need to distinguish
between the host's machine-id and SystemUUID
* On cloud platforms, MachineID and SystemUUID are identical,
but on bare-metal the two differ
2022-04-01 16:32:06 -07:00
Dalton Hubble
e61d4b92da Update Kubernetes from v1.23.4 to v1.23.5
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.23.md#v1235
2022-03-16 21:01:41 -07:00
Dalton Hubble
69770b4827 Update Calico from v3.21.2 to v3.22.1
* https://github.com/projectcalico/calico/releases/tag/v3.22.1
* Fix https://github.com/projectcalico/calico/issues/5011
2022-03-11 11:22:29 -08:00
Dalton Hubble
f797f97675 Update Cilium from v1.11.1 to v1.11.2
* https://github.com/cilium/cilium/releases/tag/v1.11.2
2022-03-11 10:08:24 -08:00
Dalton Hubble
9aa99f1996 Allow upgrading AWS Terraform provider to v4.x
* https://github.com/hashicorp/terraform-provider-aws/releases/tag/v4.0.0
2022-02-17 09:35:15 -08:00
Dalton Hubble
fc38ba45b1 Update Kubernetes from v1.23.3 to v1.23.4
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.23.md#v1234
2022-02-17 09:00:31 -08:00
Dalton Hubble
6c70d06937 Update etcd from v3.5.1 to v3.5.2
* https://github.com/etcd-io/etcd/releases/tag/v3.5.2
2022-02-07 08:10:17 -08:00
Dalton Hubble
cf4beeba34 Change default CNI provider from Calico to Cilium
* Cilium (v1.8) was added to Typhoon in v1.18.5 in June 2020
and its become more impressive since then. Its currently the
leading CNI provider choice.
* Calico has grown complex, has lots of CRDs, masks its
management complexity with an operator (which we won't use),
doesn't provide multi-arch images, and hasn't been compatible
with Kubernetes v1.23 (with ipvs) for several releases.
* Both have CNCF conformance quirks (flannel used for conformance),
but that's not the main factor in choosing the default
2022-02-07 08:07:00 -08:00
Dalton Hubble
a527f73f5a Update Kubernetes from v1.23.2 to v1.23.3
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.23.md#v1233
2022-01-27 09:23:37 -08:00
Dalton Hubble
e274a451ff Update Kubernetes from v1.23.1 to v1.23.2
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.23.md#v1232
2022-01-19 17:59:49 -08:00
Dalton Hubble
2265ab5375 Remove Kubelet --network-plugin=cni flag
* Now that `docker-shim` is no longer used, the Kubelet flag
is no longer needed and will be removed in v1.24
2022-01-14 10:43:07 -08:00
Dalton Hubble
08ea9776f3 Mask docker.service to prevent socket activation
* Kubelet now uses `containerd` as the container runtime, but
`docker.service` still starts when `docker.sock` is probed bc
the service is socket activated. Prevent this by masking the
`docker.service` unit
2022-01-14 10:31:47 -08:00
Dalton Hubble
2e8bc99164 Remove template provider usage from terraform-render-bootstrap 2022-01-14 10:27:24 -08:00
Dalton Hubble
b18b0a9f3d Remove unused ETCD_UNSUPPORTED_ARCH variable
* etcd used to require a special variable to use the arm64
container image, but this is no longer required
2022-01-14 10:25:45 -08:00
Dalton Hubble
beb9f1477a Add experimental Flatcar Linux arm64 support on AWS
* Add `arch` variable to Flatcar Linux AWS `kubernetes` and
`workers` modules. Accept `amd64` (default) or `arm64` to support
native arm64/aarch64 clusters or mixed/hybrid clusters with arm64
workers
* Requires `flannel` or `cilium` CNI

Similar to https://github.com/poseidon/typhoon/pull/875
2022-01-14 10:24:48 -08:00
Dalton Hubble
f544a9c71f Switch Fedora CoreOS from docker-shim to containerd
* Migrate from `docker-shim` to `containerd` in preparation
for Kubernetes v1.24.0 dropping `docker-shim` support
* Much consideration was given to the container runtime
choice. https://github.com/poseidon/typhoon/issues/899
provides relevant rationales
2022-01-13 09:17:29 -08:00
Dalton Hubble
6ed048eb65 Workaround Terraform v1.1 file provisioner regression
* Terraform v1.1 changed the behavior of provisioners and
`remote-exec` in a way that breaks support for expansions
in commands (including file provisioner, where `destination`
is part of an `scp` command)
* Terraform will likely revert the change eventually, but I
suspect it will take a while
* Instead, we can stop relying on Terraform's expansion
behavior. `/home/core` is a suitable choice for `$HOME` on
both Flatcar Linux and Fedora CoreOS (harldink `/var/home/core`)

Rel: https://github.com/hashicorp/terraform/issues/30243
2021-12-28 13:25:23 -08:00
Dalton Hubble
9e3807798f Update Kubernetes from v1.23.0 to v1.23.1
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.23.md#v1231
2021-12-20 08:36:19 -08:00
Dalton Hubble
ef9c6aa423 Switch Flatcar Linux to using containerd CRI
* Use containerd as the Kubernetes Container Runtime
2021-12-15 08:42:13 -08:00
Dalton Hubble
125008fbb3 Update Cilium from v1.10.5 to v1.11.0
* https://github.com/cilium/cilium/releases/tag/v1.11.0
2021-12-10 11:26:05 -08:00
Dalton Hubble
136107b448 Set Kubelet resolver config to /run/systemd/resolve/resolv.conf
* Both Flatcar Linux and Fedora CoreOS use systemd-resolved,
but they setup /etc/resolv.conf symlinks differently
* Prefer using /run/systemd/resolve/resolv.conf directly, which
also updates to reflect runtime changes (e.g. resolvectl)
2021-12-10 08:22:30 -08:00
Dalton Hubble
e97c1cc9e5 Enable Kubernetes aggregation by default
* Change `enable_aggregation` default from false to true
* These days, Kubernetes control plane components emit annoying
messages related to assumptions baked into the Kubernetes API
Aggregation Layer if you don't enable it. Further the conformance
tests force you to remember to enable it if you care about passing
those
* This change is motivated by eliminating annoyances, rather than
any enthusiasm for Kubernetes' aggregation features

Rel: https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/apiserver-aggregation/
2021-12-09 17:30:35 -08:00
Dalton Hubble
41f739891b Normalize CA certs mounts in static Pods and kube-proxy
* Mount both /etc/ssl/certs and /etc/pki into control plane static
pods and kube-proxy, rather than choosing one based a variable
(set based on Flatcar Linux or Fedora CoreOS)
* Remove deprecated `--port` from `kube-scheduler` static Pod
2021-12-09 09:56:37 -08:00
Dalton Hubble
861021ee98 Update Kubernetes from v1.22.4 to v1.23.0
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.23.md#v1230
* With Calico, add missing caliconodestatuses CRD added in v3.21.0
https://github.com/poseidon/terraform-render-bootstrap/pull/289
2021-12-09 09:28:41 -08:00
Dalton Hubble
c1d28e6f61 Change default disk_iops on Flatcar Linux
* Same as #1073, but for Flatcar Linux on AWS as well
2021-12-07 16:52:55 -08:00
Dalton Hubble
a8fd21d250 Update minimum Terraform provider versions
* Update `null` provider to allow use of v3.1.x releases,
instead of being stuck on v2.1.2
* Update min versions in terraform-render-boostrap
https://github.com/poseidon/terraform-render-bootstrap/pull/287
* Document the recommended versions of Terraform cloud providers
2021-12-07 16:26:34 -08:00
Dalton Hubble
9c626c9dbd Change default disk_iops from unset to 3000
* Since v1.21.3 switched controllers default disk type from
`gp2` to `gp3`, an iops diff has been shown (harmless, but
annoying)
* Controller nodes default to a 30GB `gp3` disk. `gp3` disks
do respect `iops` and the corresponding default is 3000
2021-12-07 15:44:09 -08:00
Dalton Hubble
85252dec6e Switch FCOS workers to official Fedora CoreOS AMIs
* Fix worker nodes to use official Fedora CoreOS AMIs,
instead of the older Poseidon built AMIs (now removed).
This should have been part of #1038, but was missed in
code review
* Poseidon build AMIs have been deleted (so I don't have
to keep paying to host them for people)
2021-12-07 15:31:47 -08:00
Dalton Hubble
93594292eb Update Kubernetes from v1.22.3 to v1.22.4
* Update flannel from v0.15.0 to v0.15.1
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.22.md#v1224
2021-11-17 19:53:32 -08:00
Dalton Hubble
94b2793e40 Update CoreDNS from v1.8.4 to v1.8.6
* https://coredns.io/2021/10/07/coredns-1.8.6-release/
2021-11-12 21:09:04 -08:00
Dalton Hubble
4fd43b39ad Fix Flatcar Linux docker driver and add cgroups v2
* Remove `/sys/fs/cgroup/systemd` mount since Flatcar Linux
uses cgroups v2
* Flatcar Linux's `docker` switched from the `cgroupfs` to
`systemd` driver without notice
2021-11-12 21:07:20 -08:00
Dalton Hubble
65083aca7d Update Calico and Flannel CNI providers
* Update Calico from v3.20.2 to v3.21.0
* Update Flannel from v0.14.0 to v0.15.0
2021-11-12 11:03:39 -08:00
Dalton Hubble
dd4a5a4e7e Update Kubernetes from v1.22.2 to v1.22.3
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.22.md#v1223
2021-10-28 10:11:06 -07:00
Dalton Hubble
af835f976f Update flannel from v0.13.0 to v0.14.0
* https://github.com/flannel-io/flannel/releases/tag/v0.14.0
2021-10-28 10:09:06 -07:00
Dalton Hubble
17dce49982 Update etcd from v3.5.0 to v3.5.1
* https://github.com/etcd-io/etcd/releases/tag/v3.5.1
2021-10-17 11:28:27 -07:00
Dalton Hubble
5744e10329 Update Cilium from v1.0.4 to v1.0.5
* https://github.com/cilium/cilium/releases/tag/v1.10.5
2021-10-17 11:26:59 -07:00
Dalton Hubble
443bd5a26b Add file to hold nodes on iptables-legacy
* Add `/etc/fedora-coreos/iptables-legacy.stamp` to declare
that `iptables-legacy` should be used instead of `iptables-nft`
(until support is added in future releases)
* https://github.com/coreos/fedora-coreos-tracker/issues/676
2021-10-11 20:30:49 -07:00
Dalton Hubble
f8162b9be3 Update Calico from v3.20.1 to v3.20.2
* Use Calico's iptables legacy vs nft auto-detection
2021-10-11 20:28:48 -07:00
Dalton Hubble
b30de949b8 Update Calico and Cilium CNI
* Update Calico from v3.20.0 to v3.20.1
* Update Cilium from v1.10.3 to v1.10.4
2021-09-22 22:18:16 -07:00
Dalton Hubble
bb7f31822e Update Kubernetes from v1.22.1 to v1.22.2
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.22.md#v1222
2021-09-15 19:56:24 -07:00
Anthony Rabbito
c6923b9ef3
Switch Fedora CoreOS to new ARM64 AMIs (#1038)
* Fedora CoreOS now publishes ARM64 AMIs
2021-09-12 11:49:13 -07:00
Dalton Hubble
fcbdb50d93 Update Kubernetes from v1.22.0 to v1.22.1
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.22.md#v1221
2021-08-19 21:12:02 -07:00
Dalton Hubble
cbef202eec Update Prometheus discovery of kube components
* Kubernetes v1.22.0 disabled kube-controller-manager insecure
port, which was used internally for Prometheus metrics scraping
* Configure Prometheus to discover and scrape endpoints for
kube-scheduler and kube-controller-manager via the authenticated
https ports, via bearer token
* Change firewall ports to allow Prometheus (on worker nodes)
to scrape kube-scheduler and kube-controller-manager targets
that run on controller(s) with hostNetwork
* Disable the insecure port on kube-scheduler
2021-08-10 21:25:19 -07:00
Dalton Hubble
1a5949824c Update etcd from v3.4.16 to v3.5.0
* Use multi-arch container image instead of a special
"-arm64" suffix on arm64
* https://github.com/etcd-io/etcd/releases/tag/v3.5.0
2021-08-04 22:10:07 -07:00
Dalton Hubble
9bac641511 Update Kubernetes from v1.21.3 to v1.22.0
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.22.md#v1220
2021-08-04 22:09:19 -07:00
Dalton Hubble
f03045f0dc Update Cilium for cgroups v2 support
* On Fedora CoreOS, Cilium cross-node service IP load balancing
stopped working for a time (first observable as CoreDNS pods
located on worker nodes not being able to reach the kubernetes
API service 10.3.0.1). This turned out to have two parts:
* Fedora CoreOS switched to cgroups v2 by default. In our early
testing with cgroups v2, Calico (default) was used. With the
cgroups v2 change, SELinux policy denied some eBPF operations.
Since fixed in all Fedora CoreOS channels
* Cilium requires new mounts to support cgroups v2, which are
added here

* https://github.com/coreos/fedora-coreos-tracker/issues/292
* https://github.com/coreos/fedora-coreos-tracker/issues/881
* https://github.com/cilium/cilium/pull/16259
2021-07-24 10:36:47 -07:00
Dalton Hubble
b603bbde3d Update Butane Config from v1.2.0 to v1.4.0
* Rename Fedora CoreOS Config (FCC) to Butane Config
* Require any snippets customizations use version v1.4.0

* https://typhoon.psdn.io/advanced/customization/#hosts
2021-07-19 23:53:51 -07:00
Dalton Hubble
fdade5b40c Update poseidon/ct provider from v0.8.0 to v0.9.0
* Continue targeting Ignition v3.2.0 for some time
2021-07-18 09:05:02 -07:00