Commit Graph

803 Commits

Author SHA1 Message Date
Dalton Hubble
beb9f1477a Add experimental Flatcar Linux arm64 support on AWS
* Add `arch` variable to Flatcar Linux AWS `kubernetes` and
`workers` modules. Accept `amd64` (default) or `arm64` to support
native arm64/aarch64 clusters or mixed/hybrid clusters with arm64
workers
* Requires `flannel` or `cilium` CNI

Similar to https://github.com/poseidon/typhoon/pull/875
2022-01-14 10:24:48 -08:00
Dalton Hubble
f544a9c71f Switch Fedora CoreOS from docker-shim to containerd
* Migrate from `docker-shim` to `containerd` in preparation
for Kubernetes v1.24.0 dropping `docker-shim` support
* Much consideration was given to the container runtime
choice. https://github.com/poseidon/typhoon/issues/899
provides relevant rationales
2022-01-13 09:17:29 -08:00
Dalton Hubble
50215e373b Add Prometheus config for monitoring Kubernetes Ingress
* Allow Kubernetes Ingress resources to be probed via Blackbox
Exporter (if present) if annotated `prometheus.io/probe: "true"`
* Fix probes of Services via Blackbox Exporter. Require Blackbox
Exporter to be deployed in the same `monitoring` namespace, be
named `blackbox-exporter`, and use port 8080
2021-12-29 11:57:50 -08:00
Dalton Hubble
a9f9c59b91 Configure Prometheus to allow a custom scrape query param
* Set `prometheus.io/param` on a Kubernetes Service to scrape
the service endpoints and pass a custom query parameter
* For example, scrape Consul with `?format=prometheus`

```yaml
kind: Service
metadata:
  annotations:
    prometheus.io/scrape: 'true'
    prometheus.io/port: '8500'
    prometheus.io/path: /v1/agent/metrics
    prometheus.io/param: format=prometheus
```
2021-12-29 11:47:10 -08:00
Dalton Hubble
6ed048eb65 Workaround Terraform v1.1 file provisioner regression
* Terraform v1.1 changed the behavior of provisioners and
`remote-exec` in a way that breaks support for expansions
in commands (including file provisioner, where `destination`
is part of an `scp` command)
* Terraform will likely revert the change eventually, but I
suspect it will take a while
* Instead, we can stop relying on Terraform's expansion
behavior. `/home/core` is a suitable choice for `$HOME` on
both Flatcar Linux and Fedora CoreOS (harldink `/var/home/core`)

Rel: https://github.com/hashicorp/terraform/issues/30243
2021-12-28 13:25:23 -08:00
Dalton Hubble
9e3807798f Update Kubernetes from v1.23.0 to v1.23.1
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.23.md#v1231
2021-12-20 08:36:19 -08:00
Dalton Hubble
ef9c6aa423 Switch Flatcar Linux to using containerd CRI
* Use containerd as the Kubernetes Container Runtime
2021-12-15 08:42:13 -08:00
Dalton Hubble
bb5e5811ec Update Prometheus and Grafana addons 2021-12-15 08:16:46 -08:00
Dalton Hubble
16aa997604 Fix Azure backend_address_pool_id deprecation warning
* Change to `backend_address_pool_ids` list
2021-12-14 10:26:08 -08:00
Dalton Hubble
43c6558aaf Update nginx-ingress and monitoring addons 2021-12-10 11:29:49 -08:00
Dalton Hubble
125008fbb3 Update Cilium from v1.10.5 to v1.11.0
* https://github.com/cilium/cilium/releases/tag/v1.11.0
2021-12-10 11:26:05 -08:00
Dalton Hubble
136107b448 Set Kubelet resolver config to /run/systemd/resolve/resolv.conf
* Both Flatcar Linux and Fedora CoreOS use systemd-resolved,
but they setup /etc/resolv.conf symlinks differently
* Prefer using /run/systemd/resolve/resolv.conf directly, which
also updates to reflect runtime changes (e.g. resolvectl)
2021-12-10 08:22:30 -08:00
Dalton Hubble
e97c1cc9e5 Enable Kubernetes aggregation by default
* Change `enable_aggregation` default from false to true
* These days, Kubernetes control plane components emit annoying
messages related to assumptions baked into the Kubernetes API
Aggregation Layer if you don't enable it. Further the conformance
tests force you to remember to enable it if you care about passing
those
* This change is motivated by eliminating annoyances, rather than
any enthusiasm for Kubernetes' aggregation features

Rel: https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/apiserver-aggregation/
2021-12-09 17:30:35 -08:00
Dalton Hubble
41f739891b Normalize CA certs mounts in static Pods and kube-proxy
* Mount both /etc/ssl/certs and /etc/pki into control plane static
pods and kube-proxy, rather than choosing one based a variable
(set based on Flatcar Linux or Fedora CoreOS)
* Remove deprecated `--port` from `kube-scheduler` static Pod
2021-12-09 09:56:37 -08:00
Dalton Hubble
861021ee98 Update Kubernetes from v1.22.4 to v1.23.0
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.23.md#v1230
* With Calico, add missing caliconodestatuses CRD added in v3.21.0
https://github.com/poseidon/terraform-render-bootstrap/pull/289
2021-12-09 09:28:41 -08:00
Dalton Hubble
c1d28e6f61 Change default disk_iops on Flatcar Linux
* Same as #1073, but for Flatcar Linux on AWS as well
2021-12-07 16:52:55 -08:00
Dalton Hubble
9c626c9dbd Change default disk_iops from unset to 3000
* Since v1.21.3 switched controllers default disk type from
`gp2` to `gp3`, an iops diff has been shown (harmless, but
annoying)
* Controller nodes default to a 30GB `gp3` disk. `gp3` disks
do respect `iops` and the corresponding default is 3000
2021-12-07 15:44:09 -08:00
Dalton Hubble
5d7b6f611e Update nginx-ingess and Prometheus exporter addons 2021-11-21 09:28:17 -08:00
Dalton Hubble
93594292eb Update Kubernetes from v1.22.3 to v1.22.4
* Update flannel from v0.15.0 to v0.15.1
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.22.md#v1224
2021-11-17 19:53:32 -08:00
Dalton Hubble
94b2793e40 Update CoreDNS from v1.8.4 to v1.8.6
* https://coredns.io/2021/10/07/coredns-1.8.6-release/
2021-11-12 21:09:04 -08:00
Dalton Hubble
4fd43b39ad Fix Flatcar Linux docker driver and add cgroups v2
* Remove `/sys/fs/cgroup/systemd` mount since Flatcar Linux
uses cgroups v2
* Flatcar Linux's `docker` switched from the `cgroupfs` to
`systemd` driver without notice
2021-11-12 21:07:20 -08:00
Dalton Hubble
65083aca7d Update Calico and Flannel CNI providers
* Update Calico from v3.20.2 to v3.21.0
* Update Flannel from v0.14.0 to v0.15.0
2021-11-12 11:03:39 -08:00
Dalton Hubble
07db4c1143 Allow use of google Terraform provider v4.0+
* https://github.com/hashicorp/terraform-provider-google/releases/tag/v4.0.0
2021-11-11 10:17:58 -08:00
Dalton Hubble
b934a13605 Update Prometheus and Grafana addons 2021-11-07 17:00:40 -08:00
Dalton Hubble
cd005a0b27 Prepare for v1.22.3 release 2021-10-28 11:58:55 -07:00
Dalton Hubble
dd4a5a4e7e Update Kubernetes from v1.22.2 to v1.22.3
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.22.md#v1223
2021-10-28 10:11:06 -07:00
Dalton Hubble
af835f976f Update flannel from v0.13.0 to v0.14.0
* https://github.com/flannel-io/flannel/releases/tag/v0.14.0
2021-10-28 10:09:06 -07:00
Dalton Hubble
17dce49982 Update etcd from v3.5.0 to v3.5.1
* https://github.com/etcd-io/etcd/releases/tag/v3.5.1
2021-10-17 11:28:27 -07:00
Dalton Hubble
5744e10329 Update Cilium from v1.0.4 to v1.0.5
* https://github.com/cilium/cilium/releases/tag/v1.10.5
2021-10-17 11:26:59 -07:00
Dalton Hubble
20748536df Update nginx-ingress from v1.0.2 to v1.0.4
* https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v1.0.4
2021-10-17 11:17:43 -07:00
Dalton Hubble
f2e6256dd9 Update Prometheus, kube-state-metrics, and Grafana
* Update monitoring addons
2021-10-17 11:15:39 -07:00
Dalton Hubble
f8162b9be3 Update Calico from v3.20.1 to v3.20.2
* Use Calico's iptables legacy vs nft auto-detection
2021-10-11 20:28:48 -07:00
Dalton Hubble
15117fb95b Update Prometheus and nginx-ingress 2021-10-05 19:15:58 -07:00
Dalton Hubble
cb72b261c7 Update Terraform provider poseidon/matchbox to v0.5+
* Relax version constraint to allow future minor version
releases to be used without a corresponding Typhoon change
2021-09-29 23:41:44 -07:00
Dalton Hubble
209efd2f5b Update Prometheus, Grafana, and kube-state-metrics 2021-09-29 23:39:10 -07:00
Dalton Hubble
5a1e455220 Update nginx-ingress from v1.0.0 to v1.0.1 2021-09-24 09:38:18 -07:00
Dalton Hubble
69f37c8b17 Update Prometheus from v2.29.2 to v2.30.0 2021-09-24 09:34:00 -07:00
Dalton Hubble
b30de949b8 Update Calico and Cilium CNI
* Update Calico from v3.20.0 to v3.20.1
* Update Cilium from v1.10.3 to v1.10.4
2021-09-22 22:18:16 -07:00
Dalton Hubble
bb7f31822e Update Kubernetes from v1.22.1 to v1.22.2
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.22.md#v1222
2021-09-15 19:56:24 -07:00
Dalton Hubble
eb29fb639b Update nginx-ingress, Prometheus, and Grafana addons 2021-08-24 22:14:57 -07:00
Dalton Hubble
fcbdb50d93 Update Kubernetes from v1.22.0 to v1.22.1
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.22.md#v1221
2021-08-19 21:12:02 -07:00
Dalton Hubble
0d8ceae1d9 Add etcd v3.5.0 note to CHANGES 2021-08-11 09:24:43 -07:00
Dalton Hubble
c5cf803634 Update Grafana and kube-state-metrics addons 2021-08-10 22:17:16 -07:00
Dalton Hubble
cbef202eec Update Prometheus discovery of kube components
* Kubernetes v1.22.0 disabled kube-controller-manager insecure
port, which was used internally for Prometheus metrics scraping
* Configure Prometheus to discover and scrape endpoints for
kube-scheduler and kube-controller-manager via the authenticated
https ports, via bearer token
* Change firewall ports to allow Prometheus (on worker nodes)
to scrape kube-scheduler and kube-controller-manager targets
that run on controller(s) with hostNetwork
* Disable the insecure port on kube-scheduler
2021-08-10 21:25:19 -07:00
Dalton Hubble
0c99b909a9 Update nginx-ingress from v0.47.0 to v1.0.0-beta.1
* https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v1.0.0-beta.1
2021-08-07 12:46:00 -07:00
Dalton Hubble
739db3b35f Update Grafana and node-exporter addons
* https://github.com/grafana/grafana/releases/tag/v8.1.0
* https://github.com/prometheus/node_exporter/releases/tag/v1.2.1
2021-08-05 23:24:57 -07:00
Dalton Hubble
9bac641511 Update Kubernetes from v1.21.3 to v1.22.0
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.22.md#v1220
2021-08-04 22:09:19 -07:00
Dalton Hubble
f03045f0dc Update Cilium for cgroups v2 support
* On Fedora CoreOS, Cilium cross-node service IP load balancing
stopped working for a time (first observable as CoreDNS pods
located on worker nodes not being able to reach the kubernetes
API service 10.3.0.1). This turned out to have two parts:
* Fedora CoreOS switched to cgroups v2 by default. In our early
testing with cgroups v2, Calico (default) was used. With the
cgroups v2 change, SELinux policy denied some eBPF operations.
Since fixed in all Fedora CoreOS channels
* Cilium requires new mounts to support cgroups v2, which are
added here

* https://github.com/coreos/fedora-coreos-tracker/issues/292
* https://github.com/coreos/fedora-coreos-tracker/issues/881
* https://github.com/cilium/cilium/pull/16259
2021-07-24 10:36:47 -07:00
Dalton Hubble
b603bbde3d Update Butane Config from v1.2.0 to v1.4.0
* Rename Fedora CoreOS Config (FCC) to Butane Config
* Require any snippets customizations use version v1.4.0

* https://typhoon.psdn.io/advanced/customization/#hosts
2021-07-19 23:53:51 -07:00
Dalton Hubble
c734fa7b84 Update node-exporter from v1.1.2 to v1.2.0
* https://github.com/prometheus/node_exporter/releases/tag/v1.2.0
2021-07-18 15:26:44 -07:00