Commit Graph

954 Commits

Author SHA1 Message Date
Dalton Hubble
11e540000f Update CHANGES to reiterate Terraform Module Registry deprecation
* Terraform supports sourcing modules from either Git repos or from
their own hosted Terraform Module Registry, introduced a few years ago
* Typhoon docs have always shown using Git-based module sources, not
the Terraform Module Registry. For example, module usage should be
`source = "git::https://github.com/poseidon/typhoon/...` not
`source = poseidon/kubernetes/...`
* Typhoon published Flatcar Linux modules (CoreOS Container Linux at the time)
to Terraform Module Registry, but the approach has a number of drawbacks
for publishers and for users.
  * Terraform's Module Registry requires subtree mirroring Typhoon to special
  terraform-platform-kubernetes repos. This distorts Git history,
  requires special automation, and the registry's naming requirements
  don't allow us to publish our full matrix of modules (Fedora CoreOS
  and Flatcar Linux, across AWS, Azure, GCP, on-prem, and DigitalOcean)
  * Terraform's Module Registry only supports release versions (no commit SHAs
  or forks)
* Ultimately, the Terraform Module Registry limits user flexibility, has
tedious publishing constraints, and introduces centralization where the
current decentralized Git-based approach is simpler and more featureful

Note: This does not affect Terraform _Providers_ like `poseidon/matchbox`
or `poseidon/ct`. For Terraform providers, Terraform's centralized
platform eases provider plugin installation and provides value
2022-12-10 10:00:22 -08:00
Dalton Hubble
d6cbcf9f96 Update Kubernetes from v1.26.0-rc.1 to v1.26.0
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.26.md#v1260
2022-12-08 08:47:24 -08:00
Dalton Hubble
ce52a2cd35 Update Nginx Ingress and monitoring addon components
* Update ingress-nginx, Prometheus, node-exporter, and
  kube-state-metrics
2022-12-05 09:38:38 -08:00
Dalton Hubble
0dc8740c77 Update Kubernetes from v1.26.0-rc.0 to v1.26.0-rc.1
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.26.md#v1260-rc1
2022-12-05 09:31:45 -08:00
Dalton Hubble
d419c58ab1 Add Equinix to the sponsors list
* Thank you Equinix!
2022-11-30 00:30:39 -08:00
Dalton Hubble
da76d32aba Migrate AWS launch configurations to launch templates
* Same features, but AWS will soon require launch templates
* Starting Dec 31, 2022 AWS will not add new instance types
(e.g. graviton 4) to launch configuration support

Rel: https://aws.amazon.com/blogs/compute/amazon-ec2-auto-scaling-will-no-longer-add-support-for-new-ec2-features-to-launch-configurations/
2022-11-30 00:26:03 -08:00
Dalton Hubble
f597f7cda3 Update Prometheus and Grafana addons 2022-11-23 11:06:03 -08:00
Dalton Hubble
b4857c123e Update flannel from v0.15.1 to v0.20.1
* https://github.com/flannel-io/flannel/releases/tag/v0.20.1
2022-11-23 11:03:29 -08:00
Dalton Hubble
50bffaae8f Update etcd from v3.5.5 to v3.5.6 in CHANGES.md 2022-11-23 11:01:24 -08:00
Dalton Hubble
adf33df99b Update Cilium from v1.12.3 to v1.12.4
* https://github.com/cilium/cilium/releases/tag/v1.12.4
2022-11-23 10:58:27 -08:00
Dalton Hubble
29a005b7b4 Update CHANGELOG links 2022-11-17 07:55:58 -08:00
Dalton Hubble
6a521257d0 Link to new Mastodon accounts
* @typhoon@fosstodon.org will announce Typhoon releases, like the
@typhoon8s Twitter account does today
* @poseidon@fosstodon.org will announce Poseidon Labs news and
general projects, like the @poseidonlabs Twitter account does today
2022-11-10 09:48:30 -08:00
Dalton Hubble
26dbc7e91d Update Kubernetes from v1.25.3 to v1.25.4
* Update Calico from v3.24.3 to v3.24.5
* Update Prometheus and Grafana addons
2022-11-10 09:42:21 -08:00
Dalton Hubble
937acc4b5a Re-enable Graceful Node Shutdown feature
* Kubelet GracefulNodeShutdown works, but only partially handles
gracefully stopping the Kubelet. The most noticeable drawback
is that Completed Pods are left around
* Use a project like poseidon/scuttle or a similar systemd unit
as a snippet to add drain and/or delete behaviors if desired
* This reverts commit 1786e34f33.

Rel:

* https://www.psdn.io/posts/kubelet-graceful-shutdown/
* https://github.com/poseidon/scuttle
2022-11-02 20:49:01 -07:00
Dalton Hubble
9b733d79c7 Update Calico v3.24.2 to v3.24.3
* https://github.com/projectcalico/calico/releases/tag/v3.24.3
* Add patch to allow Kubelet kubeconfig to drain nodes if desired
in addition to just deleting them in shutdown integrations. See
https://github.com/poseidon/terraform-render-bootstrap/pull/330
2022-10-23 22:00:15 -07:00
Dalton Hubble
35a9e22b1f Update Calico from v3.24.1 to v3.24.2
* https://github.com/projectcalico/calico/releases/tag/v3.24.2
2022-10-20 09:28:19 -07:00
Dalton Hubble
0f38a6d405 Remove defunct delete-node.service from worker nodes
* delete-node.service used to be used to remove nodes from the
cluster on shutdown, but its long since it last worked properly
* If there is still a desire for this concept, it can be added
with a custom snippet and with a better systemd unit
2022-10-20 08:43:48 -07:00
Dalton Hubble
a535581ef2 Remove unused Wants=network.target from etcd-member
* network.target is a passive unit that's not actually pulled
in by units requiring or wanting it, its only used for shutdown
ordering
> "Services using the network should ... avoid any Wants=network.target or even Requires=network.target"

Rel: https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/
2022-10-20 08:32:55 -07:00
Dalton Hubble
08d13e7215 Improve release notes slightly with links 2022-10-20 08:30:30 -07:00
Dalton Hubble
3ff2d38fa5 Update Cilium from v1.12.2 to v1.12.3
* https://github.com/cilium/cilium/releases/tag/v1.12.3
2022-10-17 17:25:23 -07:00
Dalton Hubble
f04e1d25a8 Add Flatcar Linux ARM64 support on Azure
* Kinvolk now publishes Flatcar Linux images for ARM64
* For now, amd64 image must specify a plan while arm64 images
must NOT specify a plan due to how Kinvolk publishes.

Rel: https://github.com/flatcar/Flatcar/issues/872
2022-10-17 08:36:57 -07:00
Dalton Hubble
b68f8bb2a9 Switch Azure Fedora CoreOS default worker type
* Change default Azure worker_type from Standard_DS1_v2 to Standard_D2as_v5
  * Get 2 VCPU, 7 GiB, 12500Mbps (vs 1 VCPU, 3.5GiB, 750 Mbps)
  * Small increase in pay-as-you-go price ($53.29 -> $62.78)
  * Small increase in spot price ($5.64/mo -> $7.37/mo)
  * Change from Intel to AMD EPYC (`D2as_v5` cheaper than `D2s_v5`)

Rel:

* https://github.com/poseidon/typhoon/pull/1248
* https://learn.microsoft.com/en-us/azure/virtual-machines/dasv5-dadsv5-series#dasv5-series
* https://learn.microsoft.com/en-us/azure/virtual-machines/dv2-dsv2-series#dsv2-series
2022-10-13 21:23:57 -07:00
Dalton Hubble
651151805d Update Kubernetes v1.25.2 to v1.25.3
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.25.md#v1253
2022-10-13 21:02:39 -07:00
Dalton Hubble
8d2c8b8db6 Switch to Flatcar Azure gen2 images and change worker type
* Switch from Azure Hypervisor generation 1 to generation 2
* Change default Azure `worker_type` from Standard_DS1_v2 to Standard_D2as_v5
  * Get 2 VCPU, 7 GiB, 12500Mbps (vs 1 VCPU, 3.5GiB, 750 Mbps)
  * Small increase in pay-as-you-go price ($53.29 -> $62.78)
  * Small increase in spot price ($5.64/mo -> $7.37/mo)
  * Change from Intel to AMD EPYC (`D2as_v5` cheaper than `D2s_v5`)

Notes: Azure makes you accept terms for each plan:

```
az vm image terms accept --publish kinvolk --offer flatcar-container-linux-free --plan stable-gen2
```

Rel:

* https://learn.microsoft.com/en-us/azure/virtual-machines/dasv5-dadsv5-series#dasv5-series
* https://learn.microsoft.com/en-us/azure/virtual-machines/dv2-dsv2-series#dsv2-series
2022-10-13 09:57:52 -07:00
Dalton Hubble
b4c8b1729c Switch addons images from k8s.gcr.io to registry.k8s.io
* Switch addon manifests to use the new Kubernetes image registry

Rel:

* https://github.com/poseidon/typhoon/pull/1206
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.25.md#moved-container-registry-service-from-k8sgcrio-to-registryk8sio
2022-10-09 16:14:28 -07:00
Dalton Hubble
e82241169a Update Prometheus from v2.38.0 to v2.39.1
* https://github.com/prometheus/prometheus/releases/tag/v2.39.1
2022-10-09 16:12:35 -07:00
Dalton Hubble
3ee462a24c Update Kubernetes from v1.25.1 to v1.25.2
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.25.md#v1252
2022-09-22 08:15:30 -07:00
Dalton Hubble
558e293f78 Update Nginx Ingress and Grafana addons 2022-09-20 08:28:30 -07:00
Dalton Hubble
90782ea820 Remove workaround for preventing search . propagation
* Kubelet v1.25.1 has the fix https://github.com/kubernetes/kubernetes/pull/112157
2022-09-19 22:37:02 -07:00
Dalton Hubble
74d4d56dbd Remove workaround for v1.25.0 ConfigMap rendering issue
* LocalStorageCapacityIsolationFSQuotaMonitoring was reverted back to
alpha in v1.25.1, so we don't need to explicitly disable it anymore

Rel: https://github.com/kubernetes/kubernetes/issues/112081
2022-09-19 09:10:24 -07:00
Dalton Hubble
5abe84b520 Update etcd from v3.5.4 to v3.5.5
* https://github.com/etcd-io/etcd/blob/main/CHANGELOG/CHANGELOG-3.5.md#v355
2022-09-15 09:01:45 -07:00
Dalton Hubble
951209d113 Update Cilium from v1.12.1 to v1.12.2
* https://github.com/cilium/cilium/releases/tag/v1.12.2
2022-09-15 08:28:37 -07:00
Dalton Hubble
09751cc0e8 Update Kubernetes from v1.25.0 to v1.25.1
* https://github.com/kubernetes/kubernetes/releases/tag/v1.25.1
2022-09-15 08:23:22 -07:00
Dalton Hubble
c14300f0be Update Calico from v3.23.3 to v3.24.1
* https://github.com/projectcalico/calico/releases/tag/v3.24.1
2022-09-14 08:09:38 -07:00
Dalton Hubble
1786e34f33 Revert Graceful Node Shutdown feature
* Disable Kubelet Graceful Node Shutdown on worker nodes (enabled in
Kubernetes v1.25.0 https://github.com/poseidon/typhoon/pull/1222)
* Graceful node shutdown shutdown allows 30s for critical pods to
shutdown and 15s for regular pods to shutdown before releasing the
inhibitor lock to allow the host to shutdown
* Unfortunately, both pods and the node are shutdown at the same
time at the end of the 45s period without further configuration
options. As a result, regular pods and the node are shutdown at the
same time. In practice, enabling this feature leaves Error or Completed
pods in kube-apiserver state until manually cleaned up. This feature
is not ready for general use
* Fix issue where Error/Completed pods are accumulating whenever any
node restarts (or auto-updates), visible in kubectl get pods
* This issue wasn't apparent in initial testing and seems to only
affect non-critical pods (due to critical pods being killed earlier)
But its very apparent on our real clusters

Rel: https://github.com/kubernetes/kubernetes/issues/110755
2022-09-10 14:58:44 -07:00
Dalton Hubble
5f612c82e2 Update kube-state-metrics and Grafana addons 2022-09-01 08:58:32 -07:00
Dalton Hubble
e60a321185 Sync Terraform providers shown in docs 2022-09-01 08:07:15 -07:00
Dalton Hubble
4ad473cd3c Add workaround patch to strip "search ." from resolv.conf
* systemd adds "search ." to hosts /run/systemd/resolve/resolv.conf
on hosts with a fqdn hostname
* Kubelet v1.25 began propagating "search ." from the host node
into containers' `/etc/resolv.conf`
* musl-based DNS resolvers don't behave correctly when `search .`
is used in their `/etc/resolv.conf`. This breaks Alpine images
* Adapt the same workaround used by Openshift to strip the "search ."
* This only applies to bare-metal Typhoon nodes (where hostnames are
set to fqdn's), nodes on cloud platforms aren't affected in the
Typhoon configuration

Kubernetes tracking issue: https://github.com/kubernetes/kubernetes/issues/112135

Rel:

* https://github.com/systemd/systemd/pull/17201
* https://github.com/kubernetes/kubernetes/pull/109441
* https://github.com/coreos/fedora-coreos-tracker/issues/1287
* https://github.com/openshift/okd-machine-os/pull/159
2022-08-31 08:05:45 -07:00
Dalton Hubble
393a38deff Configure Graceful Node Shutdown and lengthen max inhibitor delay
* Configure Kubelet Graceful Node Shutdown to detect system shutdown
events and stop running containers gracefully when possible
* Allow up to 30s for critical pods to gracefully shutdown
* Allow up to 15s for regular pods to gracefully shutdown
* Node will be marked as NotReady promptly, instead of having to
wait for health checks
* Kubelet uses systemd inhibitor locks to delay shutdown for a limited
number of seconds
* Raise the default max inhibitor time from 5s to 45s

Verify systemd inhibitor locks are present:

```
sudo systemd-inhibit --list
WHO     UID USER PID  COMM    WHAT     WHY                                        MODE
kubelet 0   root 4581 kubelet shutdown Kubelet needs time to handle node shutdown delay
```

Tail journal logs and then shutdown a node via systemctl reboot
or via the cloud console to watch container shutdown

Rel:

* https://kubernetes.io/blog/2021/04/21/graceful-node-shutdown-beta/
* https://kubernetes.io/docs/reference/config-api/kubelet-config.v1beta1/
* https://github.com/kubernetes/kubernetes/issues/107043
* https://github.com/coreos/fedora-coreos-tracker/issues/821
* https://www.freedesktop.org/software/systemd/man/systemd-inhibit.html
* https://github.com/kubernetes/kubernetes/blob/release-1.24/pkg/kubelet/nodeshutdown/nodeshutdown_manager_linux.go
* https://github.com/godbus/dbus/blob/master/conn.go
2022-08-28 10:37:33 -07:00
Dalton Hubble
76d92e9c2d Change podman log-driver from journald to k8s-file
* When podman runs the Kubelet container, logging to journald means
log lines are duplicated in the journal. journalctl -u kubelet shows
Kubelet's logs and the same log messages from podman. Using the
k8s-file driver alleviates this problem
* Fix Kubelet and etcd-member logs to be more readable and reduce
unneccessary Kubelet log volume
2022-08-27 17:15:22 -07:00
Dalton Hubble
bf06412dfd Update Prometheus and Grafana addons 2022-08-21 08:56:00 -07:00
Dalton Hubble
0d27811265 Update recommended Terraform provider versions 2022-08-18 09:08:55 -07:00
Dalton Hubble
c13d060b38 Add docs for GCP MIG update and AWS instance refresh
* Document that worker instances are rolling replaced when
changes to their configuration are applied
2022-08-18 09:02:38 -07:00
Dalton Hubble
e87d5aabc3 Adjust Google Cloud worker health checks to use kube-proxy healthz
* Change the workers managed instance group to health check nodes
via HTTP probe of the kube-proxy port 10256 /healthz endpoints
* Advantages: kube-proxy is a lower value target (in case there
were bugs in firewalls) that Kubelet, its more representative than
health checking Kubelet (Kubelet must run AND kube-proxy Daemonset
must be healthy), and its already used by kube-proxy liveness probes
(better discoverability via kubectl or alerts on pods crashlooping)
* Another motivator is that GKE clusters also use kube-proxy port
10256 checks to assess node health
2022-08-17 20:50:52 -07:00
Dalton Hubble
760b4cd5ee Update Kubernetes from v1.24.3 to v1.24.4
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.24.md#v1244
2022-08-17 20:09:30 -07:00
Dalton Hubble
fcd8ff2b17 Update Cilium from v1.12.0 to v1.12.1
* https://github.com/cilium/cilium/releases/tag/v1.12.1
2022-08-17 08:53:56 -07:00
Dalton Hubble
52427a4271 Refresh instances in autoscaling group when launch configuration changes
* Changes to worker launch configurations start an autoscaling group instance
refresh to replace instances
* Instance refresh creates surge instances, waits for a warm-up period, then
deletes old instances
* Changing worker_type, disk_*, worker_price, worker_target_groups, or Butane
worker_snippets on existing worker nodes will replace instances
* New AMIs or changing `os_stream` will be ignored, to allow Fedora CoreOS or
Flatcar Linux to keep themselves updated
* Previously, new launch configurations were made in the same way, but not
applied to instances unless manually replaced
2022-08-14 21:43:49 -07:00
Dalton Hubble
20b76d6e00 Roll instance template changes to worker managed instance groups
* When a worker managed instance group's (MIG) instance template
changes (including machine type, disk size, or Butane snippets
but excluding new AMIs), use Google Cloud's rolling update features
to ensure instances match declared state
* Ignore new AMIs since Fedora CoreOS and Flatcar Linux nodes
already auto-update and reboot themselves
* Rolling updates will create surge instances, wait for health
checks, then delete old instances (0 unavilable instances)
* Instances are replaced to ensure new Ignition/Butane snippets
are respected
* Add managed instance group autohealing (i.e. health checks) to
ensure new instances' Kubelet is running

Renames

* Name apiserver and kubelet health checks consistently
* Rename MIG from `${var.name}-worker-group` to `${var.name}-worker`

Rel: https://cloud.google.com/compute/docs/instance-groups/rolling-out-updates-to-managed-instance-groups
2022-08-14 13:06:53 -07:00
Dalton Hubble
6facfca4ed Switch Kubernetes image registry from k8s.gcr.io to registry.k8s.io
* Announce: https://groups.google.com/g/kubernetes-sig-testing/c/U7b_im9vRrM

Rel: https://github.com/poseidon/terraform-render-bootstrap/pull/319
2022-08-13 16:16:21 -07:00
Dalton Hubble
ed8c6a5aeb Upgrade CoreDNS from v1.8.5 to v1.9.3
Rel: https://github.com/poseidon/terraform-render-bootstrap/pull/318
2022-08-13 15:43:03 -07:00
Dalton Hubble
b321b90a4f Update Grafana from v9.0.6 to v9.0.7 2022-08-13 15:39:44 -07:00
Dalton Hubble
679f8b878f Update Grafana from v9.0.5 to v9.0.6 2022-08-10 08:23:04 -07:00
Dalton Hubble
87a8278c9d Improve AWS autoscaling group and launch config names
* Rename launch configuration to use a name_prefix named after the
cluster and worker to improve identifiability
* Shorten AWS autoscaling group name to not include the launch config
id. Years ago this used to be needed to update the ASG but the AWS
provider detects changes to the launch configuration just fine
2022-08-08 20:46:08 -07:00
Dalton Hubble
93b7f2554e Remove ineffective iptables-legacy.stamp
* Typhoon Fedora CoreOS is already using iptables nf_tables since
F36. The file to pin to legacy iptables was renamed to
/etc/coreos/iptables-legacy.stamp
2022-08-08 20:27:21 -07:00
Dalton Hubble
62d47ad3f0 Update Cilium from v1.11.7 to v1.12.0
* https://github.com/cilium/cilium/releases/tag/v1.12.0
2022-08-08 19:59:03 -07:00
Dalton Hubble
6eb7861f96 Update Grafana liveness and readiness probes
* Use the liveness and readiness probes that Grafana recommends
* Update Grafana from v9.0.3 to v9.0.5
2022-08-08 09:22:44 -07:00
Dalton Hubble
4a469513dd Migrate Flatcar Linux from Ignition spec v2.3.0 to v3.3.0
* Requires poseidon v0.11+ and Flatcar Linux 3185.0.0+ (action required)
* Previously, Flatcar Linux configs have been parsed as Container
Linux Configs to Ignition v2.2.0 specs by poseidon/ct
* Flatcar Linux starting in 3185.0.0 now supports Ignition v3.x specs
(which are rendered from Butane Configs, like Fedora CoreOS)
* poseidon/ct v0.11.0 adds support for the flatcar Butane Config
variant so that Flatcar Linux can use Ignition v3.x

Rel:

* [Flatcar Support](https://flatcar-linux.org/docs/latest/provisioning/ignition/specification/#ignition-v3)
* [poseidon/ct support](https://github.com/poseidon/terraform-provider-ct/pull/131)
2022-08-03 08:32:52 -07:00
Dalton Hubble
47d8431fe0 Fix bug provisioning multi-controller clusters on Google Cloud
* Google Cloud Terraform provider resource google_dns_record_set's
name field provides the full domain name with a trailing ".". This
isn't a new behavior, Google has behaved this way as long as I can
remember
* etcd domain names are passed to the bootstrap module to generate
TLS certificates. What seems to be new(ish?) is that etcd peers
see example.foo and example.foo. as different domains during TLS
SANs validation. As a result, clusters with multiple controller
nodes fail to run etcd-member, which manifests as cluster provisioning
hanging. Single controller/master clusters (default) are unaffected
* Fix etcd-member.service error in multi-controller clusters:

```
"error":"x509: certificate is valid for conformance-etcd0.redacted.,
conform-etcd1.redacted., conform-etcd2.redacted., not conform-etcd1.redacted"}
```
2022-08-02 20:21:02 -07:00
Dalton Hubble
256b87812e Remove Terraform template provider dependency
* Use Terraform builtin templatefile functionality
* Remove dependency on deprecated Terraform template provider

Rel:

* https://registry.terraform.io/providers/hashicorp/template/2.2.0
* https://github.com/poseidon/terraform-render-bootstrap/pull/293
2022-08-02 18:15:03 -07:00
Dalton Hubble
c6794f1007 Update Calico from v3.23.1 to v3.23.3
* https://github.com/projectcalico/calico/releases/tag/v3.23.3
2022-07-30 18:15:33 -07:00
Dalton Hubble
7f445b0dba Add release note about master to main branch rename
* Update Terraform provider versions
2022-07-19 18:12:37 -07:00
Dalton Hubble
f42b45451b Update Cilium from v1.11.6 to v1.11.7
* https://github.com/cilium/cilium/releases/tag/v1.11.7
2022-07-19 09:06:15 -07:00
Dalton Hubble
767a653baa Update Prometheus, Grafana, and ingress-nginx addons
* Update ingress-nginx RBAC Role to include coordination.k8s.io leases
permissions that are required with ingress-nginx v1.3.0
2022-07-15 20:19:12 -07:00
Dalton Hubble
42bf82b325 Update Prometheus and Grafana addons
* Bump recommended Terraform provider versions
2022-07-02 11:28:34 -07:00
Dalton Hubble
07df0c2552 Add warning about Terraform AWS provider version
* Sync Terraform provider versions with those used internally
2022-06-23 21:31:20 -07:00
Dalton Hubble
8398182956 Update Cilium and Calico CNI providers
* Update Cilium from v1.11.5 to v1.11.6
* Update Calico from v3.22.2 to v3.23.1
2022-06-18 19:29:01 -07:00
Dalton Hubble
2a8915fee9 Update Prometheus, kube-state-metrics, and Grafana addons
* Update monitoring addons
2022-06-18 18:32:17 -07:00
Dalton Hubble
31c7f0ba0e Update nginx-ingress addon from v1.2.0 to v1.2.1
* https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v1.2.1
2022-05-31 16:37:57 +01:00
Dalton Hubble
b8549a1e32 Update Cilium from v1.11.4 to v1.11.5
* https://github.com/poseidon/terraform-render-bootstrap/pull/309
2022-05-31 15:23:07 +01:00
Dalton Hubble
8e8bf305c3 Update Prometheus and Grafana addons 2022-05-31 14:29:55 +01:00
Dalton Hubble
b0e0b132e4 Update Kubernetes from v1.23.6 to v1.24.0
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.24.md#v1240
2022-05-04 08:27:14 -07:00
Dalton Hubble
02f78fbd1a Update Grafana from v8.4.5 to v8.5.1 2022-05-02 08:19:41 -07:00
Dalton Hubble
a122867748 Update nginx-ingress, Prometheus, and Grafana addons
* Sync addons with versions used in Poseidon
2022-04-27 21:02:32 -07:00
Dalton Hubble
91b38bf3fd Update etcd from v3.5.2 to v3.5.4
* https://github.com/etcd-io/etcd/releases/tag/v3.5.4
2022-04-27 20:57:02 -07:00
Dalton Hubble
d7f55c4e46 Remove use of deprecated key_algorithm field in TLS assets
* Fixes warning about use of deprecated field `key_algorithm` in
the `hashicorp/tls` provider. The key algorithm can now be inferred
directly from the private key so resources don't have to output
and pass around the algorithm
2022-04-20 19:52:03 -07:00
Dalton Hubble
80c6e2e7e6 Update Kubernetes from v1.23.5 to v1.23.6
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.23.md#v1236
2022-04-20 19:39:05 -07:00
Dalton Hubble
fddd8ac69d Fix Flatcar Linux nodes on Google Cloud not ignoring image changes
* Add `boot_disk[0].initialize_params` to the ignored fields for the
controller nodes
* Nodes will auto-update, Terraform should not attempt to delete and
recreate nodes (especially controllers!). Lack of this ignore causes
Terraform to propose deleting controller nodes when Flatcar Linux
releases new images
* Matches the configuration on Typhoon Fedora CoreOS (which does not
have the issue)
2022-04-20 18:53:00 -07:00
Dalton Hubble
2f7d2a92e0 Update Cilium and Calico CNI providers
* Update Cilium from v1.11.3 to v1.11.4
* Update Calico from v3.22.1 to v3.22.2
2022-04-19 08:28:52 -07:00
Dalton Hubble
d91408258b Update nginx-ingress, Prometheus, and Grafana addons 2022-04-04 08:53:29 -07:00
Dalton Hubble
2df1873b7f Update Cilium from v1.11.2 to v1.11.3
* https://github.com/cilium/cilium/releases/tag/v1.11.3
2022-04-01 16:44:30 -07:00
Dalton Hubble
93ebfc7dd0 Allow upgrading Azure Terraform Provider to v3.x
* Change subnet references to source and destinations prefixes
(plural)
* Remove references to a resource group in some load balancing
components, which no longer require it (inferred)
* Rename `worker_address_prefix` output to `worker_address_prefixes`
2022-04-01 16:36:53 -07:00
Dalton Hubble
b47edca6be Refresh Prometheus rules and Grafana dashboards
* Update Prometheus rules and Grafana dashboards
* Add new networking dashboards
2022-03-19 17:08:00 -07:00
Dalton Hubble
e61d4b92da Update Kubernetes from v1.23.4 to v1.23.5
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.23.md#v1235
2022-03-16 21:01:41 -07:00
Dalton Hubble
dca745fa4a Update monitoring addon components
* Update Prometheus, kube-state-metrics, and Grafana
2022-03-11 11:50:16 -08:00
Dalton Hubble
661347fa71 Update nginx-ingress from v1.1.1 to v1.1.2
* https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v1.1.2
2022-03-11 11:42:33 -08:00
Dalton Hubble
69770b4827 Update Calico from v3.21.2 to v3.22.1
* https://github.com/projectcalico/calico/releases/tag/v3.22.1
* Fix https://github.com/projectcalico/calico/issues/5011
2022-03-11 11:22:29 -08:00
Dalton Hubble
f797f97675 Update Cilium from v1.11.1 to v1.11.2
* https://github.com/cilium/cilium/releases/tag/v1.11.2
2022-03-11 10:08:24 -08:00
Dalton Hubble
6cf40722de Revert kube-state-metrics upgrade
* kube-state-metrics:v2.4.0 isn't published, skip it
2022-02-21 19:57:47 -08:00
Dalton Hubble
c230cdec46 Update Grafana and kube-state-metrics addons 2022-02-21 19:36:16 -08:00
Dalton Hubble
9aa99f1996 Allow upgrading AWS Terraform provider to v4.x
* https://github.com/hashicorp/terraform-provider-aws/releases/tag/v4.0.0
2022-02-17 09:35:15 -08:00
Dalton Hubble
28a42238c4 Update nginx-ingress, Prometheus, and Grafana addons
* Align `nginx-ingress` `--controller-class` with `IngressClass`
to provide a better example (e.g. if extended to multiple ingress
controllers)
2022-02-17 08:58:29 -08:00
Dalton Hubble
6c70d06937 Update etcd from v3.5.1 to v3.5.2
* https://github.com/etcd-io/etcd/releases/tag/v3.5.2
2022-02-07 08:10:17 -08:00
Dalton Hubble
cf4beeba34 Change default CNI provider from Calico to Cilium
* Cilium (v1.8) was added to Typhoon in v1.18.5 in June 2020
and its become more impressive since then. Its currently the
leading CNI provider choice.
* Calico has grown complex, has lots of CRDs, masks its
management complexity with an operator (which we won't use),
doesn't provide multi-arch images, and hasn't been compatible
with Kubernetes v1.23 (with ipvs) for several releases.
* Both have CNCF conformance quirks (flannel used for conformance),
but that's not the main factor in choosing the default
2022-02-07 08:07:00 -08:00
Dalton Hubble
e06ee042ee Switch to using Flatcar Linux images on Google Cloud
* Use the official Kinvolk Flatcar Linux image on Google Cloud
* Change `os_image` from a custom image name to `flatcar-stable`
(default), `flatcar-beta`, or `flatcar-alpha` (**action required**)
* Change `os_image` from a required to an optional variable
* Promote Typhoon on Flatcar Linux / Google Cloud to stable
* Remove docs about needing to upload a Flatcar Linux image
manually on Google Cloud and drop support for custom images
2022-01-28 21:04:10 -08:00
Dalton Hubble
a527f73f5a Update Kubernetes from v1.23.2 to v1.23.3
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.23.md#v1233
2022-01-27 09:23:37 -08:00
Dalton Hubble
3da8c1575c Update nginx-ingress and Grafana addons 2022-01-19 21:09:21 -08:00
Dalton Hubble
dedd17d085 Upgrade to DigitalOcean Terraform provider v2.x
* Remove deprecated `private_networking` parameter
2022-01-19 18:32:17 -08:00
Dalton Hubble
e274a451ff Update Kubernetes from v1.23.1 to v1.23.2
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.23.md#v1232
2022-01-19 17:59:49 -08:00
Dalton Hubble
2265ab5375 Remove Kubelet --network-plugin=cni flag
* Now that `docker-shim` is no longer used, the Kubelet flag
is no longer needed and will be removed in v1.24
2022-01-14 10:43:07 -08:00
Dalton Hubble
08ea9776f3 Mask docker.service to prevent socket activation
* Kubelet now uses `containerd` as the container runtime, but
`docker.service` still starts when `docker.sock` is probed bc
the service is socket activated. Prevent this by masking the
`docker.service` unit
2022-01-14 10:31:47 -08:00