Commit Graph

552 Commits

Author SHA1 Message Date
Dalton Hubble
d29e6e3de1 Upgrade Cilium from v1.13.4 to v1.14.0
* https://github.com/poseidon/terraform-render-bootstrap/pull/360
* Also update flannel from v0.22.0 to v0.22.1
2023-07-30 09:36:23 -07:00
Dalton Hubble
0a6183f859 Update Kubernetes from v1.27.3 to v1.27.4
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.27.md#v1274
2023-07-21 08:00:50 -07:00
Dalton Hubble
9a28fe79a1 Upgrade Calico from v3.25.1 to v3.26.1
* Add new CRD bgpfilters and new ClusterRoles calico-cni-plugin

Rel: https://github.com/poseidon/terraform-render-bootstrap/pull/358
2023-06-19 12:28:53 -07:00
Dalton Hubble
7255f82d71 Update Kubernetes fromv 1.27.2 to v1.27.3
* Update Cilium v1.13.3 to v1.13.4

Rel: https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.27.md#v1273
2023-06-16 08:28:17 -07:00
Dalton Hubble
6f4b4cc508 Update Cilium from v1.13.2 to v1.13.3
* Also update flannel v0.21.2 to v0.22.0

Rel: https://github.com/poseidon/terraform-render-bootstrap/pull/355
2023-06-11 19:59:10 -07:00
Dalton Hubble
2a5a43f3a4 Update etcd from v3.5.8 to v3.5.9
* https://github.com/etcd-io/etcd/releases/tag/v3.5.9
2023-06-11 19:28:23 -07:00
Dalton Hubble
8ebf31073c Update Kubernetes from v1.27.1 to v1.27.2
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.27.md#v1272
2023-05-21 14:02:49 -07:00
Bill ONeill
4ef1908299 Fix: extra kernel_args added to bare-metal workers 2023-04-28 08:07:54 -07:00
Bill ONeill
2272472d59 Omit -o flag to flatcar-install unless oem_type is defined 2023-04-25 19:02:30 -07:00
Dalton Hubble
fc444d25f8 Update poseidon/ct provider and Butane Config version
* Update Fedora CoreOS Butane configs from v1.4.0 to v1.5.0
* Require Fedora CoreOS Butane snippets update to v1.1.0
* Require poseidon/ct Terraform provider v0.13 or newer
* Use Ignition v3.4.0 spec for all node provisioning
2023-04-21 08:58:20 -07:00
Dalton Hubble
5feb4c63f7 Update Cilium from v1.13.1 to v1.13.2
* https://github.com/cilium/cilium/releases/tag/v1.13.2
2023-04-20 08:44:31 -07:00
Dalton Hubble
501e6d25e0 Update Kubernetes from v1.27.0 to v1.27.1
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.27.md#v1271
2023-04-15 23:16:51 -07:00
Dalton Hubble
1e76e1a200 Update etcd from v3.5.7 to v3.5.8
* https://github.com/etcd-io/etcd/releases/tag/v3.5.8
2023-04-15 22:54:31 -07:00
Dalton Hubble
4322857bec Update Kubernetes from v1.26.3 to v1.27.0
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.27.md#v1270
2023-04-15 22:49:12 -07:00
Lucas Resch
6bd2a1a528 Expose flatcar-install OEM parameter
By exposing this parameter it is possible to install OEM specific software
during the `flatcar-install` invocation.
2023-04-01 09:38:29 -07:00
Dalton Hubble
5f303212d2 Update Cilium to use an init container to install CNI plugins
* https://github.com/poseidon/terraform-render-bootstrap/pull/348
2023-03-29 10:35:21 -07:00
Dalton Hubble
3670ec7ed7 Update Kubernetes from v1.26.2 to v1.26.3
* Update Cilium from v1.13.0 to v1.13.1
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.26.md#v1263
2023-03-21 18:18:19 -07:00
Dalton Hubble
2b3cd451d2 Update Cilium from v1.12.6 to v1.13.0
* https://github.com/cilium/cilium/releases/tag/v1.13.0
2023-03-14 11:16:14 -07:00
Dalton Hubble
76ebc08fd2 Update Kubernetes from v1.26.1 to v1.26.2
* https://github.com/poseidon/terraform-render-bootstrap/pull/345
2023-03-01 17:13:16 -08:00
Dalton Hubble
86e8484e0a Change bare-metal workers variable to optional
* To accompany the restructure of the bare-metal modules to
allow discrete workers to be defined and attached to a cluster
(#1295), the `workers` variable (older way, used for defining
homogeneous workers inline) should be optional and default
to an empty list
* Add docs covering inline vs discrete metal workers

Fix #1301
2023-03-01 14:37:47 -08:00
Dalton Hubble
f3c327007d Update flannel from v0.20.2 to v0.21.1
* https://github.com/flannel-io/flannel/releases/tag/v0.21.1
2023-02-09 09:56:25 -08:00
Dalton Hubble
406fb444f0 Update Cilium from v1.12.5 to v1.12.6
* https://github.com/cilium/cilium/releases/tag/v1.12.6
2023-02-09 09:45:40 -08:00
Dalton Hubble
1caea3388c Restructure bare-metal module to use a worker submodule
* Add an internal `worker` module to the bare-metal module, to
allow individual bare-metal machines to be defined and joined
to an existing bare-metal cluster. This is similar to the "worker
pools" modules for adding sets of nodes to cloud (AWS, GCP, Azure)
clusters, but on metal, each piece of hardware is potentially
unique

New: Using the new `worker` module, a Kubernetes cluster can be defined
without any `workers` (i.e. just a control-plane). Use the `worker`
module to define each piece machine that should join the bare-metal
cluster and customize it in detail. This style is quite flexible and
suited for clusters with hardware that varies quite a bit.

```tf
module "mercury" {
  source = "git::https://github.com/poseidon/typhoon//bare-metal/flatcar-linux/kubernetes?ref=v1.26.2"

  # bare-metal
  cluster_name            = "mercury"
  matchbox_http_endpoint  = "http://matchbox.example.com"
  os_channel              = "flatcar-stable"
  os_version              = "2345.3.1"

  # configuration
  k8s_domain_name    = "node1.example.com"
  ssh_authorized_key = "ssh-rsa AAAAB3Nz..."

  # machines
  controllers = [{
    name   = "node1"
    mac    = "52:54:00:a1:9c:ae"
    domain = "node1.example.com"
  }]
}
```

```tf
module "mercury-node1" {
  source = "git::https://github.com/poseidon/typhoon//bare-metal/flatcar-linux/kubernetes/worker?ref=v1.26.2"

  cluster_name = "mercury"

  # bare-metal
  matchbox_http_endpoint  = "http://matchbox.example.com"
  os_channel              = "flatcar-stable"
  os_version              = "2345.3.1"

  # configuration
  name               = "node2"
  mac                = "52:54:00:b2:2f:86"
  domain             = "node2.example.com"
  kubeconfig         = module.mercury.kubeconfig
  ssh_authorized_key = "ssh-rsa AAAAB3Nz..."

  # optional
  snippets       = []
  node_labels    = []
  node_tains     = []
  install_disk   = "/dev/vda"
  cached_install = false
}
```

For clusters with fairly similar hardware, you may continue to
define `workers` directly within the cluster definition. This
reduces some repetition, but is not quite as flexible.

```tf
module "mercury" {
  source = "git::https://github.com/poseidon/typhoon//bare-metal/flatcar-linux/kubernetes?ref=v1.26.1"

  # bare-metal
  cluster_name            = "mercury"
  matchbox_http_endpoint  = "http://matchbox.example.com"
  os_channel              = "flatcar-stable"
  os_version              = "2345.3.1"

  # configuration
  k8s_domain_name    = "node1.example.com"
  ssh_authorized_key = "ssh-rsa AAAAB3Nz..."

  # machines
  controllers = [{
    name   = "node1"
    mac    = "52:54:00:a1:9c:ae"
    domain = "node1.example.com"
  }]
  workers = [
    {
      name   = "node2",
      mac    = "52:54:00:b2:2f:86"
      domain = "node2.example.com"
    },
    {
      name   = "node3",
      mac    = "52:54:00:c3:61:77"
      domain = "node3.example.com"
    }
  ]
}
```

Optional variables `snippets`, `worker_node_labels`, and
`worker_node_taints` are still defined as a map from machine name
to a list of snippets, labels, or taints respectively to allow some
degree of per-machine customization. However, fields like
`install_disk`, `kernel_args`, `cached_install` and future options
will not be designed this way. Instead, if your machines vary it
is recommended to use the new `worker` module to define each node
2023-02-09 08:29:28 -08:00
Dalton Hubble
a205922d06 Update Calico from v3.24.5 to v3.25.0
* https://github.com/poseidon/terraform-render-bootstrap/pull/342
2023-01-24 08:29:08 -08:00
Dalton Hubble
b5ba65d4c2 Update etcd from v3.5.6 to v3.5.7
* https://github.com/etcd-io/etcd/releases/tag/v3.5.7
2023-01-24 08:29:08 -08:00
Dalton Hubble
f2bf5ac3fb Update Kubernetes from v1.26.0 to v1.26.1
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.26.md#v1261
2023-01-19 08:27:56 -08:00
Dalton Hubble
0afe9d65ed Update Cilium from v1.12.4 to v1.12.5
* https://github.com/cilium/cilium/releases/tag/v1.12.5
2022-12-21 08:13:35 -08:00
Dalton Hubble
d6cbcf9f96 Update Kubernetes from v1.26.0-rc.1 to v1.26.0
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.26.md#v1260
2022-12-08 08:47:24 -08:00
Dalton Hubble
0dc8740c77 Update Kubernetes from v1.26.0-rc.0 to v1.26.0-rc.1
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.26.md#v1260-rc1
2022-12-05 09:31:45 -08:00
Dalton Hubble
a9b12b6bca Update Kubernetes from v1.25.4 to v1.26.0-rc.0
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.26.md#v1260-rc0
2022-11-30 08:47:40 -08:00
Dalton Hubble
a8990b3045 Fix flannel container image registry location
* https://github.com/poseidon/terraform-render-bootstrap/pull/336
2022-11-23 16:18:30 -08:00
Dalton Hubble
b4857c123e Update flannel from v0.15.1 to v0.20.1
* https://github.com/flannel-io/flannel/releases/tag/v0.20.1
2022-11-23 11:03:29 -08:00
Dalton Hubble
a193762eed Update etcd from v3.5.5 to v3.5.6
* https://github.com/etcd-io/etcd/releases/tag/v3.5.6
2022-11-23 10:59:17 -08:00
Dalton Hubble
adf33df99b Update Cilium from v1.12.3 to v1.12.4
* https://github.com/cilium/cilium/releases/tag/v1.12.4
2022-11-23 10:58:27 -08:00
Dalton Hubble
26dbc7e91d Update Kubernetes from v1.25.3 to v1.25.4
* Update Calico from v3.24.3 to v3.24.5
* Update Prometheus and Grafana addons
2022-11-10 09:42:21 -08:00
Dalton Hubble
937acc4b5a Re-enable Graceful Node Shutdown feature
* Kubelet GracefulNodeShutdown works, but only partially handles
gracefully stopping the Kubelet. The most noticeable drawback
is that Completed Pods are left around
* Use a project like poseidon/scuttle or a similar systemd unit
as a snippet to add drain and/or delete behaviors if desired
* This reverts commit 1786e34f33.

Rel:

* https://www.psdn.io/posts/kubelet-graceful-shutdown/
* https://github.com/poseidon/scuttle
2022-11-02 20:49:01 -07:00
Dalton Hubble
9b733d79c7 Update Calico v3.24.2 to v3.24.3
* https://github.com/projectcalico/calico/releases/tag/v3.24.3
* Add patch to allow Kubelet kubeconfig to drain nodes if desired
in addition to just deleting them in shutdown integrations. See
https://github.com/poseidon/terraform-render-bootstrap/pull/330
2022-10-23 22:00:15 -07:00
Dalton Hubble
35a9e22b1f Update Calico from v3.24.1 to v3.24.2
* https://github.com/projectcalico/calico/releases/tag/v3.24.2
2022-10-20 09:28:19 -07:00
Dalton Hubble
a535581ef2 Remove unused Wants=network.target from etcd-member
* network.target is a passive unit that's not actually pulled
in by units requiring or wanting it, its only used for shutdown
ordering
> "Services using the network should ... avoid any Wants=network.target or even Requires=network.target"

Rel: https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/
2022-10-20 08:32:55 -07:00
Dalton Hubble
3ff2d38fa5 Update Cilium from v1.12.2 to v1.12.3
* https://github.com/cilium/cilium/releases/tag/v1.12.3
2022-10-17 17:25:23 -07:00
Dalton Hubble
651151805d Update Kubernetes v1.25.2 to v1.25.3
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.25.md#v1253
2022-10-13 21:02:39 -07:00
Dalton Hubble
3ee462a24c Update Kubernetes from v1.25.1 to v1.25.2
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.25.md#v1252
2022-09-22 08:15:30 -07:00
Dalton Hubble
90782ea820 Remove workaround for preventing search . propagation
* Kubelet v1.25.1 has the fix https://github.com/kubernetes/kubernetes/pull/112157
2022-09-19 22:37:02 -07:00
Dalton Hubble
74d4d56dbd Remove workaround for v1.25.0 ConfigMap rendering issue
* LocalStorageCapacityIsolationFSQuotaMonitoring was reverted back to
alpha in v1.25.1, so we don't need to explicitly disable it anymore

Rel: https://github.com/kubernetes/kubernetes/issues/112081
2022-09-19 09:10:24 -07:00
Dalton Hubble
5abe84b520 Update etcd from v3.5.4 to v3.5.5
* https://github.com/etcd-io/etcd/blob/main/CHANGELOG/CHANGELOG-3.5.md#v355
2022-09-15 09:01:45 -07:00
Dalton Hubble
951209d113 Update Cilium from v1.12.1 to v1.12.2
* https://github.com/cilium/cilium/releases/tag/v1.12.2
2022-09-15 08:28:37 -07:00
Dalton Hubble
09751cc0e8 Update Kubernetes from v1.25.0 to v1.25.1
* https://github.com/kubernetes/kubernetes/releases/tag/v1.25.1
2022-09-15 08:23:22 -07:00
Dalton Hubble
c14300f0be Update Calico from v3.23.3 to v3.24.1
* https://github.com/projectcalico/calico/releases/tag/v3.24.1
2022-09-14 08:09:38 -07:00
Dalton Hubble
1786e34f33 Revert Graceful Node Shutdown feature
* Disable Kubelet Graceful Node Shutdown on worker nodes (enabled in
Kubernetes v1.25.0 https://github.com/poseidon/typhoon/pull/1222)
* Graceful node shutdown shutdown allows 30s for critical pods to
shutdown and 15s for regular pods to shutdown before releasing the
inhibitor lock to allow the host to shutdown
* Unfortunately, both pods and the node are shutdown at the same
time at the end of the 45s period without further configuration
options. As a result, regular pods and the node are shutdown at the
same time. In practice, enabling this feature leaves Error or Completed
pods in kube-apiserver state until manually cleaned up. This feature
is not ready for general use
* Fix issue where Error/Completed pods are accumulating whenever any
node restarts (or auto-updates), visible in kubectl get pods
* This issue wasn't apparent in initial testing and seems to only
affect non-critical pods (due to critical pods being killed earlier)
But its very apparent on our real clusters

Rel: https://github.com/kubernetes/kubernetes/issues/110755
2022-09-10 14:58:44 -07:00
Dalton Hubble
4ad473cd3c Add workaround patch to strip "search ." from resolv.conf
* systemd adds "search ." to hosts /run/systemd/resolve/resolv.conf
on hosts with a fqdn hostname
* Kubelet v1.25 began propagating "search ." from the host node
into containers' `/etc/resolv.conf`
* musl-based DNS resolvers don't behave correctly when `search .`
is used in their `/etc/resolv.conf`. This breaks Alpine images
* Adapt the same workaround used by Openshift to strip the "search ."
* This only applies to bare-metal Typhoon nodes (where hostnames are
set to fqdn's), nodes on cloud platforms aren't affected in the
Typhoon configuration

Kubernetes tracking issue: https://github.com/kubernetes/kubernetes/issues/112135

Rel:

* https://github.com/systemd/systemd/pull/17201
* https://github.com/kubernetes/kubernetes/pull/109441
* https://github.com/coreos/fedora-coreos-tracker/issues/1287
* https://github.com/openshift/okd-machine-os/pull/159
2022-08-31 08:05:45 -07:00