typhoon

mirror of https://github.com/puppetmaster/typhoon.git synced 2024-12-26 16:39:34 +01:00

Author	SHA1	Message	Date
Dalton Hubble	af27661432	Configure controller and worker node architecture separately * On platforms that support ARM64 instances, configure controller and worker node host architectures separately * For example, you can run arm64 controllers and amd64 workers * Add `controller_arch` and `worker_arch` variables * Remove `arch` variable	2024-08-02 15:04:57 -07:00
Dalton Hubble	672bbad10b	Generate Azure Virtual Network IPv6 ULA space at random * Private IPv6 address space should be assigned randomly within an organization per https://datatracker.ietf.org/doc/html/rfc4193	2024-07-20 11:01:50 -07:00
Dalton Hubble	0d10d180f8	Change worker node pools from uniform to flexible orchestration mode * Use flexible orchestration mode. Azure has started to recommend this mode because it allows interacting with VMSS instances like regular VMs via the CLI or via the Azure Portal * Add options to allow workers nodes to use ephemeral local disks * Add `controller_disk_type` and `controller_disk_size` variables * Add `worker_disk_type`, `worker_disk_size`, and `worker_ephemeral_disk` variables	2024-07-14 11:58:15 -07:00
Dalton Hubble	24b7f31c55	Rename Azure cluster region variable to location * Rename the region variable to location to align with Azure platform conventions, where resources are created within an Azure location, which are themselves part of broader geographical regions	2024-07-09 07:56:58 -07:00
Dalton Hubble	48d4973957	Add IPv6 support for Typhoon Azure clusters * Define a dual-stack virtual network with both IPv4 and IPv6 private address space. Change `host_cidr` variable (string) to a `network_cidr` variable (object) with "ipv4" and "ipv6" fields that list CIDR strings. * Define dual-stack controller and worker subnets. Disable Azure default outbound access (a deprecated fallback mechanism) * Enable dual-stack load balancing to Kubernetes Ingress by adding a public IPv6 frontend IP and LB rule to the load balancer. * Enable worker outbound IPv6 connectivity through load balancer SNAT by adding an IPv6 frontend IP and outbound rule * Configure controller nodes with a public IPv6 address to provide direct outbound IPv6 connectivity * Add an IPv6 worker backend pool. Azure requires separate IPv4 and IPv6 backend pools, though the health probe can be shared * Extend network security group rules for IPv6 source/destinations Checklist: Access to controller and worker nodes via IPv6 addresses: * SSH access to controller nodes via public IPv6 address * SSH access to worker nodes via (private) IPv6 address (via controller) Outbound IPv6 connectivity from controller and worker nodes: ``` nc -6 -zv ipv6.google.com 80 Ncat: Version 7.94 ( https://nmap.org/ncat ) Ncat: Connected to [2607:f8b0:4001:c16::66]:80. Ncat: 0 bytes sent, 0 bytes received in 0.02 seconds. ``` Serve Ingress traffic via IPv4 or IPv6 just requires setting up A and AAAA records and running the ingress controller with `hostNetwork: true` since, hostPort only forwards IPv4 traffic	2024-07-09 07:55:00 -07:00
Dalton Hubble	b3c384fbc0	Introduce the component system for managing pre-installed addons * Previously: Typhoon provisions clusters with kube-system components like CoreDNS, kube-proxy, and a chosen CNI provider (among flannel, Calico, or Cilium) pre-installed. This is convenient since clusters come with "batteries included". But it also means upgrading these components is generally done in lock-step, by upgrading to a new Typhoon / Kubernetes release * It can be valuable to manage these components with a separate plan/apply process or through automations and deploy systems. For example, this allows managing CoreDNS separately from the cluster's lifecycle. * These "components" will continue to be pre-installed by default, but a new `components` variable allows them to be disabled and managed as "addons", components you apply after cluster creation and manage on a rolling basis. For some of these, we may provide Terraform modules to aide in managing these components. ``` module "cluster" { # defaults components = { enable = true coredns = { enable = true } kube_proxy = { enable = true } # Only the CNI set in var.networking will be installed flannel = { enable = true } calico = { enable = true } cilium = { enable = true } } } ``` An earlier variable `install_container_networking = true/false` has been removed, since it can now be achieved with this more extensible and general components mechanism by setting the chosen networking provider enable field to false.	2024-05-19 16:33:57 -07:00
Dalton Hubble	d08cd317d9	Allow CoreDNS and kube-proxy to be optional components * Allow for more minimal base cluster setups, that manage CoreDNS or kube-proxy as applications, with rolling updates, or deploy systems. Or in the case of kube-proxy, its becoming more common to not install it and instead use Cilium * Add a `components` pass-through variable to configure pre-installed components like kube-proxy and CoreDNS. These components can be disabled (individually or together) to allow for managing components with separate plan/apply processes or automations * terraform-render-bootstrap manifest assets are now structured as manifests/{coredns,kube-proxy,network} so adapt the controller layout scripts accordingly * This is similar to some changes in v1.29.2 that allowed for the container networking provider manifests to be skipped Related: https://github.com/poseidon/typhoon/pull/1419, https://github.com/poseidon/typhoon/pull/1421	2024-05-12 21:20:27 -07:00
Dalton Hubble	2325a503e1	Add an `install_container_networking` variable (default `true`) * When `true`, the chosen container `networking` provider is installed during cluster bootstrap * Set `false` to self-manage the container networking provider. This allows flannel, Calico, or Cilium to be managed via Terraform (like any other Kubernetes resources). Nodes will be NotReady until you apply the self-managed container networking provider. This may become the default in future.	2024-02-24 18:49:38 -08:00
Dalton Hubble	0ce8dfbb95	Workaround to allow use of ed25519 keys on Azure * Allow passing a dummy RSA key to Azure to satisfy its obtuse requirements (recommend deleting the corresponding private key) * Then `ssh_authorized_key` can be used to provide Fedora CoreOS or Flatcar Linux with a modern ed25519 public key to set in the authorized_keys via Ignition	2023-09-17 23:21:42 +02:00
Dalton Hubble	f04e1d25a8	Add Flatcar Linux ARM64 support on Azure * Kinvolk now publishes Flatcar Linux images for ARM64 * For now, amd64 image must specify a plan while arm64 images must NOT specify a plan due to how Kinvolk publishes. Rel: https://github.com/flatcar/Flatcar/issues/872	2022-10-17 08:36:57 -07:00
Dalton Hubble	8d2c8b8db6	Switch to Flatcar Azure gen2 images and change worker type * Switch from Azure Hypervisor generation 1 to generation 2 * Change default Azure `worker_type` from Standard_DS1_v2 to Standard_D2as_v5 * Get 2 VCPU, 7 GiB, 12500Mbps (vs 1 VCPU, 3.5GiB, 750 Mbps) * Small increase in pay-as-you-go price ($53.29 -> $62.78) * Small increase in spot price ($5.64/mo -> $7.37/mo) * Change from Intel to AMD EPYC (`D2as_v5` cheaper than `D2s_v5`) Notes: Azure makes you accept terms for each plan: ``` az vm image terms accept --publish kinvolk --offer flatcar-container-linux-free --plan stable-gen2 ``` Rel: * https://learn.microsoft.com/en-us/azure/virtual-machines/dasv5-dadsv5-series#dasv5-series * https://learn.microsoft.com/en-us/azure/virtual-machines/dv2-dsv2-series#dsv2-series	2022-10-13 09:57:52 -07:00
Dalton Hubble	cf4beeba34	Change default CNI provider from Calico to Cilium * Cilium (v1.8) was added to Typhoon in v1.18.5 in June 2020 and its become more impressive since then. Its currently the leading CNI provider choice. * Calico has grown complex, has lots of CRDs, masks its management complexity with an operator (which we won't use), doesn't provide multi-arch images, and hasn't been compatible with Kubernetes v1.23 (with ipvs) for several releases. * Both have CNCF conformance quirks (flannel used for conformance), but that's not the main factor in choosing the default	2022-02-07 08:07:00 -08:00
Dalton Hubble	e97c1cc9e5	Enable Kubernetes aggregation by default * Change `enable_aggregation` default from false to true * These days, Kubernetes control plane components emit annoying messages related to assumptions baked into the Kubernetes API Aggregation Layer if you don't enable it. Further the conformance tests force you to remember to enable it if you care about passing those * This change is motivated by eliminating annoyances, rather than any enthusiasm for Kubernetes' aggregation features Rel: https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/apiserver-aggregation/	2021-12-09 17:30:35 -08:00
Dalton Hubble	b152b9f973	Reduce the default disk_size from 40GB to 30GB * We're typically reducing the `disk_size` in real clusters since the space is under used. The default should be lower.	2021-04-26 11:43:26 -07:00
Dalton Hubble	084e8bea49	Allow custom initial node taints on worker pool nodes * Add `node_taints` variable to worker modules to set custom initial node taints on cloud platforms that support auto-scaling worker pools of heterogeneous nodes (i.e. AWS, Azure, GCP) * Worker pools could use custom `node_labels` to allowed workloads to select among differentiated nodes, while custom `node_taints` allows a worker pool's nodes to be tainted as special to prevent scheduling, except by workloads that explicitly tolerate the taint * Expose `daemonset_tolerations` in AWS, Azure, and GCP kubernetes cluster modules, to determine whether `kube-system` components should tolerate the custom taint (advanced use covered in docs) Rel: #550, #663 Closes #429	2021-04-11 15:00:11 -07:00
Dalton Hubble	6a091e245e	Remove Flatcar Linux Edge `os_image` option * Flatcar Linux has not published an Edge channel image since April 2020 and recently removed mention of the channel from their documentation https://github.com/kinvolk/Flatcar/pull/345 * Users of Flatcar Linux Edge should move to the stable, beta, or alpha channel, barring any alternate advice from upstream Flatcar Linux	2021-02-20 16:09:54 -08:00
Dalton Hubble	9f94ab6bcc	Rerun terraform fmt for recent variables	2020-11-21 14:20:36 -08:00
Dalton Hubble	cc00afa4e1	Add Terraform v0.13 input variable validations * Support for migrating from Terraform v0.12.x to v0.13.x was added in v1.18.8 * Require Terraform v0.13+. Drop support for Terraform v0.12	2020-11-17 12:02:34 -08:00
Dalton Hubble	7c3f3ab6d0	Rename container-linux modules to flatcar-linux * CoreOS Container Linux was deprecated in v1.18.3 * Continue transitioning docs and modules from supporting both CoreOS and Flatcar "variants" of Container Linux to now supporting Flatcar Linux and equivalents Action Required: Update the Flatcar Linux modules `source` to replace `s/container-linux/flatcar-linux`. See docs for examples	2020-10-20 22:47:19 -07:00

19 Commits