During the "real" first boot (install boot), we need tu run butane
config to manipulate disks, so we add install_snippets variable to do
so.
This snippets are added to the install.yaml butane configuration
* On platforms that support ARM64 instances, configure controller
and worker node host architectures separately
* For example, you can run arm64 controllers and amd64 workers
* Add `controller_arch` and `worker_arch` variables
* Remove `arch` variable
* Add `controller_disk_type`, `controller_disk_size`, and `controller_disk_iops`
variables
* Add `worker_disk_type`, `worker_disk_size`, and `worker_disk_iops` variables
and fix propagation to worker nodes
* Remove `disk_type`, `disk_size`, and `disk_iops` variables
* Add `controller_cpu_credits` and `worker_cpu_credits` variables to set CPU
pricing mode for burstable instance types
* Use flexible orchestration mode. Azure has started to recommend this
mode because it allows interacting with VMSS instances like regular VMs
via the CLI or via the Azure Portal
* Add options to allow workers nodes to use ephemeral local disks
* Add `controller_disk_type` and `controller_disk_size` variables
* Add `worker_disk_type`, `worker_disk_size`, and `worker_ephemeral_disk` variables
* Consolidate load balancer frontend IPs to just the minimal IPv4
and IPv6 addresses that are needed per load balancer. apiserver and
ingress use separate ports, so there is not a true need for a separate
public IPv4 address just for apiserver
* Some might prefer a separate IP just because it slightly hides the
apiserver, but these are public hosted endpoints that can be discovered
* Reduce the cost of an Azure cluster since IPv4 public IPs are billed
($3.60/mo/cluster)
* Rename the region variable to location to align with Azure
platform conventions, where resources are created within an
Azure location, which are themselves part of broader geographical
regions
* Define a dual-stack virtual network with both IPv4 and IPv6 private
address space. Change `host_cidr` variable (string) to a `network_cidr`
variable (object) with "ipv4" and "ipv6" fields that list CIDR strings.
* Define dual-stack controller and worker subnets. Disable Azure
default outbound access (a deprecated fallback mechanism)
* Enable dual-stack load balancing to Kubernetes Ingress by adding
a public IPv6 frontend IP and LB rule to the load balancer.
* Enable worker outbound IPv6 connectivity through load balancer
SNAT by adding an IPv6 frontend IP and outbound rule
* Configure controller nodes with a public IPv6 address to provide
direct outbound IPv6 connectivity
* Add an IPv6 worker backend pool. Azure requires separate IPv4 and
IPv6 backend pools, though the health probe can be shared
* Extend network security group rules for IPv6 source/destinations
Checklist:
Access to controller and worker nodes via IPv6 addresses:
* SSH access to controller nodes via public IPv6 address
* SSH access to worker nodes via (private) IPv6 address (via
controller)
Outbound IPv6 connectivity from controller and worker nodes:
```
nc -6 -zv ipv6.google.com 80
Ncat: Version 7.94 ( https://nmap.org/ncat )
Ncat: Connected to [2607:f8b0:4001:c16::66]:80.
Ncat: 0 bytes sent, 0 bytes received in 0.02 seconds.
```
Serve Ingress traffic via IPv4 or IPv6 just requires setting
up A and AAAA records and running the ingress controller with
`hostNetwork: true` since, hostPort only forwards IPv4 traffic
* Previously: Typhoon provisions clusters with kube-system components
like CoreDNS, kube-proxy, and a chosen CNI provider (among flannel,
Calico, or Cilium) pre-installed. This is convenient since clusters
come with "batteries included". But it also means upgrading these
components is generally done in lock-step, by upgrading to a new
Typhoon / Kubernetes release
* It can be valuable to manage these components with a separate
plan/apply process or through automations and deploy systems. For
example, this allows managing CoreDNS separately from the cluster's
lifecycle.
* These "components" will continue to be pre-installed by default,
but a new `components` variable allows them to be disabled and
managed as "addons", components you apply after cluster creation
and manage on a rolling basis. For some of these, we may provide
Terraform modules to aide in managing these components.
```
module "cluster" {
# defaults
components = {
enable = true
coredns = {
enable = true
}
kube_proxy = {
enable = true
}
# Only the CNI set in var.networking will be installed
flannel = {
enable = true
}
calico = {
enable = true
}
cilium = {
enable = true
}
}
}
```
An earlier variable `install_container_networking = true/false` has
been removed, since it can now be achieved with this more extensible
and general components mechanism by setting the chosen networking
provider enable field to false.
* Output the network security group name and address prefixes
for controller nodes, to allow adding custom network security
rules that apply specifically to controller nodes
* Add firewall or security riles to allow node-to-node traffic
on ports 9962-9965 for Cilium and Hubble metrics. Cilium runs
with host network, so these require cloud firewall changes
* Allow for more minimal base cluster setups, that manage CoreDNS or
kube-proxy as applications, with rolling updates, or deploy systems.
Or in the case of kube-proxy, its becoming more common to not install
it and instead use Cilium
* Add a `components` pass-through variable to configure pre-installed
components like kube-proxy and CoreDNS. These components can be
disabled (individually or together) to allow for managing components
with separate plan/apply processes or automations
* terraform-render-bootstrap manifest assets are now structured as
manifests/{coredns,kube-proxy,network} so adapt the controller
layout scripts accordingly
* This is similar to some changes in v1.29.2 that allowed for the
container networking provider manifests to be skipped
Related: https://github.com/poseidon/typhoon/pull/1419, https://github.com/poseidon/typhoon/pull/1421