Compare commits

...

36 Commits

Author SHA1 Message Date
7d2d8e16e5 google: Use regional instance templates for workers
* Use regional instance templates for the worker node regional
managed instance groups. Regional instance templates are kept in
the associated region, whereas the older "global" instance templates
were kept in a particular region (regardless of where the MIG region)
so outages in a region X could affect clusters in a region Y which
is undesired
2024-08-27 21:35:02 -07:00
be9ba51269 Bump mkdocs-material from 9.5.32 to v9.5.33 2024-08-23 21:51:36 -07:00
9a2448f711 Remove upper bound on azurerm provider version
* Allow folks to start upgrading to azurerm provider v4.0.0,
don't set an upper bound on versions going forward
2024-08-23 21:51:29 -07:00
3412060c3c Use Cilium kube-proxy replacement when Cilium CNI is used
* When using the Cilium component, disable bootstrapping the
kube-proxy DaemonSet. Instead, configure Cilium to provide its
kube-proxy replacement with BPF
* Update the self-managed Cilium component to use kube-proxy
replacement as well
2024-08-23 12:33:32 -07:00
808b8a948f aws: Switch EC2 instances to use resource-based hostnames
* Use EC2 resource-based hostnames instead of IP-based hostnames. The Amazon
DNS server can resolve A and AAAA queries to IPv4 and IPv6 node addresses
* For example, nodes used to be named like `ip-10-11-12-13.us-east-1.compute.internal`
but going forward use the instance id `i-0123456789abcdef.us-east-1.compute.internal`
* Tag controller node EBS volumes with a name based on the controller node name
2024-08-22 20:02:53 -07:00
effa13c141 Fix flannel-cni container image
* Close #1496
2024-08-22 19:26:19 -07:00
b8645f3ec2 Bump mkdocs-material from 9.5.31 to v9.5.32 2024-08-22 10:36:50 -07:00
10be34daa2 Update Kubernetes from v1.30.4 to v1.31.0
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.31.md#v1310
2024-08-17 08:32:35 -07:00
1cb49e1267 Bump quay.io/cilium/cilium image from v1.16.0 to v1.16.1 2024-08-16 08:31:11 -07:00
d79f94f4f5 Bump quay.io/cilium/operator-generic image from v1.16.0 to v1.16.1 2024-08-16 08:31:01 -07:00
320d76c934 Update Kubernetes from v1.30.3 to v1.30.4
* Update Cilium from v1.16.0 to v1.16.1
2024-08-16 08:27:07 -07:00
2daa23be50 Update default Cilium and CoreDNS components
* Update the CoreDNS and Cilium versons used by default when
folks aren't managing the components themselves
2024-08-05 08:47:06 -07:00
6e2daded02 Remove some seldom used variables and set reasonable
* Set reasonable values and remove some variable clutter
* enable_reporting is only used with Calico and we can just default
to false, I doubt anyone uses Calico and cares much about reporting
metrics to upstream Calico
2024-08-02 20:45:37 -07:00
83f1bd2373 Update ARM64 cluster and hybrid cluster docs
* Typhoon now supports arbitrary combinations of controller, worker,
and worker pool architectures so we can drop the specific details of
full-cluster vs hybrid cluster. Just pick the architecture for each
group of nodes accordingly.
* However, if a custom node taint is set, continue to configure the
cluster's daemonsets accordingly with `daemonset_tolerations`
2024-08-02 20:34:23 -07:00
67e5ecf6f2 Bump mkdocs-material from 9.5.30 to v9.5.31 2024-08-02 16:46:36 -07:00
0120b9f38d Remove the cluster_domain_suffix variable
* Drop support for `cluster_domain_suffix` customization and
always use `cluster.local`. Many components in the Kubernetes
ecosystem assume this default suffix and its very rare to be
setting a special value here these days
* Cleanup a few variables that are seldom used
2024-08-02 15:05:25 -07:00
af27661432 Configure controller and worker node architecture separately
* On platforms that support ARM64 instances, configure controller
and worker node host architectures separately
* For example, you can run arm64 controllers and amd64 workers
* Add `controller_arch` and `worker_arch` variables
* Remove `arch` variable
2024-08-02 15:04:57 -07:00
516786d7bb google: Configure controller and worker disk sizes
* Add `controller_disk_size` and `worker_disk_size` variables
* Remove `disk_size` variable
2024-08-02 13:07:41 -07:00
1104b4bf28 AWS: Add CPU pricing mode and controller/worker disk variables
* Add `controller_disk_type`, `controller_disk_size`, and `controller_disk_iops`
variables
* Add `worker_disk_type`, `worker_disk_size`, and `worker_disk_iops` variables
and fix propagation to worker nodes
* Remove `disk_type`, `disk_size`, and `disk_iops` variables
* Add `controller_cpu_credits` and `worker_cpu_credits` variables to set CPU
pricing mode for burstable instance types
2024-07-31 15:02:28 -07:00
39b5079bc3 Bump registry.k8s.io/coredns/coredns image from v1.11.1 to v1.11.3 2024-07-31 13:28:30 -07:00
858d665d9b Bump quay.io/cilium/cilium image from v1.15.7 to v1.16.0 2024-07-28 11:59:07 -07:00
8cea37cdd9 Bump quay.io/cilium/operator-generic image from v1.15.7 to v1.16.0 2024-07-28 11:58:58 -07:00
4251ca937a Bump pymdown-extensions from 10.8.1 to v10.9 2024-07-28 11:58:50 -07:00
329987187b Bump mkdocs-material from 9.5.29 to v9.5.30 2024-07-26 09:37:41 -07:00
d046026511 Fix incorrect terraform-render-bootstrap SHA 2024-07-25 21:41:54 -07:00
0669d44026 Update Kubernetes from v1.30.2 to v1.30.3
* Update builtin Cilium manifests from v1.15.6 to v1.15.7
* Update builtin flannel manifests from v0.25.4 to v0.25.5
2024-07-20 11:04:32 -07:00
672bbad10b Generate Azure Virtual Network IPv6 ULA space at random
* Private IPv6 address space should be assigned randomly within
an organization per https://datatracker.ietf.org/doc/html/rfc4193
2024-07-20 11:01:50 -07:00
be0e516974 Bump mkdocs-material from 9.5.28 to v9.5.29 2024-07-20 10:44:04 -07:00
6a61afcd3b Bump docker.io/flannel/flannel image from v0.25.4 to v0.25.5 2024-07-20 10:36:12 -07:00
ca1f897b35 Bump quay.io/cilium/cilium image from v1.15.6 to v1.15.7 2024-07-14 13:42:35 -07:00
d4514db00c Bump quay.io/cilium/operator-generic image from v1.15.6 to v1.15.7 2024-07-14 13:42:26 -07:00
0d10d180f8 Change worker node pools from uniform to flexible orchestration mode
* Use flexible orchestration mode. Azure has started to recommend this
mode because it allows interacting with VMSS instances like regular VMs
via the CLI or via the Azure Portal
* Add options to allow workers nodes to use ephemeral local disks
  * Add `controller_disk_type` and `controller_disk_size` variables
  * Add `worker_disk_type`, `worker_disk_size`, and `worker_ephemeral_disk` variables
2024-07-14 11:58:15 -07:00
a4fab61066 Remove an IPv4 address from Azure clusters
* Consolidate load balancer frontend IPs to just the minimal IPv4
and IPv6 addresses that are needed per load balancer. apiserver and
ingress use separate ports, so there is not a true need for a separate
public IPv4 address just for apiserver
* Some might prefer a separate IP just because it slightly hides the
apiserver, but these are public hosted endpoints that can be discovered
* Reduce the cost of an Azure cluster since IPv4 public IPs are billed
($3.60/mo/cluster)
2024-07-10 22:29:43 -07:00
24b7f31c55 Rename Azure cluster region variable to location
* Rename the region variable to location to align with Azure
platform conventions, where resources are created within an
Azure location, which are themselves part of broader geographical
regions
2024-07-09 07:56:58 -07:00
48d4973957 Add IPv6 support for Typhoon Azure clusters
* Define a dual-stack virtual network with both IPv4 and IPv6 private
address space. Change `host_cidr` variable (string) to a `network_cidr`
variable (object) with "ipv4" and "ipv6" fields that list CIDR strings.
* Define dual-stack controller and worker subnets. Disable Azure
default outbound access (a deprecated fallback mechanism)
* Enable dual-stack load balancing to Kubernetes Ingress by adding
a public IPv6 frontend IP and LB rule to the load balancer.
* Enable worker outbound IPv6 connectivity through load balancer
SNAT by adding an IPv6 frontend IP and outbound rule
* Configure controller nodes with a public IPv6 address to provide
direct outbound IPv6 connectivity
* Add an IPv6 worker backend pool. Azure requires separate IPv4 and
IPv6 backend pools, though the health probe can be shared
* Extend network security group rules for IPv6 source/destinations

Checklist:

Access to controller and worker nodes via IPv6 addresses:

  * SSH access to controller nodes via public IPv6 address
  * SSH access to worker nodes via (private) IPv6 address (via
    controller)

Outbound IPv6 connectivity from controller and worker nodes:

```
nc -6 -zv ipv6.google.com 80
Ncat: Version 7.94 ( https://nmap.org/ncat )
Ncat: Connected to [2607:f8b0:4001:c16::66]:80.
Ncat: 0 bytes sent, 0 bytes received in 0.02 seconds.
```

Serve Ingress traffic via IPv4 or IPv6 just requires setting
up A and AAAA records and running the ingress controller with
`hostNetwork: true` since, hostPort only forwards IPv4 traffic
2024-07-09 07:55:00 -07:00
3483ed8bd5 Bump mkdocs-material from 9.5.27 to v9.5.28 2024-07-03 22:23:23 -07:00
132 changed files with 2163 additions and 1714 deletions

View File

@ -4,6 +4,121 @@ Notable changes between versions.
## Latest
## v1.31.0
* Kubernetes [v1.31.0](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.31.md#v1310)
* Use Cilium kube-proxy replacement mode when `cilium` networking is chosen ([#1501](https://github.com/poseidon/typhoon/pull/1501))
* Fix invalid flannel-cni container image for those using `flannel` networking ([#1497](https://github.com/poseidon/typhoon/pull/1497))
### AWS
* Use EC2 resource-based hostnames instead of IP-based hostnames ([#1499](https://github.com/poseidon/typhoon/pull/1499))
* The Amazon DNS server can resolve A and AAAA queries to IPv4 and IPv6 node addresses
* Tag controller node EBS volumes with a name based on the controller node name
### Google
* Use `google_compute_region_instance_template` instead of `google_compute_instance_template`
* Google's regional instance template metadata is kept in the associated region for greater resiliency. The "global" instance templates were kept in a single region
## v1.30.4
* Kubernetes [v1.30.4](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.30.md#v1304)
* Update Cilium from v1.15.7 to [v1.16.1](https://github.com/cilium/cilium/releases/tag/v1.16.1)
* Update CoreDNS from v1.11.1 to v1.11.3
* Remove `enable_aggregation` variable for Kubernetes Aggregation Layer, always set to true
* Remove `cluster_domain_suffix` variable, always use "cluster.local"
* Remove `enable_reporting` variable for analytics, always set to false
## v1.30.3
* Kubernetes [v1.30.3](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.30.md#v1303)
* Update Cilium from v1.15.6 to [v1.15.7](https://github.com/cilium/cilium/releases/tag/v1.15.7)
* Update flannel from v0.25.4 to [v0.25.5](https://github.com/flannel-io/flannel/releases/tag/v0.25.5)
### AWS
* Configure controller and worker disks ([#1482](https://github.com/poseidon/typhoon/pull/1482))
* Add `controller_disk_type`, `controller_disk_size`, and `controller_disk_iops` variables
* Add `worker_disk_type`, `worker_disk_size`, and `worker_disk_iops` variables
* Remove `disk_type`, `disk_size`, and `disk_iops` variables
* Fix propagating settings to worker disks, previously ignored
* Configure CPU pricing model for burstable instance types ([#1482](https://github.com/poseidon/typhoon/pull/1482))
* Add `controller_cpu_credits` and `worker_cpu_credits` variables (`standard` or `unlimited`)
* Configure controller or worker instance architecture ([#1485](https://github.com/poseidon/typhoon/pull/1485))
* Add `controller_arch` and `worker_arch` variables (`amd64` or `arm64`)
* Remove `arch` variable
```diff
module "cluster" {
...
- arch = "amd64"
- disk_type = "gp3"
- disk_size = 30
- disk_iops = 3000
+ controller_arch = "amd64"
+ controller_disk_size = 15
+ controller_cpu_credits = "standard"
+ worker_arch = "amd64"
+ worker_disk_size = 22
+ worker_cpu_credits = "unlimited"
}
```
### Azure
* Configure the virtual network and subnets with IPv6 private address space
* Change `host_cidr` variable (string) to a `network_cidr` object with `ipv4` and `ipv6` fields that list CIDR strings. Leave the variable unset to use the defaults. (**breaking**)
* Add support for dual-stack Kubernetes Ingress Load Balancing
* Add a public IPv6 frontend, 80/443 rules, and a worker-ipv6 backend pool
* Change the `controller_address_prefixes` output from a list of strings to an object with `ipv4` and `ipv6` fields. Most Azure resources can't accept a mix, so these are split out (**breaking**)
* Change the `worker_address_prefixes` output from a list of strings to an object with `ipv4` and `ipv6` fields. Most Azure resources can't accept a mix, so these are split out (**breaking**)
* Change the `backend_address_pool_id` output (and worker module input) from a string to an object with `ipv4` and `ipv6` fields that list ids (**breaking**)
* Configure nodes to have outbound IPv6 internet connectivity (analogous to IPv4 SNAT)
* Configure controller nodes to have a public IPv6 address
* Configure worker nodes to use outbound rules and the load balancer for SNAT
* Extend network security rules to allow IPv6 traffic, analogous to IPv4
* Rename `region` variable to `location` to align with Azure platform conventions ([#1469](https://github.com/poseidon/typhoon/pull/1469))
* Change worker pools from uniform to flexible orchestration mode ([#1473](https://github.com/poseidon/typhoon/pull/1473))
* Add options to allow workers nodes to use ephemeral local disks ([#1473](https://github.com/poseidon/typhoon/pull/1473))
* Add `controller_disk_type` and `controller_disk_size` variables
* Add `worker_disk_type`, `worker_disk_size`, and `worker_ephemeral_disk` variables
* Reduce the number of public IPv4 addresses needed for the Azure load balancer ([#1470](https://github.com/poseidon/typhoon/pull/1470))
* Configure controller or worker instance architecture for Flatcar Linux ([#1485](https://github.com/poseidon/typhoon/pull/1485))
* Add `controller_arch` and `worker_arch` variables (`amd64` or `arm64`)
* Remove `arch` variable
```diff
module "cluster" {
...
- region = "centralus"
+ location = "centralus"
# optional
- host_cidr = "10.0.0.0/16"
+ network_cidr = {
+ ipv4 = ["10.0.0.0/16"]
+ }
# instances
+ controller_disk_type = "StandardSSD_LRS"
+ worker_ephemeral_disk = true
}
```
### Google Cloud
* Allow configuring controller and worker disks ([#1486](https://github.com/poseidon/typhoon/pull/1486))
* Add `controller_disk_size` and `worker_disk_size` variables
* Remove `disk_size` variable
## v1.30.2
* Kubernetes [v1.30.2](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.30.md#v1302)
* Update CoreDNS from v1.9.4 to v1.11.1
* Update Cilium from v1.15.5 to [v1.15.6](https://github.com/cilium/cilium/releases/tag/v1.15.6)
* Update flannel from v0.25.1 to [v0.25.4](https://github.com/flannel-io/flannel/releases/tag/v0.25.4)
## v1.30.1
* Kubernetes [v1.30.1](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.30.md#v1301)

View File

@ -18,7 +18,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.30.2 (upstream)
* Kubernetes v1.31.0 (upstream)
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/), SELinux enforcing
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [preemptible](https://typhoon.psdn.io/flatcar-linux/google-cloud/#preemption) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#hosts) customization
@ -78,7 +78,7 @@ Define a Kubernetes cluster by using the Terraform module for your chosen platfo
```tf
module "yavin" {
source = "git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes?ref=v1.30.2"
source = "git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes?ref=v1.31.0"
# Google Cloud
cluster_name = "yavin"
@ -117,9 +117,9 @@ In 4-8 minutes (varies by platform), the cluster will be ready. This Google Clou
$ export KUBECONFIG=/home/user/.kube/configs/yavin-config
$ kubectl get nodes
NAME ROLES STATUS AGE VERSION
yavin-controller-0.c.example-com.internal <none> Ready 6m v1.30.2
yavin-worker-jrbf.c.example-com.internal <none> Ready 5m v1.30.2
yavin-worker-mzdm.c.example-com.internal <none> Ready 5m v1.30.2
yavin-controller-0.c.example-com.internal <none> Ready 6m v1.31.0
yavin-worker-jrbf.c.example-com.internal <none> Ready 5m v1.31.0
yavin-worker-mzdm.c.example-com.internal <none> Ready 5m v1.31.0
```
List the pods.
@ -127,9 +127,10 @@ List the pods.
```
$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-node-1cs8z 2/2 Running 0 6m
kube-system calico-node-d1l5b 2/2 Running 0 6m
kube-system calico-node-sp9ps 2/2 Running 0 6m
kube-system cilium-1cs8z 1/1 Running 0 6m
kube-system cilium-d1l5b 1/1 Running 0 6m
kube-system cilium-sp9ps 1/1 Running 0 6m
kube-system cilium-operator-68d778b448-g744f 1/1 Running 0 6m
kube-system coredns-1187388186-zj5dl 1/1 Running 0 6m
kube-system coredns-1187388186-dkh3o 1/1 Running 0 6m
kube-system kube-apiserver-controller-0 1/1 Running 0 6m

View File

@ -128,8 +128,8 @@ resource "kubernetes_config_map" "cilium" {
enable-bpf-masquerade = "true"
# kube-proxy
kube-proxy-replacement = "false"
kube-proxy-replacement-healthz-bind-address = ""
kube-proxy-replacement = "true"
kube-proxy-replacement-healthz-bind-address = ":10256"
enable-session-affinity = "true"
# ClusterIPs from host namespace

View File

@ -61,7 +61,7 @@ resource "kubernetes_daemonset" "cilium" {
# https://github.com/cilium/cilium/pull/24075
init_container {
name = "install-cni"
image = "quay.io/cilium/cilium:v1.15.6"
image = "quay.io/cilium/cilium:v1.16.1"
command = ["/install-plugin.sh"]
security_context {
allow_privilege_escalation = true
@ -80,7 +80,7 @@ resource "kubernetes_daemonset" "cilium" {
# We use nsenter command with host's cgroup and mount namespaces enabled.
init_container {
name = "mount-cgroup"
image = "quay.io/cilium/cilium:v1.15.6"
image = "quay.io/cilium/cilium:v1.16.1"
command = [
"sh",
"-ec",
@ -115,7 +115,7 @@ resource "kubernetes_daemonset" "cilium" {
init_container {
name = "clean-cilium-state"
image = "quay.io/cilium/cilium:v1.15.6"
image = "quay.io/cilium/cilium:v1.16.1"
command = ["/init-container.sh"]
security_context {
allow_privilege_escalation = true
@ -139,7 +139,7 @@ resource "kubernetes_daemonset" "cilium" {
container {
name = "cilium-agent"
image = "quay.io/cilium/cilium:v1.15.6"
image = "quay.io/cilium/cilium:v1.16.1"
command = ["cilium-agent"]
args = [
"--config-dir=/tmp/cilium/config-map"

View File

@ -58,7 +58,7 @@ resource "kubernetes_deployment" "operator" {
enable_service_links = false
container {
name = "cilium-operator"
image = "quay.io/cilium/operator-generic:v1.15.6"
image = "quay.io/cilium/operator-generic:v1.16.1"
command = ["cilium-operator-generic"]
args = [
"--config-dir=/tmp/cilium/config-map",

View File

@ -77,7 +77,7 @@ resource "kubernetes_deployment" "coredns" {
}
container {
name = "coredns"
image = "registry.k8s.io/coredns/coredns:v1.11.1"
image = "registry.k8s.io/coredns/coredns:v1.11.3"
args = ["-conf", "/etc/coredns/Corefile"]
port {
name = "dns"

View File

@ -73,7 +73,7 @@ resource "kubernetes_daemonset" "flannel" {
container {
name = "flannel"
image = "docker.io/flannel/flannel:v0.25.4"
image = "docker.io/flannel/flannel:v0.25.5"
command = [
"/opt/bin/flanneld",
"--ip-masq",

View File

@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.30.2 (upstream)
* Kubernetes v1.31.0 (upstream)
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/), SELinux enforcing
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [spot](https://typhoon.psdn.io/fedora-coreos/aws/#spot) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#hosts) customization

View File

@ -19,7 +19,7 @@ data "aws_ami" "fedora-coreos" {
}
data "aws_ami" "fedora-coreos-arm" {
count = var.arch == "arm64" ? 1 : 0
count = var.controller_arch == "arm64" ? 1 : 0
most_recent = true
owners = ["125523088429"]

View File

@ -1,6 +1,6 @@
# Kubernetes assets (kubeconfig, manifests)
module "bootstrap" {
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=886f501bf7b624fc12acac83449b81d0dc8b8849"
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=1ddecb1cef65c9715ed66b6c335634bc51f59613"
cluster_name = var.cluster_name
api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)]
@ -9,9 +9,6 @@ module "bootstrap" {
network_mtu = var.network_mtu
pod_cidr = var.pod_cidr
service_cidr = var.service_cidr
cluster_domain_suffix = var.cluster_domain_suffix
enable_reporting = var.enable_reporting
enable_aggregation = var.enable_aggregation
daemonset_tolerations = var.daemonset_tolerations
components = var.components
}

View File

@ -57,7 +57,7 @@ systemd:
After=afterburn.service
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.30.2
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.31.0
EnvironmentFile=/run/metadata/afterburn
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
@ -116,7 +116,7 @@ systemd:
--volume /opt/bootstrap/assets:/assets:ro,Z \
--volume /opt/bootstrap/apply:/apply:ro,Z \
--entrypoint=/apply \
quay.io/poseidon/kubelet:v1.30.2
quay.io/poseidon/kubelet:v1.31.0
ExecStartPost=/bin/touch /opt/bootstrap/bootstrap.done
ExecStartPost=-/usr/bin/podman stop bootstrap
storage:
@ -149,7 +149,7 @@ storage:
cgroupDriver: systemd
clusterDNS:
- ${cluster_dns_service_ip}
clusterDomain: ${cluster_domain_suffix}
clusterDomain: cluster.local
healthzPort: 0
rotateCertificates: true
shutdownGracePeriod: 45s

View File

@ -20,18 +20,18 @@ resource "aws_instance" "controllers" {
tags = {
Name = "${var.cluster_name}-controller-${count.index}"
}
instance_type = var.controller_type
ami = var.arch == "arm64" ? data.aws_ami.fedora-coreos-arm[0].image_id : data.aws_ami.fedora-coreos.image_id
user_data = data.ct_config.controllers.*.rendered[count.index]
ami = var.controller_arch == "arm64" ? data.aws_ami.fedora-coreos-arm[0].image_id : data.aws_ami.fedora-coreos.image_id
# storage
root_block_device {
volume_type = var.disk_type
volume_size = var.disk_size
iops = var.disk_iops
volume_type = var.controller_disk_type
volume_size = var.controller_disk_size
iops = var.controller_disk_iops
encrypted = true
tags = {}
tags = {
Name = "${var.cluster_name}-controller-${count.index}"
}
}
# network
@ -39,6 +39,14 @@ resource "aws_instance" "controllers" {
subnet_id = element(aws_subnet.public.*.id, count.index)
vpc_security_group_ids = [aws_security_group.controller.id]
# boot
user_data = data.ct_config.controllers.*.rendered[count.index]
# cost
credit_specification {
cpu_credits = var.controller_cpu_credits
}
lifecycle {
ignore_changes = [
ami,
@ -61,7 +69,6 @@ data "ct_config" "controllers" {
kubeconfig = indent(10, module.bootstrap.kubeconfig-kubelet)
ssh_authorized_key = var.ssh_authorized_key
cluster_dns_service_ip = cidrhost(var.service_cidr, 10)
cluster_domain_suffix = var.cluster_domain_suffix
})
strict = true
snippets = var.controller_snippets

View File

@ -47,17 +47,25 @@ resource "aws_route" "egress-ipv6" {
resource "aws_subnet" "public" {
count = length(data.aws_availability_zones.all.names)
vpc_id = aws_vpc.network.id
availability_zone = data.aws_availability_zones.all.names[count.index]
cidr_block = cidrsubnet(var.host_cidr, 4, count.index)
ipv6_cidr_block = cidrsubnet(aws_vpc.network.ipv6_cidr_block, 8, count.index)
map_public_ip_on_launch = true
assign_ipv6_address_on_creation = true
tags = {
"Name" = "${var.cluster_name}-public-${count.index}"
}
vpc_id = aws_vpc.network.id
availability_zone = data.aws_availability_zones.all.names[count.index]
# IPv4 and IPv6 CIDR blocks
cidr_block = cidrsubnet(var.host_cidr, 4, count.index)
ipv6_cidr_block = cidrsubnet(aws_vpc.network.ipv6_cidr_block, 8, count.index)
# Assign IPv4 and IPv6 addresses to instances
map_public_ip_on_launch = true
assign_ipv6_address_on_creation = true
# Hostnames assigned to instances
# resource-name: <ec2-instance-id>.region.compute.internal
private_dns_hostname_type_on_launch = "resource-name"
enable_resource_name_dns_a_record_on_launch = true
enable_resource_name_dns_aaaa_record_on_launch = true
}
resource "aws_route_table_association" "public" {

View File

@ -17,30 +17,6 @@ variable "dns_zone_id" {
# instances
variable "controller_count" {
type = number
description = "Number of controllers (i.e. masters)"
default = 1
}
variable "worker_count" {
type = number
description = "Number of workers"
default = 1
}
variable "controller_type" {
type = string
description = "EC2 instance type for controllers"
default = "t3.small"
}
variable "worker_type" {
type = string
description = "EC2 instance type for workers"
default = "t3.small"
}
variable "os_stream" {
type = string
description = "Fedora CoreOS image stream for instances (e.g. stable, testing, next)"
@ -52,24 +28,78 @@ variable "os_stream" {
}
}
variable "disk_size" {
variable "controller_count" {
type = number
description = "Number of controllers (i.e. masters)"
default = 1
}
variable "controller_type" {
type = string
description = "EC2 instance type for controllers"
default = "t3.small"
}
variable "controller_disk_size" {
type = number
description = "Size of the EBS volume in GB"
default = 30
}
variable "disk_type" {
variable "controller_disk_type" {
type = string
description = "Type of the EBS volume (e.g. standard, gp2, gp3, io1)"
default = "gp3"
}
variable "disk_iops" {
variable "controller_disk_iops" {
type = number
description = "IOPS of the EBS volume (e.g. 3000)"
default = 3000
}
variable "controller_cpu_credits" {
type = string
description = "CPU credits mode (if using a burstable instance type)"
default = null
}
variable "worker_count" {
type = number
description = "Number of workers"
default = 1
}
variable "worker_type" {
type = string
description = "EC2 instance type for workers"
default = "t3.small"
}
variable "worker_disk_size" {
type = number
description = "Size of the EBS volume in GB"
default = 30
}
variable "worker_disk_type" {
type = string
description = "Type of the EBS volume (e.g. standard, gp2, gp3, io1)"
default = "gp3"
}
variable "worker_disk_iops" {
type = number
description = "IOPS of the EBS volume (e.g. 3000)"
default = 3000
}
variable "worker_cpu_credits" {
type = string
description = "CPU credits mode (if using a burstable instance type)"
default = null
}
variable "worker_price" {
type = number
description = "Spot price in USD for worker instances or 0 to use on-demand instances"
@ -134,40 +164,31 @@ EOD
default = "10.3.0.0/16"
}
variable "enable_reporting" {
type = bool
description = "Enable usage or analytics reporting to upstreams (Calico)"
default = false
}
variable "enable_aggregation" {
type = bool
description = "Enable the Kubernetes Aggregation Layer"
default = true
}
variable "worker_node_labels" {
type = list(string)
description = "List of initial worker node labels"
default = []
}
# unofficial, undocumented, unsupported
# advanced
variable "cluster_domain_suffix" {
variable "controller_arch" {
type = string
description = "Queries for domains with the suffix will be answered by CoreDNS. Default is cluster.local (e.g. foo.default.svc.cluster.local)"
default = "cluster.local"
description = "Controller node(s) architecture (amd64 or arm64)"
default = "amd64"
validation {
condition = contains(["amd64", "arm64"], var.controller_arch)
error_message = "The controller_arch must be amd64 or arm64."
}
}
variable "arch" {
variable "worker_arch" {
type = string
description = "Container architecture (amd64 or arm64)"
description = "Worker node(s) architecture (amd64 or arm64)"
default = "amd64"
validation {
condition = var.arch == "amd64" || var.arch == "arm64"
error_message = "The arch must be amd64 or arm64."
condition = contains(["amd64", "arm64"], var.worker_arch)
error_message = "The worker_arch must be amd64 or arm64."
}
}

View File

@ -6,20 +6,24 @@ module "workers" {
vpc_id = aws_vpc.network.id
subnet_ids = aws_subnet.public.*.id
security_groups = [aws_security_group.worker.id]
worker_count = var.worker_count
instance_type = var.worker_type
os_stream = var.os_stream
arch = var.arch
disk_size = var.disk_size
spot_price = var.worker_price
target_groups = var.worker_target_groups
# instances
os_stream = var.os_stream
worker_count = var.worker_count
instance_type = var.worker_type
arch = var.worker_arch
disk_type = var.worker_disk_type
disk_size = var.worker_disk_size
disk_iops = var.worker_disk_iops
cpu_credits = var.worker_cpu_credits
spot_price = var.worker_price
target_groups = var.worker_target_groups
# configuration
kubeconfig = module.bootstrap.kubeconfig-kubelet
ssh_authorized_key = var.ssh_authorized_key
service_cidr = var.service_cidr
cluster_domain_suffix = var.cluster_domain_suffix
snippets = var.worker_snippets
node_labels = var.worker_node_labels
kubeconfig = module.bootstrap.kubeconfig-kubelet
ssh_authorized_key = var.ssh_authorized_key
service_cidr = var.service_cidr
snippets = var.worker_snippets
node_labels = var.worker_node_labels
}

View File

@ -29,7 +29,7 @@ systemd:
After=afterburn.service
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.30.2
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.31.0
EnvironmentFile=/run/metadata/afterburn
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
@ -104,7 +104,7 @@ storage:
cgroupDriver: systemd
clusterDNS:
- ${cluster_dns_service_ip}
clusterDomain: ${cluster_domain_suffix}
clusterDomain: cluster.local
healthzPort: 0
rotateCertificates: true
shutdownGracePeriod: 45s

View File

@ -69,6 +69,12 @@ variable "spot_price" {
default = 0
}
variable "cpu_credits" {
type = string
description = "CPU burst credits mode (if applicable)"
default = null
}
variable "target_groups" {
type = list(string)
description = "Additional target group ARNs to which instances should be added"
@ -102,12 +108,6 @@ EOD
default = "10.3.0.0/16"
}
variable "cluster_domain_suffix" {
type = string
description = "Queries for domains with the suffix will be answered by coredns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
default = "cluster.local"
}
variable "node_labels" {
type = list(string)
description = "List of initial node labels"
@ -120,15 +120,14 @@ variable "node_taints" {
default = []
}
# unofficial, undocumented, unsupported
# advanced
variable "arch" {
type = string
description = "Container architecture (amd64 or arm64)"
default = "amd64"
validation {
condition = var.arch == "amd64" || var.arch == "arm64"
condition = contains(["amd64", "arm64"], var.arch)
error_message = "The arch must be amd64 or arm64."
}
}

View File

@ -3,16 +3,14 @@ resource "aws_autoscaling_group" "workers" {
name = "${var.name}-worker"
# count
desired_capacity = var.worker_count
min_size = var.worker_count
max_size = var.worker_count + 2
default_cooldown = 30
health_check_grace_period = 30
desired_capacity = var.worker_count
min_size = var.worker_count
max_size = var.worker_count + 2
# network
vpc_zone_identifier = var.subnet_ids
# template
# instance template
launch_template {
id = aws_launch_template.worker.id
version = aws_launch_template.worker.latest_version
@ -32,6 +30,11 @@ resource "aws_autoscaling_group" "workers" {
min_healthy_percentage = 90
}
}
# Grace period before checking new instance's health
health_check_grace_period = 30
# Cooldown period between scaling activities
default_cooldown = 30
lifecycle {
# override the default destroy and replace update behavior
@ -56,11 +59,6 @@ resource "aws_launch_template" "worker" {
name_prefix = "${var.name}-worker"
image_id = local.ami_id
instance_type = var.instance_type
monitoring {
enabled = false
}
user_data = sensitive(base64encode(data.ct_config.worker.rendered))
# storage
ebs_optimized = true
@ -76,14 +74,26 @@ resource "aws_launch_template" "worker" {
}
# network
vpc_security_group_ids = var.security_groups
network_interfaces {
associate_public_ip_address = true
security_groups = var.security_groups
}
# boot
user_data = sensitive(base64encode(data.ct_config.worker.rendered))
# metadata
metadata_options {
http_tokens = "optional"
}
monitoring {
enabled = false
}
# spot
# cost
credit_specification {
cpu_credits = var.cpu_credits
}
dynamic "instance_market_options" {
for_each = var.spot_price > 0 ? [1] : []
content {
@ -107,7 +117,6 @@ data "ct_config" "worker" {
kubeconfig = indent(10, var.kubeconfig)
ssh_authorized_key = var.ssh_authorized_key
cluster_dns_service_ip = cidrhost(var.service_cidr, 10)
cluster_domain_suffix = var.cluster_domain_suffix
node_labels = join(",", var.node_labels)
node_taints = join(",", var.node_taints)
})

View File

@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.30.2 (upstream)
* Kubernetes v1.31.0 (upstream)
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [spot](https://typhoon.psdn.io/flatcar-linux/aws/#spot) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#hosts) customization

View File

@ -1,7 +1,7 @@
locals {
# Pick a Flatcar Linux AMI
# flatcar-stable -> Flatcar Linux AMI
ami_id = var.arch == "arm64" ? data.aws_ami.flatcar-arm64[0].image_id : data.aws_ami.flatcar.image_id
ami_id = var.controller_arch == "arm64" ? data.aws_ami.flatcar-arm64[0].image_id : data.aws_ami.flatcar.image_id
channel = split("-", var.os_image)[1]
}
@ -26,7 +26,7 @@ data "aws_ami" "flatcar" {
}
data "aws_ami" "flatcar-arm64" {
count = var.arch == "arm64" ? 1 : 0
count = var.controller_arch == "arm64" ? 1 : 0
most_recent = true
owners = ["075585003325"]

View File

@ -1,6 +1,6 @@
# Kubernetes assets (kubeconfig, manifests)
module "bootstrap" {
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=886f501bf7b624fc12acac83449b81d0dc8b8849"
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=1ddecb1cef65c9715ed66b6c335634bc51f59613"
cluster_name = var.cluster_name
api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)]
@ -9,9 +9,6 @@ module "bootstrap" {
network_mtu = var.network_mtu
pod_cidr = var.pod_cidr
service_cidr = var.service_cidr
cluster_domain_suffix = var.cluster_domain_suffix
enable_reporting = var.enable_reporting
enable_aggregation = var.enable_aggregation
daemonset_tolerations = var.daemonset_tolerations
components = var.components
}

View File

@ -58,7 +58,7 @@ systemd:
After=coreos-metadata.service
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.30.2
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.31.0
EnvironmentFile=/run/metadata/coreos
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
@ -109,7 +109,7 @@ systemd:
Type=oneshot
RemainAfterExit=true
WorkingDirectory=/opt/bootstrap
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.30.2
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.31.0
ExecStart=/usr/bin/docker run \
-v /etc/kubernetes/pki:/etc/kubernetes/pki:ro \
-v /opt/bootstrap/assets:/assets:ro \
@ -148,7 +148,7 @@ storage:
cgroupDriver: systemd
clusterDNS:
- ${cluster_dns_service_ip}
clusterDomain: ${cluster_domain_suffix}
clusterDomain: cluster.local
healthzPort: 0
rotateCertificates: true
shutdownGracePeriod: 45s

View File

@ -20,19 +20,18 @@ resource "aws_instance" "controllers" {
tags = {
Name = "${var.cluster_name}-controller-${count.index}"
}
instance_type = var.controller_type
ami = local.ami_id
user_data = data.ct_config.controllers.*.rendered[count.index]
ami = local.ami_id
# storage
root_block_device {
volume_type = var.disk_type
volume_size = var.disk_size
iops = var.disk_iops
volume_type = var.controller_disk_type
volume_size = var.controller_disk_size
iops = var.controller_disk_iops
encrypted = true
tags = {}
tags = {
Name = "${var.cluster_name}-controller-${count.index}"
}
}
# network
@ -40,6 +39,14 @@ resource "aws_instance" "controllers" {
subnet_id = element(aws_subnet.public.*.id, count.index)
vpc_security_group_ids = [aws_security_group.controller.id]
# boot
user_data = data.ct_config.controllers.*.rendered[count.index]
# cost
credit_specification {
cpu_credits = var.controller_cpu_credits
}
lifecycle {
ignore_changes = [
ami,
@ -62,7 +69,6 @@ data "ct_config" "controllers" {
kubeconfig = indent(10, module.bootstrap.kubeconfig-kubelet)
ssh_authorized_key = var.ssh_authorized_key
cluster_dns_service_ip = cidrhost(var.service_cidr, 10)
cluster_domain_suffix = var.cluster_domain_suffix
})
strict = true
snippets = var.controller_snippets

View File

@ -47,17 +47,25 @@ resource "aws_route" "egress-ipv6" {
resource "aws_subnet" "public" {
count = length(data.aws_availability_zones.all.names)
vpc_id = aws_vpc.network.id
availability_zone = data.aws_availability_zones.all.names[count.index]
cidr_block = cidrsubnet(var.host_cidr, 4, count.index)
ipv6_cidr_block = cidrsubnet(aws_vpc.network.ipv6_cidr_block, 8, count.index)
map_public_ip_on_launch = true
assign_ipv6_address_on_creation = true
tags = {
"Name" = "${var.cluster_name}-public-${count.index}"
}
vpc_id = aws_vpc.network.id
availability_zone = data.aws_availability_zones.all.names[count.index]
# IPv4 and IPv6 CIDR blocks
cidr_block = cidrsubnet(var.host_cidr, 4, count.index)
ipv6_cidr_block = cidrsubnet(aws_vpc.network.ipv6_cidr_block, 8, count.index)
# Assign IPv4 and IPv6 addresses to instances
map_public_ip_on_launch = true
assign_ipv6_address_on_creation = true
# Hostnames assigned to instances
# resource-name: <ec2-instance-id>.region.compute.internal
private_dns_hostname_type_on_launch = "resource-name"
enable_resource_name_dns_a_record_on_launch = true
enable_resource_name_dns_aaaa_record_on_launch = true
}
resource "aws_route_table_association" "public" {

View File

@ -17,30 +17,6 @@ variable "dns_zone_id" {
# instances
variable "controller_count" {
type = number
description = "Number of controllers (i.e. masters)"
default = 1
}
variable "worker_count" {
type = number
description = "Number of workers"
default = 1
}
variable "controller_type" {
type = string
description = "EC2 instance type for controllers"
default = "t3.small"
}
variable "worker_type" {
type = string
description = "EC2 instance type for workers"
default = "t3.small"
}
variable "os_image" {
type = string
description = "AMI channel for a Container Linux derivative (flatcar-stable, flatcar-beta, flatcar-alpha)"
@ -52,24 +28,78 @@ variable "os_image" {
}
}
variable "disk_size" {
variable "controller_count" {
type = number
description = "Number of controllers (i.e. masters)"
default = 1
}
variable "controller_type" {
type = string
description = "EC2 instance type for controllers"
default = "t3.small"
}
variable "controller_disk_size" {
type = number
description = "Size of the EBS volume in GB"
default = 30
}
variable "disk_type" {
variable "controller_disk_type" {
type = string
description = "Type of the EBS volume (e.g. standard, gp2, gp3, io1)"
default = "gp3"
}
variable "disk_iops" {
variable "controller_disk_iops" {
type = number
description = "IOPS of the EBS volume (e.g. 3000)"
default = 3000
}
variable "controller_cpu_credits" {
type = string
description = "CPU credits mode (if using a burstable instance type)"
default = null
}
variable "worker_count" {
type = number
description = "Number of workers"
default = 1
}
variable "worker_type" {
type = string
description = "EC2 instance type for workers"
default = "t3.small"
}
variable "worker_disk_size" {
type = number
description = "Size of the EBS volume in GB"
default = 30
}
variable "worker_disk_type" {
type = string
description = "Type of the EBS volume (e.g. standard, gp2, gp3, io1)"
default = "gp3"
}
variable "worker_disk_iops" {
type = number
description = "IOPS of the EBS volume (e.g. 3000)"
default = 3000
}
variable "worker_cpu_credits" {
type = string
description = "CPU credits mode (if using a burstable instance type)"
default = null
}
variable "worker_price" {
type = number
description = "Spot price in USD for worker instances or 0 to use on-demand instances"
@ -134,40 +164,31 @@ EOD
default = "10.3.0.0/16"
}
variable "enable_reporting" {
type = bool
description = "Enable usage or analytics reporting to upstreams (Calico)"
default = false
}
variable "enable_aggregation" {
type = bool
description = "Enable the Kubernetes Aggregation Layer"
default = true
}
variable "worker_node_labels" {
type = list(string)
description = "List of initial worker node labels"
default = []
}
# unofficial, undocumented, unsupported
# advanced
variable "cluster_domain_suffix" {
variable "controller_arch" {
type = string
description = "Queries for domains with the suffix will be answered by CoreDNS. Default is cluster.local (e.g. foo.default.svc.cluster.local)"
default = "cluster.local"
description = "Controller node(s) architecture (amd64 or arm64)"
default = "amd64"
validation {
condition = contains(["amd64", "arm64"], var.controller_arch)
error_message = "The controller_arch must be amd64 or arm64."
}
}
variable "arch" {
variable "worker_arch" {
type = string
description = "Container architecture (amd64 or arm64)"
description = "Worker node(s) architecture (amd64 or arm64)"
default = "amd64"
validation {
condition = var.arch == "amd64" || var.arch == "arm64"
error_message = "The arch must be amd64 or arm64."
condition = contains(["amd64", "arm64"], var.worker_arch)
error_message = "The worker_arch must be amd64 or arm64."
}
}

View File

@ -6,20 +6,23 @@ module "workers" {
vpc_id = aws_vpc.network.id
subnet_ids = aws_subnet.public.*.id
security_groups = [aws_security_group.worker.id]
worker_count = var.worker_count
instance_type = var.worker_type
os_image = var.os_image
arch = var.arch
disk_size = var.disk_size
spot_price = var.worker_price
target_groups = var.worker_target_groups
# instances
os_image = var.os_image
worker_count = var.worker_count
instance_type = var.worker_type
arch = var.worker_arch
disk_type = var.worker_disk_type
disk_size = var.worker_disk_size
disk_iops = var.worker_disk_iops
spot_price = var.worker_price
target_groups = var.worker_target_groups
# configuration
kubeconfig = module.bootstrap.kubeconfig-kubelet
ssh_authorized_key = var.ssh_authorized_key
service_cidr = var.service_cidr
cluster_domain_suffix = var.cluster_domain_suffix
snippets = var.worker_snippets
node_labels = var.worker_node_labels
kubeconfig = module.bootstrap.kubeconfig-kubelet
ssh_authorized_key = var.ssh_authorized_key
service_cidr = var.service_cidr
snippets = var.worker_snippets
node_labels = var.worker_node_labels
}

View File

@ -30,7 +30,7 @@ systemd:
After=coreos-metadata.service
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.30.2
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.31.0
EnvironmentFile=/run/metadata/coreos
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
@ -103,7 +103,7 @@ storage:
cgroupDriver: systemd
clusterDNS:
- ${cluster_dns_service_ip}
clusterDomain: ${cluster_domain_suffix}
clusterDomain: cluster.local
healthzPort: 0
rotateCertificates: true
shutdownGracePeriod: 45s

View File

@ -69,6 +69,12 @@ variable "spot_price" {
default = 0
}
variable "cpu_credits" {
type = string
description = "CPU burst credits mode (if applicable)"
default = null
}
variable "target_groups" {
type = list(string)
description = "Additional target group ARNs to which instances should be added"
@ -102,12 +108,6 @@ EOD
default = "10.3.0.0/16"
}
variable "cluster_domain_suffix" {
type = string
description = "Queries for domains with the suffix will be answered by coredns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
default = "cluster.local"
}
variable "node_labels" {
type = list(string)
description = "List of initial node labels"
@ -128,7 +128,7 @@ variable "arch" {
default = "amd64"
validation {
condition = var.arch == "amd64" || var.arch == "arm64"
condition = contains(["amd64", "arm64"], var.arch)
error_message = "The arch must be amd64 or arm64."
}
}

View File

@ -3,16 +3,14 @@ resource "aws_autoscaling_group" "workers" {
name = "${var.name}-worker"
# count
desired_capacity = var.worker_count
min_size = var.worker_count
max_size = var.worker_count + 2
default_cooldown = 30
health_check_grace_period = 30
desired_capacity = var.worker_count
min_size = var.worker_count
max_size = var.worker_count + 2
# network
vpc_zone_identifier = var.subnet_ids
# template
# instance template
launch_template {
id = aws_launch_template.worker.id
version = aws_launch_template.worker.latest_version
@ -32,6 +30,10 @@ resource "aws_autoscaling_group" "workers" {
min_healthy_percentage = 90
}
}
# Grace period before checking new instance's health
health_check_grace_period = 30
# Cooldown period between scaling activities
default_cooldown = 30
lifecycle {
# override the default destroy and replace update behavior
@ -56,11 +58,6 @@ resource "aws_launch_template" "worker" {
name_prefix = "${var.name}-worker"
image_id = local.ami_id
instance_type = var.instance_type
monitoring {
enabled = false
}
user_data = sensitive(base64encode(data.ct_config.worker.rendered))
# storage
ebs_optimized = true
@ -76,14 +73,26 @@ resource "aws_launch_template" "worker" {
}
# network
vpc_security_group_ids = var.security_groups
network_interfaces {
associate_public_ip_address = true
security_groups = var.security_groups
}
# boot
user_data = sensitive(base64encode(data.ct_config.worker.rendered))
# metadata
metadata_options {
http_tokens = "optional"
}
monitoring {
enabled = false
}
# spot
# cost
credit_specification {
cpu_credits = var.cpu_credits
}
dynamic "instance_market_options" {
for_each = var.spot_price > 0 ? [1] : []
content {
@ -107,7 +116,6 @@ data "ct_config" "worker" {
kubeconfig = indent(10, var.kubeconfig)
ssh_authorized_key = var.ssh_authorized_key
cluster_dns_service_ip = cidrhost(var.service_cidr, 10)
cluster_domain_suffix = var.cluster_domain_suffix
node_labels = join(",", var.node_labels)
node_taints = join(",", var.node_taints)
})

View File

@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.30.2 (upstream)
* Kubernetes v1.31.0 (upstream)
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/), SELinux enforcing
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [spot priority](https://typhoon.psdn.io/fedora-coreos/azure/#low-priority) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#hosts) customization

View File

@ -1,6 +1,6 @@
# Kubernetes assets (kubeconfig, manifests)
module "bootstrap" {
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=886f501bf7b624fc12acac83449b81d0dc8b8849"
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=1ddecb1cef65c9715ed66b6c335634bc51f59613"
cluster_name = var.cluster_name
api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)]
@ -14,9 +14,6 @@ module "bootstrap" {
pod_cidr = var.pod_cidr
service_cidr = var.service_cidr
cluster_domain_suffix = var.cluster_domain_suffix
enable_reporting = var.enable_reporting
enable_aggregation = var.enable_aggregation
daemonset_tolerations = var.daemonset_tolerations
components = var.components
}

View File

@ -54,7 +54,7 @@ systemd:
Description=Kubelet (System Container)
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.30.2
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.31.0
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /opt/cni/bin
@ -111,7 +111,7 @@ systemd:
--volume /opt/bootstrap/assets:/assets:ro,Z \
--volume /opt/bootstrap/apply:/apply:ro,Z \
--entrypoint=/apply \
quay.io/poseidon/kubelet:v1.30.2
quay.io/poseidon/kubelet:v1.31.0
ExecStartPost=/bin/touch /opt/bootstrap/bootstrap.done
ExecStartPost=-/usr/bin/podman stop bootstrap
storage:
@ -144,7 +144,7 @@ storage:
cgroupDriver: systemd
clusterDNS:
- ${cluster_dns_service_ip}
clusterDomain: ${cluster_domain_suffix}
clusterDomain: cluster.local
healthzPort: 0
rotateCertificates: true
shutdownGracePeriod: 45s

View File

@ -8,26 +8,23 @@ locals {
# Discrete DNS records for each controller's private IPv4 for etcd usage
resource "azurerm_dns_a_record" "etcds" {
count = var.controller_count
resource_group_name = var.dns_zone_group
count = var.controller_count
# DNS Zone name where record should be created
zone_name = var.dns_zone
zone_name = var.dns_zone
resource_group_name = var.dns_zone_group
# DNS record
name = format("%s-etcd%d", var.cluster_name, count.index)
ttl = 300
# private IPv4 address for etcd
records = [azurerm_network_interface.controllers.*.private_ip_address[count.index]]
records = [azurerm_network_interface.controllers[count.index].private_ip_address]
}
# Controller availability set to spread controllers
resource "azurerm_availability_set" "controllers" {
resource_group_name = azurerm_resource_group.cluster.name
name = "${var.cluster_name}-controllers"
location = var.region
resource_group_name = azurerm_resource_group.cluster.name
location = var.location
platform_fault_domain_count = 2
platform_update_domain_count = 4
managed = true
@ -35,31 +32,35 @@ resource "azurerm_availability_set" "controllers" {
# Controller instances
resource "azurerm_linux_virtual_machine" "controllers" {
count = var.controller_count
resource_group_name = azurerm_resource_group.cluster.name
count = var.controller_count
name = "${var.cluster_name}-controller-${count.index}"
location = var.region
resource_group_name = azurerm_resource_group.cluster.name
location = var.location
availability_set_id = azurerm_availability_set.controllers.id
size = var.controller_type
custom_data = base64encode(data.ct_config.controllers.*.rendered[count.index])
size = var.controller_type
# storage
source_image_id = var.os_image
os_disk {
name = "${var.cluster_name}-controller-${count.index}"
storage_account_type = var.controller_disk_type
disk_size_gb = var.controller_disk_size
caching = "None"
disk_size_gb = var.disk_size
storage_account_type = "Premium_LRS"
}
# network
network_interface_ids = [
azurerm_network_interface.controllers.*.id[count.index]
azurerm_network_interface.controllers[count.index].id
]
# Azure requires setting admin_ssh_key, though Ignition custom_data handles it too
# boot
custom_data = base64encode(data.ct_config.controllers[count.index].rendered)
boot_diagnostics {
# defaults to a managed storage account
}
# Azure requires an RSA admin_ssh_key
admin_username = "core"
admin_ssh_key {
username = "core"
@ -74,31 +75,52 @@ resource "azurerm_linux_virtual_machine" "controllers" {
}
}
# Controller public IPv4 addresses
resource "azurerm_public_ip" "controllers" {
count = var.controller_count
resource_group_name = azurerm_resource_group.cluster.name
# Controller node public IPv4 addresses
resource "azurerm_public_ip" "controllers-ipv4" {
count = var.controller_count
name = "${var.cluster_name}-controller-${count.index}"
location = azurerm_resource_group.cluster.location
sku = "Standard"
allocation_method = "Static"
name = "${var.cluster_name}-controller-${count.index}-ipv4"
resource_group_name = azurerm_resource_group.cluster.name
location = azurerm_resource_group.cluster.location
ip_version = "IPv4"
sku = "Standard"
allocation_method = "Static"
}
# Controller NICs with public and private IPv4
resource "azurerm_network_interface" "controllers" {
count = var.controller_count
resource_group_name = azurerm_resource_group.cluster.name
# Controller node public IPv6 addresses
resource "azurerm_public_ip" "controllers-ipv6" {
count = var.controller_count
name = "${var.cluster_name}-controller-${count.index}"
location = azurerm_resource_group.cluster.location
name = "${var.cluster_name}-controller-${count.index}-ipv6"
resource_group_name = azurerm_resource_group.cluster.name
location = azurerm_resource_group.cluster.location
ip_version = "IPv6"
sku = "Standard"
allocation_method = "Static"
}
# Controllers' network interfaces
resource "azurerm_network_interface" "controllers" {
count = var.controller_count
name = "${var.cluster_name}-controller-${count.index}"
resource_group_name = azurerm_resource_group.cluster.name
location = azurerm_resource_group.cluster.location
ip_configuration {
name = "ip0"
name = "ipv4"
primary = true
subnet_id = azurerm_subnet.controller.id
private_ip_address_allocation = "Dynamic"
# instance public IPv4
public_ip_address_id = azurerm_public_ip.controllers.*.id[count.index]
private_ip_address_version = "IPv4"
public_ip_address_id = azurerm_public_ip.controllers-ipv4[count.index].id
}
ip_configuration {
name = "ipv6"
subnet_id = azurerm_subnet.controller.id
private_ip_address_allocation = "Dynamic"
private_ip_address_version = "IPv6"
public_ip_address_id = azurerm_public_ip.controllers-ipv6[count.index].id
}
}
@ -111,12 +133,20 @@ resource "azurerm_network_interface_security_group_association" "controllers" {
}
# Associate controller network interface with controller backend address pool
resource "azurerm_network_interface_backend_address_pool_association" "controllers" {
resource "azurerm_network_interface_backend_address_pool_association" "controllers-ipv4" {
count = var.controller_count
network_interface_id = azurerm_network_interface.controllers[count.index].id
ip_configuration_name = "ip0"
backend_address_pool_id = azurerm_lb_backend_address_pool.controller.id
ip_configuration_name = "ipv4"
backend_address_pool_id = azurerm_lb_backend_address_pool.controller-ipv4.id
}
resource "azurerm_network_interface_backend_address_pool_association" "controllers-ipv6" {
count = var.controller_count
network_interface_id = azurerm_network_interface.controllers[count.index].id
ip_configuration_name = "ipv6"
backend_address_pool_id = azurerm_lb_backend_address_pool.controller-ipv6.id
}
# Fedora CoreOS controllers
@ -133,7 +163,6 @@ data "ct_config" "controllers" {
kubeconfig = indent(10, module.bootstrap.kubeconfig-kubelet)
ssh_authorized_key = var.ssh_authorized_key
cluster_dns_service_ip = cidrhost(var.service_cidr, 10)
cluster_domain_suffix = var.cluster_domain_suffix
})
strict = true
snippets = var.controller_snippets

View File

@ -1,116 +1,164 @@
# DNS record for the apiserver load balancer
# DNS A record for the apiserver load balancer
resource "azurerm_dns_a_record" "apiserver" {
resource_group_name = var.dns_zone_group
# DNS Zone name where record should be created
zone_name = var.dns_zone
zone_name = var.dns_zone
resource_group_name = var.dns_zone_group
# DNS record
name = var.cluster_name
ttl = 300
# IPv4 address of apiserver load balancer
records = [azurerm_public_ip.apiserver-ipv4.ip_address]
records = [azurerm_public_ip.frontend-ipv4.ip_address]
}
# Static IPv4 address for the apiserver frontend
resource "azurerm_public_ip" "apiserver-ipv4" {
resource_group_name = azurerm_resource_group.cluster.name
name = "${var.cluster_name}-apiserver-ipv4"
location = var.region
sku = "Standard"
allocation_method = "Static"
# DNS AAAA record for the apiserver load balancer
resource "azurerm_dns_aaaa_record" "apiserver" {
# DNS Zone name where record should be created
zone_name = var.dns_zone
resource_group_name = var.dns_zone_group
# DNS record
name = var.cluster_name
ttl = 300
# IPv4 address of apiserver load balancer
records = [azurerm_public_ip.frontend-ipv6.ip_address]
}
# Static IPv4 address for the ingress frontend
resource "azurerm_public_ip" "ingress-ipv4" {
# Static IPv4 address for the load balancer
resource "azurerm_public_ip" "frontend-ipv4" {
name = "${var.cluster_name}-frontend-ipv4"
resource_group_name = azurerm_resource_group.cluster.name
location = var.location
ip_version = "IPv4"
sku = "Standard"
allocation_method = "Static"
}
name = "${var.cluster_name}-ingress-ipv4"
location = var.region
sku = "Standard"
allocation_method = "Static"
# Static IPv6 address for the load balancer
resource "azurerm_public_ip" "frontend-ipv6" {
name = "${var.cluster_name}-frontend-ipv6"
resource_group_name = azurerm_resource_group.cluster.name
location = var.location
ip_version = "IPv6"
sku = "Standard"
allocation_method = "Static"
}
# Network Load Balancer for apiservers and ingress
resource "azurerm_lb" "cluster" {
name = var.cluster_name
resource_group_name = azurerm_resource_group.cluster.name
name = var.cluster_name
location = var.region
sku = "Standard"
location = var.location
sku = "Standard"
frontend_ip_configuration {
name = "apiserver"
public_ip_address_id = azurerm_public_ip.apiserver-ipv4.id
name = "frontend-ipv4"
public_ip_address_id = azurerm_public_ip.frontend-ipv4.id
}
frontend_ip_configuration {
name = "ingress"
public_ip_address_id = azurerm_public_ip.ingress-ipv4.id
name = "frontend-ipv6"
public_ip_address_id = azurerm_public_ip.frontend-ipv6.id
}
}
resource "azurerm_lb_rule" "apiserver" {
name = "apiserver"
resource "azurerm_lb_rule" "apiserver-ipv4" {
name = "apiserver-ipv4"
loadbalancer_id = azurerm_lb.cluster.id
frontend_ip_configuration_name = "apiserver"
frontend_ip_configuration_name = "frontend-ipv4"
disable_outbound_snat = true
protocol = "Tcp"
frontend_port = 6443
backend_port = 6443
backend_address_pool_ids = [azurerm_lb_backend_address_pool.controller.id]
backend_address_pool_ids = [azurerm_lb_backend_address_pool.controller-ipv4.id]
probe_id = azurerm_lb_probe.apiserver.id
}
resource "azurerm_lb_rule" "ingress-http" {
name = "ingress-http"
resource "azurerm_lb_rule" "apiserver-ipv6" {
name = "apiserver-ipv6"
loadbalancer_id = azurerm_lb.cluster.id
frontend_ip_configuration_name = "ingress"
frontend_ip_configuration_name = "frontend-ipv6"
disable_outbound_snat = true
protocol = "Tcp"
frontend_port = 6443
backend_port = 6443
backend_address_pool_ids = [azurerm_lb_backend_address_pool.controller-ipv6.id]
probe_id = azurerm_lb_probe.apiserver.id
}
resource "azurerm_lb_rule" "ingress-http-ipv4" {
name = "ingress-http-ipv4"
loadbalancer_id = azurerm_lb.cluster.id
frontend_ip_configuration_name = "frontend-ipv4"
disable_outbound_snat = true
protocol = "Tcp"
frontend_port = 80
backend_port = 80
backend_address_pool_ids = [azurerm_lb_backend_address_pool.worker.id]
backend_address_pool_ids = [azurerm_lb_backend_address_pool.worker-ipv4.id]
probe_id = azurerm_lb_probe.ingress.id
}
resource "azurerm_lb_rule" "ingress-https" {
name = "ingress-https"
resource "azurerm_lb_rule" "ingress-https-ipv4" {
name = "ingress-https-ipv4"
loadbalancer_id = azurerm_lb.cluster.id
frontend_ip_configuration_name = "ingress"
frontend_ip_configuration_name = "frontend-ipv4"
disable_outbound_snat = true
protocol = "Tcp"
frontend_port = 443
backend_port = 443
backend_address_pool_ids = [azurerm_lb_backend_address_pool.worker.id]
backend_address_pool_ids = [azurerm_lb_backend_address_pool.worker-ipv4.id]
probe_id = azurerm_lb_probe.ingress.id
}
# Worker outbound TCP/UDP SNAT
resource "azurerm_lb_outbound_rule" "worker-outbound" {
name = "worker"
loadbalancer_id = azurerm_lb.cluster.id
frontend_ip_configuration {
name = "ingress"
}
resource "azurerm_lb_rule" "ingress-http-ipv6" {
name = "ingress-http-ipv6"
loadbalancer_id = azurerm_lb.cluster.id
frontend_ip_configuration_name = "frontend-ipv6"
disable_outbound_snat = true
protocol = "All"
backend_address_pool_id = azurerm_lb_backend_address_pool.worker.id
protocol = "Tcp"
frontend_port = 80
backend_port = 80
backend_address_pool_ids = [azurerm_lb_backend_address_pool.worker-ipv6.id]
probe_id = azurerm_lb_probe.ingress.id
}
resource "azurerm_lb_rule" "ingress-https-ipv6" {
name = "ingress-https-ipv6"
loadbalancer_id = azurerm_lb.cluster.id
frontend_ip_configuration_name = "frontend-ipv6"
disable_outbound_snat = true
protocol = "Tcp"
frontend_port = 443
backend_port = 443
backend_address_pool_ids = [azurerm_lb_backend_address_pool.worker-ipv6.id]
probe_id = azurerm_lb_probe.ingress.id
}
# Backend Address Pools
# Address pool of controllers
resource "azurerm_lb_backend_address_pool" "controller" {
name = "controller"
resource "azurerm_lb_backend_address_pool" "controller-ipv4" {
name = "controller-ipv4"
loadbalancer_id = azurerm_lb.cluster.id
}
resource "azurerm_lb_backend_address_pool" "controller-ipv6" {
name = "controller-ipv6"
loadbalancer_id = azurerm_lb.cluster.id
}
# Address pool of workers
resource "azurerm_lb_backend_address_pool" "worker" {
name = "worker"
resource "azurerm_lb_backend_address_pool" "worker-ipv4" {
name = "worker-ipv4"
loadbalancer_id = azurerm_lb.cluster.id
}
resource "azurerm_lb_backend_address_pool" "worker-ipv6" {
name = "worker-ipv6"
loadbalancer_id = azurerm_lb.cluster.id
}
@ -122,10 +170,8 @@ resource "azurerm_lb_probe" "apiserver" {
loadbalancer_id = azurerm_lb.cluster.id
protocol = "Tcp"
port = 6443
# unhealthy threshold
number_of_probes = 3
number_of_probes = 3
interval_in_seconds = 5
}
@ -136,10 +182,29 @@ resource "azurerm_lb_probe" "ingress" {
protocol = "Http"
port = 10254
request_path = "/healthz"
# unhealthy threshold
number_of_probes = 3
number_of_probes = 3
interval_in_seconds = 5
}
# Outbound SNAT
resource "azurerm_lb_outbound_rule" "outbound-ipv4" {
name = "outbound-ipv4"
protocol = "All"
loadbalancer_id = azurerm_lb.cluster.id
backend_address_pool_id = azurerm_lb_backend_address_pool.worker-ipv4.id
frontend_ip_configuration {
name = "frontend-ipv4"
}
}
resource "azurerm_lb_outbound_rule" "outbound-ipv6" {
name = "outbound-ipv6"
protocol = "All"
loadbalancer_id = azurerm_lb.cluster.id
backend_address_pool_id = azurerm_lb_backend_address_pool.worker-ipv6.id
frontend_ip_configuration {
name = "frontend-ipv6"
}
}

View File

@ -0,0 +1,6 @@
locals {
backend_address_pool_ids = {
ipv4 = [azurerm_lb_backend_address_pool.worker-ipv4.id]
ipv6 = [azurerm_lb_backend_address_pool.worker-ipv6.id]
}
}

View File

@ -1,27 +1,64 @@
# Choose an IPv6 ULA subnet at random
# https://datatracker.ietf.org/doc/html/rfc4193
resource "random_id" "ula-netnum" {
byte_length = 5 # 40 bits
}
locals {
# fd00::/8 -> shift 40 -> 2^40 possible /48 subnets
ula-range = cidrsubnet("fd00::/8", 40, random_id.ula-netnum.dec)
network_cidr = {
ipv4 = var.network_cidr.ipv4
ipv6 = length(var.network_cidr.ipv6) > 0 ? var.network_cidr.ipv6 : [local.ula-range]
}
# Subdivide the virtual network into subnets
# - controllers use netnum 0
# - workers use netnum 1
controller_subnets = {
ipv4 = [for i, cidr in local.network_cidr.ipv4 : cidrsubnet(cidr, 1, 0)]
ipv6 = [for i, cidr in local.network_cidr.ipv6 : cidrsubnet(cidr, 16, 0)]
}
worker_subnets = {
ipv4 = [for i, cidr in local.network_cidr.ipv4 : cidrsubnet(cidr, 1, 1)]
ipv6 = [for i, cidr in local.network_cidr.ipv6 : cidrsubnet(cidr, 16, 1)]
}
cluster_subnets = {
ipv4 = concat(local.controller_subnets.ipv4, local.worker_subnets.ipv4)
ipv6 = concat(local.controller_subnets.ipv6, local.worker_subnets.ipv6)
}
}
# Organize cluster into a resource group
resource "azurerm_resource_group" "cluster" {
name = var.cluster_name
location = var.region
location = var.location
}
resource "azurerm_virtual_network" "network" {
name = var.cluster_name
resource_group_name = azurerm_resource_group.cluster.name
name = var.cluster_name
location = azurerm_resource_group.cluster.location
address_space = [var.host_cidr]
location = azurerm_resource_group.cluster.location
address_space = concat(
local.network_cidr.ipv4,
local.network_cidr.ipv6
)
}
# Subnets - separate subnets for controller and workers because Azure
# network security groups are based on IPv4 CIDR rather than instance
# tags like GCP or security group membership like AWS
# Subnets - separate subnets for controllers and workers because Azure
# network security groups are oriented around address prefixes rather
# than instance tags (GCP) or security group membership (AWS)
resource "azurerm_subnet" "controller" {
resource_group_name = azurerm_resource_group.cluster.name
name = "controller"
resource_group_name = azurerm_resource_group.cluster.name
virtual_network_name = azurerm_virtual_network.network.name
address_prefixes = [cidrsubnet(var.host_cidr, 1, 0)]
address_prefixes = concat(
local.controller_subnets.ipv4,
local.controller_subnets.ipv6,
)
default_outbound_access_enabled = false
}
resource "azurerm_subnet_network_security_group_association" "controller" {
@ -30,11 +67,14 @@ resource "azurerm_subnet_network_security_group_association" "controller" {
}
resource "azurerm_subnet" "worker" {
resource_group_name = azurerm_resource_group.cluster.name
name = "worker"
resource_group_name = azurerm_resource_group.cluster.name
virtual_network_name = azurerm_virtual_network.network.name
address_prefixes = [cidrsubnet(var.host_cidr, 1, 1)]
address_prefixes = concat(
local.worker_subnets.ipv4,
local.worker_subnets.ipv6,
)
default_outbound_access_enabled = false
}
resource "azurerm_subnet_network_security_group_association" "worker" {

View File

@ -6,13 +6,18 @@ output "kubeconfig-admin" {
# Outputs for Kubernetes Ingress
output "ingress_static_ipv4" {
value = azurerm_public_ip.ingress-ipv4.ip_address
value = azurerm_public_ip.frontend-ipv4.ip_address
description = "IPv4 address of the load balancer for distributing traffic to Ingress controllers"
}
output "ingress_static_ipv6" {
value = azurerm_public_ip.frontend-ipv6.ip_address
description = "IPv6 address of the load balancer for distributing traffic to Ingress controllers"
}
# Outputs for worker pools
output "region" {
output "location" {
value = azurerm_resource_group.cluster.location
}
@ -51,12 +56,12 @@ output "worker_security_group_name" {
output "controller_address_prefixes" {
description = "Controller network subnet CIDR addresses (for source/destination)"
value = azurerm_subnet.controller.address_prefixes
value = local.controller_subnets
}
output "worker_address_prefixes" {
description = "Worker network subnet CIDR addresses (for source/destination)"
value = azurerm_subnet.worker.address_prefixes
value = local.worker_subnets
}
# Outputs for custom load balancing
@ -66,9 +71,12 @@ output "loadbalancer_id" {
value = azurerm_lb.cluster.id
}
output "backend_address_pool_id" {
description = "ID of the worker backend address pool"
value = azurerm_lb_backend_address_pool.worker.id
output "backend_address_pool_ids" {
description = "IDs of the worker backend address pools"
value = {
ipv4 = [azurerm_lb_backend_address_pool.worker-ipv4.id]
ipv6 = [azurerm_lb_backend_address_pool.worker-ipv6.id]
}
}
# Outputs for debug

View File

@ -1,214 +1,223 @@
# Controller security group
resource "azurerm_network_security_group" "controller" {
name = "${var.cluster_name}-controller"
resource_group_name = azurerm_resource_group.cluster.name
name = "${var.cluster_name}-controller"
location = azurerm_resource_group.cluster.location
location = azurerm_resource_group.cluster.location
}
resource "azurerm_network_security_rule" "controller-icmp" {
resource_group_name = azurerm_resource_group.cluster.name
for_each = local.controller_subnets
name = "allow-icmp"
name = "allow-icmp-${each.key}"
resource_group_name = azurerm_resource_group.cluster.name
network_security_group_name = azurerm_network_security_group.controller.name
priority = "1995"
priority = 1995 + (each.key == "ipv4" ? 0 : 1)
access = "Allow"
direction = "Inbound"
protocol = "Icmp"
source_port_range = "*"
destination_port_range = "*"
source_address_prefixes = concat(azurerm_subnet.controller.address_prefixes, azurerm_subnet.worker.address_prefixes)
destination_address_prefixes = azurerm_subnet.controller.address_prefixes
source_address_prefixes = local.cluster_subnets[each.key]
destination_address_prefixes = local.controller_subnets[each.key]
}
resource "azurerm_network_security_rule" "controller-ssh" {
resource_group_name = azurerm_resource_group.cluster.name
for_each = local.controller_subnets
name = "allow-ssh"
name = "allow-ssh-${each.key}"
resource_group_name = azurerm_resource_group.cluster.name
network_security_group_name = azurerm_network_security_group.controller.name
priority = "2000"
priority = 2000 + (each.key == "ipv4" ? 0 : 1)
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "22"
source_address_prefix = "*"
destination_address_prefixes = azurerm_subnet.controller.address_prefixes
destination_address_prefixes = local.controller_subnets[each.key]
}
resource "azurerm_network_security_rule" "controller-etcd" {
resource_group_name = azurerm_resource_group.cluster.name
for_each = local.controller_subnets
name = "allow-etcd"
name = "allow-etcd-${each.key}"
resource_group_name = azurerm_resource_group.cluster.name
network_security_group_name = azurerm_network_security_group.controller.name
priority = "2005"
priority = 2005 + (each.key == "ipv4" ? 0 : 1)
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "2379-2380"
source_address_prefixes = azurerm_subnet.controller.address_prefixes
destination_address_prefixes = azurerm_subnet.controller.address_prefixes
source_address_prefixes = local.controller_subnets[each.key]
destination_address_prefixes = local.controller_subnets[each.key]
}
# Allow Prometheus to scrape etcd metrics
resource "azurerm_network_security_rule" "controller-etcd-metrics" {
resource_group_name = azurerm_resource_group.cluster.name
for_each = local.controller_subnets
name = "allow-etcd-metrics"
name = "allow-etcd-metrics-${each.key}"
resource_group_name = azurerm_resource_group.cluster.name
network_security_group_name = azurerm_network_security_group.controller.name
priority = "2010"
priority = 2010 + (each.key == "ipv4" ? 0 : 1)
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "2381"
source_address_prefixes = azurerm_subnet.worker.address_prefixes
destination_address_prefixes = azurerm_subnet.controller.address_prefixes
source_address_prefixes = local.worker_subnets[each.key]
destination_address_prefixes = local.controller_subnets[each.key]
}
# Allow Prometheus to scrape kube-proxy metrics
resource "azurerm_network_security_rule" "controller-kube-proxy" {
resource_group_name = azurerm_resource_group.cluster.name
for_each = local.controller_subnets
name = "allow-kube-proxy-metrics"
name = "allow-kube-proxy-metrics-${each.key}"
resource_group_name = azurerm_resource_group.cluster.name
network_security_group_name = azurerm_network_security_group.controller.name
priority = "2011"
priority = 2012 + (each.key == "ipv4" ? 0 : 1)
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "10249"
source_address_prefixes = azurerm_subnet.worker.address_prefixes
destination_address_prefixes = azurerm_subnet.controller.address_prefixes
source_address_prefixes = local.worker_subnets[each.key]
destination_address_prefixes = local.controller_subnets[each.key]
}
# Allow Prometheus to scrape kube-scheduler and kube-controller-manager metrics
resource "azurerm_network_security_rule" "controller-kube-metrics" {
resource_group_name = azurerm_resource_group.cluster.name
for_each = local.controller_subnets
name = "allow-kube-metrics"
name = "allow-kube-metrics-${each.key}"
resource_group_name = azurerm_resource_group.cluster.name
network_security_group_name = azurerm_network_security_group.controller.name
priority = "2012"
priority = 2014 + (each.key == "ipv4" ? 0 : 1)
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "10257-10259"
source_address_prefixes = azurerm_subnet.worker.address_prefixes
destination_address_prefixes = azurerm_subnet.controller.address_prefixes
source_address_prefixes = local.worker_subnets[each.key]
destination_address_prefixes = local.controller_subnets[each.key]
}
resource "azurerm_network_security_rule" "controller-apiserver" {
resource_group_name = azurerm_resource_group.cluster.name
for_each = local.controller_subnets
name = "allow-apiserver"
name = "allow-apiserver-${each.key}"
resource_group_name = azurerm_resource_group.cluster.name
network_security_group_name = azurerm_network_security_group.controller.name
priority = "2015"
priority = 2016 + (each.key == "ipv4" ? 0 : 1)
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "6443"
source_address_prefix = "*"
destination_address_prefixes = azurerm_subnet.controller.address_prefixes
destination_address_prefixes = local.controller_subnets[each.key]
}
resource "azurerm_network_security_rule" "controller-cilium-health" {
resource_group_name = azurerm_resource_group.cluster.name
count = var.networking == "cilium" ? 1 : 0
for_each = var.networking == "cilium" ? local.controller_subnets : {}
name = "allow-cilium-health"
name = "allow-cilium-health-${each.key}"
resource_group_name = azurerm_resource_group.cluster.name
network_security_group_name = azurerm_network_security_group.controller.name
priority = "2018"
priority = 2018 + (each.key == "ipv4" ? 0 : 1)
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "4240"
source_address_prefixes = concat(azurerm_subnet.controller.address_prefixes, azurerm_subnet.worker.address_prefixes)
destination_address_prefixes = azurerm_subnet.controller.address_prefixes
source_address_prefixes = local.cluster_subnets[each.key]
destination_address_prefixes = local.controller_subnets[each.key]
}
resource "azurerm_network_security_rule" "controller-cilium-metrics" {
resource_group_name = azurerm_resource_group.cluster.name
count = var.networking == "cilium" ? 1 : 0
for_each = var.networking == "cilium" ? local.controller_subnets : {}
name = "allow-cilium-metrics"
name = "allow-cilium-metrics-${each.key}"
resource_group_name = azurerm_resource_group.cluster.name
network_security_group_name = azurerm_network_security_group.controller.name
priority = "2019"
priority = 2035 + (each.key == "ipv4" ? 0 : 1)
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "9962-9965"
source_address_prefixes = concat(azurerm_subnet.controller.address_prefixes, azurerm_subnet.worker.address_prefixes)
destination_address_prefixes = azurerm_subnet.controller.address_prefixes
source_address_prefixes = local.cluster_subnets[each.key]
destination_address_prefixes = local.controller_subnets[each.key]
}
resource "azurerm_network_security_rule" "controller-vxlan" {
resource_group_name = azurerm_resource_group.cluster.name
for_each = local.controller_subnets
name = "allow-vxlan"
name = "allow-vxlan-${each.key}"
resource_group_name = azurerm_resource_group.cluster.name
network_security_group_name = azurerm_network_security_group.controller.name
priority = "2020"
priority = 2020 + (each.key == "ipv4" ? 0 : 1)
access = "Allow"
direction = "Inbound"
protocol = "Udp"
source_port_range = "*"
destination_port_range = "4789"
source_address_prefixes = concat(azurerm_subnet.controller.address_prefixes, azurerm_subnet.worker.address_prefixes)
destination_address_prefixes = azurerm_subnet.controller.address_prefixes
source_address_prefixes = local.cluster_subnets[each.key]
destination_address_prefixes = local.controller_subnets[each.key]
}
resource "azurerm_network_security_rule" "controller-linux-vxlan" {
resource_group_name = azurerm_resource_group.cluster.name
for_each = local.controller_subnets
name = "allow-linux-vxlan"
name = "allow-linux-vxlan-${each.key}"
resource_group_name = azurerm_resource_group.cluster.name
network_security_group_name = azurerm_network_security_group.controller.name
priority = "2021"
priority = 2022 + (each.key == "ipv4" ? 0 : 1)
access = "Allow"
direction = "Inbound"
protocol = "Udp"
source_port_range = "*"
destination_port_range = "8472"
source_address_prefixes = concat(azurerm_subnet.controller.address_prefixes, azurerm_subnet.worker.address_prefixes)
destination_address_prefixes = azurerm_subnet.controller.address_prefixes
source_address_prefixes = local.cluster_subnets[each.key]
destination_address_prefixes = local.controller_subnets[each.key]
}
# Allow Prometheus to scrape node-exporter daemonset
resource "azurerm_network_security_rule" "controller-node-exporter" {
resource_group_name = azurerm_resource_group.cluster.name
for_each = local.controller_subnets
name = "allow-node-exporter"
name = "allow-node-exporter-${each.key}"
resource_group_name = azurerm_resource_group.cluster.name
network_security_group_name = azurerm_network_security_group.controller.name
priority = "2025"
priority = 2025 + (each.key == "ipv4" ? 0 : 1)
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "9100"
source_address_prefixes = azurerm_subnet.worker.address_prefixes
destination_address_prefixes = azurerm_subnet.controller.address_prefixes
source_address_prefixes = local.worker_subnets[each.key]
destination_address_prefixes = local.controller_subnets[each.key]
}
# Allow apiserver to access kubelet's for exec, log, port-forward
resource "azurerm_network_security_rule" "controller-kubelet" {
resource_group_name = azurerm_resource_group.cluster.name
for_each = local.controller_subnets
name = "allow-kubelet"
name = "allow-kubelet-${each.key}"
resource_group_name = azurerm_resource_group.cluster.name
network_security_group_name = azurerm_network_security_group.controller.name
priority = "2030"
priority = 2030 + (each.key == "ipv4" ? 0 : 1)
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "10250"
# allow Prometheus to scrape kubelet metrics too
source_address_prefixes = concat(azurerm_subnet.controller.address_prefixes, azurerm_subnet.worker.address_prefixes)
destination_address_prefixes = azurerm_subnet.controller.address_prefixes
source_address_prefixes = local.cluster_subnets[each.key]
destination_address_prefixes = local.controller_subnets[each.key]
}
# Override Azure AllowVNetInBound and AllowAzureLoadBalancerInBound
@ -247,182 +256,189 @@ resource "azurerm_network_security_rule" "controller-deny-all" {
# Worker security group
resource "azurerm_network_security_group" "worker" {
name = "${var.cluster_name}-worker"
resource_group_name = azurerm_resource_group.cluster.name
name = "${var.cluster_name}-worker"
location = azurerm_resource_group.cluster.location
location = azurerm_resource_group.cluster.location
}
resource "azurerm_network_security_rule" "worker-icmp" {
resource_group_name = azurerm_resource_group.cluster.name
for_each = local.worker_subnets
name = "allow-icmp"
name = "allow-icmp-${each.key}"
resource_group_name = azurerm_resource_group.cluster.name
network_security_group_name = azurerm_network_security_group.worker.name
priority = "1995"
priority = 1995 + (each.key == "ipv4" ? 0 : 1)
access = "Allow"
direction = "Inbound"
protocol = "Icmp"
source_port_range = "*"
destination_port_range = "*"
source_address_prefixes = concat(azurerm_subnet.controller.address_prefixes, azurerm_subnet.worker.address_prefixes)
destination_address_prefixes = azurerm_subnet.worker.address_prefixes
source_address_prefixes = local.cluster_subnets[each.key]
destination_address_prefixes = local.worker_subnets[each.key]
}
resource "azurerm_network_security_rule" "worker-ssh" {
resource_group_name = azurerm_resource_group.cluster.name
for_each = local.worker_subnets
name = "allow-ssh"
name = "allow-ssh-${each.key}"
resource_group_name = azurerm_resource_group.cluster.name
network_security_group_name = azurerm_network_security_group.worker.name
priority = "2000"
priority = 2000 + (each.key == "ipv4" ? 0 : 1)
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "22"
source_address_prefixes = azurerm_subnet.controller.address_prefixes
destination_address_prefixes = azurerm_subnet.worker.address_prefixes
source_address_prefixes = local.controller_subnets[each.key]
destination_address_prefixes = local.worker_subnets[each.key]
}
resource "azurerm_network_security_rule" "worker-http" {
resource_group_name = azurerm_resource_group.cluster.name
for_each = local.worker_subnets
name = "allow-http"
name = "allow-http-${each.key}"
resource_group_name = azurerm_resource_group.cluster.name
network_security_group_name = azurerm_network_security_group.worker.name
priority = "2005"
priority = 2005 + (each.key == "ipv4" ? 0 : 1)
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "80"
source_address_prefix = "*"
destination_address_prefixes = azurerm_subnet.worker.address_prefixes
destination_address_prefixes = local.worker_subnets[each.key]
}
resource "azurerm_network_security_rule" "worker-https" {
resource_group_name = azurerm_resource_group.cluster.name
for_each = local.worker_subnets
name = "allow-https"
name = "allow-https-${each.key}"
resource_group_name = azurerm_resource_group.cluster.name
network_security_group_name = azurerm_network_security_group.worker.name
priority = "2010"
priority = 2010 + (each.key == "ipv4" ? 0 : 1)
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "443"
source_address_prefix = "*"
destination_address_prefixes = azurerm_subnet.worker.address_prefixes
destination_address_prefixes = local.worker_subnets[each.key]
}
resource "azurerm_network_security_rule" "worker-cilium-health" {
resource_group_name = azurerm_resource_group.cluster.name
count = var.networking == "cilium" ? 1 : 0
for_each = var.networking == "cilium" ? local.worker_subnets : {}
name = "allow-cilium-health"
name = "allow-cilium-health-${each.key}"
resource_group_name = azurerm_resource_group.cluster.name
network_security_group_name = azurerm_network_security_group.worker.name
priority = "2013"
priority = 2012 + (each.key == "ipv4" ? 0 : 1)
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "4240"
source_address_prefixes = concat(azurerm_subnet.controller.address_prefixes, azurerm_subnet.worker.address_prefixes)
destination_address_prefixes = azurerm_subnet.worker.address_prefixes
source_address_prefixes = local.cluster_subnets[each.key]
destination_address_prefixes = local.worker_subnets[each.key]
}
resource "azurerm_network_security_rule" "worker-cilium-metrics" {
resource_group_name = azurerm_resource_group.cluster.name
count = var.networking == "cilium" ? 1 : 0
for_each = var.networking == "cilium" ? local.worker_subnets : {}
name = "allow-cilium-metrics"
name = "allow-cilium-metrics-${each.key}"
resource_group_name = azurerm_resource_group.cluster.name
network_security_group_name = azurerm_network_security_group.worker.name
priority = "2014"
priority = 2014 + (each.key == "ipv4" ? 0 : 1)
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "9962-9965"
source_address_prefixes = concat(azurerm_subnet.controller.address_prefixes, azurerm_subnet.worker.address_prefixes)
destination_address_prefixes = azurerm_subnet.worker.address_prefixes
source_address_prefixes = local.cluster_subnets[each.key]
destination_address_prefixes = local.worker_subnets[each.key]
}
resource "azurerm_network_security_rule" "worker-vxlan" {
resource_group_name = azurerm_resource_group.cluster.name
for_each = local.worker_subnets
name = "allow-vxlan"
name = "allow-vxlan-${each.key}"
resource_group_name = azurerm_resource_group.cluster.name
network_security_group_name = azurerm_network_security_group.worker.name
priority = "2015"
priority = 2016 + (each.key == "ipv4" ? 0 : 1)
access = "Allow"
direction = "Inbound"
protocol = "Udp"
source_port_range = "*"
destination_port_range = "4789"
source_address_prefixes = concat(azurerm_subnet.controller.address_prefixes, azurerm_subnet.worker.address_prefixes)
destination_address_prefixes = azurerm_subnet.worker.address_prefixes
source_address_prefixes = local.cluster_subnets[each.key]
destination_address_prefixes = local.worker_subnets[each.key]
}
resource "azurerm_network_security_rule" "worker-linux-vxlan" {
resource_group_name = azurerm_resource_group.cluster.name
for_each = local.worker_subnets
name = "allow-linux-vxlan"
name = "allow-linux-vxlan-${each.key}"
resource_group_name = azurerm_resource_group.cluster.name
network_security_group_name = azurerm_network_security_group.worker.name
priority = "2016"
priority = 2018 + (each.key == "ipv4" ? 0 : 1)
access = "Allow"
direction = "Inbound"
protocol = "Udp"
source_port_range = "*"
destination_port_range = "8472"
source_address_prefixes = concat(azurerm_subnet.controller.address_prefixes, azurerm_subnet.worker.address_prefixes)
destination_address_prefixes = azurerm_subnet.worker.address_prefixes
source_address_prefixes = local.cluster_subnets[each.key]
destination_address_prefixes = local.worker_subnets[each.key]
}
# Allow Prometheus to scrape node-exporter daemonset
resource "azurerm_network_security_rule" "worker-node-exporter" {
resource_group_name = azurerm_resource_group.cluster.name
for_each = local.worker_subnets
name = "allow-node-exporter"
name = "allow-node-exporter-${each.key}"
resource_group_name = azurerm_resource_group.cluster.name
network_security_group_name = azurerm_network_security_group.worker.name
priority = "2020"
priority = 2020 + (each.key == "ipv4" ? 0 : 1)
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "9100"
source_address_prefixes = azurerm_subnet.worker.address_prefixes
destination_address_prefixes = azurerm_subnet.worker.address_prefixes
source_address_prefixes = local.worker_subnets[each.key]
destination_address_prefixes = local.worker_subnets[each.key]
}
# Allow Prometheus to scrape kube-proxy
resource "azurerm_network_security_rule" "worker-kube-proxy" {
resource_group_name = azurerm_resource_group.cluster.name
for_each = local.worker_subnets
name = "allow-kube-proxy"
name = "allow-kube-proxy-${each.key}"
resource_group_name = azurerm_resource_group.cluster.name
network_security_group_name = azurerm_network_security_group.worker.name
priority = "2024"
priority = 2024 + (each.key == "ipv4" ? 0 : 1)
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "10249"
source_address_prefixes = azurerm_subnet.worker.address_prefixes
destination_address_prefixes = azurerm_subnet.worker.address_prefixes
source_address_prefixes = local.worker_subnets[each.key]
destination_address_prefixes = local.worker_subnets[each.key]
}
# Allow apiserver to access kubelet's for exec, log, port-forward
resource "azurerm_network_security_rule" "worker-kubelet" {
resource_group_name = azurerm_resource_group.cluster.name
for_each = local.worker_subnets
name = "allow-kubelet"
name = "allow-kubelet-${each.key}"
resource_group_name = azurerm_resource_group.cluster.name
network_security_group_name = azurerm_network_security_group.worker.name
priority = "2025"
priority = 2026 + (each.key == "ipv4" ? 0 : 1)
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "10250"
# allow Prometheus to scrape kubelet metrics too
source_address_prefixes = concat(azurerm_subnet.controller.address_prefixes, azurerm_subnet.worker.address_prefixes)
destination_address_prefixes = azurerm_subnet.worker.address_prefixes
source_address_prefixes = local.cluster_subnets[each.key]
destination_address_prefixes = local.worker_subnets[each.key]
}
# Override Azure AllowVNetInBound and AllowAzureLoadBalancerInBound

View File

@ -18,7 +18,7 @@ resource "null_resource" "copy-controller-secrets" {
connection {
type = "ssh"
host = azurerm_public_ip.controllers.*.ip_address[count.index]
host = azurerm_public_ip.controllers-ipv4[count.index].ip_address
user = "core"
timeout = "15m"
}
@ -45,7 +45,7 @@ resource "null_resource" "bootstrap" {
connection {
type = "ssh"
host = azurerm_public_ip.controllers.*.ip_address[0]
host = azurerm_public_ip.controllers-ipv4[0].ip_address
user = "core"
timeout = "15m"
}

View File

@ -5,9 +5,9 @@ variable "cluster_name" {
# Azure
variable "region" {
variable "location" {
type = string
description = "Azure Region (e.g. centralus , see `az account list-locations --output table`)"
description = "Azure location (e.g. centralus , see `az account list-locations --output table`)"
}
variable "dns_zone" {
@ -22,41 +22,65 @@ variable "dns_zone_group" {
# instances
variable "os_image" {
type = string
description = "Fedora CoreOS image for instances"
}
variable "controller_count" {
type = number
description = "Number of controllers (i.e. masters)"
default = 1
}
variable "worker_count" {
type = number
description = "Number of workers"
default = 1
}
variable "controller_type" {
type = string
description = "Machine type for controllers (see `az vm list-skus --location centralus`)"
default = "Standard_B2s"
}
variable "controller_disk_type" {
type = string
description = "Type of managed disk for controller node(s)"
default = "Premium_LRS"
}
variable "controller_disk_size" {
type = number
description = "Size of the managed disk in GB for controller node(s)"
default = 30
}
variable "worker_count" {
type = number
description = "Number of workers"
default = 1
}
variable "worker_type" {
type = string
description = "Machine type for workers (see `az vm list-skus --location centralus`)"
default = "Standard_D2as_v5"
}
variable "os_image" {
variable "worker_disk_type" {
type = string
description = "Fedora CoreOS image for instances"
description = "Type of managed disk for worker nodes"
default = "Standard_LRS"
}
variable "disk_size" {
variable "worker_disk_size" {
type = number
description = "Size of the disk in GB"
description = "Size of the managed disk in GB for worker nodes"
default = 30
}
variable "worker_ephemeral_disk" {
type = bool
description = "Use ephemeral local disk instead of managed disk (requires vm_type with local storage)"
default = false
}
variable "worker_priority" {
type = string
description = "Set worker priority to Spot to use reduced cost surplus capacity, with the tradeoff that instances can be deallocated at any time."
@ -94,10 +118,15 @@ variable "networking" {
default = "cilium"
}
variable "host_cidr" {
type = string
description = "CIDR IPv4 range to assign to instances"
default = "10.0.0.0/16"
variable "network_cidr" {
type = object({
ipv4 = list(string)
ipv6 = optional(list(string), [])
})
description = "Virtual network CIDR ranges"
default = {
ipv4 = ["10.0.0.0/16"]
}
}
variable "pod_cidr" {
@ -115,31 +144,13 @@ EOD
default = "10.3.0.0/16"
}
variable "enable_reporting" {
type = bool
description = "Enable usage or analytics reporting to upstreams (Calico)"
default = false
}
variable "enable_aggregation" {
type = bool
description = "Enable the Kubernetes Aggregation Layer"
default = true
}
variable "worker_node_labels" {
type = list(string)
description = "List of initial worker node labels"
default = []
}
# unofficial, undocumented, unsupported
variable "cluster_domain_suffix" {
type = string
description = "Queries for domains with the suffix will be answered by coredns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
default = "cluster.local"
}
# advanced
variable "daemonset_tolerations" {
type = list(string)

View File

@ -3,7 +3,7 @@
terraform {
required_version = ">= 0.13.0, < 2.0.0"
required_providers {
azurerm = ">= 2.8, < 4.0"
azurerm = ">= 2.8"
null = ">= 2.1"
ct = {
source = "poseidon/ct"

View File

@ -3,23 +3,26 @@ module "workers" {
name = var.cluster_name
# Azure
resource_group_name = azurerm_resource_group.cluster.name
region = azurerm_resource_group.cluster.location
subnet_id = azurerm_subnet.worker.id
security_group_id = azurerm_network_security_group.worker.id
backend_address_pool_id = azurerm_lb_backend_address_pool.worker.id
resource_group_name = azurerm_resource_group.cluster.name
location = azurerm_resource_group.cluster.location
subnet_id = azurerm_subnet.worker.id
security_group_id = azurerm_network_security_group.worker.id
backend_address_pool_ids = local.backend_address_pool_ids
worker_count = var.worker_count
vm_type = var.worker_type
os_image = var.os_image
priority = var.worker_priority
# instances
os_image = var.os_image
worker_count = var.worker_count
vm_type = var.worker_type
disk_type = var.worker_disk_type
disk_size = var.worker_disk_size
ephemeral_disk = var.worker_ephemeral_disk
priority = var.worker_priority
# configuration
kubeconfig = module.bootstrap.kubeconfig-kubelet
ssh_authorized_key = var.ssh_authorized_key
azure_authorized_key = var.azure_authorized_key
service_cidr = var.service_cidr
cluster_domain_suffix = var.cluster_domain_suffix
snippets = var.worker_snippets
node_labels = var.worker_node_labels
kubeconfig = module.bootstrap.kubeconfig-kubelet
ssh_authorized_key = var.ssh_authorized_key
azure_authorized_key = var.azure_authorized_key
service_cidr = var.service_cidr
snippets = var.worker_snippets
node_labels = var.worker_node_labels
}

View File

@ -26,7 +26,7 @@ systemd:
Description=Kubelet (System Container)
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.30.2
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.31.0
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /opt/cni/bin
@ -99,7 +99,7 @@ storage:
cgroupDriver: systemd
clusterDNS:
- ${cluster_dns_service_ip}
clusterDomain: ${cluster_domain_suffix}
clusterDomain: cluster.local
healthzPort: 0
rotateCertificates: true
shutdownGracePeriod: 45s

View File

@ -5,9 +5,9 @@ variable "name" {
# Azure
variable "region" {
variable "location" {
type = string
description = "Must be set to the Azure Region of cluster"
description = "Must be set to the Azure location of cluster"
}
variable "resource_group_name" {
@ -25,9 +25,12 @@ variable "security_group_id" {
description = "Must be set to the `worker_security_group_id` output by cluster"
}
variable "backend_address_pool_id" {
type = string
description = "Must be set to the `worker_backend_address_pool_id` output by cluster"
variable "backend_address_pool_ids" {
type = object({
ipv4 = list(string)
ipv6 = list(string)
})
description = "Must be set to the `backend_address_pool_ids` output by cluster"
}
# instances
@ -49,6 +52,24 @@ variable "os_image" {
description = "Fedora CoreOS image for instances"
}
variable "disk_type" {
type = string
description = "Type of managed disk"
default = "Standard_LRS"
}
variable "disk_size" {
type = number
description = "Size of the managed disk in GB"
default = 30
}
variable "ephemeral_disk" {
type = bool
description = "Use ephemeral local disk instead of managed disk (requires vm_type with local storage)"
default = false
}
variable "priority" {
type = string
description = "Set priority to Spot to use reduced cost surplus capacity, with the tradeoff that instances can be evicted at any time."
@ -99,12 +120,3 @@ variable "node_taints" {
description = "List of initial node taints"
default = []
}
# unofficial, undocumented, unsupported
variable "cluster_domain_suffix" {
description = "Queries for domains with the suffix will be answered by coredns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
type = string
default = "cluster.local"
}

View File

@ -3,7 +3,7 @@
terraform {
required_version = ">= 0.13.0, < 2.0.0"
required_providers {
azurerm = ">= 2.8, < 4.0"
azurerm = ">= 2.8"
ct = {
source = "poseidon/ct"
version = "~> 0.13"

View File

@ -3,30 +3,29 @@ locals {
}
# Workers scale set
resource "azurerm_linux_virtual_machine_scale_set" "workers" {
resource_group_name = var.resource_group_name
name = "${var.name}-worker"
location = var.region
sku = var.vm_type
instances = var.worker_count
# instance name prefix for instances in the set
computer_name_prefix = "${var.name}-worker"
single_placement_group = false
custom_data = base64encode(data.ct_config.worker.rendered)
resource "azurerm_orchestrated_virtual_machine_scale_set" "workers" {
name = "${var.name}-worker"
resource_group_name = var.resource_group_name
location = var.location
platform_fault_domain_count = 1
sku_name = var.vm_type
instances = var.worker_count
# storage
source_image_id = var.os_image
encryption_at_host_enabled = true
source_image_id = var.os_image
os_disk {
storage_account_type = "Standard_LRS"
caching = "ReadWrite"
}
# Azure requires setting admin_ssh_key, though Ignition custom_data handles it too
admin_username = "core"
admin_ssh_key {
username = "core"
public_key = var.azure_authorized_key
storage_account_type = var.disk_type
disk_size_gb = var.disk_size
caching = "ReadOnly"
# Optionally, use the ephemeral disk of the instance type (support varies)
dynamic "diff_disk_settings" {
for_each = var.ephemeral_disk ? [1] : []
content {
option = "Local"
placement = "ResourceDisk"
}
}
}
# network
@ -36,41 +35,46 @@ resource "azurerm_linux_virtual_machine_scale_set" "workers" {
network_security_group_id = var.security_group_id
ip_configuration {
name = "ip0"
name = "ipv4"
version = "IPv4"
primary = true
subnet_id = var.subnet_id
# backend address pool to which the NIC should be added
load_balancer_backend_address_pool_ids = [var.backend_address_pool_id]
load_balancer_backend_address_pool_ids = var.backend_address_pool_ids.ipv4
}
ip_configuration {
name = "ipv6"
version = "IPv6"
subnet_id = var.subnet_id
# backend address pool to which the NIC should be added
load_balancer_backend_address_pool_ids = var.backend_address_pool_ids.ipv6
}
}
# boot
user_data_base64 = base64encode(data.ct_config.worker.rendered)
boot_diagnostics {
# defaults to a managed storage account
}
# Azure requires an RSA admin_ssh_key
os_profile {
linux_configuration {
admin_username = "core"
admin_ssh_key {
username = "core"
public_key = local.azure_authorized_key
}
computer_name_prefix = "${var.name}-worker"
}
}
# lifecycle
upgrade_mode = "Manual"
# eviction policy may only be set when priority is Spot
priority = var.priority
eviction_policy = var.priority == "Spot" ? "Delete" : null
}
# Scale up or down to maintain desired number, tolerating deallocations.
resource "azurerm_monitor_autoscale_setting" "workers" {
resource_group_name = var.resource_group_name
name = "${var.name}-maintain-desired"
location = var.region
# autoscale
enabled = true
target_resource_id = azurerm_linux_virtual_machine_scale_set.workers.id
profile {
name = "default"
capacity {
minimum = var.worker_count
default = var.worker_count
maximum = var.worker_count
}
termination_notification {
enabled = true
}
}
@ -80,7 +84,6 @@ data "ct_config" "worker" {
kubeconfig = indent(10, var.kubeconfig)
ssh_authorized_key = var.ssh_authorized_key
cluster_dns_service_ip = cidrhost(var.service_cidr, 10)
cluster_domain_suffix = var.cluster_domain_suffix
node_labels = join(",", var.node_labels)
node_taints = join(",", var.node_taints)
})

View File

@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.30.2 (upstream)
* Kubernetes v1.31.0 (upstream)
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [low-priority](https://typhoon.psdn.io/flatcar-linux/azure/#low-priority) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#hosts) customization

View File

@ -1,6 +1,6 @@
# Kubernetes assets (kubeconfig, manifests)
module "bootstrap" {
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=886f501bf7b624fc12acac83449b81d0dc8b8849"
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=1ddecb1cef65c9715ed66b6c335634bc51f59613"
cluster_name = var.cluster_name
api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)]
@ -14,9 +14,6 @@ module "bootstrap" {
pod_cidr = var.pod_cidr
service_cidr = var.service_cidr
cluster_domain_suffix = var.cluster_domain_suffix
enable_reporting = var.enable_reporting
enable_aggregation = var.enable_aggregation
daemonset_tolerations = var.daemonset_tolerations
components = var.components
}

View File

@ -56,7 +56,7 @@ systemd:
After=docker.service
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.30.2
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.31.0
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /opt/cni/bin
@ -105,7 +105,7 @@ systemd:
Type=oneshot
RemainAfterExit=true
WorkingDirectory=/opt/bootstrap
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.30.2
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.31.0
ExecStart=/usr/bin/docker run \
-v /etc/kubernetes/pki:/etc/kubernetes/pki:ro \
-v /opt/bootstrap/assets:/assets:ro \
@ -144,7 +144,7 @@ storage:
cgroupDriver: systemd
clusterDNS:
- ${cluster_dns_service_ip}
clusterDomain: ${cluster_domain_suffix}
clusterDomain: cluster.local
healthzPort: 0
rotateCertificates: true
shutdownGracePeriod: 45s

View File

@ -1,25 +1,9 @@
# Discrete DNS records for each controller's private IPv4 for etcd usage
resource "azurerm_dns_a_record" "etcds" {
count = var.controller_count
resource_group_name = var.dns_zone_group
# DNS Zone name where record should be created
zone_name = var.dns_zone
# DNS record
name = format("%s-etcd%d", var.cluster_name, count.index)
ttl = 300
# private IPv4 address for etcd
records = [azurerm_network_interface.controllers.*.private_ip_address[count.index]]
}
locals {
# Container Linux derivative
# flatcar-stable -> Flatcar Linux Stable
channel = split("-", var.os_image)[1]
offer_suffix = var.arch == "arm64" ? "corevm" : "free"
urn = var.arch == "arm64" ? local.channel : "${local.channel}-gen2"
offer_suffix = var.controller_arch == "arm64" ? "corevm" : "free"
urn = var.controller_arch == "arm64" ? local.channel : "${local.channel}-gen2"
# Typhoon ssh_authorized_key supports RSA or a newer formats (e.g. ed25519).
# However, Azure requires an older RSA key to pass validations. To use a
@ -28,12 +12,25 @@ locals {
azure_authorized_key = var.azure_authorized_key == "" ? var.ssh_authorized_key : var.azure_authorized_key
}
# Discrete DNS records for each controller's private IPv4 for etcd usage
resource "azurerm_dns_a_record" "etcds" {
count = var.controller_count
# DNS Zone name where record should be created
zone_name = var.dns_zone
resource_group_name = var.dns_zone_group
# DNS record
name = format("%s-etcd%d", var.cluster_name, count.index)
ttl = 300
# private IPv4 address for etcd
records = [azurerm_network_interface.controllers[count.index].private_ip_address]
}
# Controller availability set to spread controllers
resource "azurerm_availability_set" "controllers" {
resource_group_name = azurerm_resource_group.cluster.name
name = "${var.cluster_name}-controllers"
location = var.region
resource_group_name = azurerm_resource_group.cluster.name
location = var.location
platform_fault_domain_count = 2
platform_update_domain_count = 4
managed = true
@ -41,25 +38,20 @@ resource "azurerm_availability_set" "controllers" {
# Controller instances
resource "azurerm_linux_virtual_machine" "controllers" {
count = var.controller_count
resource_group_name = azurerm_resource_group.cluster.name
count = var.controller_count
name = "${var.cluster_name}-controller-${count.index}"
location = var.region
resource_group_name = azurerm_resource_group.cluster.name
location = var.location
availability_set_id = azurerm_availability_set.controllers.id
size = var.controller_type
custom_data = base64encode(data.ct_config.controllers.*.rendered[count.index])
boot_diagnostics {
# defaults to a managed storage account
}
size = var.controller_type
# storage
os_disk {
name = "${var.cluster_name}-controller-${count.index}"
storage_account_type = var.controller_disk_type
disk_size_gb = var.controller_disk_size
caching = "None"
disk_size_gb = var.disk_size
storage_account_type = "Premium_LRS"
}
# Flatcar Container Linux
@ -71,7 +63,7 @@ resource "azurerm_linux_virtual_machine" "controllers" {
}
dynamic "plan" {
for_each = var.arch == "arm64" ? [] : [1]
for_each = var.controller_arch == "arm64" ? [] : [1]
content {
publisher = "kinvolk"
product = "flatcar-container-linux-${local.offer_suffix}"
@ -84,7 +76,13 @@ resource "azurerm_linux_virtual_machine" "controllers" {
azurerm_network_interface.controllers[count.index].id
]
# Azure requires setting admin_ssh_key, though Ignition custom_data handles it too
# boot
custom_data = base64encode(data.ct_config.controllers[count.index].rendered)
boot_diagnostics {
# defaults to a managed storage account
}
# Azure requires an RSA admin_ssh_key
admin_username = "core"
admin_ssh_key {
username = "core"
@ -99,31 +97,52 @@ resource "azurerm_linux_virtual_machine" "controllers" {
}
}
# Controller public IPv4 addresses
resource "azurerm_public_ip" "controllers" {
count = var.controller_count
resource_group_name = azurerm_resource_group.cluster.name
# Controller node public IPv4 addresses
resource "azurerm_public_ip" "controllers-ipv4" {
count = var.controller_count
name = "${var.cluster_name}-controller-${count.index}"
location = azurerm_resource_group.cluster.location
sku = "Standard"
allocation_method = "Static"
name = "${var.cluster_name}-controller-${count.index}-ipv4"
resource_group_name = azurerm_resource_group.cluster.name
location = azurerm_resource_group.cluster.location
ip_version = "IPv4"
sku = "Standard"
allocation_method = "Static"
}
# Controller NICs with public and private IPv4
resource "azurerm_network_interface" "controllers" {
count = var.controller_count
resource_group_name = azurerm_resource_group.cluster.name
# Controller node public IPv6 addresses
resource "azurerm_public_ip" "controllers-ipv6" {
count = var.controller_count
name = "${var.cluster_name}-controller-${count.index}"
location = azurerm_resource_group.cluster.location
name = "${var.cluster_name}-controller-${count.index}-ipv6"
resource_group_name = azurerm_resource_group.cluster.name
location = azurerm_resource_group.cluster.location
ip_version = "IPv6"
sku = "Standard"
allocation_method = "Static"
}
# Controllers' network interfaces
resource "azurerm_network_interface" "controllers" {
count = var.controller_count
name = "${var.cluster_name}-controller-${count.index}"
resource_group_name = azurerm_resource_group.cluster.name
location = azurerm_resource_group.cluster.location
ip_configuration {
name = "ip0"
name = "ipv4"
primary = true
subnet_id = azurerm_subnet.controller.id
private_ip_address_allocation = "Dynamic"
# instance public IPv4
public_ip_address_id = azurerm_public_ip.controllers.*.id[count.index]
private_ip_address_version = "IPv4"
public_ip_address_id = azurerm_public_ip.controllers-ipv4[count.index].id
}
ip_configuration {
name = "ipv6"
subnet_id = azurerm_subnet.controller.id
private_ip_address_allocation = "Dynamic"
private_ip_address_version = "IPv6"
public_ip_address_id = azurerm_public_ip.controllers-ipv6[count.index].id
}
}
@ -135,13 +154,21 @@ resource "azurerm_network_interface_security_group_association" "controllers" {
network_security_group_id = azurerm_network_security_group.controller.id
}
# Associate controller network interface with controller backend address pool
resource "azurerm_network_interface_backend_address_pool_association" "controllers" {
# Associate controller network interface with controller backend address pools
resource "azurerm_network_interface_backend_address_pool_association" "controllers-ipv4" {
count = var.controller_count
network_interface_id = azurerm_network_interface.controllers[count.index].id
ip_configuration_name = "ip0"
backend_address_pool_id = azurerm_lb_backend_address_pool.controller.id
ip_configuration_name = "ipv4"
backend_address_pool_id = azurerm_lb_backend_address_pool.controller-ipv4.id
}
resource "azurerm_network_interface_backend_address_pool_association" "controllers-ipv6" {
count = var.controller_count
network_interface_id = azurerm_network_interface.controllers[count.index].id
ip_configuration_name = "ipv6"
backend_address_pool_id = azurerm_lb_backend_address_pool.controller-ipv6.id
}
# Flatcar Linux controllers
@ -158,7 +185,6 @@ data "ct_config" "controllers" {
kubeconfig = indent(10, module.bootstrap.kubeconfig-kubelet)
ssh_authorized_key = var.ssh_authorized_key
cluster_dns_service_ip = cidrhost(var.service_cidr, 10)
cluster_domain_suffix = var.cluster_domain_suffix
})
strict = true
snippets = var.controller_snippets

View File

@ -1,116 +1,164 @@
# DNS record for the apiserver load balancer
# DNS A record for the apiserver load balancer
resource "azurerm_dns_a_record" "apiserver" {
resource_group_name = var.dns_zone_group
# DNS Zone name where record should be created
zone_name = var.dns_zone
zone_name = var.dns_zone
resource_group_name = var.dns_zone_group
# DNS record
name = var.cluster_name
ttl = 300
# IPv4 address of apiserver load balancer
records = [azurerm_public_ip.apiserver-ipv4.ip_address]
records = [azurerm_public_ip.frontend-ipv4.ip_address]
}
# Static IPv4 address for the apiserver frontend
resource "azurerm_public_ip" "apiserver-ipv4" {
resource_group_name = azurerm_resource_group.cluster.name
name = "${var.cluster_name}-apiserver-ipv4"
location = var.region
sku = "Standard"
allocation_method = "Static"
# DNS AAAA record for the apiserver load balancer
resource "azurerm_dns_aaaa_record" "apiserver" {
# DNS Zone name where record should be created
zone_name = var.dns_zone
resource_group_name = var.dns_zone_group
# DNS record
name = var.cluster_name
ttl = 300
# IPv6 address of apiserver load balancer
records = [azurerm_public_ip.frontend-ipv6.ip_address]
}
# Static IPv4 address for the ingress frontend
resource "azurerm_public_ip" "ingress-ipv4" {
# Static IPv4 address for the load balancer
resource "azurerm_public_ip" "frontend-ipv4" {
name = "${var.cluster_name}-frontend-ipv4"
resource_group_name = azurerm_resource_group.cluster.name
location = var.location
ip_version = "IPv4"
sku = "Standard"
allocation_method = "Static"
}
name = "${var.cluster_name}-ingress-ipv4"
location = var.region
sku = "Standard"
allocation_method = "Static"
# Static IPv6 address for the load balancer
resource "azurerm_public_ip" "frontend-ipv6" {
name = "${var.cluster_name}-ingress-ipv6"
resource_group_name = azurerm_resource_group.cluster.name
location = var.location
ip_version = "IPv6"
sku = "Standard"
allocation_method = "Static"
}
# Network Load Balancer for apiservers and ingress
resource "azurerm_lb" "cluster" {
name = var.cluster_name
resource_group_name = azurerm_resource_group.cluster.name
name = var.cluster_name
location = var.region
sku = "Standard"
location = var.location
sku = "Standard"
frontend_ip_configuration {
name = "apiserver"
public_ip_address_id = azurerm_public_ip.apiserver-ipv4.id
name = "frontend-ipv4"
public_ip_address_id = azurerm_public_ip.frontend-ipv4.id
}
frontend_ip_configuration {
name = "ingress"
public_ip_address_id = azurerm_public_ip.ingress-ipv4.id
name = "frontend-ipv6"
public_ip_address_id = azurerm_public_ip.frontend-ipv6.id
}
}
resource "azurerm_lb_rule" "apiserver" {
name = "apiserver"
resource "azurerm_lb_rule" "apiserver-ipv4" {
name = "apiserver-ipv4"
loadbalancer_id = azurerm_lb.cluster.id
frontend_ip_configuration_name = "apiserver"
frontend_ip_configuration_name = "frontend-ipv4"
disable_outbound_snat = true
protocol = "Tcp"
frontend_port = 6443
backend_port = 6443
backend_address_pool_ids = [azurerm_lb_backend_address_pool.controller.id]
backend_address_pool_ids = [azurerm_lb_backend_address_pool.controller-ipv4.id]
probe_id = azurerm_lb_probe.apiserver.id
}
resource "azurerm_lb_rule" "ingress-http" {
name = "ingress-http"
resource "azurerm_lb_rule" "apiserver-ipv6" {
name = "apiserver-ipv6"
loadbalancer_id = azurerm_lb.cluster.id
frontend_ip_configuration_name = "ingress"
frontend_ip_configuration_name = "frontend-ipv6"
disable_outbound_snat = true
protocol = "Tcp"
frontend_port = 6443
backend_port = 6443
backend_address_pool_ids = [azurerm_lb_backend_address_pool.controller-ipv6.id]
probe_id = azurerm_lb_probe.apiserver.id
}
resource "azurerm_lb_rule" "ingress-http-ipv4" {
name = "ingress-http-ipv4"
loadbalancer_id = azurerm_lb.cluster.id
frontend_ip_configuration_name = "frontend-ipv4"
disable_outbound_snat = true
protocol = "Tcp"
frontend_port = 80
backend_port = 80
backend_address_pool_ids = [azurerm_lb_backend_address_pool.worker.id]
backend_address_pool_ids = [azurerm_lb_backend_address_pool.worker-ipv4.id]
probe_id = azurerm_lb_probe.ingress.id
}
resource "azurerm_lb_rule" "ingress-https" {
name = "ingress-https"
resource "azurerm_lb_rule" "ingress-https-ipv4" {
name = "ingress-https-ipv4"
loadbalancer_id = azurerm_lb.cluster.id
frontend_ip_configuration_name = "ingress"
frontend_ip_configuration_name = "frontend-ipv4"
disable_outbound_snat = true
protocol = "Tcp"
frontend_port = 443
backend_port = 443
backend_address_pool_ids = [azurerm_lb_backend_address_pool.worker.id]
backend_address_pool_ids = [azurerm_lb_backend_address_pool.worker-ipv4.id]
probe_id = azurerm_lb_probe.ingress.id
}
# Worker outbound TCP/UDP SNAT
resource "azurerm_lb_outbound_rule" "worker-outbound" {
name = "worker"
loadbalancer_id = azurerm_lb.cluster.id
frontend_ip_configuration {
name = "ingress"
}
resource "azurerm_lb_rule" "ingress-http-ipv6" {
name = "ingress-http-ipv6"
loadbalancer_id = azurerm_lb.cluster.id
frontend_ip_configuration_name = "frontend-ipv6"
disable_outbound_snat = true
protocol = "All"
backend_address_pool_id = azurerm_lb_backend_address_pool.worker.id
protocol = "Tcp"
frontend_port = 80
backend_port = 80
backend_address_pool_ids = [azurerm_lb_backend_address_pool.worker-ipv6.id]
probe_id = azurerm_lb_probe.ingress.id
}
resource "azurerm_lb_rule" "ingress-https-ipv6" {
name = "ingress-https-ipv6"
loadbalancer_id = azurerm_lb.cluster.id
frontend_ip_configuration_name = "frontend-ipv6"
disable_outbound_snat = true
protocol = "Tcp"
frontend_port = 443
backend_port = 443
backend_address_pool_ids = [azurerm_lb_backend_address_pool.worker-ipv6.id]
probe_id = azurerm_lb_probe.ingress.id
}
# Backend Address Pools
# Address pool of controllers
resource "azurerm_lb_backend_address_pool" "controller" {
name = "controller"
resource "azurerm_lb_backend_address_pool" "controller-ipv4" {
name = "controller-ipv4"
loadbalancer_id = azurerm_lb.cluster.id
}
# Address pool of workers
resource "azurerm_lb_backend_address_pool" "worker" {
name = "worker"
resource "azurerm_lb_backend_address_pool" "controller-ipv6" {
name = "controller-ipv6"
loadbalancer_id = azurerm_lb.cluster.id
}
# Address pools for workers
resource "azurerm_lb_backend_address_pool" "worker-ipv4" {
name = "worker-ipv4"
loadbalancer_id = azurerm_lb.cluster.id
}
resource "azurerm_lb_backend_address_pool" "worker-ipv6" {
name = "worker-ipv6"
loadbalancer_id = azurerm_lb.cluster.id
}
@ -122,10 +170,8 @@ resource "azurerm_lb_probe" "apiserver" {
loadbalancer_id = azurerm_lb.cluster.id
protocol = "Tcp"
port = 6443
# unhealthy threshold
number_of_probes = 3
number_of_probes = 3
interval_in_seconds = 5
}
@ -136,10 +182,29 @@ resource "azurerm_lb_probe" "ingress" {
protocol = "Http"
port = 10254
request_path = "/healthz"
# unhealthy threshold
number_of_probes = 3
number_of_probes = 3
interval_in_seconds = 5
}
# Outbound SNAT
resource "azurerm_lb_outbound_rule" "outbound-ipv4" {
name = "outbound-ipv4"
protocol = "All"
loadbalancer_id = azurerm_lb.cluster.id
backend_address_pool_id = azurerm_lb_backend_address_pool.worker-ipv4.id
frontend_ip_configuration {
name = "frontend-ipv4"
}
}
resource "azurerm_lb_outbound_rule" "outbound-ipv6" {
name = "outbound-ipv6"
protocol = "All"
loadbalancer_id = azurerm_lb.cluster.id
backend_address_pool_id = azurerm_lb_backend_address_pool.worker-ipv6.id
frontend_ip_configuration {
name = "frontend-ipv6"
}
}

View File

@ -0,0 +1,6 @@
locals {
backend_address_pool_ids = {
ipv4 = [azurerm_lb_backend_address_pool.worker-ipv4.id]
ipv6 = [azurerm_lb_backend_address_pool.worker-ipv6.id]
}
}

View File

@ -1,27 +1,63 @@
# Choose an IPv6 ULA subnet at random
# https://datatracker.ietf.org/doc/html/rfc4193
resource "random_id" "ula-netnum" {
byte_length = 5 # 40 bits
}
locals {
# fd00::/8 -> shift 40 -> 2^40 possible /48 subnets
ula-range = cidrsubnet("fd00::/8", 40, random_id.ula-netnum.dec)
network_cidr = {
ipv4 = var.network_cidr.ipv4
ipv6 = length(var.network_cidr.ipv6) > 0 ? var.network_cidr.ipv6 : [local.ula-range]
}
# Subdivide the virtual network into subnets
# - controllers use netnum 0
# - workers use netnum 1
controller_subnets = {
ipv4 = [for i, cidr in local.network_cidr.ipv4 : cidrsubnet(cidr, 1, 0)]
ipv6 = [for i, cidr in local.network_cidr.ipv6 : cidrsubnet(cidr, 16, 0)]
}
worker_subnets = {
ipv4 = [for i, cidr in local.network_cidr.ipv4 : cidrsubnet(cidr, 1, 1)]
ipv6 = [for i, cidr in local.network_cidr.ipv6 : cidrsubnet(cidr, 16, 1)]
}
cluster_subnets = {
ipv4 = concat(local.controller_subnets.ipv4, local.worker_subnets.ipv4)
ipv6 = concat(local.controller_subnets.ipv6, local.worker_subnets.ipv6)
}
}
# Organize cluster into a resource group
resource "azurerm_resource_group" "cluster" {
name = var.cluster_name
location = var.region
location = var.location
}
resource "azurerm_virtual_network" "network" {
name = var.cluster_name
resource_group_name = azurerm_resource_group.cluster.name
name = var.cluster_name
location = azurerm_resource_group.cluster.location
address_space = [var.host_cidr]
location = azurerm_resource_group.cluster.location
address_space = concat(
local.network_cidr.ipv4,
local.network_cidr.ipv6
)
}
# Subnets - separate subnets for controller and workers because Azure
# network security groups are based on IPv4 CIDR rather than instance
# tags like GCP or security group membership like AWS
# Subnets - separate subnets for controllers and workers because Azure
# network security groups are oriented around address prefixes rather
# than instance tags (GCP) or security group membership (AWS)
resource "azurerm_subnet" "controller" {
resource_group_name = azurerm_resource_group.cluster.name
name = "controller"
resource_group_name = azurerm_resource_group.cluster.name
virtual_network_name = azurerm_virtual_network.network.name
address_prefixes = [cidrsubnet(var.host_cidr, 1, 0)]
address_prefixes = concat(
local.controller_subnets.ipv4,
local.controller_subnets.ipv6,
)
default_outbound_access_enabled = false
}
resource "azurerm_subnet_network_security_group_association" "controller" {
@ -30,11 +66,14 @@ resource "azurerm_subnet_network_security_group_association" "controller" {
}
resource "azurerm_subnet" "worker" {
resource_group_name = azurerm_resource_group.cluster.name
name = "worker"
resource_group_name = azurerm_resource_group.cluster.name
virtual_network_name = azurerm_virtual_network.network.name
address_prefixes = [cidrsubnet(var.host_cidr, 1, 1)]
address_prefixes = concat(
local.worker_subnets.ipv4,
local.worker_subnets.ipv6,
)
default_outbound_access_enabled = false
}
resource "azurerm_subnet_network_security_group_association" "worker" {

View File

@ -6,13 +6,18 @@ output "kubeconfig-admin" {
# Outputs for Kubernetes Ingress
output "ingress_static_ipv4" {
value = azurerm_public_ip.ingress-ipv4.ip_address
value = azurerm_public_ip.frontend-ipv4.ip_address
description = "IPv4 address of the load balancer for distributing traffic to Ingress controllers"
}
output "ingress_static_ipv6" {
value = azurerm_public_ip.frontend-ipv6.ip_address
description = "IPv6 address of the load balancer for distributing traffic to Ingress controllers"
}
# Outputs for worker pools
output "region" {
output "location" {
value = azurerm_resource_group.cluster.location
}
@ -51,12 +56,12 @@ output "worker_security_group_name" {
output "controller_address_prefixes" {
description = "Controller network subnet CIDR addresses (for source/destination)"
value = azurerm_subnet.controller.address_prefixes
value = local.controller_subnets
}
output "worker_address_prefixes" {
description = "Worker network subnet CIDR addresses (for source/destination)"
value = azurerm_subnet.worker.address_prefixes
value = local.worker_subnets
}
# Outputs for custom load balancing
@ -66,9 +71,12 @@ output "loadbalancer_id" {
value = azurerm_lb.cluster.id
}
output "backend_address_pool_id" {
description = "ID of the worker backend address pool"
value = azurerm_lb_backend_address_pool.worker.id
output "backend_address_pool_ids" {
description = "IDs of the worker backend address pools"
value = {
ipv4 = [azurerm_lb_backend_address_pool.worker-ipv4.id]
ipv6 = [azurerm_lb_backend_address_pool.worker-ipv6.id]
}
}
# Outputs for debug

View File

@ -1,214 +1,223 @@
# Controller security group
resource "azurerm_network_security_group" "controller" {
name = "${var.cluster_name}-controller"
resource_group_name = azurerm_resource_group.cluster.name
name = "${var.cluster_name}-controller"
location = azurerm_resource_group.cluster.location
location = azurerm_resource_group.cluster.location
}
resource "azurerm_network_security_rule" "controller-icmp" {
resource_group_name = azurerm_resource_group.cluster.name
for_each = local.controller_subnets
name = "allow-icmp"
name = "allow-icmp-${each.key}"
resource_group_name = azurerm_resource_group.cluster.name
network_security_group_name = azurerm_network_security_group.controller.name
priority = "1995"
priority = 1995 + (each.key == "ipv4" ? 0 : 1)
access = "Allow"
direction = "Inbound"
protocol = "Icmp"
source_port_range = "*"
destination_port_range = "*"
source_address_prefixes = concat(azurerm_subnet.controller.address_prefixes, azurerm_subnet.worker.address_prefixes)
destination_address_prefixes = azurerm_subnet.controller.address_prefixes
source_address_prefixes = local.cluster_subnets[each.key]
destination_address_prefixes = local.controller_subnets[each.key]
}
resource "azurerm_network_security_rule" "controller-ssh" {
resource_group_name = azurerm_resource_group.cluster.name
for_each = local.controller_subnets
name = "allow-ssh"
name = "allow-ssh-${each.key}"
resource_group_name = azurerm_resource_group.cluster.name
network_security_group_name = azurerm_network_security_group.controller.name
priority = "2000"
priority = 2000 + (each.key == "ipv4" ? 0 : 1)
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "22"
source_address_prefix = "*"
destination_address_prefixes = azurerm_subnet.controller.address_prefixes
destination_address_prefixes = local.controller_subnets[each.key]
}
resource "azurerm_network_security_rule" "controller-etcd" {
resource_group_name = azurerm_resource_group.cluster.name
for_each = local.controller_subnets
name = "allow-etcd"
name = "allow-etcd-${each.key}"
resource_group_name = azurerm_resource_group.cluster.name
network_security_group_name = azurerm_network_security_group.controller.name
priority = "2005"
priority = 2005 + (each.key == "ipv4" ? 0 : 1)
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "2379-2380"
source_address_prefixes = azurerm_subnet.controller.address_prefixes
destination_address_prefixes = azurerm_subnet.controller.address_prefixes
source_address_prefixes = local.controller_subnets[each.key]
destination_address_prefixes = local.controller_subnets[each.key]
}
# Allow Prometheus to scrape etcd metrics
resource "azurerm_network_security_rule" "controller-etcd-metrics" {
resource_group_name = azurerm_resource_group.cluster.name
for_each = local.controller_subnets
name = "allow-etcd-metrics"
name = "allow-etcd-metrics-${each.key}"
resource_group_name = azurerm_resource_group.cluster.name
network_security_group_name = azurerm_network_security_group.controller.name
priority = "2010"
priority = 2010 + (each.key == "ipv4" ? 0 : 1)
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "2381"
source_address_prefixes = azurerm_subnet.worker.address_prefixes
destination_address_prefixes = azurerm_subnet.controller.address_prefixes
source_address_prefixes = local.worker_subnets[each.key]
destination_address_prefixes = local.controller_subnets[each.key]
}
# Allow Prometheus to scrape kube-proxy metrics
resource "azurerm_network_security_rule" "controller-kube-proxy" {
resource_group_name = azurerm_resource_group.cluster.name
for_each = local.controller_subnets
name = "allow-kube-proxy-metrics"
name = "allow-kube-proxy-metrics-${each.key}"
resource_group_name = azurerm_resource_group.cluster.name
network_security_group_name = azurerm_network_security_group.controller.name
priority = "2011"
priority = 2012 + (each.key == "ipv4" ? 0 : 1)
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "10249"
source_address_prefixes = azurerm_subnet.worker.address_prefixes
destination_address_prefixes = azurerm_subnet.controller.address_prefixes
source_address_prefixes = local.worker_subnets[each.key]
destination_address_prefixes = local.controller_subnets[each.key]
}
# Allow Prometheus to scrape kube-scheduler and kube-controller-manager metrics
resource "azurerm_network_security_rule" "controller-kube-metrics" {
resource_group_name = azurerm_resource_group.cluster.name
for_each = local.controller_subnets
name = "allow-kube-metrics"
name = "allow-kube-metrics-${each.key}"
resource_group_name = azurerm_resource_group.cluster.name
network_security_group_name = azurerm_network_security_group.controller.name
priority = "2012"
priority = 2014 + (each.key == "ipv4" ? 0 : 1)
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "10257-10259"
source_address_prefixes = azurerm_subnet.worker.address_prefixes
destination_address_prefixes = azurerm_subnet.controller.address_prefixes
source_address_prefixes = local.worker_subnets[each.key]
destination_address_prefixes = local.controller_subnets[each.key]
}
resource "azurerm_network_security_rule" "controller-apiserver" {
resource_group_name = azurerm_resource_group.cluster.name
for_each = local.controller_subnets
name = "allow-apiserver"
name = "allow-apiserver-${each.key}"
resource_group_name = azurerm_resource_group.cluster.name
network_security_group_name = azurerm_network_security_group.controller.name
priority = "2015"
priority = 2016 + (each.key == "ipv4" ? 0 : 1)
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "6443"
source_address_prefix = "*"
destination_address_prefixes = azurerm_subnet.controller.address_prefixes
destination_address_prefixes = local.controller_subnets[each.key]
}
resource "azurerm_network_security_rule" "controller-cilium-health" {
resource_group_name = azurerm_resource_group.cluster.name
count = var.networking == "cilium" ? 1 : 0
for_each = var.networking == "cilium" ? local.controller_subnets : {}
name = "allow-cilium-health"
name = "allow-cilium-health-${each.key}"
resource_group_name = azurerm_resource_group.cluster.name
network_security_group_name = azurerm_network_security_group.controller.name
priority = "2018"
priority = 2018 + (each.key == "ipv4" ? 0 : 1)
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "4240"
source_address_prefixes = concat(azurerm_subnet.controller.address_prefixes, azurerm_subnet.worker.address_prefixes)
destination_address_prefixes = azurerm_subnet.controller.address_prefixes
source_address_prefixes = local.cluster_subnets[each.key]
destination_address_prefixes = local.controller_subnets[each.key]
}
resource "azurerm_network_security_rule" "controller-cilium-metrics" {
resource_group_name = azurerm_resource_group.cluster.name
count = var.networking == "cilium" ? 1 : 0
for_each = var.networking == "cilium" ? local.controller_subnets : {}
name = "allow-cilium-metrics"
name = "allow-cilium-metrics-${each.key}"
resource_group_name = azurerm_resource_group.cluster.name
network_security_group_name = azurerm_network_security_group.controller.name
priority = "2019"
priority = 2035 + (each.key == "ipv4" ? 0 : 1)
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "9962-9965"
source_address_prefixes = concat(azurerm_subnet.controller.address_prefixes, azurerm_subnet.worker.address_prefixes)
destination_address_prefixes = azurerm_subnet.controller.address_prefixes
source_address_prefixes = local.cluster_subnets[each.key]
destination_address_prefixes = local.controller_subnets[each.key]
}
resource "azurerm_network_security_rule" "controller-vxlan" {
resource_group_name = azurerm_resource_group.cluster.name
for_each = local.controller_subnets
name = "allow-vxlan"
name = "allow-vxlan-${each.key}"
resource_group_name = azurerm_resource_group.cluster.name
network_security_group_name = azurerm_network_security_group.controller.name
priority = "2020"
priority = 2020 + (each.key == "ipv4" ? 0 : 1)
access = "Allow"
direction = "Inbound"
protocol = "Udp"
source_port_range = "*"
destination_port_range = "4789"
source_address_prefixes = concat(azurerm_subnet.controller.address_prefixes, azurerm_subnet.worker.address_prefixes)
destination_address_prefixes = azurerm_subnet.controller.address_prefixes
source_address_prefixes = local.cluster_subnets[each.key]
destination_address_prefixes = local.controller_subnets[each.key]
}
resource "azurerm_network_security_rule" "controller-linux-vxlan" {
resource_group_name = azurerm_resource_group.cluster.name
for_each = local.controller_subnets
name = "allow-linux-vxlan"
name = "allow-linux-vxlan-${each.key}"
resource_group_name = azurerm_resource_group.cluster.name
network_security_group_name = azurerm_network_security_group.controller.name
priority = "2021"
priority = 2022 + (each.key == "ipv4" ? 0 : 1)
access = "Allow"
direction = "Inbound"
protocol = "Udp"
source_port_range = "*"
destination_port_range = "8472"
source_address_prefixes = concat(azurerm_subnet.controller.address_prefixes, azurerm_subnet.worker.address_prefixes)
destination_address_prefixes = azurerm_subnet.controller.address_prefixes
source_address_prefixes = local.cluster_subnets[each.key]
destination_address_prefixes = local.controller_subnets[each.key]
}
# Allow Prometheus to scrape node-exporter daemonset
resource "azurerm_network_security_rule" "controller-node-exporter" {
resource_group_name = azurerm_resource_group.cluster.name
for_each = local.controller_subnets
name = "allow-node-exporter"
name = "allow-node-exporter-${each.key}"
resource_group_name = azurerm_resource_group.cluster.name
network_security_group_name = azurerm_network_security_group.controller.name
priority = "2025"
priority = 2025 + (each.key == "ipv4" ? 0 : 1)
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "9100"
source_address_prefixes = azurerm_subnet.worker.address_prefixes
destination_address_prefixes = azurerm_subnet.controller.address_prefixes
source_address_prefixes = local.worker_subnets[each.key]
destination_address_prefixes = local.controller_subnets[each.key]
}
# Allow apiserver to access kubelet's for exec, log, port-forward
resource "azurerm_network_security_rule" "controller-kubelet" {
resource_group_name = azurerm_resource_group.cluster.name
for_each = local.controller_subnets
name = "allow-kubelet"
name = "allow-kubelet-${each.key}"
resource_group_name = azurerm_resource_group.cluster.name
network_security_group_name = azurerm_network_security_group.controller.name
priority = "2030"
priority = 2030 + (each.key == "ipv4" ? 0 : 1)
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "10250"
# allow Prometheus to scrape kubelet metrics too
source_address_prefixes = concat(azurerm_subnet.controller.address_prefixes, azurerm_subnet.worker.address_prefixes)
destination_address_prefixes = azurerm_subnet.controller.address_prefixes
source_address_prefixes = local.cluster_subnets[each.key]
destination_address_prefixes = local.controller_subnets[each.key]
}
# Override Azure AllowVNetInBound and AllowAzureLoadBalancerInBound
@ -247,182 +256,189 @@ resource "azurerm_network_security_rule" "controller-deny-all" {
# Worker security group
resource "azurerm_network_security_group" "worker" {
name = "${var.cluster_name}-worker"
resource_group_name = azurerm_resource_group.cluster.name
name = "${var.cluster_name}-worker"
location = azurerm_resource_group.cluster.location
location = azurerm_resource_group.cluster.location
}
resource "azurerm_network_security_rule" "worker-icmp" {
resource_group_name = azurerm_resource_group.cluster.name
for_each = local.worker_subnets
name = "allow-icmp"
name = "allow-icmp-${each.key}"
resource_group_name = azurerm_resource_group.cluster.name
network_security_group_name = azurerm_network_security_group.worker.name
priority = "1995"
priority = 1995 + (each.key == "ipv4" ? 0 : 1)
access = "Allow"
direction = "Inbound"
protocol = "Icmp"
source_port_range = "*"
destination_port_range = "*"
source_address_prefixes = concat(azurerm_subnet.controller.address_prefixes, azurerm_subnet.worker.address_prefixes)
destination_address_prefixes = azurerm_subnet.worker.address_prefixes
source_address_prefixes = local.cluster_subnets[each.key]
destination_address_prefixes = local.worker_subnets[each.key]
}
resource "azurerm_network_security_rule" "worker-ssh" {
resource_group_name = azurerm_resource_group.cluster.name
for_each = local.worker_subnets
name = "allow-ssh"
name = "allow-ssh-${each.key}"
resource_group_name = azurerm_resource_group.cluster.name
network_security_group_name = azurerm_network_security_group.worker.name
priority = "2000"
priority = 2000 + (each.key == "ipv4" ? 0 : 1)
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "22"
source_address_prefixes = azurerm_subnet.controller.address_prefixes
destination_address_prefixes = azurerm_subnet.worker.address_prefixes
source_address_prefixes = local.controller_subnets[each.key]
destination_address_prefixes = local.worker_subnets[each.key]
}
resource "azurerm_network_security_rule" "worker-http" {
resource_group_name = azurerm_resource_group.cluster.name
for_each = local.worker_subnets
name = "allow-http"
name = "allow-http-${each.key}"
resource_group_name = azurerm_resource_group.cluster.name
network_security_group_name = azurerm_network_security_group.worker.name
priority = "2005"
priority = 2005 + (each.key == "ipv4" ? 0 : 1)
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "80"
source_address_prefix = "*"
destination_address_prefixes = azurerm_subnet.worker.address_prefixes
destination_address_prefixes = local.worker_subnets[each.key]
}
resource "azurerm_network_security_rule" "worker-https" {
resource_group_name = azurerm_resource_group.cluster.name
for_each = local.worker_subnets
name = "allow-https"
name = "allow-https-${each.key}"
resource_group_name = azurerm_resource_group.cluster.name
network_security_group_name = azurerm_network_security_group.worker.name
priority = "2010"
priority = 2010 + (each.key == "ipv4" ? 0 : 1)
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "443"
source_address_prefix = "*"
destination_address_prefixes = azurerm_subnet.worker.address_prefixes
destination_address_prefixes = local.worker_subnets[each.key]
}
resource "azurerm_network_security_rule" "worker-cilium-health" {
resource_group_name = azurerm_resource_group.cluster.name
count = var.networking == "cilium" ? 1 : 0
for_each = var.networking == "cilium" ? local.worker_subnets : {}
name = "allow-cilium-health"
name = "allow-cilium-health-${each.key}"
resource_group_name = azurerm_resource_group.cluster.name
network_security_group_name = azurerm_network_security_group.worker.name
priority = "2013"
priority = 2012 + (each.key == "ipv4" ? 0 : 1)
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "4240"
source_address_prefixes = concat(azurerm_subnet.controller.address_prefixes, azurerm_subnet.worker.address_prefixes)
destination_address_prefixes = azurerm_subnet.worker.address_prefixes
source_address_prefixes = local.cluster_subnets[each.key]
destination_address_prefixes = local.worker_subnets[each.key]
}
resource "azurerm_network_security_rule" "worker-cilium-metrics" {
resource_group_name = azurerm_resource_group.cluster.name
count = var.networking == "cilium" ? 1 : 0
for_each = var.networking == "cilium" ? local.worker_subnets : {}
name = "allow-cilium-metrics"
name = "allow-cilium-metrics-${each.key}"
resource_group_name = azurerm_resource_group.cluster.name
network_security_group_name = azurerm_network_security_group.worker.name
priority = "2014"
priority = 2014 + (each.key == "ipv4" ? 0 : 1)
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "9962-9965"
source_address_prefixes = concat(azurerm_subnet.controller.address_prefixes, azurerm_subnet.worker.address_prefixes)
destination_address_prefixes = azurerm_subnet.worker.address_prefixes
source_address_prefixes = local.cluster_subnets[each.key]
destination_address_prefixes = local.worker_subnets[each.key]
}
resource "azurerm_network_security_rule" "worker-vxlan" {
resource_group_name = azurerm_resource_group.cluster.name
for_each = local.worker_subnets
name = "allow-vxlan"
name = "allow-vxlan-${each.key}"
resource_group_name = azurerm_resource_group.cluster.name
network_security_group_name = azurerm_network_security_group.worker.name
priority = "2015"
priority = 2016 + (each.key == "ipv4" ? 0 : 1)
access = "Allow"
direction = "Inbound"
protocol = "Udp"
source_port_range = "*"
destination_port_range = "4789"
source_address_prefixes = concat(azurerm_subnet.controller.address_prefixes, azurerm_subnet.worker.address_prefixes)
destination_address_prefixes = azurerm_subnet.worker.address_prefixes
source_address_prefixes = local.cluster_subnets[each.key]
destination_address_prefixes = local.worker_subnets[each.key]
}
resource "azurerm_network_security_rule" "worker-linux-vxlan" {
resource_group_name = azurerm_resource_group.cluster.name
for_each = local.worker_subnets
name = "allow-linux-vxlan"
name = "allow-linux-vxlan-${each.key}"
resource_group_name = azurerm_resource_group.cluster.name
network_security_group_name = azurerm_network_security_group.worker.name
priority = "2016"
priority = 2018 + (each.key == "ipv4" ? 0 : 1)
access = "Allow"
direction = "Inbound"
protocol = "Udp"
source_port_range = "*"
destination_port_range = "8472"
source_address_prefixes = concat(azurerm_subnet.controller.address_prefixes, azurerm_subnet.worker.address_prefixes)
destination_address_prefixes = azurerm_subnet.worker.address_prefixes
source_address_prefixes = local.cluster_subnets[each.key]
destination_address_prefixes = local.worker_subnets[each.key]
}
# Allow Prometheus to scrape node-exporter daemonset
resource "azurerm_network_security_rule" "worker-node-exporter" {
resource_group_name = azurerm_resource_group.cluster.name
for_each = local.worker_subnets
name = "allow-node-exporter"
name = "allow-node-exporter-${each.key}"
resource_group_name = azurerm_resource_group.cluster.name
network_security_group_name = azurerm_network_security_group.worker.name
priority = "2020"
priority = 2020 + (each.key == "ipv4" ? 0 : 1)
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "9100"
source_address_prefixes = azurerm_subnet.worker.address_prefixes
destination_address_prefixes = azurerm_subnet.worker.address_prefixes
source_address_prefixes = local.worker_subnets[each.key]
destination_address_prefixes = local.worker_subnets[each.key]
}
# Allow Prometheus to scrape kube-proxy
resource "azurerm_network_security_rule" "worker-kube-proxy" {
resource_group_name = azurerm_resource_group.cluster.name
for_each = local.worker_subnets
name = "allow-kube-proxy"
name = "allow-kube-proxy-${each.key}"
resource_group_name = azurerm_resource_group.cluster.name
network_security_group_name = azurerm_network_security_group.worker.name
priority = "2024"
priority = 2024 + (each.key == "ipv4" ? 0 : 1)
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "10249"
source_address_prefixes = azurerm_subnet.worker.address_prefixes
destination_address_prefixes = azurerm_subnet.worker.address_prefixes
source_address_prefixes = local.worker_subnets[each.key]
destination_address_prefixes = local.worker_subnets[each.key]
}
# Allow apiserver to access kubelet's for exec, log, port-forward
resource "azurerm_network_security_rule" "worker-kubelet" {
resource_group_name = azurerm_resource_group.cluster.name
for_each = local.worker_subnets
name = "allow-kubelet"
name = "allow-kubelet-${each.key}"
resource_group_name = azurerm_resource_group.cluster.name
network_security_group_name = azurerm_network_security_group.worker.name
priority = "2025"
priority = 2026 + (each.key == "ipv4" ? 0 : 1)
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "10250"
# allow Prometheus to scrape kubelet metrics too
source_address_prefixes = concat(azurerm_subnet.controller.address_prefixes, azurerm_subnet.worker.address_prefixes)
destination_address_prefixes = azurerm_subnet.worker.address_prefixes
source_address_prefixes = local.cluster_subnets[each.key]
destination_address_prefixes = local.worker_subnets[each.key]
}
# Override Azure AllowVNetInBound and AllowAzureLoadBalancerInBound

View File

@ -18,7 +18,7 @@ resource "null_resource" "copy-controller-secrets" {
connection {
type = "ssh"
host = azurerm_public_ip.controllers.*.ip_address[count.index]
host = azurerm_public_ip.controllers-ipv4[count.index].ip_address
user = "core"
timeout = "15m"
}
@ -45,7 +45,7 @@ resource "null_resource" "bootstrap" {
connection {
type = "ssh"
host = azurerm_public_ip.controllers.*.ip_address[0]
host = azurerm_public_ip.controllers-ipv4[0].ip_address
user = "core"
timeout = "15m"
}

View File

@ -5,9 +5,9 @@ variable "cluster_name" {
# Azure
variable "region" {
variable "location" {
type = string
description = "Azure Region (e.g. centralus , see `az account list-locations --output table`)"
description = "Azure location (e.g. centralus , see `az account list-locations --output table`)"
}
variable "dns_zone" {
@ -22,30 +22,6 @@ variable "dns_zone_group" {
# instances
variable "controller_count" {
type = number
description = "Number of controllers (i.e. masters)"
default = 1
}
variable "worker_count" {
type = number
description = "Number of workers"
default = 1
}
variable "controller_type" {
type = string
description = "Machine type for controllers (see `az vm list-skus --location centralus`)"
default = "Standard_B2s"
}
variable "worker_type" {
type = string
description = "Machine type for workers (see `az vm list-skus --location centralus`)"
default = "Standard_D2as_v5"
}
variable "os_image" {
type = string
description = "Channel for a Container Linux derivative (flatcar-stable, flatcar-beta, flatcar-alpha)"
@ -57,12 +33,60 @@ variable "os_image" {
}
}
variable "disk_size" {
variable "controller_count" {
type = number
description = "Size of the disk in GB"
description = "Number of controllers (i.e. masters)"
default = 1
}
variable "controller_type" {
type = string
description = "Machine type for controllers (see `az vm list-skus --location centralus`)"
default = "Standard_B2s"
}
variable "controller_disk_type" {
type = string
description = "Type of managed disk for controller node(s)"
default = "Premium_LRS"
}
variable "controller_disk_size" {
type = number
description = "Size of the managed disk in GB for controller node(s)"
default = 30
}
variable "worker_count" {
type = number
description = "Number of workers"
default = 1
}
variable "worker_type" {
type = string
description = "Machine type for workers (see `az vm list-skus --location centralus`)"
default = "Standard_D2as_v5"
}
variable "worker_disk_type" {
type = string
description = "Type of managed disk for worker nodes"
default = "Standard_LRS"
}
variable "worker_disk_size" {
type = number
description = "Size of the managed disk in GB for worker nodes"
default = 30
}
variable "worker_ephemeral_disk" {
type = bool
description = "Use ephemeral local disk instead of managed disk (requires vm_type with local storage)"
default = false
}
variable "worker_priority" {
type = string
description = "Set worker priority to Spot to use reduced cost surplus capacity, with the tradeoff that instances can be deallocated at any time."
@ -100,10 +124,15 @@ variable "networking" {
default = "cilium"
}
variable "host_cidr" {
type = string
description = "CIDR IPv4 range to assign to instances"
default = "10.0.0.0/16"
variable "network_cidr" {
type = object({
ipv4 = list(string)
ipv6 = optional(list(string), [])
})
description = "Virtual network CIDR ranges"
default = {
ipv4 = ["10.0.0.0/16"]
}
}
variable "pod_cidr" {
@ -121,32 +150,31 @@ EOD
default = "10.3.0.0/16"
}
variable "enable_reporting" {
type = bool
description = "Enable usage or analytics reporting to upstreams (Calico)"
default = false
}
variable "enable_aggregation" {
type = bool
description = "Enable the Kubernetes Aggregation Layer"
default = true
}
variable "worker_node_labels" {
type = list(string)
description = "List of initial worker node labels"
default = []
}
variable "arch" {
type = string
description = "Container architecture (amd64 or arm64)"
default = "amd64"
# advanced
variable "controller_arch" {
type = string
description = "Controller node(s) architecture (amd64 or arm64)"
default = "amd64"
validation {
condition = var.arch == "amd64" || var.arch == "arm64"
error_message = "The arch must be amd64 or arm64."
condition = contains(["amd64", "arm64"], var.controller_arch)
error_message = "The controller_arch must be amd64 or arm64."
}
}
variable "worker_arch" {
type = string
description = "Worker node(s) architecture (amd64 or arm64)"
default = "amd64"
validation {
condition = contains(["amd64", "arm64"], var.worker_arch)
error_message = "The worker_arch must be amd64 or arm64."
}
}
@ -156,14 +184,6 @@ variable "daemonset_tolerations" {
default = []
}
# unofficial, undocumented, unsupported
variable "cluster_domain_suffix" {
type = string
description = "Queries for domains with the suffix will be answered by coredns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
default = "cluster.local"
}
variable "components" {
description = "Configure pre-installed cluster components"
# Component configs are passed through to terraform-render-bootstrap,

View File

@ -3,7 +3,7 @@
terraform {
required_version = ">= 0.13.0, < 2.0.0"
required_providers {
azurerm = ">= 2.8, < 4.0"
azurerm = ">= 2.8"
null = ">= 2.1"
ct = {
source = "poseidon/ct"

View File

@ -3,24 +3,26 @@ module "workers" {
name = var.cluster_name
# Azure
resource_group_name = azurerm_resource_group.cluster.name
region = azurerm_resource_group.cluster.location
subnet_id = azurerm_subnet.worker.id
security_group_id = azurerm_network_security_group.worker.id
backend_address_pool_id = azurerm_lb_backend_address_pool.worker.id
resource_group_name = azurerm_resource_group.cluster.name
location = azurerm_resource_group.cluster.location
subnet_id = azurerm_subnet.worker.id
security_group_id = azurerm_network_security_group.worker.id
backend_address_pool_ids = local.backend_address_pool_ids
worker_count = var.worker_count
vm_type = var.worker_type
os_image = var.os_image
priority = var.worker_priority
worker_count = var.worker_count
vm_type = var.worker_type
os_image = var.os_image
disk_type = var.worker_disk_type
disk_size = var.worker_disk_size
ephemeral_disk = var.worker_ephemeral_disk
priority = var.worker_priority
# configuration
kubeconfig = module.bootstrap.kubeconfig-kubelet
ssh_authorized_key = var.ssh_authorized_key
azure_authorized_key = var.azure_authorized_key
service_cidr = var.service_cidr
cluster_domain_suffix = var.cluster_domain_suffix
snippets = var.worker_snippets
node_labels = var.worker_node_labels
arch = var.arch
kubeconfig = module.bootstrap.kubeconfig-kubelet
ssh_authorized_key = var.ssh_authorized_key
azure_authorized_key = var.azure_authorized_key
service_cidr = var.service_cidr
snippets = var.worker_snippets
node_labels = var.worker_node_labels
arch = var.worker_arch
}

View File

@ -28,7 +28,7 @@ systemd:
After=docker.service
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.30.2
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.31.0
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /opt/cni/bin
@ -99,7 +99,7 @@ storage:
cgroupDriver: systemd
clusterDNS:
- ${cluster_dns_service_ip}
clusterDomain: ${cluster_domain_suffix}
clusterDomain: cluster.local
healthzPort: 0
rotateCertificates: true
shutdownGracePeriod: 45s

View File

@ -5,9 +5,9 @@ variable "name" {
# Azure
variable "region" {
variable "location" {
type = string
description = "Must be set to the Azure Region of cluster"
description = "Must be set to the Azure location of cluster"
}
variable "resource_group_name" {
@ -25,9 +25,12 @@ variable "security_group_id" {
description = "Must be set to the `worker_security_group_id` output by cluster"
}
variable "backend_address_pool_id" {
type = string
description = "Must be set to the `worker_backend_address_pool_id` output by cluster"
variable "backend_address_pool_ids" {
type = object({
ipv4 = list(string)
ipv6 = list(string)
})
description = "Must be set to the `backend_address_pool_ids` output by cluster"
}
# instances
@ -55,6 +58,24 @@ variable "os_image" {
}
}
variable "disk_type" {
type = string
description = "Type of managed disk"
default = "Standard_LRS"
}
variable "disk_size" {
type = number
description = "Size of the managed disk in GB"
default = 30
}
variable "ephemeral_disk" {
type = bool
description = "Use ephemeral local disk instead of managed disk (requires vm_type with local storage)"
default = false
}
variable "priority" {
type = string
description = "Set priority to Spot to use reduced cost surplus capacity, with the tradeoff that instances can be evicted at any time."
@ -116,12 +137,3 @@ variable "arch" {
error_message = "The arch must be amd64 or arm64."
}
}
# unofficial, undocumented, unsupported
variable "cluster_domain_suffix" {
description = "Queries for domains with the suffix will be answered by coredns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
type = string
default = "cluster.local"
}

View File

@ -3,7 +3,7 @@
terraform {
required_version = ">= 0.13.0, < 2.0.0"
required_providers {
azurerm = ">= 2.8, < 4.0"
azurerm = ">= 2.8"
ct = {
source = "poseidon/ct"
version = "~> 0.13"

View File

@ -8,25 +8,28 @@ locals {
}
# Workers scale set
resource "azurerm_linux_virtual_machine_scale_set" "workers" {
resource_group_name = var.resource_group_name
name = "${var.name}-worker"
location = var.region
sku = var.vm_type
instances = var.worker_count
# instance name prefix for instances in the set
computer_name_prefix = "${var.name}-worker"
single_placement_group = false
custom_data = base64encode(data.ct_config.worker.rendered)
boot_diagnostics {
# defaults to a managed storage account
}
resource "azurerm_orchestrated_virtual_machine_scale_set" "workers" {
name = "${var.name}-worker"
resource_group_name = var.resource_group_name
location = var.location
platform_fault_domain_count = 1
sku_name = var.vm_type
instances = var.worker_count
# storage
encryption_at_host_enabled = true
os_disk {
storage_account_type = "Standard_LRS"
caching = "ReadWrite"
storage_account_type = var.disk_type
disk_size_gb = var.disk_size
caching = "ReadOnly"
# Optionally, use the ephemeral disk of the instance type (support varies)
dynamic "diff_disk_settings" {
for_each = var.ephemeral_disk ? [1] : []
content {
option = "Local"
placement = "ResourceDisk"
}
}
}
# Flatcar Container Linux
@ -46,13 +49,6 @@ resource "azurerm_linux_virtual_machine_scale_set" "workers" {
}
}
# Azure requires setting admin_ssh_key, though Ignition custom_data handles it too
admin_username = "core"
admin_ssh_key {
username = "core"
public_key = local.azure_authorized_key
}
# network
network_interface {
name = "nic0"
@ -60,17 +56,41 @@ resource "azurerm_linux_virtual_machine_scale_set" "workers" {
network_security_group_id = var.security_group_id
ip_configuration {
name = "ip0"
name = "ipv4"
version = "IPv4"
primary = true
subnet_id = var.subnet_id
# backend address pool to which the NIC should be added
load_balancer_backend_address_pool_ids = [var.backend_address_pool_id]
load_balancer_backend_address_pool_ids = var.backend_address_pool_ids.ipv4
}
ip_configuration {
name = "ipv6"
version = "IPv6"
subnet_id = var.subnet_id
# backend address pool to which the NIC should be added
load_balancer_backend_address_pool_ids = var.backend_address_pool_ids.ipv6
}
}
# boot
user_data_base64 = base64encode(data.ct_config.worker.rendered)
boot_diagnostics {
# defaults to a managed storage account
}
# Azure requires an RSA admin_ssh_key
os_profile {
linux_configuration {
admin_username = "core"
admin_ssh_key {
username = "core"
public_key = local.azure_authorized_key
}
computer_name_prefix = "${var.name}-worker"
}
}
# lifecycle
upgrade_mode = "Manual"
# eviction policy may only be set when priority is Spot
priority = var.priority
eviction_policy = var.priority == "Spot" ? "Delete" : null
@ -79,35 +99,12 @@ resource "azurerm_linux_virtual_machine_scale_set" "workers" {
}
}
# Scale up or down to maintain desired number, tolerating deallocations.
resource "azurerm_monitor_autoscale_setting" "workers" {
resource_group_name = var.resource_group_name
name = "${var.name}-maintain-desired"
location = var.region
# autoscale
enabled = true
target_resource_id = azurerm_linux_virtual_machine_scale_set.workers.id
profile {
name = "default"
capacity {
minimum = var.worker_count
default = var.worker_count
maximum = var.worker_count
}
}
}
# Flatcar Linux worker
data "ct_config" "worker" {
content = templatefile("${path.module}/butane/worker.yaml", {
kubeconfig = indent(10, var.kubeconfig)
ssh_authorized_key = var.ssh_authorized_key
cluster_dns_service_ip = cidrhost(var.service_cidr, 10)
cluster_domain_suffix = var.cluster_domain_suffix
node_labels = join(",", var.node_labels)
node_taints = join(",", var.node_taints)
})

View File

@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.30.2 (upstream)
* Kubernetes v1.31.0 (upstream)
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/), SELinux enforcing
* Advanced features like [snippets](https://typhoon.psdn.io/advanced/customization/#hosts) customization

View File

@ -1,6 +1,6 @@
# Kubernetes assets (kubeconfig, manifests)
module "bootstrap" {
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=886f501bf7b624fc12acac83449b81d0dc8b8849"
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=1ddecb1cef65c9715ed66b6c335634bc51f59613"
cluster_name = var.cluster_name
api_servers = [var.k8s_domain_name]
@ -10,9 +10,6 @@ module "bootstrap" {
network_ip_autodetection_method = var.network_ip_autodetection_method
pod_cidr = var.pod_cidr
service_cidr = var.service_cidr
cluster_domain_suffix = var.cluster_domain_suffix
enable_reporting = var.enable_reporting
enable_aggregation = var.enable_aggregation
components = var.components
}

View File

@ -53,7 +53,7 @@ systemd:
Description=Kubelet (System Container)
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.30.2
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.31.0
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /opt/cni/bin
@ -113,7 +113,7 @@ systemd:
Type=oneshot
RemainAfterExit=true
WorkingDirectory=/opt/bootstrap
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.30.2
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.31.0
ExecStartPre=-/usr/bin/podman rm bootstrap
ExecStart=/usr/bin/podman run --name bootstrap \
--network host \
@ -154,7 +154,7 @@ storage:
cgroupDriver: systemd
clusterDNS:
- ${cluster_dns_service_ip}
clusterDomain: ${cluster_domain_suffix}
clusterDomain: cluster.local
healthzPort: 0
rotateCertificates: true
shutdownGracePeriod: 45s

View File

@ -59,7 +59,6 @@ data "ct_config" "controllers" {
etcd_name = var.controllers.*.name[count.index]
etcd_initial_cluster = join(",", formatlist("%s=https://%s:2380", var.controllers.*.name, var.controllers.*.domain))
cluster_dns_service_ip = module.bootstrap.cluster_dns_service_ip
cluster_domain_suffix = var.cluster_domain_suffix
ssh_authorized_key = var.ssh_authorized_key
})
strict = true

View File

@ -139,25 +139,7 @@ variable "kernel_args" {
default = []
}
variable "enable_reporting" {
type = bool
description = "Enable usage or analytics reporting to upstreams (Calico)"
default = false
}
variable "enable_aggregation" {
type = bool
description = "Enable the Kubernetes Aggregation Layer"
default = true
}
# unofficial, undocumented, unsupported
variable "cluster_domain_suffix" {
description = "Queries for domains with the suffix will be answered by coredns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
type = string
default = "cluster.local"
}
# advanced
variable "components" {
description = "Configure pre-installed cluster components"

View File

@ -25,7 +25,7 @@ systemd:
Description=Kubelet (System Container)
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.30.2
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.31.0
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /opt/cni/bin
@ -108,7 +108,7 @@ storage:
cgroupDriver: systemd
clusterDNS:
- ${cluster_dns_service_ip}
clusterDomain: ${cluster_domain_suffix}
clusterDomain: cluster.local
healthzPort: 0
rotateCertificates: true
shutdownGracePeriod: 45s

View File

@ -53,7 +53,6 @@ data "ct_config" "worker" {
domain_name = var.domain
ssh_authorized_key = var.ssh_authorized_key
cluster_dns_service_ip = cidrhost(var.service_cidr, 10)
cluster_domain_suffix = var.cluster_domain_suffix
node_labels = join(",", var.node_labels)
node_taints = join(",", var.node_taints)
})

View File

@ -103,9 +103,3 @@ The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for
EOD
default = "10.3.0.0/16"
}
variable "cluster_domain_suffix" {
description = "Queries for domains with the suffix will be answered by coredns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
type = string
default = "cluster.local"
}

View File

@ -15,13 +15,12 @@ module "workers" {
domain = var.workers[count.index].domain
# configuration
kubeconfig = module.bootstrap.kubeconfig-kubelet
ssh_authorized_key = var.ssh_authorized_key
service_cidr = var.service_cidr
cluster_domain_suffix = var.cluster_domain_suffix
node_labels = lookup(var.worker_node_labels, var.workers[count.index].name, [])
node_taints = lookup(var.worker_node_taints, var.workers[count.index].name, [])
snippets = lookup(var.snippets, var.workers[count.index].name, [])
kubeconfig = module.bootstrap.kubeconfig-kubelet
ssh_authorized_key = var.ssh_authorized_key
service_cidr = var.service_cidr
node_labels = lookup(var.worker_node_labels, var.workers[count.index].name, [])
node_taints = lookup(var.worker_node_taints, var.workers[count.index].name, [])
snippets = lookup(var.snippets, var.workers[count.index].name, [])
# optional
cached_install = var.cached_install

View File

@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.30.2 (upstream)
* Kubernetes v1.31.0 (upstream)
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Advanced features like [snippets](https://typhoon.psdn.io/advanced/customization/#hosts) customization

View File

@ -1,6 +1,6 @@
# Kubernetes assets (kubeconfig, manifests)
module "bootstrap" {
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=886f501bf7b624fc12acac83449b81d0dc8b8849"
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=1ddecb1cef65c9715ed66b6c335634bc51f59613"
cluster_name = var.cluster_name
api_servers = [var.k8s_domain_name]
@ -10,9 +10,6 @@ module "bootstrap" {
network_ip_autodetection_method = var.network_ip_autodetection_method
pod_cidr = var.pod_cidr
service_cidr = var.service_cidr
cluster_domain_suffix = var.cluster_domain_suffix
enable_reporting = var.enable_reporting
enable_aggregation = var.enable_aggregation
components = var.components
}

View File

@ -64,7 +64,7 @@ systemd:
After=docker.service
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.30.2
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.31.0
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /opt/cni/bin
@ -114,7 +114,7 @@ systemd:
Type=oneshot
RemainAfterExit=true
WorkingDirectory=/opt/bootstrap
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.30.2
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.31.0
ExecStart=/usr/bin/docker run \
-v /etc/kubernetes/pki:/etc/kubernetes/pki:ro \
-v /opt/bootstrap/assets:/assets:ro \
@ -155,7 +155,7 @@ storage:
cgroupDriver: systemd
clusterDNS:
- ${cluster_dns_service_ip}
clusterDomain: ${cluster_domain_suffix}
clusterDomain: cluster.local
healthzPort: 0
rotateCertificates: true
shutdownGracePeriod: 45s

View File

@ -88,7 +88,6 @@ data "ct_config" "controllers" {
etcd_name = var.controllers.*.name[count.index]
etcd_initial_cluster = join(",", formatlist("%s=https://%s:2380", var.controllers.*.name, var.controllers.*.domain))
cluster_dns_service_ip = module.bootstrap.cluster_dns_service_ip
cluster_domain_suffix = var.cluster_domain_suffix
ssh_authorized_key = var.ssh_authorized_key
})
strict = true

View File

@ -144,18 +144,6 @@ variable "kernel_args" {
default = []
}
variable "enable_reporting" {
type = bool
description = "Enable usage or analytics reporting to upstreams (Calico)"
default = false
}
variable "enable_aggregation" {
type = bool
description = "Enable the Kubernetes Aggregation Layer"
default = true
}
variable "oem_type" {
type = string
description = <<EOD
@ -167,13 +155,7 @@ EOD
default = ""
}
# unofficial, undocumented, unsupported
variable "cluster_domain_suffix" {
type = string
description = "Queries for domains with the suffix will be answered by coredns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
default = "cluster.local"
}
# advanced
variable "components" {
description = "Configure pre-installed cluster components"

View File

@ -36,7 +36,7 @@ systemd:
After=docker.service
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.30.2
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.31.0
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /opt/cni/bin
@ -113,7 +113,7 @@ storage:
cgroupDriver: systemd
clusterDNS:
- ${cluster_dns_service_ip}
clusterDomain: ${cluster_domain_suffix}
clusterDomain: cluster.local
healthzPort: 0
rotateCertificates: true
shutdownGracePeriod: 45s

View File

@ -79,7 +79,6 @@ data "ct_config" "worker" {
domain_name = var.domain
ssh_authorized_key = var.ssh_authorized_key
cluster_dns_service_ip = cidrhost(var.service_cidr, 10)
cluster_domain_suffix = var.cluster_domain_suffix
node_labels = join(",", var.node_labels)
node_taints = join(",", var.node_taints)
})

View File

@ -114,13 +114,3 @@ The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for
EOD
default = "10.3.0.0/16"
}
variable "cluster_domain_suffix" {
type = string
description = "Queries for domains with the suffix will be answered by coredns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
default = "cluster.local"
}

View File

@ -15,13 +15,12 @@ module "workers" {
domain = var.workers[count.index].domain
# configuration
kubeconfig = module.bootstrap.kubeconfig-kubelet
ssh_authorized_key = var.ssh_authorized_key
service_cidr = var.service_cidr
cluster_domain_suffix = var.cluster_domain_suffix
node_labels = lookup(var.worker_node_labels, var.workers[count.index].name, [])
node_taints = lookup(var.worker_node_taints, var.workers[count.index].name, [])
snippets = lookup(var.snippets, var.workers[count.index].name, [])
kubeconfig = module.bootstrap.kubeconfig-kubelet
ssh_authorized_key = var.ssh_authorized_key
service_cidr = var.service_cidr
node_labels = lookup(var.worker_node_labels, var.workers[count.index].name, [])
node_taints = lookup(var.worker_node_taints, var.workers[count.index].name, [])
snippets = lookup(var.snippets, var.workers[count.index].name, [])
# optional
download_protocol = var.download_protocol

View File

@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.30.2 (upstream)
* Kubernetes v1.31.0 (upstream)
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/), SELinux enforcing
* Advanced features like [snippets](https://typhoon.psdn.io/advanced/customization/#hosts) customization

View File

@ -1,6 +1,6 @@
# Kubernetes assets (kubeconfig, manifests)
module "bootstrap" {
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=886f501bf7b624fc12acac83449b81d0dc8b8849"
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=1ddecb1cef65c9715ed66b6c335634bc51f59613"
cluster_name = var.cluster_name
api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)]
@ -11,11 +11,8 @@ module "bootstrap" {
network_encapsulation = "vxlan"
network_mtu = "1450"
pod_cidr = var.pod_cidr
service_cidr = var.service_cidr
cluster_domain_suffix = var.cluster_domain_suffix
enable_reporting = var.enable_reporting
enable_aggregation = var.enable_aggregation
components = var.components
pod_cidr = var.pod_cidr
service_cidr = var.service_cidr
components = var.components
}

View File

@ -55,7 +55,7 @@ systemd:
After=afterburn.service
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.30.2
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.31.0
EnvironmentFile=/run/metadata/afterburn
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
@ -123,7 +123,7 @@ systemd:
--volume /opt/bootstrap/assets:/assets:ro,Z \
--volume /opt/bootstrap/apply:/apply:ro,Z \
--entrypoint=/apply \
quay.io/poseidon/kubelet:v1.30.2
quay.io/poseidon/kubelet:v1.31.0
ExecStartPost=/bin/touch /opt/bootstrap/bootstrap.done
ExecStartPost=-/usr/bin/podman stop bootstrap
storage:
@ -151,7 +151,7 @@ storage:
cgroupDriver: systemd
clusterDNS:
- ${cluster_dns_service_ip}
clusterDomain: ${cluster_domain_suffix}
clusterDomain: cluster.local
healthzPort: 0
rotateCertificates: true
shutdownGracePeriod: 45s

View File

@ -28,7 +28,7 @@ systemd:
After=afterburn.service
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.30.2
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.31.0
EnvironmentFile=/run/metadata/afterburn
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
@ -104,7 +104,7 @@ storage:
cgroupDriver: systemd
clusterDNS:
- ${cluster_dns_service_ip}
clusterDomain: ${cluster_domain_suffix}
clusterDomain: cluster.local
healthzPort: 0
rotateCertificates: true
shutdownGracePeriod: 45s

View File

@ -74,7 +74,6 @@ data "ct_config" "controllers" {
for i in range(var.controller_count) : "etcd${i}=https://${var.cluster_name}-etcd${i}.${var.dns_zone}:2380"
])
cluster_dns_service_ip = cidrhost(var.service_cidr, 10)
cluster_domain_suffix = var.cluster_domain_suffix
})
strict = true
snippets = var.controller_snippets

View File

@ -86,25 +86,7 @@ EOD
default = "10.3.0.0/16"
}
variable "enable_reporting" {
type = bool
description = "Enable usage or analytics reporting to upstreams (Calico)"
default = false
}
variable "enable_aggregation" {
type = bool
description = "Enable the Kubernetes Aggregation Layer"
default = true
}
# unofficial, undocumented, unsupported
variable "cluster_domain_suffix" {
type = string
description = "Queries for domains with the suffix will be answered by coredns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
default = "cluster.local"
}
# advanced
variable "components" {
description = "Configure pre-installed cluster components"

View File

@ -62,7 +62,6 @@ resource "digitalocean_tag" "workers" {
data "ct_config" "worker" {
content = templatefile("${path.module}/butane/worker.yaml", {
cluster_dns_service_ip = cidrhost(var.service_cidr, 10)
cluster_domain_suffix = var.cluster_domain_suffix
})
strict = true
snippets = var.worker_snippets

View File

@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.30.2 (upstream)
* Kubernetes v1.31.0 (upstream)
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Advanced features like [snippets](https://typhoon.psdn.io/advanced/customization/#hosts) customization

View File

@ -1,6 +1,6 @@
# Kubernetes assets (kubeconfig, manifests)
module "bootstrap" {
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=886f501bf7b624fc12acac83449b81d0dc8b8849"
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=1ddecb1cef65c9715ed66b6c335634bc51f59613"
cluster_name = var.cluster_name
api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)]
@ -11,11 +11,8 @@ module "bootstrap" {
network_encapsulation = "vxlan"
network_mtu = "1450"
pod_cidr = var.pod_cidr
service_cidr = var.service_cidr
cluster_domain_suffix = var.cluster_domain_suffix
enable_reporting = var.enable_reporting
enable_aggregation = var.enable_aggregation
components = var.components
pod_cidr = var.pod_cidr
service_cidr = var.service_cidr
components = var.components
}

View File

@ -66,7 +66,7 @@ systemd:
After=coreos-metadata.service
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.30.2
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.31.0
EnvironmentFile=/run/metadata/coreos
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
@ -117,7 +117,7 @@ systemd:
Type=oneshot
RemainAfterExit=true
WorkingDirectory=/opt/bootstrap
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.30.2
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.31.0
ExecStart=/usr/bin/docker run \
-v /etc/kubernetes/pki:/etc/kubernetes/pki:ro \
-v /opt/bootstrap/assets:/assets:ro \
@ -153,7 +153,7 @@ storage:
cgroupDriver: systemd
clusterDNS:
- ${cluster_dns_service_ip}
clusterDomain: ${cluster_domain_suffix}
clusterDomain: cluster.local
healthzPort: 0
rotateCertificates: true
shutdownGracePeriod: 45s

View File

@ -38,7 +38,7 @@ systemd:
After=coreos-metadata.service
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.30.2
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.31.0
EnvironmentFile=/run/metadata/coreos
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
@ -103,7 +103,7 @@ storage:
cgroupDriver: systemd
clusterDNS:
- ${cluster_dns_service_ip}
clusterDomain: ${cluster_domain_suffix}
clusterDomain: cluster.local
healthzPort: 0
rotateCertificates: true
shutdownGracePeriod: 45s

View File

@ -79,7 +79,6 @@ data "ct_config" "controllers" {
for i in range(var.controller_count) : "etcd${i}=https://${var.cluster_name}-etcd${i}.${var.dns_zone}:2380"
])
cluster_dns_service_ip = cidrhost(var.service_cidr, 10)
cluster_domain_suffix = var.cluster_domain_suffix
})
strict = true
snippets = var.controller_snippets

View File

@ -86,25 +86,7 @@ EOD
default = "10.3.0.0/16"
}
variable "enable_reporting" {
type = bool
description = "Enable usage or analytics reporting to upstreams (Calico)"
default = false
}
variable "enable_aggregation" {
type = bool
description = "Enable the Kubernetes Aggregation Layer"
default = true
}
# unofficial, undocumented, unsupported
variable "cluster_domain_suffix" {
type = string
description = "Queries for domains with the suffix will be answered by coredns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
default = "cluster.local"
}
# advanced
variable "components" {
description = "Configure pre-installed cluster components"

View File

@ -60,7 +60,6 @@ resource "digitalocean_tag" "workers" {
data "ct_config" "worker" {
content = templatefile("${path.module}/butane/worker.yaml", {
cluster_dns_service_ip = cidrhost(var.service_cidr, 10)
cluster_domain_suffix = var.cluster_domain_suffix
})
strict = true
snippets = var.worker_snippets

View File

@ -37,7 +37,7 @@ resource "google_dns_record_set" "some-application" {
## Azure
On Azure, a load balancer distributes traffic across a backend address pool of worker nodes running an Ingress controller deployment. Security group rules allow traffic to ports 80 and 443. Health probes ensure only workers with a healthy Ingress controller receive traffic.
On Azure, an Azure Load Balancer distributes IPv4/IPv6 traffic across backend address pools of worker nodes running an Ingress controller deployment. Security group rules allow traffic to ports 80 and 443. Health probes ensure only workers with a healthy Ingress controller receive traffic.
Create the Ingress controller deployment, service, RBAC roles, RBAC bindings, and namespace.
@ -53,10 +53,10 @@ app2.example.com -> 11.22.33.44
app3.example.com -> 11.22.33.44
```
Find the load balancer's IPv4 address with the Azure console or use the Typhoon module's output `ingress_static_ipv4`. For example, you might use Terraform to manage a Google Cloud DNS record:
Find the load balancer's addresses with the Azure console or use the Typhoon module's outputs `ingress_static_ipv4` or `ingress_static_ipv6`. For example, you might use Terraform to manage a Google Cloud DNS record:
```tf
resource "google_dns_record_set" "some-application" {
resource "google_dns_record_set" "app-record-a" {
# DNS zone name
managed_zone = "example-zone"
@ -66,6 +66,17 @@ resource "google_dns_record_set" "some-application" {
ttl = 300
rrdatas = [module.ramius.ingress_static_ipv4]
}
resource "google_dns_record_set" "app-record-aaaa" {
# DNS zone name
managed_zone = "example-zone"
# DNS record
name = "app.example.com."
type = "AAAA"
ttl = 300
rrdatas = [module.ramius.ingress_static_ipv6]
}
```
## Bare-Metal

View File

@ -1,13 +1,11 @@
# ARM64
Typhoon supports ARM64 Kubernetes clusters with ARM64 controller and worker nodes (full-cluster) or adding worker pools of ARM64 nodes to clusters with an x86/amd64 control plane for a hybdrid (mixed-arch) cluster.
Typhoon ARM64 clusters (full-cluster or mixed-arch) are available on:
Typhoon supports Kubernetes clusters with ARM64 controller or worker nodes on several platforms:
* AWS with Fedora CoreOS or Flatcar Linux
* Azure with Flatcar Linux
## Cluster
## AWS
Create a cluster on AWS with ARM64 controller and worker nodes. Container workloads must be `arm64` compatible and use `arm64` (or multi-arch) container images.
@ -15,24 +13,23 @@ Create a cluster on AWS with ARM64 controller and worker nodes. Container worklo
```tf
module "gravitas" {
source = "git::https://github.com/poseidon/typhoon//aws/fedora-coreos/kubernetes?ref=v1.30.2"
source = "git::https://github.com/poseidon/typhoon//aws/fedora-coreos/kubernetes?ref=v1.31.0"
# AWS
cluster_name = "gravitas"
dns_zone = "aws.example.com"
dns_zone_id = "Z3PAABBCFAKEC0"
# instances
controller_type = "t4g.small"
controller_arch = "arm64"
worker_count = 2
worker_type = "t4g.small"
worker_arch = "arm64"
worker_price = "0.0168"
# configuration
ssh_authorized_key = "ssh-ed25519 AAAAB3Nz..."
# optional
arch = "arm64"
networking = "cilium"
worker_count = 2
worker_price = "0.0168"
controller_type = "t4g.small"
worker_type = "t4g.small"
}
```
@ -40,24 +37,23 @@ Create a cluster on AWS with ARM64 controller and worker nodes. Container worklo
```tf
module "gravitas" {
source = "git::https://github.com/poseidon/typhoon//aws/flatcar-linux/kubernetes?ref=v1.30.2"
source = "git::https://github.com/poseidon/typhoon//aws/flatcar-linux/kubernetes?ref=v1.31.0"
# AWS
cluster_name = "gravitas"
dns_zone = "aws.example.com"
dns_zone_id = "Z3PAABBCFAKEC0"
# instances
controller_type = "t4g.small"
controller_arch = "arm64"
worker_count = 2
worker_type = "t4g.small"
worker_arch = "arm64"
worker_price = "0.0168"
# configuration
ssh_authorized_key = "ssh-ed25519 AAAAB3Nz..."
# optional
arch = "arm64"
networking = "cilium"
worker_count = 2
worker_price = "0.0168"
controller_type = "t4g.small"
worker_type = "t4g.small"
}
```
@ -66,118 +62,9 @@ Verify the cluster has only arm64 (`aarch64`) nodes. For Flatcar Linux, describe
```
$ kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
ip-10-0-21-119 Ready <none> 77s v1.30.2 10.0.21.119 <none> Fedora CoreOS 35.20211215.3.0 5.15.7-200.fc35.aarch64 containerd://1.5.8
ip-10-0-32-166 Ready <none> 80s v1.30.2 10.0.32.166 <none> Fedora CoreOS 35.20211215.3.0 5.15.7-200.fc35.aarch64 containerd://1.5.8
ip-10-0-5-79 Ready <none> 77s v1.30.2 10.0.5.79 <none> Fedora CoreOS 35.20211215.3.0 5.15.7-200.fc35.aarch64 containerd://1.5.8
```
## Hybrid
Create a hybrid/mixed arch cluster by defining an AWS cluster. Then define a [worker pool](worker-pools.md#aws) with ARM64 workers. Optional taints are added to aid in scheduling.
=== "FCOS Cluster"
```tf
module "gravitas" {
source = "git::https://github.com/poseidon/typhoon//aws/fedora-coreos/kubernetes?ref=v1.30.2"
# AWS
cluster_name = "gravitas"
dns_zone = "aws.example.com"
dns_zone_id = "Z3PAABBCFAKEC0"
# configuration
ssh_authorized_key = "ssh-ed25519 AAAAB3Nz..."
# optional
networking = "cilium"
worker_count = 2
worker_price = "0.021"
daemonset_tolerations = ["arch"] # important
}
```
=== "Flatcar Cluster"
```tf
module "gravitas" {
source = "git::https://github.com/poseidon/typhoon//aws/flatcar-linux/kubernetes?ref=v1.30.2"
# AWS
cluster_name = "gravitas"
dns_zone = "aws.example.com"
dns_zone_id = "Z3PAABBCFAKEC0"
# configuration
ssh_authorized_key = "ssh-ed25519 AAAAB3Nz..."
# optional
networking = "cilium"
worker_count = 2
worker_price = "0.021"
daemonset_tolerations = ["arch"] # important
}
```
=== "FCOS ARM64 Workers"
```tf
module "gravitas-arm64" {
source = "git::https://github.com/poseidon/typhoon//aws/fedora-coreos/kubernetes/workers?ref=v1.30.2"
# AWS
vpc_id = module.gravitas.vpc_id
subnet_ids = module.gravitas.subnet_ids
security_groups = module.gravitas.worker_security_groups
# configuration
name = "gravitas-arm64"
kubeconfig = module.gravitas.kubeconfig
ssh_authorized_key = var.ssh_authorized_key
# optional
arch = "arm64"
instance_type = "t4g.small"
spot_price = "0.0168"
node_taints = ["arch=arm64:NoSchedule"]
}
```
=== "Flatcar ARM64 Workers"
```tf
module "gravitas-arm64" {
source = "git::https://github.com/poseidon/typhoon//aws/flatcar-linux/kubernetes/workers?ref=v1.30.2"
# AWS
vpc_id = module.gravitas.vpc_id
subnet_ids = module.gravitas.subnet_ids
security_groups = module.gravitas.worker_security_groups
# configuration
name = "gravitas-arm64"
kubeconfig = module.gravitas.kubeconfig
ssh_authorized_key = var.ssh_authorized_key
# optional
arch = "arm64"
instance_type = "t4g.small"
spot_price = "0.0168"
node_taints = ["arch=arm64:NoSchedule"]
}
```
Verify amd64 (x86_64) and arm64 (aarch64) nodes are present.
```
$ kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
ip-10-0-1-73 Ready <none> 111m v1.30.2 10.0.1.73 <none> Fedora CoreOS 35.20211215.3.0 5.15.7-200.fc35.x86_64 containerd://1.5.8
ip-10-0-22-79... Ready <none> 111m v1.30.2 10.0.22.79 <none> Flatcar Container Linux by Kinvolk 3033.2.0 (Oklo) 5.10.84-flatcar containerd://1.5.8
ip-10-0-24-130 Ready <none> 111m v1.30.2 10.0.24.130 <none> Fedora CoreOS 35.20211215.3.0 5.15.7-200.fc35.x86_64 containerd://1.5.8
ip-10-0-39-19 Ready <none> 111m v1.30.2 10.0.39.19 <none> Fedora CoreOS 35.20211215.3.0 5.15.7-200.fc35.x86_64 containerd://1.5.8
ip-10-0-21-119 Ready <none> 77s v1.31.0 10.0.21.119 <none> Fedora CoreOS 35.20211215.3.0 5.15.7-200.fc35.aarch64 containerd://1.5.8
ip-10-0-32-166 Ready <none> 80s v1.31.0 10.0.32.166 <none> Fedora CoreOS 35.20211215.3.0 5.15.7-200.fc35.aarch64 containerd://1.5.8
ip-10-0-5-79 Ready <none> 77s v1.31.0 10.0.5.79 <none> Fedora CoreOS 35.20211215.3.0 5.15.7-200.fc35.aarch64 containerd://1.5.8
```
## Azure
@ -186,22 +73,136 @@ Create a cluster on Azure with ARM64 controller and worker nodes. Container work
```tf
module "ramius" {
source = "git::https://github.com/poseidon/typhoon//azure/flatcar-linux/kubernetes?ref=v1.30.2"
source = "git::https://github.com/poseidon/typhoon//azure/flatcar-linux/kubernetes?ref=v1.31.0"
# Azure
cluster_name = "ramius"
region = "centralus"
location = "centralus"
dns_zone = "azure.example.com"
dns_zone_group = "example-group"
# instances
controller_arch = "arm64"
controller_type = "Standard_B2pls_v5"
worker_count = 2
controller_arch = "arm64"
worker_type = "Standard_D2pls_v5"
# configuration
ssh_authorized_key = "ssh-rsa AAAAB3Nz..."
# optional
arch = "arm64"
controller_type = "Standard_D2pls_v5"
worker_type = "Standard_D2pls_v5"
worker_count = 2
host_cidr = "10.0.0.0/20"
}
```
## Hybrid
Create a hybrid/mixed arch cluster by defining a cluster where [worker pool(s)](worker-pools.md#aws) have a different instance type architecture than controllers or other workers. Taints are added to aid in scheduling.
Here's an AWS example,
=== "FCOS Cluster"
```tf
module "gravitas" {
source = "git::https://github.com/poseidon/typhoon//aws/fedora-coreos/kubernetes?ref=v1.31.0"
# AWS
cluster_name = "gravitas"
dns_zone = "aws.example.com"
dns_zone_id = "Z3PAABBCFAKEC0"
# instances
worker_count = 2
worker_arch = "arm64"
worker_type = "t4g.medium"
worker_price = "0.021"
# configuration
daemonset_tolerations = ["arch"] # important
networking = "cilium"
ssh_authorized_key = "ssh-ed25519 AAAAB3Nz..."
}
```
=== "Flatcar Cluster"
```tf
module "gravitas" {
source = "git::https://github.com/poseidon/typhoon//aws/flatcar-linux/kubernetes?ref=v1.31.0"
# AWS
cluster_name = "gravitas"
dns_zone = "aws.example.com"
dns_zone_id = "Z3PAABBCFAKEC0"
# instances
worker_count = 2
worker_arch = "arm64"
worker_type = "t4g.medium"
worker_price = "0.021"
# configuration
daemonset_tolerations = ["arch"] # important
networking = "cilium"
ssh_authorized_key = "ssh-ed25519 AAAAB3Nz..."
}
```
=== "FCOS ARM64 Workers"
```tf
module "gravitas-arm64" {
source = "git::https://github.com/poseidon/typhoon//aws/fedora-coreos/kubernetes/workers?ref=v1.31.0"
# AWS
vpc_id = module.gravitas.vpc_id
subnet_ids = module.gravitas.subnet_ids
security_groups = module.gravitas.worker_security_groups
# instances
arch = "arm64"
instance_type = "t4g.small"
spot_price = "0.0168"
# configuration
name = "gravitas-arm64"
kubeconfig = module.gravitas.kubeconfig
node_taints = ["arch=arm64:NoSchedule"]
ssh_authorized_key = var.ssh_authorized_key
}
```
=== "Flatcar ARM64 Workers"
```tf
module "gravitas-arm64" {
source = "git::https://github.com/poseidon/typhoon//aws/flatcar-linux/kubernetes/workers?ref=v1.31.0"
# AWS
vpc_id = module.gravitas.vpc_id
subnet_ids = module.gravitas.subnet_ids
security_groups = module.gravitas.worker_security_groups
# instances
arch = "arm64"
instance_type = "t4g.small"
spot_price = "0.0168"
# configuration
name = "gravitas-arm64"
kubeconfig = module.gravitas.kubeconfig
node_taints = ["arch=arm64:NoSchedule"]
ssh_authorized_key = var.ssh_authorized_key
}
```
Verify amd64 (x86_64) and arm64 (aarch64) nodes are present.
```
$ kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
ip-10-0-1-73 Ready <none> 111m v1.31.0 10.0.1.73 <none> Fedora CoreOS 35.20211215.3.0 5.15.7-200.fc35.x86_64 containerd://1.5.8
ip-10-0-22-79... Ready <none> 111m v1.31.0 10.0.22.79 <none> Flatcar Container Linux by Kinvolk 3033.2.0 (Oklo) 5.10.84-flatcar containerd://1.5.8
ip-10-0-24-130 Ready <none> 111m v1.31.0 10.0.24.130 <none> Fedora CoreOS 35.20211215.3.0 5.15.7-200.fc35.x86_64 containerd://1.5.8
ip-10-0-39-19 Ready <none> 111m v1.31.0 10.0.39.19 <none> Fedora CoreOS 35.20211215.3.0 5.15.7-200.fc35.x86_64 containerd://1.5.8
```

View File

@ -36,7 +36,7 @@ Add custom initial worker node labels to default workers or worker pool nodes to
```tf
module "yavin" {
source = "git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes?ref=v1.30.2"
source = "git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes?ref=v1.31.0"
# Google Cloud
cluster_name = "yavin"
@ -57,7 +57,7 @@ Add custom initial worker node labels to default workers or worker pool nodes to
```tf
module "yavin-pool" {
source = "git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes/workers?ref=v1.30.2"
source = "git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes/workers?ref=v1.31.0"
# Google Cloud
cluster_name = "yavin"
@ -89,7 +89,7 @@ Add custom initial taints on worker pool nodes to indicate a node is unique and
```tf
module "yavin" {
source = "git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes?ref=v1.30.2"
source = "git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes?ref=v1.31.0"
# Google Cloud
cluster_name = "yavin"
@ -110,7 +110,7 @@ Add custom initial taints on worker pool nodes to indicate a node is unique and
```tf
module "yavin-pool" {
source = "git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes/workers?ref=v1.30.2"
source = "git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes/workers?ref=v1.31.0"
# Google Cloud
cluster_name = "yavin"

View File

@ -19,7 +19,7 @@ Create a cluster following the AWS [tutorial](../flatcar-linux/aws.md#cluster).
```tf
module "tempest-worker-pool" {
source = "git::https://github.com/poseidon/typhoon//aws/fedora-coreos/kubernetes/workers?ref=v1.30.2"
source = "git::https://github.com/poseidon/typhoon//aws/fedora-coreos/kubernetes/workers?ref=v1.31.0"
# AWS
vpc_id = module.tempest.vpc_id
@ -42,7 +42,7 @@ Create a cluster following the AWS [tutorial](../flatcar-linux/aws.md#cluster).
```tf
module "tempest-worker-pool" {
source = "git::https://github.com/poseidon/typhoon//aws/flatcar-linux/kubernetes/workers?ref=v1.30.2"
source = "git::https://github.com/poseidon/typhoon//aws/flatcar-linux/kubernetes/workers?ref=v1.31.0"
# AWS
vpc_id = module.tempest.vpc_id
@ -111,14 +111,14 @@ Create a cluster following the Azure [tutorial](../flatcar-linux/azure.md#cluste
```tf
module "ramius-worker-pool" {
source = "git::https://github.com/poseidon/typhoon//azure/fedora-coreos/kubernetes/workers?ref=v1.30.2"
source = "git::https://github.com/poseidon/typhoon//azure/fedora-coreos/kubernetes/workers?ref=v1.31.0"
# Azure
region = module.ramius.region
resource_group_name = module.ramius.resource_group_name
subnet_id = module.ramius.subnet_id
security_group_id = module.ramius.security_group_id
backend_address_pool_id = module.ramius.backend_address_pool_id
location = module.ramius.location
resource_group_name = module.ramius.resource_group_name
subnet_id = module.ramius.subnet_id
security_group_id = module.ramius.security_group_id
backend_address_pool_ids = module.ramius.backend_address_pool_ids
# configuration
name = "ramius-spot"
@ -127,7 +127,7 @@ Create a cluster following the Azure [tutorial](../flatcar-linux/azure.md#cluste
# optional
worker_count = 2
vm_type = "Standard_F4"
vm_type = "Standard_D2as_v5"
priority = "Spot"
os_image = "/subscriptions/some/path/Microsoft.Compute/images/fedora-coreos-31.20200323.3.2"
}
@ -137,14 +137,14 @@ Create a cluster following the Azure [tutorial](../flatcar-linux/azure.md#cluste
```tf
module "ramius-worker-pool" {
source = "git::https://github.com/poseidon/typhoon//azure/flatcar-linux/kubernetes/workers?ref=v1.30.2"
source = "git::https://github.com/poseidon/typhoon//azure/flatcar-linux/kubernetes/workers?ref=v1.31.0"
# Azure
region = module.ramius.region
resource_group_name = module.ramius.resource_group_name
subnet_id = module.ramius.subnet_id
security_group_id = module.ramius.security_group_id
backend_address_pool_id = module.ramius.backend_address_pool_id
location = module.ramius.location
resource_group_name = module.ramius.resource_group_name
subnet_id = module.ramius.subnet_id
security_group_id = module.ramius.security_group_id
backend_address_pool_ids = module.ramius.backend_address_pool_ids
# configuration
name = "ramius-spot"
@ -153,7 +153,7 @@ Create a cluster following the Azure [tutorial](../flatcar-linux/azure.md#cluste
# optional
worker_count = 2
vm_type = "Standard_F4"
vm_type = "Standard_D2as_v5"
priority = "Spot"
os_image = "flatcar-beta"
}
@ -180,7 +180,7 @@ The Azure internal `workers` module supports a number of [variables](https://git
| resource_group_name | Must be set to `resource_group_name` output by cluster | module.cluster.resource_group_name |
| subnet_id | Must be set to `subnet_id` output by cluster | module.cluster.subnet_id |
| security_group_id | Must be set to `security_group_id` output by cluster | module.cluster.security_group_id |
| backend_address_pool_id | Must be set to `backend_address_pool_id` output by cluster | module.cluster.backend_address_pool_id |
| backend_address_pool_ids | Must be set to `backend_address_pool_ids` output by cluster | module.cluster.backend_address_pool_ids |
| kubeconfig | Must be set to `kubeconfig` output by cluster | module.cluster.kubeconfig |
| ssh_authorized_key | SSH public key for user 'core' | "ssh-ed25519 AAAAB3NZ..." |
@ -207,7 +207,7 @@ Create a cluster following the Google Cloud [tutorial](../flatcar-linux/google-c
```tf
module "yavin-worker-pool" {
source = "git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes/workers?ref=v1.30.2"
source = "git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes/workers?ref=v1.31.0"
# Google Cloud
region = "europe-west2"
@ -231,7 +231,7 @@ Create a cluster following the Google Cloud [tutorial](../flatcar-linux/google-c
```tf
module "yavin-worker-pool" {
source = "git::https://github.com/poseidon/typhoon//google-cloud/flatcar-linux/kubernetes/workers?ref=v1.30.2"
source = "git::https://github.com/poseidon/typhoon//google-cloud/flatcar-linux/kubernetes/workers?ref=v1.31.0"
# Google Cloud
region = "europe-west2"
@ -262,11 +262,11 @@ Verify a managed instance group of workers joins the cluster within a few minute
```
$ kubectl get nodes
NAME STATUS AGE VERSION
yavin-controller-0.c.example-com.internal Ready 6m v1.30.2
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.30.2
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.30.2
yavin-16x-worker-jrbf.c.example-com.internal Ready 3m v1.30.2
yavin-16x-worker-mzdm.c.example-com.internal Ready 3m v1.30.2
yavin-controller-0.c.example-com.internal Ready 6m v1.31.0
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.31.0
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.31.0
yavin-16x-worker-jrbf.c.example-com.internal Ready 3m v1.31.0
yavin-16x-worker-mzdm.c.example-com.internal Ready 3m v1.31.0
```
### Variables

View File

@ -10,9 +10,9 @@ A load balancer distributes IPv4 TCP/6443 traffic across a backend address pool
### HTTP/HTTPS Ingress
A load balancer distributes IPv4 TCP/80 and TCP/443 traffic across a backend address pool of workers with a healthy Ingress controller.
An Azure Load Balancer distributes IPv4/IPv6 TCP/80 and TCP/443 traffic across backend address pools of workers with a healthy Ingress controller.
The Azure LB IPv4 address is output as `ingress_static_ipv4` for use in DNS A records. See [Ingress on Azure](/addons/ingress/#azure).
The load balancer addresses are output as `ingress_static_ipv4` and `ingress_static_ipv6` for use in DNS A and AAAA records. See [Ingress on Azure](/addons/ingress/#azure).
### TCP/UDP Services
@ -21,27 +21,25 @@ Load balance TCP/UDP applications by adding rules to the Azure LB (output). A ru
```tf
# Forward traffic to the worker backend address pool
resource "azurerm_lb_rule" "some-app-tcp" {
resource_group_name = module.ramius.resource_group_name
name = "some-app-tcp"
resource_group_name = module.ramius.resource_group_name
loadbalancer_id = module.ramius.loadbalancer_id
frontend_ip_configuration_name = "ingress"
frontend_ip_configuration_name = "ingress-ipv4"
protocol = "Tcp"
frontend_port = 3333
backend_port = 30333
backend_address_pool_id = module.ramius.backend_address_pool_id
probe_id = azurerm_lb_probe.some-app.id
protocol = "Tcp"
frontend_port = 3333
backend_port = 30333
backend_address_pool_ids = module.ramius.backend_address_pool_ids.ipv4
probe_id = azurerm_lb_probe.some-app.id
}
# Health check some-app
resource "azurerm_lb_probe" "some-app" {
name = "some-app"
resource_group_name = module.ramius.resource_group_name
name = "some-app"
loadbalancer_id = module.ramius.loadbalancer_id
protocol = "Tcp"
port = 30333
loadbalancer_id = module.ramius.loadbalancer_id
protocol = "Tcp"
port = 30333
}
```
@ -51,9 +49,8 @@ Add firewall rules to the worker security group.
```tf
resource "azurerm_network_security_rule" "some-app" {
resource_group_name = module.ramius.resource_group_name
name = "some-app"
resource_group_name = module.ramius.resource_group_name
network_security_group_name = module.ramius.worker_security_group_name
priority = "3001"
access = "Allow"
@ -62,7 +59,7 @@ resource "azurerm_network_security_rule" "some-app" {
source_port_range = "*"
destination_port_range = "30333"
source_address_prefix = "*"
destination_address_prefixes = module.ramius.worker_address_prefixes
destination_address_prefixes = module.ramius.worker_address_prefixes.ipv4
}
```
@ -72,6 +69,6 @@ Azure does not provide public IPv6 addresses at the standard SKU.
| IPv6 Feature | Supported |
|-------------------------|-----------|
| Node IPv6 address | No |
| Node Outbound IPv6 | No |
| Kubernetes Ingress IPv6 | No |
| Node IPv6 address | Yes |
| Node Outbound IPv6 | Yes |
| Kubernetes Ingress IPv6 | Yes |

Some files were not shown because too many files have changed in this diff Show More