mirror of https://github.com/puppetmaster/typhoon.git synced 2024-12-25 18:49:33 +01:00

Add region to gcp instance template resource

* Configure the regional worker instance templates with the
region of the cluster. This defaults to the provider's region
which isn't always what you want and if left off causes an error
* Close #1512

2024-10-08 21:28:29 -07:00

148 KiB

Raw Blame History

Typhoon

Notable changes between versions.

Latest

v1.31.1

Kubernetes v1.31.1
Update flannel from v0.25.5 to v0.25.6

Google

Add controller_disk_type and worker_disk_type variables (#1513)
Add explicit region field to regional worker instance templates (#1524)

v1.31.0

Kubernetes v1.31.0
Use Cilium kube-proxy replacement mode when cilium networking is chosen (#1501)
Fix invalid flannel-cni container image for those using flannel networking (#1497)

AWS

Use EC2 resource-based hostnames instead of IP-based hostnames (#1499)
- The Amazon DNS server can resolve A and AAAA queries to IPv4 and IPv6 node addresses
Tag controller node EBS volumes with a name based on the controller node name

Google

Use google_compute_region_instance_template instead of google_compute_instance_template
- Google's regional instance template metadata is kept in the associated region for greater resiliency. The "global" instance templates were kept in a single region

v1.30.4

Kubernetes v1.30.4
Update Cilium from v1.15.7 to v1.16.1
Update CoreDNS from v1.11.1 to v1.11.3
Remove enable_aggregation variable for Kubernetes Aggregation Layer, always set to true
Remove cluster_domain_suffix variable, always use "cluster.local"
Remove enable_reporting variable for analytics, always set to false

v1.30.3

Kubernetes v1.30.3
Update Cilium from v1.15.6 to v1.15.7
Update flannel from v0.25.4 to v0.25.5

AWS

Configure controller and worker disks (#1482)
- Add controller_disk_type, controller_disk_size, and controller_disk_iops variables
- Add worker_disk_type, worker_disk_size, and worker_disk_iops variables
- Remove disk_type, disk_size, and disk_iops variables
- Fix propagating settings to worker disks, previously ignored
Configure CPU pricing model for burstable instance types (#1482)
- Add controller_cpu_credits and worker_cpu_credits variables (standard or unlimited)
Configure controller or worker instance architecture (#1485)
- Add controller_arch and worker_arch variables (amd64 or arm64)
- Remove arch variable

module "cluster" {
  ...
- arch      = "amd64"
- disk_type = "gp3"
- disk_size = 30
- disk_iops = 3000

+ controller_arch        = "amd64"
+ controller_disk_size   = 15
+ controller_cpu_credits = "standard"
+ worker_arch            = "amd64"
+ worker_disk_size       = 22
+ worker_cpu_credits     = "unlimited"
}

Azure

Configure the virtual network and subnets with IPv6 private address space
- Change host_cidr variable (string) to a network_cidr object with ipv4 and ipv6 fields that list CIDR strings. Leave the variable unset to use the defaults. (breaking)
Add support for dual-stack Kubernetes Ingress Load Balancing
- Add a public IPv6 frontend, 80/443 rules, and a worker-ipv6 backend pool
- Change the controller_address_prefixes output from a list of strings to an object with ipv4 and ipv6 fields. Most Azure resources can't accept a mix, so these are split out (breaking)
- Change the worker_address_prefixes output from a list of strings to an object with ipv4 and ipv6 fields. Most Azure resources can't accept a mix, so these are split out (breaking)
- Change the backend_address_pool_id output (and worker module input) from a string to an object with ipv4 and ipv6 fields that list ids (breaking)
Configure nodes to have outbound IPv6 internet connectivity (analogous to IPv4 SNAT)
- Configure controller nodes to have a public IPv6 address
- Configure worker nodes to use outbound rules and the load balancer for SNAT
Extend network security rules to allow IPv6 traffic, analogous to IPv4
Rename region variable to location to align with Azure platform conventions (#1469)
Change worker pools from uniform to flexible orchestration mode (#1473)
Add options to allow workers nodes to use ephemeral local disks (#1473)
- Add controller_disk_type and controller_disk_size variables
- Add worker_disk_type, worker_disk_size, and worker_ephemeral_disk variables
Reduce the number of public IPv4 addresses needed for the Azure load balancer (#1470)
Configure controller or worker instance architecture for Flatcar Linux (#1485)
- Add controller_arch and worker_arch variables (amd64 or arm64)
- Remove arch variable

module "cluster" {
  ...
- region = "centralus"
+ location = "centralus"
  # optional
- host_cidr = "10.0.0.0/16"
+ network_cidr = {
+   ipv4 = ["10.0.0.0/16"]
+ }

  # instances
+ controller_disk_type = "StandardSSD_LRS"
+ worker_ephemeral_disk = true
}

Google Cloud

Allow configuring controller and worker disks (#1486)
- Add controller_disk_size and worker_disk_size variables
- Remove disk_size variable

v1.30.2

Kubernetes v1.30.2
Update CoreDNS from v1.9.4 to v1.11.1
Update Cilium from v1.15.5 to v1.15.6
Update flannel from v0.25.1 to v0.25.4

v1.30.1

Kubernetes v1.30.1
Add firewall rules and security group rules for Cilium and Hubble metrics (#1449)
Update Cilium from v1.15.3 to v1.15.5
Update flannel from v0.24.4 to v0.25.1
Introduce components variabe to enable/disable/configure pre-installed components (#1453)
Add Terraform modules for coredns, cilium, and flannel components

Azure

Add controller_security_group_name output for adding custom security rules (#1450)
Add controller_address_prefixes output for adding custom security rules (#1450)

v1.30.0

Kubernetes v1.30.0
Update etcd from v3.5.12 to v3.5.13
Update Cilium from v1.15.2 to v1.15.3
Update Calico from v3.27.2 to v3.27.3

v1.29.3

Kubernetes v1.29.3
Update Cilium from v1.15.1 to v1.15.2
Update flannel from v0.24.2 to v0.24.4

v1.29.2

Kubernetes v1.29.2
Update etcd from v3.5.10 to v3.5.12
Update Cilium from v1.14.3 to v1.15.1
Update Calico from v3.26.3 to v3.27.2
- Fix upstream incompatibility with Fedora CoreOS (calico#8372)
Update flannel from v0.22.2 to v0.24.2
Add an install_container_networking variable (default true) (#1421)
- When true, the chosen container networking provider is installed during cluster bootstrap
- Set false to self-manage the container networking provider. This allows flannel, Calico, or Cilium to be managed via Terraform (like any other Kubernetes resources). Nodes will be NotReady until you apply the self-managed container networking provider. This may become the default in future.
- Continue to set networking to one of the three supported container networking providers. Most require custom firewall / security policies be present across nodes so they have some infra tie-ins.

v1.29.1

Kubernetes v1.29.1

AWS

Continue to support AWS IMDSv1 (#1412)

Known Issues

Calico and Fedora CoreOS cannot be used together currently (calico#8372)

v1.29.0

Kubernetes v1.29.0

Known Issues

Calico and Fedora CoreOS cannot be used together currently (calico#8372)

v1.28.4

Kubernetes v1.28.4

v1.28.3

Kubernetes v1.28.3
Update etcd from v3.5.9 to v3.5.10
Update Cilium from v1.14.2 to v1.14.3
Workaround problems in Cilium v1.14's partial kube-proxy implementation (#365)
Update Calico from v3.26.1 to v3.26.3

Google Cloud

Allow upgrading Google Cloud Terraform provider to v5.x

v1.28.2

Kubernetes v1.28.2
Update Cilium from v1.14.1 to v1.14.2

Azure

Add optional azure_authorized_key variable
- Azure obtusely inspects public keys, requires RSA keys, and forbids more secure key formats (e.g. ed25519)
- Allow passing a dummy RSA key via azure_authorized_key (delete the private key) to satisfy Azure validations, then the usual ssh_authorized_key variable can new newer formats (e.g. ed25519)

v1.28.1

Kubernetes v1.28.1

v1.28.0

Kubernetes v1.28.0
Update Cilium from v1.13.4 to v1.14.1
Update flannel from v0.22.0 to v0.22.2

v1.27.4

Kubernetes v1.27.4

v1.27.3

Kubernetes v1.27.3
Update etcd from v3.5.7 to v3.5.9
Update Cilium from v1.13.2 to v1.13.4
Update Calico from v3.25.1 to v3.26.1
Update flannel from v0.21.2 to v0.22.0

AWS

Allow upgrading AWS Terraform provider to v5.x (#1353)

Azure

Enable boot diagnostics for controller and worker VMs (#1351)

v1.27.2

Kubernetes v1.27.2

Fedora CoreOS

Update Butane Config version from v1.4.0 to v1.5.0
- Require any custom Butane snippets update to v1.5.0
Require Fedora CoreOS 37.20230303.3.0 or newer (with ignition v2.15)
Require poseidon/ct v0.13+ (action required)

v1.27.1

Kubernetes v1.27.1
Update etcd from v3.5.7 to v3.5.8
Update Cilium from v1.13.1 to v1.13.2
Update Calico from v3.25.0 to v3.25.1

v1.26.3

Kubernetes v1.26.3
Update Cilium from v1.12.6 to v1.13.1

Bare-Metal

Add oem_type variable for Flatcar Linux (#1302)

v1.26.2

Kubernetes v1.26.2
Update Cilium from v1.12.5 to v1.12.6
Update flannel from v0.20.2 to v0.21.2

Bare-Metal

Add a worker module to allow customizing individual worker nodes (#1295)

Known Issues

Fedora CoreOS issue fix is progressing through channels

v1.26.1

Kubernetes v1.26.1
Update etcd from v3.5.6 to v3.5.7
Update Cilium from v1.12.4 to v1.12.5
Update Calico from v3.24.5 to v3.25.0
Update CoreDNS from v1.9.3 to v1.9.4

v1.26.0

Kubernetes v1.26.0
Update etcd from v3.5.5 to v3.5.6
Update Cilium from v1.12.3 to v1.12.4
Update flannel from v0.15.1 to v0.20.2
Reminder: Modules are no longer published to the Terraform Module Registry (#1282)
- See #1282 and v1.25.4 for details

AWS

Migrate AWS launch configurations to launch templates (#1275)
- Starting Dec 31, 2022 AWS won't add new instance types/families to launch configurations

Addons

Update ingress-nginx from v1.3.1 to v1.5.1
Update Prometheus from v2.40.1 to v2.40.5
Update node-exporter from v1.3.1 to v1.5.0
Update kube-state-metrics from v2.6.0 to v2.7.0
Update Grafana from v9.2.4 to v9.3.1

v1.25.4

Kubernetes v1.25.4
Update Calico from v3.24.1 to v3.24.5
Allow Kubelet kubeconfig to drain nodes, if desired (#330)
Re-enable Kubelet Graceful Node Shutdown (#1261)
- Introduce companion project poseidon/scuttle
Link to new Mastodon account for release announcements
- @typhoon@fosstodon.org
- @poseidon@fosstodon.org
Deprecate publishing to the Terraform Module Registry
- Typhoon docs have always shown using Git-based module sources, not the Terraform Module Registry
- Module usage should be source = "git::https://github.com/poseidon/typhoon/... not source = poseidon/kubernetes/...
- Terraform's Module Registry requires subtree mirroring typhoon to special terraform-platform-kubernetes repos, only supports release versions (no commit SHAs or forks), only ever contained Flatcar Linux modules (not Fedora CoreOS) for historical reasons
- Note, this does not affect Terraform Providers like poseidon/matchbox or poseidon/ct, the registry works well for providers

Fedora CoreOS

Remove unused Wants=network.target from etcd-member.service (#1254)

Cloud

Remove defunct delete-node.service from worker node configurations (#1256)

Addons

Update Prometheus from v2.39.1 to v2.40.1
Update Grafana from v9.1.7 to v9.2.4

v1.25.3

Kubernetes v1.25.3
Switch Kubernetes registry from k8s.gcr.io to registry.k8s.io for addons (#1246)
Update Cilium from v1.12.2 to v1.12.3 (#1253)

Azure

Change default Azure worker_type from Standard_DS1_v2 to Standard_D2as_v5 (#1248)
- Get 2 VCPU, 7 GiB, 12500Mbps (vs 1 VCPU, 3.5GiB, 750 Mbps)
- Small increase in pay-as-you-go price ($53.29 -> $62.78)
- Small increase in spot price ($5.64/mo -> $7.37/mo)
- Change from Intel to AMD EPYC (D2as_v5 cheaper than D2s_v5)

Flatcar Linux

Add Flatcar Linux ARM64 support on Azure (docs, #1251)
Switch from Azure Hypervisor gen1 to gen2 (action required) (#1248)
- Run az vm image terms accept --publish kinvolk --offer flatcar-container-linux-free --plan stable-gen2

Docs

Remove old docs note about not supporting ARM64 with Calico
- Typhoon supports ARM64 with cilium, calico, and flannel

Addons

Update Prometheus from v2.38.0 to v2.39.1
Update Grafana from v9.1.6 to v9.1.7

v1.25.2

Kubernetes v1.25.2 was skipped since there were minimal changes upstream.

v1.25.1

Kubernetes v1.25.1
Update etcd from v3.5.4 to v3.5.5
Update Cilium from v1.12.1 to v1.12.2
Update Calico from v3.23.3 to v3.24.1
Revert Kubelet Graceful Node Shutdown on worker nodes (#1227)
- Fix issue where non-critical pods are left in Error/Completed state on node shutdown
Remove feature flag disable workaround for kubernetes/kubernetes#112081
- Kubernetes reverted LocalStorageCapacityIsolationFSQuotaMonitoring back to alpha
Remove workaround for preventing search . propagation in kubernetes/kubernetes#112135
- Upstream Kubernetes fix

Addons

Update kube-state-metrics from v2.5.0 to v2.6.0
Update ingress-nginx from v1.3.0 to v1.3.1
Update Grafana from v9.1.0 to v9.1.6

v1.25.0

Kubernetes v1.25.0
- Disable LocalStorageCapacityIsolationFSQuotaMonitoring feature gate (#1220, fixes kubernetes#112081)
- Add workaround to revert adding "search ." to containers' /etc/resolv.conf (#1224, fixes kubernetes#112135)
Migrate most Kubelet flags to KubeletConfiguration file (#1219)
Configure Kubelet Graceful Node Shutdown (#1222)
- Allow up to 30s for critical pods to gracefully shutdown on node shutdown
- Allow up to 15s for regular pods to gracefully shutdown on node shutdown
- Mark node NotReady promptly on node shutdown
- Lengthen systemd inhibitor lock max delay from 5s to 45s

Fedora CoreOS

Change Podman log-driver from journald to k8s-file (#1221)
- Fix etcd-member and Kubelet systemd service log lines appearing twice in journal logs

v1.24.4

Kubernetes v1.24.4
Update CoreDNS from v1.8.6 to v1.9.3
Update Cilium from v1.11.7 to v1.12.1
Update Calico from v3.23.1 to v3.23.3
Switch Kubernetes registry from k8s.gcr.io to registry.k8s.io (#1206)
Remove use of deprecated Terraform template provider (#1194)

Fedora CoreOS

Remove ineffective /etc/fedora-coreos/iptables-legacy.stamp (#1201)
- Typhoon already uses iptables v1.8.7 (nf_tables) since FCOS 36
- Staying on legacy iptables required a file in /etc/coreos instead

Flatcar Linux

Migrate Flatcar Linux from Ignition spec v2.3.0 to v3.3.0 (#1196) (action required)
- Flatcar Linux 3185.0.0+ supports Ignition v3.x specs (which are rendered from Butane Configs, like Fedora CoreOS)
- poseidon/ct v0.11.0 supports the flatcar Butane Config variant
- Require poseidon/ct v0.11+ and Flatcar Linux 3185.0.0+
Please modify any Flatcar Linux snippets to use the Butane Config format (action required)

variant: flatcar
version: 1.0.0
...

AWS

Refresh instances in autoscaling group when launch configuration changes (#1208) (docs, important)
- Worker launch configuration changes start an autoscaling group instance refresh to replace instances
- Instance refresh creates surge instances, waits for a warm-up period, then deletes old instances
- Changing worker_type, disk_*, worker_price, worker_target_groups, or Butane worker_snippets on existing worker nodes will replace instances
- New AMIs or changing os_stream will be ignored, to allow Fedora CoreOS or Flatcar Linux to keep themselves updated
- Previously, new launch configurations were made in the same way, but not applied to instances unless manually replaced
Rename worker autoscaling group ${cluster_name}-worker (#1202)
- Rename launch configuration ${cluster_name}-worker instead of a random id

Google

Roll instance template changes to worker managed instance groups (#1207) (docs, important)
- Worker instance template changes roll out by gradually replacing instances
- Automatic rollouts create surge instances, wait for health checks, then delete old instances (0 unavailable instances)
- Changing worker_type, disk_size, worker_preemptible, or Butane worker_snippets on existing worker nodes will replace instances
- New compute images or changing os_stream will be ignored, to allow Fedora CoreOS or Flatcar Linux to keep themselves updated
- Previously, new instance templates were made in the same way, but not applied to instances unless manually replaced
Add health checks to worker managed instance groups (i.e. "autohealing") (#1207)
- Use health checks to probe kube-proxy every 30s
- Replace worker nodes that fail the health check 6 times (3min)
Name kube-apiserver and worker health checks consistently (#1207)
- Use name ${cluster_name}-apiserver-health and ${cluster_name}-worker-health
Rename managed instance group from ${cluster_name}-worker-group to ${cluster_name}-worker (#1207)
Fix bug provisioning clusters with multiple controller nodes (#1195)

Addons

Update Prometheus from v2.37.0 to v2.38.0
Update Grafana from v9.0.3 to v9.1.0

v1.24.3

Kubernetes v1.24.3
Update Cilium from v1.11.6 to v1.11.7

Addons

Update ingress-nginx from v1.2.1 to v1.3.0
Update Prometheus from v2.36.1 to v2.37.0
Update Grafana from v8.5.6 to v9.0.3

Notes

Poseidon repos will soon change their default branch from master to main

v1.24.2

Kubernetes v1.24.2
Update Cilium from v1.11.5 to v1.11.6
Update Calico from v3.22.2 to v3.23.1

Addons

Update Prometheus from v2.36.0 to v2.36.1
Update Grafana from v8.5.3 to v8.5.6
Update kube-state-metrics from v2.4.2 to v2.5.0

Known Issues

Skip AWS Terraform provider v4.17.0 to v4.19.0, which had a regression affecting workers joining (#1173)

v1.24.1

Kubernetes v1.24.1
Update Cilium from v1.11.4 to v1.11.5

Addons

Update Prometheus from v2.35.0 to v2.36.0
Update Grafana from v8.5.1 to v8.5.3
Update nginx-ingress from v1.2.0 to v1.2.1

v1.24.0

Kubernetes v1.24.0
Update etcd from v3.5.2 to v3.5.4
Add Kubelet mounts to enable relabeling workload volumes (#1152)
- StorageClass no longer require explicit SELinux mount contexts

Addons

Update nginx-ingress from v1.1.3 to v1.2.0
Update Prometheus from v2.34.0 to v2.35.0
Update Grafana from v8.4.5 to v8.5.1

v1.23.6

Kubernetes v1.23.6
Update Cilium from v1.11.2 to v1.11.4
Rename Cilium DaemonSet from cilium-agent to cilium to match Cilium CLI tools (#303)
Update Calico from v3.22.1 to v3.22.2
Mount /etc/machine-id from host into Kubelet (#1143)
Remove deprecated use of key_algorithm in hashicorp/tls resources

Azure

Allow upgrading Azure Terraform provider to v3.x (#1144)
Rename worker_address_prefix output to worker_address_prefixes

Google Cloud

Fix issue on Flatcar Linux with controller nodes not ignoring os image changes (#1149)
- Nodes will auto-update, Terraform should not attempt to delete/recreate them

Addons

Update nginx-ingress from v1.1.2 to v1.1.3
Update Prometheus from v2.33.5 to v2.34.0
Update Grafana from v8.4.4 to v8.4.5

v1.23.5

Kubernetes v1.23.5
Update Cilium from v1.11.1 to v1.11.2
Update Calico from v3.21.2 to v3.22.1
- Fix calico#5011, broken since v1.23.0

Addons

Refresh Prometheus rules and Grafana dashboards (#1136)
Update nginx-ingress from v1.1.1 to v1.1.2
Update Prometheus from v2.33.3 to v2.33.5
Update Grafana from v8.4.1 to v8.4.3
Update kube-state-metrics from v2.3.0 to v2.4.2

v1.23.4

Kubernetes v1.23.4
Update etcd from v3.5.1 to v3.5.2
Change default CNI networking provider from calico to cilium (#1114)

AWS

Allow upgrading AWS Terraform Provider to v4.x

Addons

Align nginx-ingress --controller-class with IngressClass
- Watch only public IngressClass objects, better example
Update Prometheus from v2.32.1 to v2.33.3
Update Grafana from v8.3.6 to v8.4.1

V1.23.3

Kubernetes v1.23.3

Flatcar Linux

Google Cloud

Switch to using official Kinvolk Flatcar Linux images
Promote Typhoon on Flatcar Linux / Google Cloud to stable
Change os_image to flatcar-stable, flatcar-beta, or flatcar-alpha (action required)

v1.23.2

Kubernetes v1.23.2
Update Cilium from v1.11.0 to v1.11.1
Remove Kubelet flag --network-plugin. Unused since docker-shim isn't used (#1106)

Fedora CoreOS

Switch Kubernetes Container Runtime from docker to containerd (#1101)
Mask docker.service to prevent it from being socket activated (#1105)

Flatcar Linux

AWS

Add experimental Flatcar Linux ARM64 support (docs, #1102)
- Add arch variable to AWS kubernetes and workers modules
- Allow arm64 full-cluster or mixed/hybrid cluster with arm64 workers
- Requires flannel or cilium CNI provider

DigitalOcean

Upgrade DigitalOcean Terraform provider to v2.x (#1109)

Addons

Update nginx-ingress from v1.1.0 to v1.1.1
Update Grafana from v8.3.3 to v8.3.4

v1.23.1

Kubernetes v1.23.1
Workaround Terraform v1.1 regression in file provisioner (#1093)

Flatcar Linux

Switch Kubernetes Container Runtime from docker to containerd (#1087)

Addons

Configure Prometheus to allow a custom scrape query parameter (#1095)
Configure Prometheus to probe Kubernetes Ingress via blackbox-exporter (#1096)
Fix Prometheus Service probes to use blackbox-exporter, not blackbox (#1096)

v1.23.0

Kubernetes v1.23.0
Normalize CA cert mounts in static Pods and kube-proxy (#1078)
Set Kubelet resolver config to /run/systemd/resolve/resolv.conf (#1082)
Update Cilium from v1.10.5 to v1.11.0 (#1083)
With Calico, add missing caliconodestatuses CRD (#289)
Change enable_aggregation default to true (#279)
Remove deprecated --port from kube-scheduler (#1078)

AWS

Change controller node default disk_iops to 3000 (#1073)

Azure

Fix warning about deprecated backend_address_pool_id (#1086)

Fedora CoreOS

Fix Fedora ARM64 workers to official Fedora CoreOS AMIs (#1072)
- Should have been changed alongside controller AMIs in (#1038)
- Old Poseidon built ARM64 AMIs have been deleted

Addons

Update nginx-ingress from v1.0.5 to v1.1.0
Update Prometheus from v2.31.1 to v2.32.0
Update kube-state-metrics from v2.2.4 to v2.3.0
Update node-exporter from v1.3.0 to v1.3.1
Update Grafana from v8.2.4 to v8.3.3

Known Issues

Calico does not yet support Kubernetes v1.23.0, use flannel or cilium (calico#5011)

v1.22.4

Kubernetes v1.22.4
Update CoreDNS from v1.8.4 to v1.8.6
Update Calico from v3.20.2 to v3.21.0
Update flannel from v0.14.0 to v0.15.1

Google

Allow use of Terraform provider google v4.0+

Flatcar Linux

Change Kubelet mounts for cgroups v2 (#1064)
Update cgroup driver from cgroupfs to systemd (Flatcar Linux changed default) (#1064)

Addons

Update Prometheus from v2.30.3 to v2.31.1
Update node-exporter from v1.2.2 to v1.3.0
Update kube-state-metrics from v2.2.3 to v2.2.4
Update Grafana from v8.2.1 to v8.2.4
Update nginx-ingress from v1.0.4 to v1.0.5

v1.23.3

Kubernetes v1.22.3
Update etcd from v3.5.0 to v3.5.1
Update Cilium from v1.10.4 to v1.10.5
Update Calico from v3.20.1 to v3.20.2
- Use Calico's iptables legacy vs nft auto-detection
Update flannel from v0.13.0 to v0.14.0

Bare-Metal

Require Terraform provider poseidon/matchbox v0.5+ (#1048)

Addons

Update nginx-ingress from v1.0.0 to v1.0.4
Update Prometheus from v2.29.2 to v2.30.3
Update kube-state-metrics from v2.2.0 to v2.2.3
Update Grafana from v8.1.2 to v8.2.1

v1.22.2

Kubernetes v1.22.2
Update Cilium from v1.10.3 to v1.10.4
Update Calico from v3.20.0 to v3.20.1
Fix access to ClusterIP services with Cilium (#276)

Fedora CoreOS

Use Fedora CoreOS ARM64 AMIs (#1038)

Addons

Update Prometheus from v2.29.1 to v2.29.2
Update kube-state-metrics from v2.1.1 to v2.2.0

v1.22.1

Kubernetes v1.22.1
Update Calico from v3.19.1 to v3.20.0

Addons

Update nginx-ingress from v1.0.0-beta.1 to v1.0.0
Update Prometheus from v2.28.1 to v2.29.1
Update Grafana from v8.1.1 to v8.1.2

v1.22.0

Kubernetes v1.22.0
Update etcd from v3.4.16 to v3.5.0
Switch kube-controller-manager and kube-scheduler to use secure port only
- Update Prometheus config to discover endpoints and use a bearer token to scrape

Fedora CoreOS

Add Cilium cgroups v2 support on Fedora CoreOS
Update Butane Config version from v1.2.0 to v1.4.0
- Rename Fedora CoreOS Config to Butane Config
- Require any snippets customizations to update to v1.4.0

Addons

Update nginx-ingress from v0.47.0 to v1.0.0-beta.1
Update node-exporter from v1.2.0 to v1.2.2
Update kube-state-metrics from v2.1.0 to v2.1.1
Update Grafana from v8.0.6 to v8.1.1

v1.21.3

Kubernetes v1.21.3
Update Cilium from v1.10.1 to v1.10.3
Require poseidon/ct Terraform provider v0.9+ (notes)

AWS

Change default disk type from gp2 to gp3 (#1012)

Addons

Update Prometheus from v2.28.0 to v2.28.1
Update node-exporter from v1.1.2 to v1.2.0
Update Grafana from v8.0.3 to v8.0.6

Known Issues

Cilium with recent Fedora CoreOS will have networking issues (fedora-coreos#881) (fixed in v1.21.4)

v1.21.2

Kubernetes v1.21.2
Add Terraform v1.0.x support (#974)
- Continue to support Terraform v0.13.x, v0.14.4+, and v0.15.x
Update CoreDNS from v1.8.0 to v1.8.4
Update Cilium from v1.9.6 to v1.10.1
Update Calico from v3.19.0 to v3.19.1

Addons

Update kube-state-metrics from v2.0.0 to v2.1.0
Update Prometheus from v2.27.0 to v2.28.0
Update Grafana from v7.5.6 to v8.0.3
Update nginx-ingress from v0.46.0 to v0.47.0

Fedora CoreOS

AWS

Extend experimental Fedora CoreOS arm64 support with Cilium
- CNI provider may now be flannel or cilium (new)

Bare-Metal

Workaround systemd path unit issue fedora-coreos-tracker/#861

DigitalOcean

Workaround systemd path unit issue fedora-coreos-tracker/#861

Known Issues

Cilium with recent Fedora CoreOS will have networking issues (fedora-coreos#881) (fixed in v1.21.4)

v1.21.1

Kubernetes v1.21.1
Add Terraform v0.15.x support (#974)
- Continue to support Terraform v0.13.x and v0.14.4+
Update etcd from v3.4.15 to v3.4.16
Update Cilium from v1.9.5 to v1.9.6
Update Calico from v3.18.1 to v3.19.0

AWS

Reduce the default disk_size from 40GB to 30GB (#983)

Azure

Reduce the default disk_size from 40GB to 30GB (#983)

Google Cloud

Reduce the default disk_size from 40GB to 30GB (#983)

Fedora CoreOS

Update Kubelet mounts for cgroups v2 (#978)

Addons

Update kube-state-metrics from v2.0.0-rc.1 to v2.0.0
Update Prometheus from v2.25.2 to v2.27.0
Update Grafana from v7.5.3 to v7.5.6
Update nginx-ingress from v0.45.0 to v0.46.0

v1.21.0

Kubernetes v1.21.0
- Enable tokencleaner controller (#969)
- Enable kube-scheduler and kube-controller-manager separate authn/z kubeconfig
- Change CNI config location from /etc/kubernetes/cni/net.d to /etc/cni/net.d (#965)
- Change kube-controller-manager to mount /var/lib/kubelet/volumeplugins directly
- Remove unused cloud-provider flags
Update Fedora CoreOS Config version from v1.1.0 to v1.2.0 (#970)
- Require poseidon/ct Terraform provider v0.8+ (notes)
- Require any snippets customizations to update to v1.2.0

AWS

Allow setting custom initial node taints on worker pools (#968)
- Add node_taints variable to internal workers pool module to set initial node taints
- Add daemonset_tolerations so kube-system DaemonSets can tolerate custom taints

Azure

Allow setting custom initial node taints on worker pools (#968)
- Add node_taints variable to internal workers pool module to set initial node taints
- Add daemonset_tolerations so kube-system DaemonSets can tolerate custom taints
Remove deprecated azurerm_lb_backend_address_pool field resource_group_name (#972)

Google Cloud

Allow setting custom initial node taints on worker pools (#968)
- Add node_taints variable to internal workers pool module to set initial node taints
- Add daemonset_tolerations so kube-system DaemonSets can tolerate custom taints

Addons

Update nginx-ingress from v0.44.0 to v0.45.0
Update kube-state-metrics from v2.0.0-rc.0 to v2.0.0-rc.1
Update Grafana from v7.4.5 to v7.5.3

v1.20.5

Kubernetes v1.20.5
Update etcd from v3.4.14 to v3.4.15
Update Cilium from v1.9.4 to v1.9.5
Update Calico from v3.17.3 to v3.18.1
Update CoreDNS from v1.7.0 to v1.8.0
Mark bootstrap token as sensitive in Terraform plans (#949)

Fedora CoreOS

Set Kubelet provider-id (#951)

Flatcar Linux

AWS

Set Kubelet provider-id (#951)
Remove os_image option flatcar-edge (#943)

Azure

Remove os_image option flatcar-edge (#943)

Bare-Metal

Remove os_channel option flatcar-edge (#943)

Addons

Update Prometheus from v2.25.0 to v2.25.2
Update kube-state-metrics from v2.0.0-alpha.3 to v2.0.0-rc.0
- Switch image from quay.io to k8s.gcr.io (#946)
Update node-exporter from v1.1.1 to v1.1.2
Update Grafana from v7.4.2 to v7.4.5

v1.20.4

Kubernetes v1.20.4
Update Cilium from v1.9.1 to v1.9.4
Update Calico from v3.17.1 to v3.17.3
Update flannel-cni from v0.4.1 to v0.4.2

Addons

Update nginx-ingress from v0.43.0 to v0.44.0
Update Prometheus from v2.24.0 to v2.25.0
- Update node-exporter from v1.0.1 to v1.1.1
Update Grafana from v7.3.7 to v7.4.2

v1.20.2

Kubernetes v1.20.2
Support Terraform v0.13.x and v0.14.4+ (#924)

Addons

Update nginx-ingress from v0.41.2 to v0.43.0
Update Prometheus from v2.23.0 to v2.24.0
Update Grafana from v7.3.6 to v7.3.7

v1.20.1

Kubernetes v1.20.1

Fedora CoreOS

Fedora CoreOS 33 has stronger crypto defaults (notice, #915)
- Use a non-RSA SSH key or add the workaround provided in upstream Fedora docs as a snippet (action required)

Addons

Update Grafana from v7.3.5 to v7.3.6

v1.20.0

Kubernetes v1.20.0
Add input variable validations (#880)
- Require Terraform v0.13+ (migration guide)
Set output sensitive to suppress console display for some cases (#885)
Add service account token volume projection (#897)
Scope kube-scheduler and kube-controller-manager permissions (#898)
Update etcd from v3.4.12 to v3.4.14
Update Calico from v3.16.5 to v3.17.1 (#890)
- Enable Calico MTU auto-detection
- Remove workaround to Calico cni-plugin issue
Update Cilium from v1.9.0 to v1.9.1
Relax terraform-provider-ct version constraint to v0.6+ (#893)
- Allow upgrading terraform-provider-ct to v0.7.x (warn)

AWS

Enable Network Load Balancer (NLB) dualstack (#883)
- NLB subnets assigned both IPv4 and IPv6 addresses
- NLB DNS name has both A and AAAA records
- NLB to target node traffic is IPv4 (no change)

Bare-Metal

Remove iSCSI /etc/iscsi and iscsadm mounts from Kubelet (#912)

Fedora CoreOS

AWS

Fix AMI query for which could fail in some regions (#887)

Bare-Metal

Promote Fedora CoreOS to stable
Use initramfs and rootfs images as initrd's (#889)
- Requires Fedora CoreOS version with rootfs images (e.g. 32.20200923.3.0+)

Addons

Update Prometheus from v2.22.2 to v2.23.0
Update kube-state-metrics from v2.0.0-alpha.2 to v2.0.0-alpha.3
Update Grafana from v7.3.2 to v7.3.5

v1.19.4

Kubernetes v1.19.4
Update Cilium from v1.8.4 to v1.9.0
Update Calico from v3.16.3 to v3.16.5
Remove asset_dir variable (defaulted off in v1.17.0, deprecated in v1.18.0)

Fedora CoreOS

Improve etcd-member.service systemd unit (#868)
- Allow a snippet with a systemd dropin to set an alternate image (e.g. mirror)
Fix local node delete oneshot on node shutdown (#856)

AWS

Add experimental Fedora CoreOS arm64 support (docs, #875)
- Allow arm64 full-cluster or mixed/hybrid cluster with worker pools
- Add arch variable to cluster module
- Add daemonset_tolerations variable to cluster module
- Add node_taints variable to workers module
- Requires flannel CNI provider and use of experimental AMI (see docs)

Flatcar Linux

Rename container-linux modules to flatcar-linux (#858) (action required)
Change on-host system containers from rkt to docker
- Change etcd-member.service container runnner from rkt to docker (#867)
- Change kubelet.service container runner from rkt-fly to docker (#855)
- Change bootstrap.service container runner from rkt to docker (#873)
- Change delete-node.service to use docker and an inline ExecStart (#855)
Fix local node delete oneshot on node shutdown (#855)
Remove CoreOS Container Linux Matchbox profiles (#859)

Addons

Update nginx-ingress from v0.40.2 to v0.41.2
Update Prometheus from v2.22.0 to v2.22.1
Update kube-state-metrics from v2.0.0-alpha.1 to v2.0.0-alpha.2
Update Grafana from v7.2.1 to v7.3.2

v1.19.3

Kubernetes v1.19.3
Update Cilium from v1.8.3 to v1.8.4
Update Calico from v1.15.3 to v1.16.3 (#851)
Update flannel from v0.13.0-rc2 to v0.13.0 (#219)

Flatcar Linux

Remove references to CoreOS Container Linux (#839)
- Fix error querying for coreos AMI on AWS (#838)

Addons

Update nginx-ingress from v0.35.0 to v0.40.2
Update Grafana from v7.1.5 to v7.2.1
Update Prometheus from v2.21.0 to v2.22.0
- Update kube-state-metrics from v1.9.7 to v2.0.0-alpha.1

v1.19.2

Kubernetes v1.19.2
Update flannel from v0.12.0 to v0.13.0-rc2 (#216)
- Update flannel-cni from v0.4.0 to v0.4.1
- Update CNI plugins from v0.8.6 to v0.8.7

Addons

Refresh Prometheus rules/alerts and Grafana dashboards (#831)
Reduce apiserver metrics cardinality for non-core APIs (#830)

v1.19.1

Kubernetes v1.19.1
- Change control plane seccomp annotations to GA seccompProfile (#822)
Update Cilium from v1.8.2 to v1.8.3
- Promote Cilium from experimental to general availability (#827)
Update Calico from v1.15.2 to v1.15.3

Fedora CoreOS

Update Fedora CoreOS Config version from v1.0.0 to v1.1.0
- Require any snippets customizations to update to v1.1.0

Addons

Update IngressClass resources to networking.k8s.io/v1 (#824)
Update Prometheus from v2.20.0 to v2.21.0
- Remove Kubernetes node name labelmap relabel_config from etcd, Kubelet, and CAdvisor scrape config (#828)

v1.19.0

Kubernetes v1.19.0
Update etcd from v3.4.10 to v3.4.12
Update Calico from v3.15.1 to v3.15.2

Fedora CoreOS

Fix race condition during bootstrap of multi-controller clusters (#808)
- Fix SELinux label of bootstrap-secrets on non-bootstrap controllers

Addons

Introduce fleetlock for Fedora CoreOS reboot coordination (#814)
Update nginx-ingress from v0.34.1 to v0.35.0
- Repository changed to k8s.gcr.io/ingress-nginx/controller
Update Grafana from v7.1.3 to v7.1.5

v1.18.8

Kubernetes v1.18.8
Migrate from Terraform v0.12.x to v0.13.x (#804) (action required)
- Recommend Terraform v0.13.x (migration guide)
- Support automatic install of poseidon's provider plugins (poseidon/ct, poseidon/matchbox)
- Require Terraform v0.12.26+ (migration compatibility)
- Require terraform-provider-ct v0.6.1
- Require terraform-provider-matchbox v0.4.1
Update etcd from v3.4.9 to v3.4.10
Update CoreDNS from v1.6.7 to v1.7.0
Update Cilium from v1.8.1 to v1.8.2
Update coreos/flannel-cni to poseidon/flannel-cni (#798)
- Update CNI plugins and fix CVEs with Flannel CNI (non-default)
- Transition to a poseidon maintained container image

AWS

Allow terraform-provider-aws v3.0+ (#803)
- Recommend updating terraform-provider-aws to v3.0+
- Continue to allow v2.23+, no v3.x specific features are used

DigitalOcean

Require terraform-provider-digitalocean v1.21+ for Terraform v0.13.x (unenforced)
Require terraform-provider-digitalocean v1.20+ for Terraform v0.12.x

Fedora CoreOS

Fix support for Flannel with Fedora CoreOS (#795)
- Configure flannel.1 link to select its own MAC address to solve flannel pod-to-pod traffic drops starting with default link changes in Fedora CoreOS 32.20200629.3.0 (details)

Addons

Update Prometheus from v2.19.2 to v2.20.0
Update Grafana from v7.0.6 to v7.1.3

v1.18.6

Kubernetes v1.18.6
Update Calico from v3.15.0 to v3.15.1
Update Cilium from v1.8.0 to v1.8.1

Addons

Update nginx-ingress from v0.33.0 to v0.34.1
- ingress-nginx will publish images only to gcr.io
Update Prometheus from v2.19.1 to v2.19.2
Update Grafana from v7.0.4 to v7.0.6

v1.18.5

Kubernetes v1.18.5
Add Cilium v1.8.0 as a (experimental) CNI provider option (#760)
- Set networking to "cilium" to enable
Update Calico from v3.14.1 to v3.15.0

DigitalOcean

Isolate each cluster in an independent DigitalOcean VPC (#776)
- Create droplets in a VPC per cluster (matches Typhoon AWS, Azure, and GCP)
- Require terraform-provider-digitalocean v1.16.0+ (action required)
- Output vpc_id for use with an attached DigitalOcean loadbalancer

Fedora CoreOS

Google Cloud

Promote Fedora CoreOS to stable
Remove os_image variable deprecated in v1.18.3 (#777)
- Use os_stream to select a Fedora CoreOS image stream

Flatcar Linux

Azure

Allow using Flatcar Linux Edge by setting os_image to "flatcar-edge" (#778)

Addons

Update Prometheus from v2.19.0 to v2.19.1
Update Grafana from v7.0.3 to v7.0.4

v1.18.4

Kubernetes v1.18.4
Update Kubelet image publishing (#749)
- Build Kubelet images internally and publish to Quay and Dockerhub
  - quay.io/poseidon/kubelet (official)
  - docker.io/psdn/kubelet (fallback)
- Continue offering automated image builds with an alternate tag strategy (see docs)
- Document use of alternate Kubelet images during registry incidents
Update Calico from v3.14.0 to v3.14.1
- Fix CVE-2020-13597
Rename controller NoSchedule taint from node-role.kubernetes.io/master to node-role.kubernetes.io/controller (#764)
- Tolerate the new taint name for workloads that may run on controller nodes
Remove node label node.kubernetes.io/master from controller nodes (#764)
- Use node.kubernetes.io/controller (present since v1.9.5, #160) to node select controllers
Remove unused Kubelet -lock-file and -exit-on-lock-contention (#758)

Fedora CoreOS

Azure

Use strict Fedora CoreOS Config (FCC) snippet parsing (#755)
Reduce Calico vxlan interface MTU to maintain performance (#767)

AWS

Fix Kubelet service race with hostname update (#766)
- Wait for a hostname to avoid Kubelet trying to register as localhost

Flatcar Linux

Use strict Container Linux Config (CLC) snippet parsing (#755)
- Require terraform-provider-ct v0.4+, recommend v0.5+ (action required)

Addons

Update nginx-ingress from v0.32.0 to v0.33.0
Update Prometheus from v2.18.1 to v2.19.0
Update node-exporter from v1.0.0-rc.1 to v1.0.1
Update kube-state-metrics from v1.9.6 to v1.9.7
Update Grafana from v7.0.0 to v7.0.3

v1.18.3

Kubernetes v1.18.3
Use Kubelet TLS bootstrap with bootstrap token authentication (#713)
- Enable Node Authorization and NodeRestriction to reduce authorization scope
- Renew Kubelet certificates every 72 hours
Update etcd from v3.4.7 to v3.4.9
Update Calico from v3.13.1 to v3.14.0
Add CoreDNS node affinity preference for controller nodes (#188)
Deprecate CoreOS Container Linux support (no OS updates after May 2020)
- Use a fedora-coreos module for Fedora CoreOS
- Use a container-linux module for Flatcar Linux

AWS

Fix Terraform plan error when controller_count exceeds AWS zones (e.g. 5 controllers) (#714)
- Regressed in v1.17.1 (#605)

Azure

Update Azure subnets to set address_prefixes list (#730)
- Fix warning that address_prefix is deprecated
- Require terraform-provider-azurerm v2.8.0+ (action required)

DigitalOcean

Promote DigitalOcean to beta on both Fedora CoreOS and Flatcar Linux

Fedora CoreOS

Fix Calico install-cni crashloop on Pod restarts (#724)
- SELinux enforcement requires consistent file context MCS level
- Restarting a node resolved the issue as a previous workaround

AWS

Support Fedora CoreOS image streams (#727)
- Add os_stream variable to set the stream to stable (default), testing, or next
- Remove unused os_image variable

Google

Support Fedora CoreOS image streams (#723)
- Add os_stream variable to set the stream to stable (default), testing, or next
- Deprecate os_image variable. Manual image uploads are no longer needed

Flatcar Linux

Azure

Use the Flatcar Linux Azure Marketplace image
- Restore #664 (reverted in #707) but use Flatcar Linux new free offer (not byol)
Change os_image to use a flatcar-stable default

Google

Promote Flatcar Linux to beta

Addons

Update nginx-ingress from v0.30.0 to v0.32.0
- Add support for IngressClass
Update Prometheus from v2.17.1 to v2.18.1
- Update kube-state-metrics from v1.9.5 to v1.9.6
- Update node-exporter from v1.0.0-rc.0 to v1.0.0-rc.1
Update Grafana from v6.7.2 to v7.0.0

v1.18.2

Kubernetes v1.18.2
Choose Fedora CoreOS or Flatcar Linux (action required)
- Use a fedora-coreos module for Fedora CoreOS
- Use a container-linux module for Flatcar Linux
Change Container Linux modules' defaults from CoreOS Container Linux to Flatcar Container Linux (#702)
- CoreOS Container Linux won't receive updates after May 2020

Fedora CoreOS

Fix bootstrap race condition from SELinux unshared content label (#708)

Azure

Add support for Fedora CoreOS (#704)

DigitalOcean

Fix race condition creating firewall allow rules (#709)

Flatcar Linux

AWS

Change os_image default from coreos-stable to flatcar-stable (#702)

Azure

Change os_image to be required. Recommend uploading a Flatcar Linux image (action required) (#702)
Disable Flatcar Linux Azure Marketplace image support (breaking, #707)
- Revert to manual uploading until marketplace issue is closed (#703)

Bare-Metal

Recommend changing os_channel from coreos-stable to flatcar-stable

Google

Change os_image to be required. Recommend uploading a Flatcar Linux image (action required) (#702)

DigitalOcean

Change os_image to be required. Recommend uploading a Flatcar Linux image (action required) (#702)
Fix race condition creating firewall allow rules (#709)

v1.18.1

Kubernetes v1.18.1
Choose Fedora CoreOS or Flatcar Linux (action recommended)
- Use a fedora-coreos module for Fedora CoreOS
- Use a container-linux module with OS set to Flatcar Linux
Update etcd from v3.4.5 to v3.4.7
Change kube-proxy and calico or flannel to tolerate specific taints (#682)
- Tolerate master and not-ready taints, rather than tolerating all taints
Update flannel from v0.11.0 to v0.12.0 (#690)
Fix bootstrap when networking mode flannel (non-default) is chosen (#689)
- Regressed in v1.18.0 changes for Calico (#675)
Rename Container Linux controller_clc_snippets to controller_snippets for consistency (#688)
Rename Container Linux worker_clc_snippets to worker_snippets for consistency
Rename Container Linux clc_snippets (bare-metal) to snippets for consistency
Drop support for gitRepo volumes (kubelet#3)

Azure

Fix Azure worker UDP outbound connections (#691)
- Fix Azure worker clock sync timeouts

DigitalOcean

Add support for Fedora CoreOS (#699)

Addons

Refresh Prometheus rules/alerts and Grafana dashboards (#692)
Update Grafana from v6.7.1 to v6.7.2

v1.18.0

Kubernetes v1.18.0
Update etcd from v3.4.4 to v3.4.5
Switch from upstream hyperkube image to individual images (#669)
- Use upstream k8s.gcr.io kube-apiserver, kube-controller-manager, kube-scheduler, and kube-proxy container images
- Use poseidon/kubelet to package the upstream Kubelet binary and dependencies as a container image (checksummed, automated build)
- Add quay.io/poseidon/kubelet as a Typhoon distributed artifact in the security policy
- Update base images from debian 9 to debian 10
- Background: Kubernetes will stop releasing the hyperkube container image and provide the Kubelet as a binary for packaging
Choose Fedora CoreOS or Flatcar Linux (action recommended)
- Use a fedora-coreos module for Fedora CoreOS
- Use a container-linux module with OS set for Flatcar Linux (varies, see docs)
- CoreOS Container Linux won't receive updates after May 2020
Add support for Fedora CoreOS snippets (terraform-provider-ct v0.5+) (#686)
Recommend updating terraform-provider-ct plugin from v0.4.0 to v0.5.0
Set Fedora CoreOS log driver back to the default journald (#681)
Deprecate asset_dir variable and remove docs (#678)
Deprecate support for gitRepo volumes. A future release will drop support.

AWS

Fix Fedora CoreOS AMI to filter for stable images (#685)
- Latest Fedora CoreOS testing or bodhi-update images could be chosen depending on the region

Bare-Metal

Update Fedora CoreOS default os_stream from testing to stable

Google Cloud

Known: Use of stale Fedora CoreOS image may require terraform re-apply during bootstrap (#687)

DigitalOcean

Rename image variable to os_image for consistency (#677) (action required)

Addons

Update Prometheus from v2.16.0 to v2.17.1
Update Grafana from v6.6.2 to v6.7.1

v1.17.4

Kubernetes v1.17.4
Update etcd from v3.4.3 to v3.4.4
- On Container Linux, fetch using the docker transport format (#659)
Update CoreDNS from v1.6.6 to v1.6.7 (#648)
Update Calico from v3.12.0 to v3.13.1

AWS

Promote Fedora CoreOS to stable (#668)
Allow VPC route table extension via reference (#654)
Fix worker_node_labels on Fedora CoreOS (#651)
Fix automatic worker node delete on shutdown on Fedora CoreOS (#657)

Azure

Upgrade to terraform-provider-azurerm v2.0+ (action required)
- Change worker_priority from Low to Spot if used (action required)
- Switch to Azure's new Linux VM and Linux VM Scale Set resources
- Set controller's Azure disk caching to None
- Associate subnets (in addition to NICs) with security groups (aesthetic)
Add support for Flatcar Container Linux (#664)
- Requires accepting Flatcar Linux Azure Marketplace terms

Bare-Metal

Add worker_node_labels map variable for per-worker node labels (#663)
Add worker_node_taints map variable for per-worker node taints (#663)

DigitalOcean

Add support for Flatcar Container Linux (#644)

Google Cloud

Promote Fedora CoreOS to beta (#668)
Fix worker_node_labels on Fedora CoreOS (#651)
Fix automatic worker node delete on shutdown on Fedora CoreOS (#657)

Addons

Update nginx-ingress from v0.28.0 to v0.30.0
Update Prometheus from v2.15.2 to v2.16.0
- Refresh Prometheus rules and alerts
- Add a BlackboxProbeFailure alert
- Update kube-state-metrics from v1.9.4 to v1.9.5
- Update node-exporter from v0.18.1 to v1.0.0-rc.0
Update Grafana from v6.6.1 to v6.6.2
- Refresh Grafana dashboards
Remove Container Linux Update Operator (CLUO) addon example (#667)
- CLUO hasn't been in active use in our clusters and won't be relevant beyond Container Linux. Requires patches for use on Kubernetes v1.16+

v1.17.3

Kubernetes v1.17.3
Update Calico from v3.11.2 to v3.12.0
Allow Fedora CoreOS clusters to pass CNCF conformance suite
- Set Docker log driver to json-file as a workaround
Try Fedora CoreOS or Flatcar Linux alongside CoreOS Container Linux clusters (recommended)

AWS

Promote Fedora CoreOS to beta (#645)

Bare-Metal

Promote Fedora CoreOS to beta (#645)
Add Fedora CoreOS kernel arguments initrd and console (#640)

Google Cloud

Add Terraform module for Fedora CoreOS (#632)
Add support for Flatcar Container Linux (#639)

Addons

Update nginx-ingress from v0.27.1 to v0.28.0
Update kube-state-metrics from v1.9.3 to v1.9.4
Update Grafana from v6.5.3 to v6.6.1

v1.17.2

Kubernetes v1.17.2

AWS

Promote Fedora CoreOS from preview to alpha

Bare-Metal

Promote Fedora CoreOS from preview to alpha
Update Fedora CoreOS images location
- Use Fedora CoreOS production download streams
- Use live PXE kernel and initramfs images

Addons

Update nginx-ingress from v0.26.1 to v0.27.1 (#625)
- Change runAsUser from 33 to 101 for alpine-based image
Update kube-state-metrics from v1.9.2 to v1.9.3

v1.17.1

Kubernetes v1.17.1
Update CoreDNS from v1.6.5 to v1.6.6 (#602)
Update Calico from v3.10.2 to v3.11.2 (#604)
Inline Kubelet service on Container Linux nodes (#606)
Disable unused Kubelet 127.0.0.1:10248 healthz listener (#607)
Enable kube-proxy metrics and allow Prometheus scrapes
- Allow TCP/10249 traffic with worker node sources

AWS

Update Fedora CoreOS AMI filter for fedora-coreos-31 (#620)

Google

Allow terraform-provider-google v3.0+ (#617)
- Only enforce v2.19+ to ease migration, as no v3.x features are used

Addons

Update Prometheus from v2.14.0 to v2.15.2
- Add discovery for kube-proxy service endpoints
Update kube-state-metrics from v1.8.0 to v1.9.2
Reduce node-exporter DaemonSet tolerations (#614)
Update Grafana from v6.5.1 to v6.5.3

v1.17.0

Kubernetes v1.17.0
Manage clusters without using a local asset_dir (#595)
- Change asset_dir to be optional. Remove the variable to skip writing assets locally (action recommended)
- Allow keeping cluster assets only in Terraform state (pluggable, encryption) and allow terraform apply from stateless automation systems
- Improve asset unpacking on controllers
- Obtain kubeconfig from Terraform module outputs
Replace usage of template_dir with templatefile function (#587)
- Require Terraform version v0.12.6+ (action required)
Update CoreDNS from v1.6.2 to v1.6.5 (#588)
- Add health lameduck option to wait before shutdown
Update Calico from v3.10.1 to v3.10.2 (#599)
Reduce pod eviction timeout for deleting pods on unready nodes from 5m to 1m (#597)
- Present since v1.13.3, but mistakenly removed in v1.16.0
Add CPU requests for control plane static pods (#589)
- May provide slight edge case benefits and aligns with upstream

Google

Use new google_compute_region_instance_group_manager version block format
- Fixes warning that instance_template is deprecated
- Require terraform-provider-google v2.19.0+ (action required)

Addons

Update Grafana from v6.4.4 to v6.5.1
Add pod networking details in dashboards (#593)
Add node alerts and Grafana dashboard from node-exporter (#591)
Reduce Prometheus high cardinality time series (#596)

v1.16.3

Kubernetes v1.16.3
Update etcd from v3.4.2 to v3.4.3 (#582)
Upgrade Calico from v3.9.2 to v3.10.1
- Allow advertising service ClusterIPs to peer routers via a BGPConfiguration
Switch kube-proxy from iptables to ipvs mode (#574)

Addons

Update Prometheus from v2.13.0 to v2.14.0
- Refresh rules, alerts, and dashboards from upstreams
Remove addon-resizer from kube-state-metrics (#575)
Update Grafana from v6.4.2 to v6.4.4

v1.16.2

Kubernetes v1.16.2
Update etcd from v3.4.1 to v3.4.2 (#570)
Update Calico from v3.9.1 to v3.9.2
- Default to using Calico and supporting NetworkPolicy on all platforms

Azure

Change default networking provider from "flannel" to "calico" (#573)

Bare-Metal

Add controllers and workers as typed lists of machine detail objects (#566)
- Define clusters' machines cleanly and with Terraform v0.12 type constraints (action required, see PR example)
- Remove controller_names, controller_macs, and controller_domains variables
- Remove worker_names, worker_macs, and worker_domains variables

DigitalOcean

Change default networking provider from "flannel" to "calico" (#573)

Addons

Update Grafana from v6.4.1 to v6.4.2
Change CLUO label from "app" to "name"

v1.16.1

Kubernetes v1.16.1
Update etcd from v3.4.0 to v3.4.1
Update Calico from v3.8.2 to v3.9.1
Add Terraform v0.12 variables types (#553, #557, #560, #556, #562)
- Deprecate cluster_domain_suffix variable

AWS

Add worker_node_labels variable to set initial worker node labels (#550)
Add node_labels variable to internal workers pool module (#550)
For Fedora CoreOS, detect most recent AMI in the region

Azure

Promote networking provider Calico VXLAN out of experimental (set networking = "calico")
Add worker_node_labels variable to set initial worker node labels (#550)
Add node_labels variable to internal workers pool module (#550)
Change workers module default vm_type to Standard_DS1_v2 (followup to #539)

Bare-Metal

For Fedora CoreOS, use new kernel, initrd, and raw paths (#563)
Fix Terraform missing comma error (#549)
Remove deprecated container_linux_oem variable (#562)

DigitalOcean

Promote networking provider Calico VXLAN out of experimental (set networking = "calico")
Fix Terraform missing comma error (#549)

Google Cloud

Add worker_node_labels variable to set initial worker node labels (#550)
Add node_labels variable to internal workers module (#550)

Addons

Update Prometheus from v2.12.0 to v2.13.0
- Fix Prometheus etcd target discovery and scraping (#561, regressed with Kubernetes v1.16.0)
Update kube-state-metrics from v1.7.2 to v1.8.0
Update nginx-ingress from v0.25.1 to v0.26.1 (#555)
- Add lifecycle hook to allow draining for up to 5 minutes
Update Grafana from v6.3.5 to v6.4.1

v1.16.0

Kubernetes v1.16.0 (#543)
- Read about several Kubernetes API deprecations!
- Remove legacy node role labels (no longer shown in kubectl get nodes)
- Rename node labels to node.kubernetes.io/master and node.kubernetes.io/node (migratory)
Migrate control plane from self-hosted to static pods (#536)
- Run kube-apiserver, kube-scheduler, and kube-controller-manager as static pods on each controller
- kubectl edits to kube-apiserver, kube-scheduler, and kube-controller-manager are no longer possible (change)
- Remove bootkube, self-hosted pivot, and pod-checkpointer
Update CoreDNS from v1.5.0 to v1.6.2 (#535)
Update etcd from v3.3.15 to v3.4.0
Recommend updating terraform-provider-ct plugin from v0.3.2 to v0.4.0

Azure

Change default controller_type to Standard_B2s (#539)
- B2s is cheaper by $17/month and provides 2 vCPU, 4GB RAM
Change default worker_type to Standard_DS1_v2 (#539)
- F1 is previous generation. DS1_v2 is newer, similar cost, and supports Low Priority mode

Addons

Update Grafana from v6.3.3 to v6.3.5

v1.15.3

Kubernetes v1.15.3
Update etcd from v3.3.13 to v3.3.15
Update Calico from v3.8.1 to v3.8.2

AWS

Enable root block device encryption by default (#527)
- Require terraform-provider-aws v2.23+ (action required)

Addons

Update Prometheus from v2.11.0 to v2.12.0
- Update kube-state-metrics from v1.7.1 to v1.7.2
Update Grafana from v6.2.5 to v6.3.3
- Use stable IDs for etcd, CoreDNS, and Nginx Ingress dashboards (#530)
Update nginx-ingress from v0.25.0 to v0.25.1
- Fix Nginx security advisories

v1.15.2

Kubernetes v1.15.2
Update Calico from v3.8.0 to v3.8.1
Publish new load balancing, TCP/UDP, and firewall docs (#523)

Addons

Add new Grafana dashboards for CoreDNS and Nginx Ingress Controller (#525)

v1.15.1

Kubernetes v1.15.1
Upgrade Calico from v3.7.3 to v3.8.0
- Enable CNI bandwidth plugin for traffic shaping
Run kube-apiserver with lower privilege user (nobody) (#506)
Relax terraform-provider-ct version constraint (v0.3.2+)
- Allow provider versions below v1.0.0 (e.g. upgrading to v0.4)

Azure

Fix to add all controller nodes to the apiserver load balancer backend address pool (#518)
- kube-apiserver availability relied on the 0th controller

Google Cloud

Allow controller nodes to span more than 3 zones if available in a region (#504)
Eliminate extraneous controller instance groups in single-controller clusters (#504)
Raise network deletion timeout from 4m to 6m (#505)

Addons

Update Prometheus from v2.10.0 to v2.11.0
- Refresh rules, alerts, and dashboards from upstreams
- Update kube-state-metrics from v1.6.0 to v1.7.1
Update Grafana from v6.2.4 to v6.2.5
Update nginx-ingress from v0.24.1 to v0.25.0
- Support networking.k8s.io/v1beta1 apiVersion

v1.15.0

Kubernetes v1.15.0
Migrate from Terraform v0.11 to v0.12.x (action required!)
- Migration instructions for Terraform v0.12
Require terraform-provider-ct v0.3.2+ to support Terraform v0.12 (action required)
Update Calico from v3.7.2 to v3.7.3
Remove Fedora Atomic modules (deprecated in March) (#501)

AWS

Require terraform-provider-aws v2.7+ to support Terraform v0.12 (action required)
Allow using Flatcar Linux Edge by setting os_image to "flatcar-edge"

Azure

Require terraform-provider-azurerm v1.27+ to support Terraform v0.12 (action required)
Avoid unneeded rotations of Regular priority virtual machine scale sets
- Azure only allows eviction_policy to be set for Low priority VMs. Supporting Low priority VMs meant when Regular VMs were used, each terraform apply rolled workers, to set eviction_policy to null.
- Terraform v0.12 nullable variables fix the issue so plan does not produce a diff.

Bare-Metal

Require terraform-provider-matchbox v0.3.0+ to support Terraform v0.12 (action required)
Allow using Flatcar Linux Edge by setting os_channel to "flatcar-edge"

DigitalOcean

Require terraform-provider-digitalocean v1.3+ to support Terraform v0.12 (action required)
Change the default worker_type from s-1vcpu1-1gb to s-1vcpu-2gb

Google Cloud

Require terraform-provider-google v2.5+ to support Terraform v0.12 (action required)

Addons

Update Grafana from v6.2.1 to v6.2.4
Update node-exporter from v0.18.0 to v0.18.1

v1.14.3

Kubernetes v1.14.3
Update CoreDNS from v1.3.1 to v1.5.0
- Add ready plugin to improve readinessProbe
Fix trailing slash in terraform-render-bootkube version (#479)
Recommend updating terraform-provider-ct plugin from v0.3.1 to v0.3.2 (#487)

AWS

Rename worker pool module count variable to worker_count (#485) (action required)
- count will become a reserved variable name in Terraform v0.12

Azure

Replace azurerm_autoscale_setting with azurerm_monitor_autoscale_setting (#482)
Rename worker pool module count variable to worker_count (#485) (action required)
- count will become a reserved variable name in Terraform v0.12

Bare-Metal

Recommend updating terraform-provider-matchbox plugin from v0.2.3 to v0.3.0 (#487)

Google Cloud

Rename worker pool module count variable to worker_count (#485) (action required)
- count is a reserved variable in Terraform v0.12

Addons

Update Prometheus from v2.9.2 to v2.10.0
Update Grafana from v6.1.6 to v6.2.1

v1.14.2

Kubernetes v1.14.2
Update etcd from v3.3.12 to v3.3.13
Upgrade Calico from v3.6.1 to v3.7.2
Change flannel VXLAN port from 8472 (kernel default) to 4789 (IANA VXLAN)

AWS

Only set internal VXLAN rules when networking is "flannel" (default: calico)

Azure

Allow choosing Calico as the network provider (experimental) (#472)
- Add a networking variable accepting "flannel" (default) or "calico"
- Use VXLAN encapsulation since Azure doesn't support IPIP

DigitalOcean

Allow choosing Calico as the network provider (experimental) (#472)
- Add a networking variable accepting "flannel" (default) or "calico"
- Use VXLAN encapsulation since DigitalOcean doesn't support IPIP
Add explicit ordering between firewall rule creation and secure copying Kubelet credentials (#469)
- Fix race scenario if copies to nodes were before rule creation, blocking cluster creation

Addons

Update Prometheus from v2.8.1 to v2.9.2
- Update kube-state-metrics from v1.5.0 to v1.6.0
Update node-exporter from v0.17.0 to v0.18.0
Update Grafana from v6.1.3 to v6.1.6
Reduce nginx-ingress Role RBAC permissions (#458)

v1.14.1

Kubernetes v1.14.1

Addons

Update Grafana from v6.1.1 to v6.1.3
Update nginx-ingress from v0.23.0 to v0.24.1

v1.14.0

Kubernetes v1.14.0
Update Calico from v3.6.0 to v3.6.1
Add enable_aggregation option for CNCF conformance (#436)
- Aggregation is disabled by default to retain our security stance
- Aggregation increases the security surface area. Extensions become part of the control plane and must be scrutinized carefully and trusted. Favor leaving aggregation disabled.

AWS

Add ability to load balance TCP applications (#443)
- Output the network load balancer ARN as nlb_id
- Accept a worker_target_groups (ARN) list to which worker instances should be added

Azure

Add ability to load balance TCP/UDP applications (#447)
- Output the load balancer ID as loadbalancer_id
Output worker_security_group_name and worker_address_prefix for extending firewall rules (#447)

DigitalOcean

Harden internal (node-to-node) firewall rules to align with other platforms (#444)
Add ability to load balance TCP applications (#444)
- Output controller_tag and worker_tag for extending firewall rules (#444)

Google Cloud

Add ability to load balance TCP/UDP applications (#442)
- Add worker instances to a target pool, output as worker_target_pool
- Health check for workers with Ingress controllers. Forward rules don't support differing internal/external ports, but some Ingress controllers support TCP/UDP proxy as a workaround
Remove Haswell minimum CPU platform requirement (#439)
- Google Cloud API implements min_cpu_platform to mean "use exactly this CPU". Revert #405 added in v1.13.4.
- Fix error creating clusters in new regions without Haswell (e.g. europe-west2) (#438)

Addons

Update Prometheus from v2.8.0 to v2.8.1
Update Grafana from v6.0.2 to v6.1.1
- Add dashboard for pods in a workload (deployment/daemonset/statefulset) (#446)
- Add dashboard for workloads by namespace

v1.13.5

Kubernetes v1.13.5
Resolve in-addr.arpa reverse DNS lookups (PTR) for pod IPv4 addresses (#415)
- Reverse DNS lookups for service IPv4 addresses unchanged
Upgrade Calico from v3.5.2 to v3.6.0 (#430)
- Change pod IPAM from host-local to calico-ipam. pod_cidr is still divided into /24 subnets per node, but managed as ippools and ipamblocks
Recommend updating terraform-provider-ct from v0.3.0 to v0.3.1 (#434)
Announce: Fedora Atomic modules will be not be updated beyond Kubernetes v1.13.x (#437)
- Thank you Project Atomic team and users, please see the deprecation notice

AWS

Support terraform-provider-aws v2.0+ (#419)

Bare-Metal

Change the default iPXE kernel and initrd download protocol from HTTP to HTTPS (#420)
- Require an iPXE-enabled network boot environment with support for TLS downloads. PXE clients must chainload to iPXE firmware compiled with DOWNLOAD_PROTO_HTTPS enabled. (action required)
- Only affects Container Linux and Flatcar Linux install profiles that pull public images (default)
- Add download_protocol variable. Recognizing boot firmware TLS support is difficult in some environments, set the protocol to "http" for the old behavior (discouraged)

DigitalOcean

Fix kubelet hostname-override to set node metadata InternalIP correctly (#424)
- Uniquely, DigitalOcean does not resolve hostnames to instance private IPs. Kubelet auto-detect mechanisms require the internal IP be set directly.
- Regressed in v1.12.3 (#337) which aimed to provide friendly hostname-based node names on DigitalOcean

Addons

Update Prometheus from v2.7.1 to v2.8.0
- Refresh rules based on upstreams (#426)
- Define NetworkPolicy to allow only traffic from the Grafana addon
Update Grafana from v6.0.0 to v6.0.2
- Add liveness and readiness probes
- Refresh dashboards and organize to stay below ConfigMap size limit (#426)
Remove heapster manifests from addons (#427)
- Heapster addon powers kubectl top (in early Kubernetes, running the addon was expected). Today, there are better monitoring options.
- kubectl top reliance on a non-core extension means its not in-scope for minimal Kubernetes
- Look to prior releases if you still wish to apply heapster

v1.13.4

Kubernetes v1.13.4
Update etcd from v3.3.11 to v3.3.12
Update Calico from v3.5.0 to v3.5.2
Assign priorityClassNames to critical cluster and node components (#406)
- Inform node out-of-resource eviction and scheduler preemption and ordering
Add CoreDNS readiness probe (#410)

Bare-Metal

Recommend updating terraform-provider-matchbox plugin from v0.2.2 to v0.2.3 (#402)
Improve docs on using Ubiquiti EdgeOS with bare-metal clusters (#413)

Google Cloud

Support terraform-provider-google v2.0+ (#407)
- Require terraform-provider-google v1.19+ (action required)
Set the minimum CPU platform to Intel Haswell (#405)
- Haswell or better is available in every zone (no price change)
- A few zones still default to Sandy/Ivy Bridge (shifts in April 2019)

Addons

Modernize Prometheus rules and alerts (#404)
- Drop extraneous metrics (#397)
- Add pod name label to metrics discovered via service endpoints
- Rename kubernetes_namespace label to namespace
Modernize Grafana and dashboards, see docs (#403, #404)
- Upgrade Grafana from v5.4.3 to v6.0.0!
- Enable Grafana Explore UI as a Viewer (inspect/edit without saving)
Update nginx-ingress from v0.22.0 to v0.23.0
- Raise nginx-ingress liveness/readiness timeout to 5 seconds
- Remove nginx-ingess default-backend (#401)

Fedora Atomic

Build Kubelet system container with buildah. The image is an OCI format and slightly larger.

v1.13.3

Kubernetes v1.13.3
Update etcd from v3.3.10 to v3.3.11
Update CoreDNS from v1.3.0 to v1.3.1
- Switch from the proxy plugin to the faster forward plugin for upsteam resolvers
Update Calico from v3.4.0 to v3.5.0
Update flannel from v0.10.0 to v0.11.0
Reduce pod eviction timeout for deleting pods on unready nodes to 1 minute
- Respond more quickly to node preemption (previously 5 minutes)
Fix automatic worker deletion on shutdown for cloud platforms
- Lowering Kubelet privileges in #372 dropped a needed node deletion authorization. Scale-in due to manual terraform apply (any cloud), AWS spot termination, or Azure low priority deletion left old nodes registered, requiring manual deletion (kubectl delete node name)

AWS

Add ingress_zone_id output with the NLB DNS name's Route53 zone for use in alias records (#380)

Azure

Fix azure provider warning, public_ip allocation_method replaces public_ip_address_allocation
- Require terraform-provider-azurerm v1.21+ (action required)

Addons

Update nginx-ingress from v0.21.0 to v0.22.0
Update Prometheus from v2.6.0 to v2.7.1
Update kube-state-metrics from v1.4.0 to v1.5.0
- Fix ClusterRole to collect and export PodDisruptionBudget metrics (#383)
Update node-exporter from v0.15.2 to v0.17.0
Update Grafana from v5.4.2 to v5.4.3

v1.13.2

Kubernetes v1.13.2
Add ServiceAccounts for kube-apiserver and kube-scheduler (#370)
Use lower-privilege TLS client certificates for Kubelets (#372)
Use HTTPS liveness probes for kube-scheduler and kube-controller-manager (#377)
Update CoreDNS from v1.2.6 to v1.3.0
Allow the certificates.k8s.io API to issue certificates signed by the cluster CA (#376)
- Configure controller manager to sign CSRs that are manually approved by an administrator

AWS

Change controller_type and worker_type default from t2.small to t3.small (#365)
- t3.small is cheaper, provides 2 vCPU (instead of 1), and 5 Gbps of pod-to-pod bandwidth!

Bare-Metal

Remove the kubeconfig output variable

Addons

Update Prometheus from v2.5.0 to v2.6.0

v1.13.1

Kubernetes v1.13.1
Update Calico from v3.3.2 to v3.4.0 (#362)
- Install CNI plugins with an init container rather than a sidecar
- Improve the calico-node ClusterRole
Recommend updating terraform-provider-ct plugin from v0.2.1 to v0.3.0 (#363)
- Migration instructions for upgrading terraform-provider-ct in-place for v1.12.2+ clusters (action required)
- Require switching from ~/.terraformrc to the Terraform third-party plugins directory ~/.terraform.d/plugins/
- Require Container Linux 1688.5.3 or newer

Google Cloud

Increase TCP proxy apiserver backend service timeout from 1 minute to 5 minutes (#361)
- Align port-forward behavior closer to AWS/Azure (no timeout)

Addons

Update Grafana from v5.4.0 to v5.4.2

v1.13.0

Kubernetes v1.13.0
Update Calico from v3.3.1 to v3.3.2

Addons

Update Grafana from v5.3.4 to v5.4.0
Disable Grafana login form, since admin user can't be disabled (#352)
- Example manifests aim to provide a read-only dashboard view

v1.12.3

Kubernetes v1.12.3
Add enable_reporting variable (default "false") to provide upstreams with usage data (#345)
Change kube-apiserver --kubelet-preferred-address-types to InternalIP,ExternalIP,Hostname
Update Calico from v3.3.0 to v3.3.1
- Disable Felix usage reporting by default (#345)
Improve flannel manifests
- Rename kube-flannel DaemonSet to flannel and kube-flannel-cfg ConfigMap to flannel-config
- Drop unused mounts and add a CPU resource request
Update CoreDNS from v1.2.4 to v1.2.6
- Enable CoreDNS loop and loadbalance plugins (#340)
Fix pod-checkpointer log noise and checkpointable pods detection (#346)
Use kubernetes-incubator/bootkube v0.14.0
Recommend switching from ~/.terraformrc to the Terraform third-party plugins directory ~/.terraform.d/plugins/.
- Allows pinning terraform-provider-ct and terraform-provider-matchbox versions
- Improves safety of later plugin version migrations

Azure

Use eviction policy Delete for Low priority virtual machine scale set workers (#343)
- Fix issue where Azure defaults to Deallocate eviction policy, which required manually restarting deallocated instances. Delete policy aligns Azure with AWS and GCP behavior.
- Require terraform-provider-azurerm v1.19+ (action required)

Bare-Metal

Add Kubelet /etc/iscsi and iscsadm mounts on bare-metal for iSCSI (#103)

Addons

Update nginx-ingress from v0.20.0 to v0.21.0
Update Prometheus from v2.4.3 to v2.5.0
Update Grafana from v5.3.2 to v5.3.4

v1.12.2

Kubernetes v1.12.2
Update CoreDNS from 1.2.2 to 1.2.4
Update Calico from v3.2.3 to v3.3.0
Disable Kubelet read-only port (#324)
Fix CoreDNS AntiAffinity spec to prefer spreading replicas
Ignore controller node user-data changes (#335)
- Once all managed clusters use v1.12.2, it is possible to update terraform-provider-ct

AWS

Add disk_iops variable for EBS volume IOPS (#314)

Azure

Use new azurerm_network_interface_backend_address_pool_association (#332)
- Require terraform-provider-azurerm v1.17+ (action required)
Add primary field to ip_configuration needed by v1.17+ (#331)

DigitalOcean

Add AAAA DNS records resolving to worker nodes (#333)
- Hosting IPv6 apps requires editing nginx-ingress with hostNetwork: true

Google Cloud

Add an IPv6 address and IPv6 forwarding rules for load balancing IPv6 Ingress (#334)
- Add ingress_static_ipv6 output variable for use in AAAA DNS records
- Allow serving IPv6 applications via Kubernetes Ingress

Addons

Configure Heapster to scrape Kubelets with bearer token auth (#323)
Update Grafana from v5.3.1 to v5.3.2

v1.12.1

Kubernetes v1.12.1
Update etcd from v3.3.9 to v3.3.10
Update CoreDNS from 1.1.3 to 1.2.2
Update Calico from v3.2.1 to v3.2.3
Raise scheduler and controller-manager replicas to the larger of 2 or the number of controller nodes (#312)
- Single-controller clusters continue to run 2 replicas as before
Raise default CoreDNS replicas to the larger of 2 or the number of controller nodes (#313)
- Add AntiAffinity preferred rule to favor spreading CoreDNS pods
Annotate control plane and addon containers to use the Docker runtime seccomp profile (#319)
- Override Kubernetes default behavior that starts containers with seccomp=unconfined

Azure

Remove admin_password field (disabled) since it is now optional
- Require terraform-provider-azurerm v1.16+ (action required)

Bare-Metal

Add support for cached_install mode with Flatcar Linux (#315)

DigitalOcean

Require terraform-provider-digitalocean v1.0+ (action required)

Addons

Update nginx-ingress from v0.19.0 to v0.20.0
Update Prometheus from v2.3.2 to v2.4.3
Update Grafana from v5.2.4 to v5.3.1

v1.11.3

Kubernetes v1.11.3
Introduce Typhoon for Azure as alpha (#288)
- Special thanks @justaugustus for an earlier variant
Update Calico from v3.1.3 to v3.2.1 (#278)

AWS

Remove firewall rule allowing ICMP packets to nodes (#285)

Bare-Metal

Remove controller_networkds and worker_networkds variables. Use Container Linux Config snippets #277

Google Cloud

Fix firewall to allow etcd client port 2379 traffic between controller nodes (#287)
- kube-apiservers were only able to connect to their node's local etcd peer. While master node outages were tolerated, reaching a healthy peer took longer than neccessary in some cases
- Reduce time needed to bootstrap the cluster
Remove firewall rule allowing workers to access Nginx Ingress health check (#284)
- Nginx Ingress addon no longer uses hostNetwork, Prometheus scrapes via CNI network

Addons

Update nginx-ingress from 0.17.1 to 0.19.0
Update kube-state-metrics from v1.3.1 to v1.4.0
Update Grafana from 5.2.2 to 5.2.4

v1.11.2

Kubernetes v1.11.2
Update etcd from v3.3.8 to v3.3.9
Use kubernetes-incubator/bootkube v0.13.0
Fix Fedora Atomic modules' Kubelet version (#270)

Bare-Metal

Introduce Container Linux Config snippets on bare-metal
- Validate and additively merge custom Container Linux Configs during terraform plan
- Define files, systemd units, dropins, networkd configs, mounts, users, and more
- Require terraform-provider-ct plugin v0.2.1 (action required!)

Addons

Update nginx-ingress from 0.16.2 to 0.17.1
Add nginx-ingress manifests for bare-metal
Update Grafana from 5.2.1 to 5.2.2
Update heapster from v1.5.3 to v1.5.4

v1.11.1

Kubernetes v1.11.1

Addons

Update Prometheus from v2.3.1 to v2.3.2

Errata

Fedora Atomic modules shipped with Kubelet v1.11.0, instead of v1.11.1. Fixed in #270.

v1.11.0

Kubernetes v1.11.0
Force apiserver to stop listening on 127.0.0.1:8080
Replace kube-dns with CoreDNS (#261)
- Edit the coredns ConfigMap to customize
- CoreDNS doesn't use a resizer. For large clusters, scaling may be required.

AWS

Update from Fedora Atomic 27 to 28 (#258)

Bare-Metal

Update from Fedora Atomic 27 to 28 (#263)

Google

Promote Google Cloud to stable
Update from Fedora Atomic 27 to 28 (#259)
Remove ingress_static_ip module output. Use ingress_static_ipv4.
Remove controllers_ipv4_public module output.

Addons

Update nginx-ingress from 0.15.0 to 0.16.2
Update Grafana from 5.1.4 to 5.2.1
Update heapster from v1.5.2 to v1.5.3

v1.10.5

Kubernetes v1.10.5
Update etcd from v3.3.6 to v3.3.8 (#243, #247)

AWS

Switch kube-apiserver port from 443 to 6443 (#248)
Combine apiserver and ingress NLBs (#249)
- Reduce cost by ~$18/month per cluster. Typhoon AWS clusters now use one network load balancer.
- Ingress addon users may keep using CNAME records to the ingress_dns_name module output (few million RPS)
- Ingress users with heavy traffic (many million RPS) should create a separate NLB(s)
Worker pools no longer include an extraneous load balancer. Remove worker module's ingress_dns_name output
Disable detailed (paid) monitoring on worker nodes (#251)
- Favor Prometheus for cloud-agnostic metrics, aggregation, and alerting
Add worker_target_group_http and worker_target_group_https module outputs to allow custom load balancing
Add target_group_http and target_group_https worker module outputs to allow custom load balancing

Bare-Metal

Switch kube-apiserver port from 443 to 6443 (#248)
- Users who exposed kube-apiserver on a WAN via their router/load-balancer will need to adjust its configuration (e.g. DNAT 6443). Most apiservers are on a LAN (internal, VPN-only, etc) so if you didn't specially configure network gear for 443, no change is needed. (possible action required)
Fix possible deadlock when provisioning clusters larger than 10 nodes (#244)

DigitalOcean

Switch kube-apiserver port from 443 to 6443 (#248)
- Update firewall rules and generated kubeconfig's

Google Cloud

Use global HTTP and TCP proxy load balancing for Kubernetes Ingress (#252)
- Switch Ingress from regional network load balancers to global HTTP/TCP Proxy load balancing
- Reduce cost by ~$19/month per cluster. Google bills the first 5 global and regional forwarding rules separately. Typhoon clusters now use 3 global and 0 regional forwarding rules.
Worker pools no longer include an extraneous load balancer. Remove worker module's ingress_static_ip output
Allow using nginx-ingress addon on Fedora Atomic clusters (#200)
Add worker_instance_group module output to allow custom global load balancing
Add instance_group worker module output to allow custom global load balancing
Deprecate ingress_static_ip module output. Add ingress_static_ipv4 module output instead.
Deprecate controllers_ipv4_public module output

Addons

Update CLUO from v0.6.0 to v0.7.0 (#242)
Update Prometheus from v2.3.0 to v2.3.1
Update Grafana from 5.1.3 to 5.1.4
Drop hostNetwork from nginx-ingress addon
- Both flannel and Calico support host port via portmap
- Allows writing NetworkPolicies that reference ingress pods in from or to. HostNetwork pods were difficult to write network policy for since they could circumvent the CNI network to communicate with pods on the same node.

v1.10.4

Kubernetes v1.10.4
Update etcd from v3.3.5 to v3.3.6
Update Calico from v3.1.2 to v3.1.3

Addons

Update Prometheus from v2.2.1 to v2.3.0
Add Prometheus liveness and readiness probes
Annotate Grafana service so Prometheus scrapes metrics
Label namespaces to ease writing Network Policies

v1.10.3

Kubernetes v1.10.3
Add Flatcar Linux (Container Linux derivative) as an option for AWS and bare-metal (thanks @kinvolk folks)
Allow bearer token authentication to the Kubelet (#216)
- Require Webhook authorization to the Kubelet
- Switch apiserver X509 client cert org to satisfy new authorization requirement
Require Terraform v0.11.x and drop support for v0.10.x (migration guide)
Update etcd from v3.3.4 to v3.3.5 (#213)
Update Calico from v3.1.1 to v3.1.2

AWS

Allow Flatcar Linux by setting os_image to flatcar-stable (default), flatcar-beta, flatcar-alpha (#211)
Replace os_channel variable with os_image to align naming across clouds
- Please change values stable, beta, or alpha to coreos-stable, coreos-beta, coreos-alpha (action required!)
Allow preemptible workers via spot instances (#202)
- Add worker_price to allow worker spot instances. Default to empty string for the worker autoscaling group to use regular on-demand instances
- Add spot_price to internal workers module for spot worker pools

Bare-Metal

Allow Flatcar Linux by setting os_channel to flatcar-stable, flatcar-beta, flatcar-alpha (#220)
Replace container_linux_channel variable with os_channel
- Please change values stable, beta, or alpha to coreos-stable, coreos-beta, coreos-alpha (action required!)
Replace container_linux_version variable with os_version
Add network_ip_autodetection_method variable for Calico host IPv4 address detection
- Use Calico's default "first-found" to support single NIC and bonded NIC nodes
- Allow alternative methods for multi NIC nodes, like can-reach=IP or interface=REGEX
Deprecate container_linux_oem variable

DigitalOcean

Update Fedora Atomic module to use Fedora Atomic 28 (#225)
- Fedora Atomic 27 images disappeared from DigitalOcean and forced this early update

Addons

Fix Prometheus data directory location (#203)
Configure Prometheus to scrape Kubelets directly with bearer token auth instead of proxying through the apiserver (#217)
- Security improvement: Drop RBAC permission from nodes/proxy to nodes/metrics
- Scale: Remove per-node proxied scrape load from the apiserver
Update Grafana from v5.04 to v5.1.3 (#208)
- Disable Grafana Google Analytics by default (#214)
Update nginx-ingress from 0.14.0 to 0.15.0
Annotate nginx-ingress service so Prometheus auto-discovers and scrapes service endpoints (#222)

v1.10.2

Kubernetes v1.10.2
Introduce Typhoon for Fedora Atomic (#199)
Update Calico from v3.0.4 to v3.1.1 (#197)
- https://www.projectcalico.org/announcing-calico-v3-1/
- https://github.com/projectcalico/calico/releases/tag/v3.1.0
Update etcd from v3.3.3 to v3.3.4
Update kube-dns from v1.14.9 to v1.14.10

Google Cloud

Add support for multi-controller clusters (i.e. multi-master) (#54, #190)
- Switch from Google Cloud network load balancer to a TCP proxy load balancer. Avoid a bug in Google network load balancers that limited clusters to only bootstrapping one controller node.
- Add TCP health check for apiserver pods on controllers. Replace kubelet check approximation.

Addons

Update nginx-ingress from 0.12.0 to 0.14.0
Update kube-state-metrics from v1.3.0 to v1.3.1

v1.10.1

Kubernetes v1.10.1
Enable etcd v3.3 metrics endpoint (#175)
Use k8s.gcr.io instead of gcr.io/google_containers (#180)
- Kubernetes recommends using the alias to pull from the nearest regional mirror and to abstract the backing container registry
Update etcd from v3.3.2 to v3.3.3
Update kube-dns from v1.14.8 to v1.14.9
Use kubernetes-incubator/bootkube v0.12.0

Bare-Metal

Fix need for multiple terraform apply runs to create a cluster with Terraform v0.11.4 (#181)
- To SSH during a disk install for debugging, SSH as user "core" with port 2222
- Remove the old trick of using a user "debug" during disk install

Google Cloud

Refactor out the controller internal module

Addons

Add Prometheus discovery for etcd peers on controller nodes (#175)
- Scrape etcd v3.3 --listen-metrics-urls for metrics
- Enable etcd alerts and populate the etcd Grafana dashboard
Update kube-state-metrics from v1.2.0 to v1.3.0

v1.10.0

Kubernetes v1.10.0
Remove unused, unmaintained pxe-worker internal module

AWS

Add disk_type optional variable for setting the EBS volume type (#176)
- Change default type from standard to gp2. Prometheus etcd alerts are tuned for fast disks.

Digital Ocean

Ensure etcd secrets are only distributed to controller hosts, not workers.
Remove networking optional variable. Only flannel works on Digital Ocean.

Google Cloud

Add disk_size optional variable for setting instance disk size in GB
Add controller_type optional variable for setting machine type for controllers
Add worker_type optional variable for setting machine type for workers
Remove machine_type optional variable. Use controller_type and worker_type.

Addons

Update Grafana from v4.6.3 to v5.0.4 (#153, #174)
- Restrict dashboard organization role to Viewer

v1.9.6

Kubernetes v1.9.6
Update Calico from v3.0.3 to v3.0.4

Addons

Update heapster from v1.5.1 to v1.5.2

v1.9.5

Kubernetes v1.9.5
- Fix subPath volume mounts regression (kubernetes#61076)
Introduce Container Linux Config snippets on cloud platforms (#145)
- Validate and additively merge custom Container Linux Configs during terraform plan
- Define files, systemd units, dropins, networkd configs, mounts, users, and more
- Require updating terraform-provider-ct plugin from v0.2.0 to v0.2.1
Add node-role.kubernetes.io/controller="true" node label to controllers (#160)

AWS

Require updating terraform-provider-ct plugin from v0.2.0 to v0.2.1 (action required!)

Digital Ocean

Require updating terraform-provider-ct plugin from v0.2.0 to v0.2.1 (action required!)

Google Cloud

Require updating terraform-provider-ct plugin from v0.2.0 to v0.2.1 (action required!)
Relax os_image to optional. Default to "coreos-stable".

Addons

Update nginx-ingress from 0.11.0 to 0.12.0
Update Prometheus from 2.2.0 to 2.2.1

v1.9.4

Kubernetes v1.9.4
- Secret, configMap, downward API, and projected volumes now read-only (breaking, kubernetes#58720)
- Regressed subPath volume mounts (regression, kubernetes#61076)
- Mitigated subPath CVE-2017-1002101
Introduce worker pools for AWS and Google Cloud for joining heterogeneous workers to existing clusters.
Use new Network Load Balancers and cross zone load balancing on AWS
Allow flexvolume plugins to be used on any Typhoon cluster (not just bare-metal)
Upgrade etcd from v3.2.15 to v3.3.2
Update Calico from v3.0.2 to v3.0.3
Use kubernetes-incubator/bootkube v0.11.0
Recommend updating terraform-provider-ct plugin from v0.2.0 to v0.2.1 (action recommended)

AWS

Promote AWS platform to stable
Allow groups of workers to be defined and joined to a cluster (i.e. worker pools) (#150)
Replace the apiserver elastic load balancer with a network load balancer (#136)
Replace the Ingress elastic load balancer with a network load balancer (#141)
- AWS NLBs can handle millions of RPS with high throughput and low latency.
- Require terraform-provider-aws 1.7.0 or higher
Enable NLB cross-zone load balancing (#159)
- Requests are automatically evenly distributed to targets regardless of AZ
- Require terraform-provider-aws 1.11.0 or higher
Add kubelet --volume-plugin-dir flag to allow flexvolume plugins (#142)
Fix controller and worker launch configs to ignore AMI changes (#126, #158)

Digital Ocean

Add kubelet --volume-plugin-dir flag to allow flexvolume plugins (#142)
Fix to pass ssh_fingerprints as a list to droplets (#143)

Google Cloud

Allow groups of workers to be defined and joined to a cluster (i.e. worker pools) (#148)
Add kubelet --volume-plugin-dir flag to allow flexvolume plugins (#142)
Add kubeconfig variable to controllers and workers submodules (#147)
Remove kubeconfig_* variables from controllers and workers submodules (#147)
Allow initial experimentation with accelerators (i.e. GPUs) on workers (#161) (unofficial)
- Require terraform-provider-google v1.6.0

Addons

Update Prometheus from 2.1.0 to 2.2.0 (#153)
- Scrape Prometheus itself to enable alerts about Prometheus itself
- Adjust KubeletDown rule to fire when 10% of kubelets are down
Update heapster from v1.5.0 to v1.5.1 (#131)
- Use separate service account
Update nginx-ingress from 0.10.2 to 0.11.0

v1.9.3

Kubernetes v1.9.3
Network improvements and fixes (#104)
- Switch from Calico v2.6.6 to v3.0.2
- Add Calico GlobalNetworkSet CRD
- Update flannel from v0.9.0 to v0.10.0
- Use separate service account for flannel
Update etcd from v3.2.14 to v3.2.15

Digital Ocean

Use new Droplet types which offer more CPU/memory, at lower cost. (#105)
- A small Digital Ocean cluster costs less than $25 a month!

Addons

Update Prometheus from v2.0.0 to v2.1.0 (#113)
- Improve alerting rules
- Relabel discovered kubelet, endpoint, service, and apiserver scrapes
- Use separate service accounts
- Update node-exporter and kube-state-metrics
Include Grafana dashboards for Kubernetes admins (#113)
- Add grafana-watcher to load bundled upstream dashboards
Update nginx-ingress from 0.9.0 to 0.10.2
Update CLUO from v0.5.0 to v0.6.0
Switch manifests to use apps/v1 Deployments and Daemonsets (#120)
Remove Kubernetes Dashboard manifests (#121)

v1.9.2

Kubernetes v1.9.2
Add Terraform v0.11.x support
- Add explicit "providers" section to modules for Terraform v0.11.x
- Retain support for Terraform v0.10.4+
Add migration guide from Terraform v0.10.x to v0.11.x (action required!)
Update etcd from 3.2.13 to 3.2.14
Update calico from 2.6.5 to 2.6.6
Update kube-dns from v1.14.7 to v1.14.8
Use separate service account for kube-dns
Use kubernetes-incubator/bootkube v0.10.0

Bare-Metal

Use per-node Container Linux install profiles (#97)
- Allow Container Linux channel/version to be chosen per-cluster
- Fix issue where cluster deletion could require terraform apply multiple times

Digital Ocean

Relax digitalocean provider version constraint
Fix bug with terraform plan always showing a firewall diff to be applied (#3)

Addons

Update CLUO to v0.5.0 to fix compatibility with Kubernetes 1.9 (important)
- Earlier versions can't roll out Container Linux updates on Kubernetes 1.9 nodes (cluo#163)
Update kube-state-metrics from v1.1.0 to v1.2.0
Fix RBAC cluster role for kube-state-metrics

v1.9.1

Kubernetes v1.9.1
Update kube-dns from 1.14.5 to v1.14.7
Update etcd from 3.2.0 to 3.2.13
Update Calico from v2.6.4 to v2.6.5
Enable portmap to fix hostPort with Calico
Use separate service account for controller-manager

v1.8.6

Kubernetes v1.8.6
Update Calico from v2.6.3 to v2.6.4

v1.8.5

Kubernetes v1.8.5
Recommend Container Linux images with Docker 17.09
- Container Linux stable, beta, and alpha now provide Docker 17.09 (instead of 1.12)
- Older clusters (with CLUO addon) auto-update Container Linux version to begin using Docker 17.09
Fix race where etcd-member.service could fail to resolve peers (#69)
Add optional cluster_domain_suffix variable (#74)
Use kubernetes-incubator/bootkube v0.9.1

Bare-Metal

Add kubelet --volume-plugin-dir flag to allow flexvolume providers (#61)

Addons

Discourage deploying the Kubernetes Dashboard (security)

v1.8.4

Kubernetes v1.8.4
Calico related bug fixes
Update Calico from v2.6.1 to v2.6.3
Update flannel from v0.9.0 to v0.9.1
Service accounts for kube-proxy and pod-checkpointer
Use kubernetes-incubator/bootkube v0.9.0

v1.8.3

Kubernetes v1.8.3
Run etcd on-host, across controllers
Promote AWS platform to beta
Use kubernetes-incubator/bootkube v0.8.2

Google Cloud

Add required variable region (e.g. "us-central1")
Reduce time to bootstrap a cluster
Change etcd to run on-host, across controllers (etcd-member.service)
Change controller instances to automatically span zones in the region
Change worker managed instance group to automatically span zones in the region
Improve internal firewall rules and use tag-based firewall policies
Remove support for self-hosted etcd
Remove the zone required variable
Remove the controller_preemptible optional variable

AWS

Promote AWS platform to beta
Reduce time to bootstrap a cluster
Change etcd to run on-host, across controllers (etcd-member.service)
Fix firewall rules for multi-controller kubelet scraping and node-exporter
Remove support for self-hosted etcd

Addons

Add Prometheus 2.0 addon with alerting rules
Add Grafana dashboard for observing metrics

v1.8.2

Kubernetes v1.8.2
- Fixes a memory leak in the v1.8.1 apiserver (kubernetes#53485)
Switch to using the gcr.io/google_containers/hyperkube
Update flannel from v0.8.0 to v0.9.0
Add hairpinMode to flannel CNI config
Add --no-negcache to kube-dns dnsmasq
Use kubernetes-incubator/bootkube v0.8.1

v1.8.1

Kubernetes v1.8.1
Use kubernetes-incubator/bootkube v0.8.0

Digital Ocean

Run etcd cluster across controller nodes (etcd-member.service)
Remove support for self-hosted etcd
Reduce time to bootstrap a cluster

v1.7.7

Kubernetes v1.7.7
Use kubernetes-incubator/bootkube v0.7.0
Update kube-dns to 1.14.5 to fix dnsmasq vulnerability
Calico v2.6.1
flannel-cni v0.3.0
- Update flannel CNI config to fix hostPort

v1.7.5

Kubernetes v1.7.5
Use kubernetes-incubator/bootkube v0.6.2
Add AWS Terraform module (alpha)
Add support for Calico networking (bare-metal, Google Cloud, AWS)
Change networking default from "flannel" to "calico"

AWS

Add network_mtu to allow CNI interface MTU customization

Bare-Metal

Add network_mtu to allow CNI interface MTU customization
Remove support for experimental_self_hosted_etcd

v1.7.3

Kubernetes v1.7.3
Use kubernetes-incubator/bootkube v0.6.1

Digital Ocean

Add cloud firewall rules (requires Terraform v0.10)
Change nodes tags from strings to DO tags

v1.7.1

Kubernetes v1.7.1
Use kubernetes-incubator/bootkube v0.6.0
Add Bare-Metal Terraform module (stable)
Add Digital Ocean Terraform module (beta)

Google Cloud

Remove k8s_domain_name variable, cluster_name + dns_zone resolves to controllers
Rename dns_base_zone to dns_zone
Rename dns_base_zone_name to dns_zone_name

v1.6.7

Kubernetes v1.6.7
Use kubernetes-incubator/bootkube v0.5.1

v1.6.6

Kubernetes v1.6.6
Use kubernetes-incubator/bootkube v0.4.5
Disable locksmithd on hosts, in favor of CLUO.

v1.6.4

Kubernetes v1.6.4
Add Google Cloud Terraform module (stable)

Earlier

Earlier versions, back to v1.3.0, used different designs and mechanisms.

148 KiB Raw Blame History

Typhoon

Latest

v1.31.1

Google

v1.31.0

AWS

Google

v1.30.4

v1.30.3

AWS

Azure

Google Cloud

v1.30.2

v1.30.1

Azure

v1.30.0

v1.29.3

v1.29.2

v1.29.1

AWS

Known Issues

v1.29.0

Known Issues

v1.28.4

v1.28.3

Google Cloud

v1.28.2

Azure

v1.28.1

v1.28.0

v1.27.4

v1.27.3

AWS

Azure

v1.27.2

Fedora CoreOS

v1.27.1

v1.26.3

Bare-Metal

v1.26.2

Bare-Metal

Known Issues

v1.26.1

v1.26.0

AWS

Addons

v1.25.4

Fedora CoreOS

Cloud

Addons

v1.25.3

Azure

Flatcar Linux

Docs

Addons

v1.25.2

v1.25.1

Addons

v1.25.0

Fedora CoreOS

v1.24.4

Fedora CoreOS

Flatcar Linux

AWS

Google

Addons

v1.24.3

Addons

Notes

v1.24.2

Addons

Known Issues

v1.24.1

Addons

v1.24.0

Addons

v1.23.6

Azure

Google Cloud

148 KiB

Raw Blame History