typhoon

mirror of https://github.com/puppetmaster/typhoon.git synced 2025-10-04 02:04:37 +02:00

Author	SHA1	Message	Date
Dalton Hubble	db947537d1	Migrate GCP, DO, Azure to static pod control plane * Run a kube-apiserver, kube-scheduler, and kube-controller-manager static pod on each controller node. Previously, kube-apiserver was self-hosted as a DaemonSet across controllers and kube-scheduler and kube-controller-manager were a Deployment (with 2 or controller_count many replicas). * Remove bootkube bootstrap and pivot to self-hosted * Remove pod-checkpointer manifests (no longer needed)	2019-09-09 22:37:31 -07:00
Dalton Hubble	c20683067d	Update etcd from v3.3.15 to v3.4.0 * https://github.com/etcd-io/etcd/releases/tag/v3.4.0	2019-09-08 15:32:49 -07:00
Dalton Hubble	4d5f962d76	Update CoreDNS from v1.5.0 to v1.6.2 * https://coredns.io/2019/06/26/coredns-1.5.1-release/ * https://coredns.io/2019/07/03/coredns-1.5.2-release/ * https://coredns.io/2019/07/28/coredns-1.6.0-release/ * https://coredns.io/2019/08/02/coredns-1.6.1-release/ * https://coredns.io/2019/08/13/coredns-1.6.2-release/	2019-08-31 15:57:42 -07:00
Dalton Hubble	c42139beaa	Update etcd from v3.3.14 to v3.3.15 * No functional changes, just changes to vendoring tools (go modules -> glide). Still, update to v3.3.15 anyway * https://github.com/etcd-io/etcd/compare/v3.3.14...v3.3.15	2019-08-19 15:05:21 -07:00
Dalton Hubble	35c2763ab0	Update Kubernetes from v1.15.2 to v1.15.3 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.15.md/#v1153	2019-08-19 14:49:24 -07:00
Dalton Hubble	8f412e2f09	Update etcd from v3.3.13 to v3.3.14 * https://github.com/etcd-io/etcd/releases/tag/v3.3.14	2019-08-18 21:05:06 -07:00
Dalton Hubble	3c3708d58e	Update Calico from v3.8.1 to v3.8.2 * https://docs.projectcalico.org/v3.8/release-notes/	2019-08-16 15:38:23 -07:00
Dalton Hubble	2227f2cc62	Update Kubernetes from v1.15.1 to v1.15.2 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.15.md#v1152	2019-08-05 08:48:57 -07:00
Dalton Hubble	dcd6733649	Update Calico from v3.8.0 to v3.8.1 * https://docs.projectcalico.org/v3.8/release-notes/	2019-07-27 15:31:13 -07:00
Dalton Hubble	56d0b9eae4	Avoid creating extraneous GCE controller instance groups * Intended as part of #504 improvement * Single controller clusters only require one controller instance group (previously created zone-many) * Multi-controller clusters must "wrap" controllers over zonal heterogeneous instance groups. For example, 5 controllers over 3 zones (no change)	2019-07-20 16:58:45 -07:00
Dalton Hubble	e0c7676a15	Update Kubernetes from v1.15.0 to v1.15.1 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.15.md#downloads-for-v1151	2019-07-19 01:21:08 -07:00
Dalton Hubble	dfa6bcfecf	Relax terraform-provider-ct version constraint * Allow updating terraform-provider-ct to any release beyond v0.3.2, but below v1.0. This relaxes the prior constraint that allowed only v0.3.y provider versions	2019-07-16 22:07:37 -07:00
Dalton Hubble	9e91d7f011	Upgrade Calico from v3.7.4 to v3.8.0 * Enable CNI bandwidth plugin for traffic shaping * https://kubernetes.io/docs/concepts/extend-kubernetes/compute-storage-net/network-plugins/#support-traffic-shaping	2019-07-11 21:01:41 -07:00
Dalton Hubble	69d064bfdf	Run kube-apiserver with lower privilege user (nobody) * Run kube-apiserver as a non-root user (nobody). User no longer needs to bind low number ports. * On most platforms, the kube-apiserver load balancer listens on 6443 and fronts controllers with kube-apiserver pods using port 6443. Google Cloud TCP proxy load balancers cannot listen on 6443. However, GCP's load balancer can be made to listen on 443, while kube-apiserver uses 6443 across all platforms.	2019-07-08 20:52:00 -07:00
Dalton Hubble	7a69bae75e	Raise GCP network deletion timeout from 4m to 6m * Fix a GCP errata item https://github.com/poseidon/typhoon/wiki/Errata * Removal of a Google Cloud cluster often required 2 runs of `terraform apply` because network resource deletes timeout after 4m. Raise the network deletion timeout to 6m to ensure apply only needs to be run once to remove a cluster	2019-07-06 13:15:33 -07:00
Dalton Hubble	3fcb04f68c	Improve apiserver backend service zone spanning * google_compute_backend_services use nested blocks to define backends (instance groups heterogeneous controllers) * Use Terraform v0.12.x dynamic blocks so the apiserver backend service can refer to (up to zone-many) controller instance groups * Previously, with Terraform v0.11.x, the apiserver backend service had to list a fixed set of backends to span controller nodes across zones in multi-controller setups. 3 backends were used because each GCP region offered at least 3 zones. Single-controller clusters had the cosmetic ugliness of unused instance groups * Allow controllers to span more than 3 zones if avilable in a region (e.g. currently only us-central1, with 4 zones) Related: * https://www.terraform.io/docs/providers/google/r/compute_backend_service.html * https://www.terraform.io/docs/configuration/expressions.html#dynamic-blocks	2019-07-05 19:46:26 -07:00
Dalton Hubble	8d373b5850	Update Calico from v3.7.3 to v3.7.4 * https://docs.projectcalico.org/v3.7/release-notes/	2019-07-02 20:18:02 -07:00
Dalton Hubble	408e60075a	Update Kubernetes from v1.14.3 to v1.15.0 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.15.md#v1150 * Remove docs referring to possible v1.14.4 release	2019-06-23 13:12:18 -07:00
Dalton Hubble	21fb632e90	Update Calico from v3.7.2 to v3.7.3 * https://docs.projectcalico.org/v3.7/release-notes/	2019-06-13 23:54:20 -07:00
Dalton Hubble	d6d9e6c4b9	Migrate Google Cloud module Terraform v0.11 to v0.12 * Replace v0.11 bracket type hints with Terraform v0.12 list expressions * Use expression syntax instead of interpolated strings, where suggested * Update Google Cloud tutorial and worker pools documentation * Define Terraform and plugin version requirements in versions.tf * Require google ~> 2.5 to support Terraform v0.12 * Require ct ~> 0.3.2 to support Terraform v0.12	2019-06-06 09:48:56 -07:00
Dalton Hubble	0ccb2217b5	Update Kubernetes from v1.14.2 to v1.14.3 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.14.md#v1143	2019-05-31 01:08:32 -07:00
Dalton Hubble	c565f9fd47	Rename worker pool modules' count variable to worker_count * This change affects users who use worker pools on AWS, GCP, or Azure with a Container Linux derivative * Rename worker pool modules' `count` variable to `worker_count`, because `count` will be a reserved variable name in Terraform v0.12	2019-05-27 16:40:00 -07:00
Dalton Hubble	2a71cba0e3	Update CoreDNS from v1.3.1 to v1.5.0 * Add `ready` plugin to improve readinessProbe * https://coredns.io/2019/04/06/coredns-1.5.0-release/	2019-05-27 00:11:52 -07:00
Dalton Hubble	6e4cf65c4c	Fix terraform-render-bootkube to remove trailing slash * Fix to remove a trailing slash that was erroneously introduced in the scripting that updated from v1.14.1 to v1.14.2 * Workaround before this fix was to re-run `terraform init`	2019-05-22 18:29:11 +02:00
Dalton Hubble	da97bd4f12	Update Kubernetes from v1.14.1 to v1.14.2 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.14.md#v1142	2019-05-17 13:09:15 +02:00
Dalton Hubble	f62286b677	Update Calico from v3.7.0 to v3.7.2 * https://docs.projectcalico.org/v3.7/release-notes/	2019-05-17 12:29:46 +02:00
Dalton Hubble	af18296bc5	Change flannel port from 8472 to 4789 * Change flannel port from the kernel default 8472 to the IANA assigned VXLAN port 4789 * Update firewall rules or security groups for VXLAN * Why now? Calico now offers its own VXLAN backend so standardizing on the IANA port will simplify config * https://github.com/coreos/flannel/blob/master/Documentation/backends.md#vxlan	2019-05-06 21:58:10 -07:00
Dalton Hubble	09e0230111	Upgrade Calico from v3.6.1 to v3.7.0 * https://docs.projectcalico.org/v3.7/release-notes/ * https://github.com/poseidon/terraform-render-bootkube/pull/131	2019-05-06 00:44:15 -07:00
Dalton Hubble	feb6192aac	Update etcd from v3.3.12 to v3.3.13 on Container Linux * Skip updating etcd for Fedora Atomic clusters, now that Fedora Atomic has been deprecated	2019-05-04 12:55:42 -07:00
Jordan Pittier	ecbbdd905e	Use ./ prefix for inner/local worker pool modules * Terraform v0.11 encouraged use of a "./" prefix for local module references and Terraform v0.12 will require it * https://www.terraform.io/docs/modules/sources.html#local-paths Related: https://github.com/hashicorp/terraform/issues/19745	2019-05-04 12:27:22 -07:00
JordanP	8da17fb7a2	Fix "google_compute_target_pool.workers: Cannot determine region" If no region is set at the Google provider level, Terraform fails to create the google_compute_target_pool.workers resource and complains with "Cannot determine region: set in this resource, or set provider-level 'region' or 'zone'." This commit fixes the issue by explicitly setting the region for the google_compute_target_pool.workers resource.	2019-04-13 11:53:56 -07:00
Dalton Hubble	452253081b	Update Kubernetes from v1.14.0 to v1.14.1 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.14.md#changelog-since-v1140	2019-04-09 21:47:23 -07:00
Dalton Hubble	be29f52039	Add enable_aggregation option (defaults to false) * Add an `enable_aggregation` variable to enable the kube-apiserver aggregation layer for adding extension apiservers to clusters * Aggregation is disabled by default. Typhoon recommends you not enable aggregation. Consider whether less invasive ways to achieve your goals are possible and whether those goals are well-founded * Enabling aggregation and extension apiservers increases the attack surface of a cluster and makes extensions a part of the control plane. Admins must scrutinize and trust any extension apiserver used. * Passing a v1.14 CNCF conformance test requires aggregation be enabled. Having an option for aggregation keeps compliance, but retains the stricter security posture on default clusters	2019-04-07 12:00:38 -07:00
Dalton Hubble	5271e410eb	Update Kubernetes from v1.13.5 to v1.14.0 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.14.md#v1140	2019-04-07 00:15:59 -07:00
Dalton Hubble	aaa8e0261a	Add Google Cloud worker instances to a target pool * Background: A managed instance group of workers is used in backend services for global load balancing (HTTP/HTTPS Ingress) and output for custom global load balancing use cases * Add worker instances to a target pool load balancing TCP/UDP applications (NodePort or proxied). Output as `worker_target_pool` * Health check for workers with a healthy Ingress controller. Forward rules (regional) to target pools don't support different external and internal ports so choosing nodes with Ingress allows proxying as a workaround * A target pool is a logical grouping only. It doesn't add costs to clusters or worker pools	2019-04-01 21:03:48 -07:00
Dalton Hubble	b3ec5f73e3	Update Calico from v3.6.0 to v3.6.1 * https://docs.projectcalico.org/v3.6/release-notes/	2019-03-31 17:43:43 -07:00
Dalton Hubble	46196af500	Remove Haswell minimum CPU platform requirement * Google Cloud API implements `min_cpu_platform` to mean "use exactly this CPU" * Fix error creating clusters in newer regions lacking Haswell platform (e.g. europe-west2) (#438) * Reverts #405, added in v1.13.4 * Original goal of ignoring old Ivy/Sandy bridge CPUs in older regions will be achieved shortly anyway. Google Cloud is deprecating those CPUs in April 2019 * https://cloud.google.com/compute/docs/instances/specify-min-cpu-platform#how_selecting_a_minimum_cpu_platform_works	2019-03-27 19:51:32 -07:00
Dalton Hubble	4fea526ebf	Update Kubernetes from v1.13.4 to v1.13.5 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.13.md#v1135	2019-03-25 21:43:47 -07:00
Dalton Hubble	1feefbe9c6	Update Calico from v3.5.2 to v3.6.0 * Add calico-ipam CRDs and RBAC permissions * Switch IPAM from host-local to calico-ipam * `calico-ipam` subnets `ippools` (defaults to pod CIDR) into `ipamblocks` (defaults to /26, but set to /24 in Typhoon) * `host-local` subnets the pod CIDR based on the node PodCIDR field (set via kube-controller-manager as /24's) * Create a custom default IPv4 IPPool to ensure the block size is kept at /24 to allow 110 pods per node (Kubernetes default) * Retaining host-local was slightly preferred, but Calico v3.6 is migrating all usage to calico-ipam. The codepath that skipped calico-ipam for KDD was removed * https://docs.projectcalico.org/v3.6/release-notes/	2019-03-19 22:49:56 -07:00
Dalton Hubble	2019177b6b	Fix implicit map assignments to be explicit * Terraform v0.12 will require map assignments be explicit, part of v0.12 readiness	2019-03-12 01:19:54 -07:00
Dalton Hubble	deec512c14	Resolve in-addr.arpa and ip6.arpa zones with CoreDNS kubernetes plugin * Resolve in-addr.arpa and ip6.arpa DNS PTR requests for Kubernetes service IPs and pod IPs * Previously, CoreDNS was configured to resolve in-addr.arpa PTR records for service IPs (but not pod IPs)	2019-03-04 23:03:00 -08:00
Dalton Hubble	f598307998	Update Kubernetes from v1.13.3 to v1.13.4 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.13.md#v1134	2019-02-28 22:47:43 -08:00
Dalton Hubble	73ae5d5649	Update Calico from v3.5.1 to v3.5.2 * https://docs.projectcalico.org/v3.5/releases/	2019-02-25 21:23:13 -08:00
Dalton Hubble	42d7222f3d	Add a readiness probe to CoreDNS * https://github.com/poseidon/terraform-render-bootkube/pull/115	2019-02-23 13:25:23 -08:00
Dalton Hubble	7f8572030d	Upgrade to support terraform-provider-google v2.0+ * Support terraform-provider-google v1.19.0, v1.19.1, v1.20.0 and v2.0+ (and allow for future 2.x.y releases) * Require terraform-provider-google v1.19.0 or newer. v1.19.0 introduced `network_interface` fields `network_ip` and `nat_ip` to deprecate `address` and `assigned_nat_ip`. Those deprecated fields are removed in terraform-provider-google v2.0 * https://github.com/terraform-providers/terraform-provider-google/releases/tag/v2.0.0	2019-02-20 02:33:32 -08:00
Dalton Hubble	4294bd0292	Assign Pod Priority classes to critical cluster and node components * Assign pod priorityClassNames to critical cluster and node components (higher is higher priority) to inform node out-of-resource eviction order and scheduler preemption and scheduling order * Priority Admission Controller has been enabled since Typhoon v1.11.1	2019-02-19 22:21:39 -08:00
Dalton Hubble	ba4c5de052	Set the Google Cloud minimum CPU platform to Intel Haswell * Intel Haswell or better is available in every zone around the world * Neither Kubernetes nor Typhoon have a particular minimum processor family. However, a few Google Cloud zones still default to Sandy/Ivy bridge (scheduled to shift April 2019). Price is only based on machine type so it is beneficial to opt for the next processor family * Intel Haswell is a suitable minimum since it still allows plenty of liberty in choosing any region or machine type * Likely a slight increase to preemption probability in a few zones, but any lower probability on Sandy/Ivy bridge is due to lower desirability as they're phased out * https://cloud.google.com/compute/docs/regions-zones/	2019-02-18 12:55:04 -08:00
Dalton Hubble	584088397c	Update etcd from v3.3.11 to v3.3.12 * https://github.com/etcd-io/etcd/releases/tag/v3.3.12	2019-02-09 11:54:54 -08:00
Dalton Hubble	0200058e0e	Update Calico from v3.5.0 to v3.5.1 * Fix in confd https://github.com/projectcalico/confd/pull/205	2019-02-09 11:49:31 -08:00
Dalton Hubble	ccd96c37da	Update Kubernetes from v1.13.2 to v1.13.3 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.13.md#v1133	2019-02-01 23:26:13 -08:00

1 2 3 4 5 ...

283 Commits