typhoon

Commit Graph

Author	SHA1	Message	Date
Dalton Hubble	3fcb04f68c	Improve apiserver backend service zone spanning * google_compute_backend_services use nested blocks to define backends (instance groups heterogeneous controllers) * Use Terraform v0.12.x dynamic blocks so the apiserver backend service can refer to (up to zone-many) controller instance groups * Previously, with Terraform v0.11.x, the apiserver backend service had to list a fixed set of backends to span controller nodes across zones in multi-controller setups. 3 backends were used because each GCP region offered at least 3 zones. Single-controller clusters had the cosmetic ugliness of unused instance groups * Allow controllers to span more than 3 zones if avilable in a region (e.g. currently only us-central1, with 4 zones) Related: * https://www.terraform.io/docs/providers/google/r/compute_backend_service.html * https://www.terraform.io/docs/configuration/expressions.html#dynamic-blocks	2019-07-05 19:46:26 -07:00
Dalton Hubble	d6d9e6c4b9	Migrate Google Cloud module Terraform v0.11 to v0.12 * Replace v0.11 bracket type hints with Terraform v0.12 list expressions * Use expression syntax instead of interpolated strings, where suggested * Update Google Cloud tutorial and worker pools documentation * Define Terraform and plugin version requirements in versions.tf * Require google ~> 2.5 to support Terraform v0.12 * Require ct ~> 0.3.2 to support Terraform v0.12	2019-06-06 09:48:56 -07:00
Dalton Hubble	46196af500	Remove Haswell minimum CPU platform requirement * Google Cloud API implements `min_cpu_platform` to mean "use exactly this CPU" * Fix error creating clusters in newer regions lacking Haswell platform (e.g. europe-west2) (#438) * Reverts #405, added in v1.13.4 * Original goal of ignoring old Ivy/Sandy bridge CPUs in older regions will be achieved shortly anyway. Google Cloud is deprecating those CPUs in April 2019 * https://cloud.google.com/compute/docs/instances/specify-min-cpu-platform#how_selecting_a_minimum_cpu_platform_works	2019-03-27 19:51:32 -07:00
Dalton Hubble	2019177b6b	Fix implicit map assignments to be explicit * Terraform v0.12 will require map assignments be explicit, part of v0.12 readiness	2019-03-12 01:19:54 -07:00
Dalton Hubble	7f8572030d	Upgrade to support terraform-provider-google v2.0+ * Support terraform-provider-google v1.19.0, v1.19.1, v1.20.0 and v2.0+ (and allow for future 2.x.y releases) * Require terraform-provider-google v1.19.0 or newer. v1.19.0 introduced `network_interface` fields `network_ip` and `nat_ip` to deprecate `address` and `assigned_nat_ip`. Those deprecated fields are removed in terraform-provider-google v2.0 * https://github.com/terraform-providers/terraform-provider-google/releases/tag/v2.0.0	2019-02-20 02:33:32 -08:00
Dalton Hubble	ba4c5de052	Set the Google Cloud minimum CPU platform to Intel Haswell * Intel Haswell or better is available in every zone around the world * Neither Kubernetes nor Typhoon have a particular minimum processor family. However, a few Google Cloud zones still default to Sandy/Ivy bridge (scheduled to shift April 2019). Price is only based on machine type so it is beneficial to opt for the next processor family * Intel Haswell is a suitable minimum since it still allows plenty of liberty in choosing any region or machine type * Likely a slight increase to preemption probability in a few zones, but any lower probability on Sandy/Ivy bridge is due to lower desirability as they're phased out * https://cloud.google.com/compute/docs/regions-zones/	2019-02-18 12:55:04 -08:00
Dalton Hubble	b57273b6f1	Rename internal kube_dns_service_ip to cluster_dns_service_ip * terraform-render-bootkube module deprecated kube_dns_service_ip output in favor of cluster_dns_service_ip * Rename k8s_dns_service_ip to cluster_dns_service_ip for consistency too	2019-01-05 13:32:03 -08:00
Dalton Hubble	812a1adb49	Use a lower-privilege Kubelet kubeconfig in system:nodes * Kubelets can use a lower-privilege TLS client certificate with Org system:nodes and a binding to the system:node ClusterRole * Admin kubeconfig's continue to belong to Org system:masters to provide cluster-admin (available in assets/auth/kubeconfig or as a Terraform output kubeconfig-admin) * Remove bare-metal output variable kubeconfig	2019-01-05 13:08:56 -08:00
Dalton Hubble	0e71f7e565	Ignore controller user_data changes to allow plugin updates * Updating the `terraform-provider-ct` plugin is known to produce a `user_data` diff in all pre-existing clusters. Applying the diff to pre-existing cluster destroys controller nodes * Ignore changes to controller `user_data`. Once all managed clusters use a release containing this change, it is possible to update the `terraform-provider-ct` plugin (worker `user_data` will still be modified) * Changing the module `ref` for an existing cluster and re-applying is still NOT supported (although this PR would protect controllers from being destroyed)	2018-10-28 16:48:12 -07:00
Dalton Hubble	b8eeafe4f9	Template etcd_servers list to replace null_resource.repeat * Remove the last usage of null_resource.repeat, which has always been an eyesore for creating the etcd server list * Originally, #224 switched to templating the etcd_servers list for all clouds, but had to revert on GCP in #237 * https://github.com/poseidon/typhoon/pull/224 * https://github.com/poseidon/typhoon/pull/237	2018-08-21 22:46:24 -07:00
Dalton Hubble	6676484490	Partially revert b7ed6e7bd35cee39a3f65b47e731938c3006b5cd * Fix change that broke Google Cloud container-linux and fedora-atomic https://github.com/poseidon/typhoon/pull/224	2018-06-06 23:48:37 -07:00
Ben Drucker	6a581ab577	Render etcd_initial_cluster using a template_file	2018-05-30 21:14:49 -07:00
Dalton Hubble	9d4cbb38f6	Rerun terraform fmt	2018-05-01 21:41:22 -07:00
Dalton Hubble	ad2e4311d1	Switch GCP network lb to global TCP proxy lb * Allow multi-controller clusters on Google Cloud * GCP regional network load balancers have a long open bug in which requests originating from a backend instance are routed to the instance itself, regardless of whether the health check passes or not. As a result, only the 0th controller node registers. We've recommended just using single master GCP clusters for a while * https://issuetracker.google.com/issues/67366622 * Workaround issue by switching to a GCP TCP Proxy load balancer. TCP proxy lb routes traffic to a backend service (global) of instance group backends. In our case, spread controllers across 3 zones (all regions have 3+ zones) and organize them in 3 zonal unmanaged instance groups that serve as backends. Allows multi-controller cluster creation * GCP network load balancers only allowed legacy HTTP health checks so kubelet 10255 was checked as an approximation of controller health. Replace with TCP apiserver health checks to detect unhealth or unresponsive apiservers. * Drawbacks: GCP provision time increases, tailed logs now timeout (similar tradeoff in AWS), controllers only span 3 zones instead of the exact number in the region * Workaround in Typhoon has been known and posted for 5 months, but there still appears to be no better alternative. Its probably time to support multi-master and accept the downsides	2018-04-18 00:09:06 -07:00
Dalton Hubble	5035d56db2	Refactor GCP to remove controller internal module * Remove the controller internal module to align with other platforms and since its not a supported use case	2018-04-12 19:41:51 -07:00

15 Commits