typhoon

mirror of https://github.com/puppetmaster/typhoon.git synced 2025-02-18 22:51:27 +01:00

Author	SHA1	Message	Date
Dalton Hubble	69d064bfdf	Run kube-apiserver with lower privilege user (nobody) * Run kube-apiserver as a non-root user (nobody). User no longer needs to bind low number ports. * On most platforms, the kube-apiserver load balancer listens on 6443 and fronts controllers with kube-apiserver pods using port 6443. Google Cloud TCP proxy load balancers cannot listen on 6443. However, GCP's load balancer can be made to listen on 443, while kube-apiserver uses 6443 across all platforms.	2019-07-08 20:52:00 -07:00
Dalton Hubble	3fcb04f68c	Improve apiserver backend service zone spanning * google_compute_backend_services use nested blocks to define backends (instance groups heterogeneous controllers) * Use Terraform v0.12.x dynamic blocks so the apiserver backend service can refer to (up to zone-many) controller instance groups * Previously, with Terraform v0.11.x, the apiserver backend service had to list a fixed set of backends to span controller nodes across zones in multi-controller setups. 3 backends were used because each GCP region offered at least 3 zones. Single-controller clusters had the cosmetic ugliness of unused instance groups * Allow controllers to span more than 3 zones if avilable in a region (e.g. currently only us-central1, with 4 zones) Related: * https://www.terraform.io/docs/providers/google/r/compute_backend_service.html * https://www.terraform.io/docs/configuration/expressions.html#dynamic-blocks	2019-07-05 19:46:26 -07:00
Dalton Hubble	d6d9e6c4b9	Migrate Google Cloud module Terraform v0.11 to v0.12 * Replace v0.11 bracket type hints with Terraform v0.12 list expressions * Use expression syntax instead of interpolated strings, where suggested * Update Google Cloud tutorial and worker pools documentation * Define Terraform and plugin version requirements in versions.tf * Require google ~> 2.5 to support Terraform v0.12 * Require ct ~> 0.3.2 to support Terraform v0.12	2019-06-06 09:48:56 -07:00
Dalton Hubble	e0c032be94	Increase GCP TCP proxy apiserver backend timeout to 5 minutes * On GCP, kubectl port-forward connections to pods are closed after a timeout (unlike AWS NLB's or Azure load balancers) * Increase the GCP apiserver backend service timeout from 1 minute to 5 minutes to be more similar to AWS/Azure LB behavior	2018-12-15 17:34:18 -08:00
Dalton Hubble	9d4cbb38f6	Rerun terraform fmt	2018-05-01 21:41:22 -07:00
Dalton Hubble	ad2e4311d1	Switch GCP network lb to global TCP proxy lb * Allow multi-controller clusters on Google Cloud * GCP regional network load balancers have a long open bug in which requests originating from a backend instance are routed to the instance itself, regardless of whether the health check passes or not. As a result, only the 0th controller node registers. We've recommended just using single master GCP clusters for a while * https://issuetracker.google.com/issues/67366622 * Workaround issue by switching to a GCP TCP Proxy load balancer. TCP proxy lb routes traffic to a backend service (global) of instance group backends. In our case, spread controllers across 3 zones (all regions have 3+ zones) and organize them in 3 zonal unmanaged instance groups that serve as backends. Allows multi-controller cluster creation * GCP network load balancers only allowed legacy HTTP health checks so kubelet 10255 was checked as an approximation of controller health. Replace with TCP apiserver health checks to detect unhealth or unresponsive apiservers. * Drawbacks: GCP provision time increases, tailed logs now timeout (similar tradeoff in AWS), controllers only span 3 zones instead of the exact number in the region * Workaround in Typhoon has been known and posted for 5 months, but there still appears to be no better alternative. Its probably time to support multi-master and accept the downsides	2018-04-18 00:09:06 -07:00
Dalton Hubble	5035d56db2	Refactor GCP to remove controller internal module * Remove the controller internal module to align with other platforms and since its not a supported use case	2018-04-12 19:41:51 -07:00

7 Commits