typhoon

Commit Graph

Author	SHA1	Message	Date
Dalton Hubble	6cd3e65267	Update kube-state-metrics from v1.7.0-rc.1 to v1.7.0 * https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.7.0 * Add storageclasses and verticalpodautoscalers to ClusterRole	2019-07-19 00:14:47 -07:00
Dalton Hubble	dfa6bcfecf	Relax terraform-provider-ct version constraint * Allow updating terraform-provider-ct to any release beyond v0.3.2, but below v1.0. This relaxes the prior constraint that allowed only v0.3.y provider versions	2019-07-16 22:07:37 -07:00
Dalton Hubble	70f5cfd33e	Update kube-state-metrics from v1.6.0 to v1.7.0-rc.1 * https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.7.0-rc.1 * https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.7.0-rc.0	2019-07-13 13:13:57 -07:00
Dalton Hubble	9e91d7f011	Upgrade Calico from v3.7.4 to v3.8.0 * Enable CNI bandwidth plugin for traffic shaping * https://kubernetes.io/docs/concepts/extend-kubernetes/compute-storage-net/network-plugins/#support-traffic-shaping	2019-07-11 21:01:41 -07:00
Dalton Hubble	eaf59bd33f	Update Prometheus from v2.11.0-rc.0 to v2.11.0 * https://github.com/prometheus/prometheus/releases/tag/v2.11.0	2019-07-09 21:33:24 -07:00
Dalton Hubble	40640f3697	Upgrade nginx-ingress from v0.24.1 to v0.25.0 * Support networking.k8s.io/v1beta1 apiVersion * Update RBAC cluster-role for networking.k8s.io/v1beta1 * https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.25.0	2019-07-08 22:04:50 -07:00
Dalton Hubble	28ab746068	Update Prometheus from v2.10.0 to v2.11.0-rc.0 * https://github.com/prometheus/prometheus/releases/tag/v2.11.0-rc.0	2019-07-08 21:32:50 -07:00
Dalton Hubble	69d064bfdf	Run kube-apiserver with lower privilege user (nobody) * Run kube-apiserver as a non-root user (nobody). User no longer needs to bind low number ports. * On most platforms, the kube-apiserver load balancer listens on 6443 and fronts controllers with kube-apiserver pods using port 6443. Google Cloud TCP proxy load balancers cannot listen on 6443. However, GCP's load balancer can be made to listen on 443, while kube-apiserver uses 6443 across all platforms.	2019-07-08 20:52:00 -07:00
Dalton Hubble	7a69bae75e	Raise GCP network deletion timeout from 4m to 6m * Fix a GCP errata item https://github.com/poseidon/typhoon/wiki/Errata * Removal of a Google Cloud cluster often required 2 runs of `terraform apply` because network resource deletes timeout after 4m. Raise the network deletion timeout to 6m to ensure apply only needs to be run once to remove a cluster	2019-07-06 13:15:33 -07:00
Dalton Hubble	3fcb04f68c	Improve apiserver backend service zone spanning * google_compute_backend_services use nested blocks to define backends (instance groups heterogeneous controllers) * Use Terraform v0.12.x dynamic blocks so the apiserver backend service can refer to (up to zone-many) controller instance groups * Previously, with Terraform v0.11.x, the apiserver backend service had to list a fixed set of backends to span controller nodes across zones in multi-controller setups. 3 backends were used because each GCP region offered at least 3 zones. Single-controller clusters had the cosmetic ugliness of unused instance groups * Allow controllers to span more than 3 zones if avilable in a region (e.g. currently only us-central1, with 4 zones) Related: * https://www.terraform.io/docs/providers/google/r/compute_backend_service.html * https://www.terraform.io/docs/configuration/expressions.html#dynamic-blocks	2019-07-05 19:46:26 -07:00
Dalton Hubble	8d373b5850	Update Calico from v3.7.3 to v3.7.4 * https://docs.projectcalico.org/v3.7/release-notes/	2019-07-02 20:18:02 -07:00
Dalton Hubble	9a395dbf88	Update Grafana from v6.2.4 to v6.2.5 * https://github.com/grafana/grafana/releases/tag/v6.2.5	2019-06-29 13:21:42 -07:00
Dalton Hubble	fff7cc035d	Remove Fedora Atomic modules * Typhoon for Fedora Atomic was deprecated in March 2019 * https://typhoon.psdn.io/announce/#march-27-2019	2019-06-23 13:40:51 -07:00
Dalton Hubble	408e60075a	Update Kubernetes from v1.14.3 to v1.15.0 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.15.md#v1150 * Remove docs referring to possible v1.14.4 release	2019-06-23 13:12:18 -07:00
Dalton Hubble	5c4486f57b	Allow using Flatcar Linux Edge on bare-metal and AWS * On AWS, use Flatcar Linux Edge by setting `os_image` to "flatcar-edge" * On bare-metal, Flatcar Linux Edge by setting `os_channel` to "flatcar-edge"	2019-06-22 23:38:42 -07:00
Dalton Hubble	4ad69efc43	Update Grafana from v6.2.2 to v6.2.4 * https://github.com/grafana/grafana/releases/tag/v6.2.4	2019-06-19 21:51:54 -07:00
Dalton Hubble	21fb632e90	Update Calico from v3.7.2 to v3.7.3 * https://docs.projectcalico.org/v3.7/release-notes/	2019-06-13 23:54:20 -07:00
Dalton Hubble	cc4f7e09ab	Update node-exporter from v0.18.0 to v0.18.1 * https://github.com/prometheus/node_exporter/releases/tag/v0.18.1	2019-06-07 02:09:44 -07:00
Dalton Hubble	d449477272	Update Grafana from v6.2.1 to v6.2.2 * https://github.com/grafana/grafana/releases/tag/v6.2.2	2019-06-07 00:07:54 -07:00
Dalton Hubble	5303e32e38	Change DO worker_type default from s-1vcpu-1gb to s-1vcpu-2gb * On DigitalOcean, `s-1vcpu-1gb` worker nodes have 1GB of RAM, which is too small as a default, even for most cost constrained developers	2019-06-06 23:50:19 -07:00
Dalton Hubble	da3f2b5d95	Adjust README example and Terraform version in docs * Delay changing README example. Its prominent display on github.com may lead to new users copying it, even though it corresponds to an "in between releases" state and v1.14.4 doesn't exist yet * Leave docs tutorials the same, they can reflect master	2019-06-06 23:36:36 -07:00
Dalton Hubble	3276bf5878	Add migration instructions from Terraform v0.11 to v0.12 * Provide Terraform v0.11 to v0.12 migration guide. Show an in-place strategy and a move resources strategy * Describe in-place modifying an existing cluster and providers, using the Terraform helper to edit syntax, and checking the plan produces a zero diff * Describe replacing existing clusters by creating a new config directory for use with Terraform v0.12 only and moving resources one by one * Provide some limited advise on migrating non-Typhoon resources	2019-06-06 09:51:22 -07:00
Dalton Hubble	db36959178	Migrate bare-metal module Terraform v0.11 to v0.12 * Replace v0.11 bracket type hints with Terraform v0.12 list expressions * Use expression syntax instead of interpolated strings, where suggested * Update bare-metal tutorial * Define `clc_snippets` type constraint map(list(string)) * Define Terraform and plugin version requirements in versions.tf * Require matchbox ~> 0.3.0 to support Terraform v0.12 * Require ct ~> 0.3.2 to support Terraform v0.12	2019-06-06 09:51:21 -07:00
Dalton Hubble	28506df9c7	Avoid unneeded rotations of Regular priority virtual machine scale sets * Azure only allows `eviction_policy` to be set for Low priority VMs. Supporting Low priority VMs meant when Regular VMs were used, each `terraform apply` rolled workers, to set eviction_policy to null. * Terraform v0.12 nullable variables fix the issue and plan does not produce a diff	2019-06-06 09:50:37 -07:00
Dalton Hubble	189487ecaa	Migrate Azure module Terraform v0.11 to v0.12 * Replace v0.11 bracket type hints with Terraform v0.12 list expressions * Use expression syntax instead of interpolated strings, where suggested * Update Azure tutorial and worker pools documentation * Define Terraform and plugin version requirements in versions.tf * Require azurerm ~> 1.27 to support Terraform v0.12 * Require ct ~> 0.3.2 to support Terraform v0.12	2019-06-06 09:50:35 -07:00
Dalton Hubble	d6d9e6c4b9	Migrate Google Cloud module Terraform v0.11 to v0.12 * Replace v0.11 bracket type hints with Terraform v0.12 list expressions * Use expression syntax instead of interpolated strings, where suggested * Update Google Cloud tutorial and worker pools documentation * Define Terraform and plugin version requirements in versions.tf * Require google ~> 2.5 to support Terraform v0.12 * Require ct ~> 0.3.2 to support Terraform v0.12	2019-06-06 09:48:56 -07:00
Dalton Hubble	2ba0181dbe	Migrate AWS module Terraform v0.11 to v0.12 * Replace v0.11 bracket type hints with Terraform v0.12 list expressions * Use expression syntax instead of interpolated strings, where suggested * Update AWS tutorial and worker pools documentation * Define Terraform and plugin version requirements in versions.tf * Require aws ~> 2.7 to support Terraform v0.12 * Require ct ~> 0.3.2 to support Terraform v0.12	2019-06-06 09:45:59 -07:00
Dalton Hubble	1366ae404b	Migrate DigitalOcean module from Terraform v0.11 to v0.12 * Replace v0.11 bracket type hints with Terraform v0.12 list expressions * Use expression syntax instead of interpolated strings, where suggested * Update DigitalOcean tutorial documentation * Define Terraform and plugin version requirements in versions.tf * Require digitalocean ~> v1.3 to support Terraform v0.12 * Require ct ~> v0.3.2 to support Terraform v0.12	2019-06-06 09:44:58 -07:00
Dalton Hubble	0ccb2217b5	Update Kubernetes from v1.14.2 to v1.14.3 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.14.md#v1143	2019-05-31 01:08:32 -07:00
Dalton Hubble	c6faa6b5b8	Recommend updating Terraform providers ct and matchbox * Recomment updating Terraform provider plugins `terraform-provider-ct` and `terraform-provider-matchbox` to prepare for the upcoming Terraform v0.12 migration * https://github.com/poseidon/terraform-provider-ct/releases/tag/v0.3.2 * https://github.com/poseidon/terraform-provider-matchbox/releases/tag/v0.3.0	2019-05-31 00:48:37 -07:00
Dalton Hubble	c565f9fd47	Rename worker pool modules' count variable to worker_count * This change affects users who use worker pools on AWS, GCP, or Azure with a Container Linux derivative * Rename worker pool modules' `count` variable to `worker_count`, because `count` will be a reserved variable name in Terraform v0.12	2019-05-27 16:40:00 -07:00
Dalton Hubble	d9e7195477	Update Grafana from v2.6.0 to v2.6.1	2019-05-27 12:25:00 -07:00
Dalton Hubble	2a71cba0e3	Update CoreDNS from v1.3.1 to v1.5.0 * Add `ready` plugin to improve readinessProbe * https://coredns.io/2019/04/06/coredns-1.5.0-release/	2019-05-27 00:11:52 -07:00
Dalton Hubble	0a835ee403	Replace deprecated `azurerm_autoscale_setting` * Fix Terraform provider azure warning about `azurerm_autoscale_setting` * Require terraform-provider-azure v1.22+ version that introduces the new `azurerm_monitor_autoscale_setting` resource * https://github.com/terraform-providers/terraform-provider-azurerm/blob/master/CHANGELOG.md#1220-february-11-2019	2019-05-26 23:32:42 -07:00
Dalton Hubble	5d2684a04d	Update Grafana from v6.1.6 to v6.2.0 * https://github.com/grafana/grafana/releases/tag/v6.2.0	2019-05-26 22:00:47 -07:00
Dalton Hubble	221889cc9b	Update Prometheus from v2.9.2 to v2.10.0 * https://github.com/prometheus/prometheus/releases/tag/v2.10.0	2019-05-26 21:58:28 -07:00
Dalton Hubble	6e4cf65c4c	Fix terraform-render-bootkube to remove trailing slash * Fix to remove a trailing slash that was erroneously introduced in the scripting that updated from v1.14.1 to v1.14.2 * Workaround before this fix was to re-run `terraform init`	2019-05-22 18:29:11 +02:00
Dalton Hubble	bef9b991b7	Bump Terraform provider versions in docs * Bump Terraform provider versions to reflect the versions used by the maintainer	2019-05-20 18:29:56 +02:00
Dalton Hubble	147c21a4bd	Allow Calico networking on Azure and DigitalOcean * Introduce "calico" as a `networking` option on Azure and DigitalOcean using Calico's new VXLAN support (similar to flannel). Flannel remains the default on these platforms for now. * Historically, DigitalOcean and Azure only allowed Flannel as the CNI provider, since those platforms don't support IPIP traffic that was previously required for Calico. * Looking forward, its desireable for Calico to become the default across Typhoon clusters, since it provides NetworkPolicy and a consistent experience * No changes to AWS, GCP, or bare-metal where Calico remains the default CNI provider. On these platforms, IPIP mode will always be used, since its available and more performant than vxlan	2019-05-20 17:17:20 +02:00
Dalton Hubble	222a94247c	Update node_exporter from v0.17.0 to v0.18.0 * https://github.com/prometheus/node_exporter/releases/tag/v0.18.0	2019-05-17 20:01:30 +02:00
Dalton Hubble	da97bd4f12	Update Kubernetes from v1.14.1 to v1.14.2 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.14.md#v1142	2019-05-17 13:09:15 +02:00
Dalton Hubble	37ce722f9c	Fix race condition in DigitalOcean cluster create * DigitalOcean clusters must secure copy a kubeconfig to worker nodes, but Terraform could decide to try copying before firewall rules have been added to allow SSH access. * Add an explicit dependency on adding firewall rules first	2019-05-17 13:05:08 +02:00
Dalton Hubble	f62286b677	Update Calico from v3.7.0 to v3.7.2 * https://docs.projectcalico.org/v3.7/release-notes/	2019-05-17 12:29:46 +02:00
Dalton Hubble	af18296bc5	Change flannel port from 8472 to 4789 * Change flannel port from the kernel default 8472 to the IANA assigned VXLAN port 4789 * Update firewall rules or security groups for VXLAN * Why now? Calico now offers its own VXLAN backend so standardizing on the IANA port will simplify config * https://github.com/coreos/flannel/blob/master/Documentation/backends.md#vxlan	2019-05-06 21:58:10 -07:00
Dalton Hubble	2d19ab8457	Update kube-state-metrics from v1.6.0-rc.2 to v1.6.0 * https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.6.0	2019-05-06 21:30:49 -07:00
Dalton Hubble	09e0230111	Upgrade Calico from v3.6.1 to v3.7.0 * https://docs.projectcalico.org/v3.7/release-notes/ * https://github.com/poseidon/terraform-render-bootkube/pull/131	2019-05-06 00:44:15 -07:00
Dalton Hubble	feb6192aac	Update etcd from v3.3.12 to v3.3.13 on Container Linux * Skip updating etcd for Fedora Atomic clusters, now that Fedora Atomic has been deprecated	2019-05-04 12:55:42 -07:00
Dalton Hubble	6e9b2450fe	Update Grafana from v6.1.4 to v6.1.6 * https://github.com/grafana/grafana/releases/tag/v6.1.6	2019-05-04 11:14:37 -07:00
Dalton Hubble	253831aac3	Update links to Matchbox, terraform-provider-ct, etc. * Matchbox, terraform-provider-matchbox, and terraform-provider-ct have moved to the poseidon Github organization	2019-05-04 10:50:53 -07:00
Dalton Hubble	0e94708fd8	Update kube-state-metrics from v1.5.0 to v1.6.0-rc.2 * Collect metrics Ingress resources * Collects metrics about certificates.k8s.io certificatesigningrequests * https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.6.0-rc.2	2019-04-27 20:54:40 -07:00
Dalton Hubble	2c11bad439	Update Prometheus from v2.9.1 to v2.9.2 * https://github.com/prometheus/prometheus/releases/tag/v2.9.2	2019-04-27 20:39:55 -07:00
Dalton Hubble	418597aa59	Update Grafana from v6.1.3 to v6.1.4 * https://github.com/grafana/grafana/releases/tag/v6.1.4	2019-04-18 23:30:43 -07:00
Dalton Hubble	f3174c2b7a	Update Prometheus from v2.8.1 to v2.9.1 * https://github.com/prometheus/prometheus/releases/tag/v2.9.1 * https://github.com/prometheus/prometheus/releases/tag/v2.9.0	2019-04-18 23:26:32 -07:00
Dalton Hubble	e73cccd7eb	Update provider versions in tutorial docs * Update terraform provider plugin version in docs to reflect the recommended current versions that are currently used	2019-04-16 00:05:13 -07:00
Dalton Hubble	a141c5fe9e	Update nginx-ingress from v0.23.0 to v0.24.1 * https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.24.1 * https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.24.0	2019-04-15 21:08:22 -07:00
Dalton Hubble	1b157a2fa4	Revert "Update kube-state-metrics from v1.5.0 to v1.6.0-rc.0" * This reverts commit `6e5d66cf66` * kube-state-metrics v1.6.0-rc.0 fires KubeDeploymentReplicasMismatch alerts where its own Deployment doesn't have replicas available, (kube_deployment_status_replicas_available) even though all replicas are available according to kubectl inspection * This problem was present even with the CSR ClusterRole fix (https://github.com/kubernetes/kube-state-metrics/pull/717)	2019-04-13 12:37:53 -07:00
Dalton Hubble	6e5d66cf66	Update kube-state-metrics from v1.5.0 to v1.6.0-rc.0 * Adds a metrics collector for Ingress resources and other improvements * https://github.com/kubernetes/kube-state-metrics/pull/640 * https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.6.0-rc.0	2019-04-09 22:16:36 -07:00
Dalton Hubble	44c293888b	Update Grafana from v6.1.1 to v6.1.3 * https://github.com/grafana/grafana/releases/tag/v6.1.3	2019-04-09 22:06:27 -07:00
Dalton Hubble	452253081b	Update Kubernetes from v1.14.0 to v1.14.1 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.14.md#changelog-since-v1140	2019-04-09 21:47:23 -07:00
Dalton Hubble	1aa4d2cdc1	Update CHANGES for v1.14.0 release	2019-04-08 18:49:52 -07:00
Dalton Hubble	c1fe41d34a	Add ability to load balance TCP/UDP applications on Azure * Add ability to load balance TCP/UDP applications (e.g. NodePort) * Output the load balancer ID as `loadbalancer_id` * Output `worker_security_group_name` and `worker_address_prefix` for extending firewall rules	2019-04-07 22:59:46 -07:00
Dalton Hubble	be29f52039	Add enable_aggregation option (defaults to false) * Add an `enable_aggregation` variable to enable the kube-apiserver aggregation layer for adding extension apiservers to clusters * Aggregation is disabled by default. Typhoon recommends you not enable aggregation. Consider whether less invasive ways to achieve your goals are possible and whether those goals are well-founded * Enabling aggregation and extension apiservers increases the attack surface of a cluster and makes extensions a part of the control plane. Admins must scrutinize and trust any extension apiserver used. * Passing a v1.14 CNCF conformance test requires aggregation be enabled. Having an option for aggregation keeps compliance, but retains the stricter security posture on default clusters	2019-04-07 12:00:38 -07:00
Dalton Hubble	5271e410eb	Update Kubernetes from v1.13.5 to v1.14.0 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.14.md#v1140	2019-04-07 00:15:59 -07:00
Dalton Hubble	ce78d5988e	Refresh Prometheus rules and Grafana dashboards * Refresh rules and dashboards from upstreams * Add new Kubernetes "workload" dashboards * View pods in a workload (deployment/daemonset/statefulset) * View workloads in a namespace	2019-04-06 23:31:44 -07:00
Dalton Hubble	29a3035245	Update Grafana from v6.1.0 to v6.1.1	2019-04-06 18:32:14 -07:00
Dalton Hubble	3e7a38cb13	Update Grafana from v6.0.2 to v6.1.0 * https://github.com/grafana/grafana/releases/tag/v6.1.0	2019-04-03 20:47:48 -07:00
Dalton Hubble	2a07c97538	Harden internal firewall rules on DigitalOcean * Define firewall rules on DigitialOcean to match rules used on AWS, GCP, and Azure * Output `controller_tag` and `worker_tag` to simplify custom firewall rule creation	2019-04-03 20:38:22 -07:00
Dalton Hubble	60265f9b58	Add ability to load balance TCP applications on AWS * Add ability to load balance TCP applications (e.g. NodePort) * Output the network load balancer ARN as `nlb_id` * Accept a `worker_target_groups` (ARN) list to which worker instances should be added * AWS NLBs and target groups don't support UDP	2019-04-01 21:22:20 -07:00
Dalton Hubble	aaa8e0261a	Add Google Cloud worker instances to a target pool * Background: A managed instance group of workers is used in backend services for global load balancing (HTTP/HTTPS Ingress) and output for custom global load balancing use cases * Add worker instances to a target pool load balancing TCP/UDP applications (NodePort or proxied). Output as `worker_target_pool` * Health check for workers with a healthy Ingress controller. Forward rules (regional) to target pools don't support different external and internal ports so choosing nodes with Ingress allows proxying as a workaround * A target pool is a logical grouping only. It doesn't add costs to clusters or worker pools	2019-04-01 21:03:48 -07:00
Dalton Hubble	b3ec5f73e3	Update Calico from v3.6.0 to v3.6.1 * https://docs.projectcalico.org/v3.6/release-notes/	2019-03-31 17:43:43 -07:00
Dalton Hubble	3e9dc28a00	Update Prometheus from v2.8.0 to v2.8.1 * https://github.com/prometheus/prometheus/releases/tag/v2.8.1	2019-03-31 17:40:20 -07:00
Dalton Hubble	46196af500	Remove Haswell minimum CPU platform requirement * Google Cloud API implements `min_cpu_platform` to mean "use exactly this CPU" * Fix error creating clusters in newer regions lacking Haswell platform (e.g. europe-west2) (#438) * Reverts #405, added in v1.13.4 * Original goal of ignoring old Ivy/Sandy bridge CPUs in older regions will be achieved shortly anyway. Google Cloud is deprecating those CPUs in April 2019 * https://cloud.google.com/compute/docs/instances/specify-min-cpu-platform#how_selecting_a_minimum_cpu_platform_works	2019-03-27 19:51:32 -07:00
Dalton Hubble	5a1bc423a1	Announce Fedora Atomic modules won't be updated beyond v1.13.x * Thank you Project Atomic team and users * See the deprecation announcement https://typhoon.psdn.io/announce/#march-27-2019	2019-03-26 23:56:33 -07:00
Dalton Hubble	32fe72fb2d	Update mkdocs and plugin versions used in tutorials * Recommend provider plugin versions that are currently used by the author * Recommend updating terraform-provider-ct plugin from v0.3.0 to v0.3.1 * https://github.com/coreos/terraform-provider-ct/releases	2019-03-26 01:00:44 -07:00
Dalton Hubble	4fea526ebf	Update Kubernetes from v1.13.4 to v1.13.5 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.13.md#v1135	2019-03-25 21:43:47 -07:00
Dalton Hubble	41a9d86bc3	Add NetworkPolicy to limit traffic into Prometheus * Allow traffic from Grafana to Prometheus in monitoring * Allow traffic from Prometheus to Prometheus in monitoring * NetworkPolicy denies non-whitelisted traffic. Define policy to allow other access	2019-03-23 21:38:34 -07:00
Dalton Hubble	36e31fc9fa	Add liveness and readiness probes to Grafana * https://github.com/grafana/grafana/issues/3302	2019-03-23 17:55:37 -07:00
Dalton Hubble	619a0370dc	Update Grafana from v6.0.1 to v6.0.2 * https://github.com/grafana/grafana/releases/tag/v6.0.2	2019-03-21 23:41:25 -07:00
Dalton Hubble	1feefbe9c6	Update Calico from v3.5.2 to v3.6.0 * Add calico-ipam CRDs and RBAC permissions * Switch IPAM from host-local to calico-ipam * `calico-ipam` subnets `ippools` (defaults to pod CIDR) into `ipamblocks` (defaults to /26, but set to /24 in Typhoon) * `host-local` subnets the pod CIDR based on the node PodCIDR field (set via kube-controller-manager as /24's) * Create a custom default IPv4 IPPool to ensure the block size is kept at /24 to allow 110 pods per node (Kubernetes default) * Retaining host-local was slightly preferred, but Calico v3.6 is migrating all usage to calico-ipam. The codepath that skipped calico-ipam for KDD was removed * https://docs.projectcalico.org/v3.6/release-notes/	2019-03-19 22:49:56 -07:00
Dalton Hubble	aa630003a4	Refresh Prometheus rules and Grafana dashboards * Refresh rules and dashboards from upstreams * Organize dashboards and stay below the ConfigMap size limit	2019-03-17 13:23:04 -07:00
Dalton Hubble	bf97a45b9d	Remove heapster manifests from addons * Heapster addon powers `kubectl top` * In early Kubernetes, people legitimately used and expected `kubectl top` to work, so the optional addon was provided * Today the standards are different. Many better monitoring tools exist, that are also less coupled to Kubernetes "kubectl top" reliance on a non-core extensions means its not in-scope for minimal Kubernetes clusters. No more exceptionalism * Finally, Heapster isn't that useful anymore. Its manifests have no need for Typhoon-specific modification * Look to prior releases if you still wish to apply heapster	2019-03-17 12:41:59 -07:00
Dalton Hubble	3d6a6d4adb	Re-add Kubelet metadata service dependency on DigitalOcean * Restore the original special-casing of DigitalOcean Kubelets * Fix node metadata InternalIP being set to the IP of the default gateway on DigitalOcean nodes (regressed in v1.12.3) * Reverts the "pretty" node names on DigitalOcean (worker-2 vs IP) * Closes #424 (full details)	2019-03-17 12:39:25 -07:00
Dalton Hubble	e0bee2e417	Update Prometheus from v2.7.2 to v2.8.0 * https://github.com/prometheus/prometheus/releases/tag/v2.8.0	2019-03-13 22:11:38 -07:00
Dalton Hubble	9493ed3b1d	Change default iPXE kernel/initrd download from HTTP to HTTPS * Require an iPXE-enabled network boot environment with support for TLS downloads. PXE clients must chainload to iPXE firmware compiled with `DOWNLOAD_PROTO_HTTPS` enabled ([crypto](https://ipxe.org/crypto)) * iPXE's pre-compiled firmware binaries do _not_ enable HTTPS. Admins should build iPXE from source with support enabled * Affects the Container Linux and Flatcar Linux install profiles that pull from public downloads. No effect when cached_install=true or using Fedora Atomic, as those download from Matchbox * Add `download_protocol` variable. Recognizing boot firmware TLS support is difficult in some environments, set the protocol to "http" for the old behavior (discouraged)	2019-03-09 23:23:40 -08:00
Dalton Hubble	4201eb1efa	Update Grafana from v6.0.0 to v6.0.1 * https://github.com/grafana/grafana/releases/tag/v6.0.1	2019-03-09 12:44:18 -08:00
Dalton Hubble	fe96da27d7	Add support for terraform-provider-aws v2.0+ * Allow terraform-provider-aws >= v1.13, but < 3.0. No change to the minimum version, but allow using v2.x.y releases * Verify compatability with terraform-provider-aws v2.1.0	2019-03-09 12:06:44 -08:00
Dalton Hubble	4d9a692424	Update Prometheus from v2.7.1 to v2.7.2 * https://github.com/prometheus/prometheus/releases/tag/v2.7.2	2019-03-04 23:08:12 -08:00
Dalton Hubble	deec512c14	Resolve in-addr.arpa and ip6.arpa zones with CoreDNS kubernetes plugin * Resolve in-addr.arpa and ip6.arpa DNS PTR requests for Kubernetes service IPs and pod IPs * Previously, CoreDNS was configured to resolve in-addr.arpa PTR records for service IPs (but not pod IPs)	2019-03-04 23:03:00 -08:00
Dalton Hubble	5066a25d89	Add links and clarifications in CHANGES for release	2019-03-02 11:26:12 -08:00
Dalton Hubble	a08adc92b5	Update nginx-ingress from v0.22.0 to v0.23.0 * https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.23.0	2019-03-01 01:18:54 -08:00
Dalton Hubble	f598307998	Update Kubernetes from v1.13.3 to v1.13.4 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.13.md#v1134	2019-02-28 22:47:43 -08:00
Dalton Hubble	daee5a9d60	Update Grafana from v6.0.0-beta3 to v6.0.0 * https://github.com/grafana/grafana/releases/tag/v6.0.0 * http://docs.grafana.org/guides/whats-new-in-v6-0/	2019-02-25 21:43:43 -08:00
Dalton Hubble	73ae5d5649	Update Calico from v3.5.1 to v3.5.2 * https://docs.projectcalico.org/v3.5/releases/	2019-02-25 21:23:13 -08:00
Dalton Hubble	42d7222f3d	Add a readiness probe to CoreDNS * https://github.com/poseidon/terraform-render-bootkube/pull/115	2019-02-23 13:25:23 -08:00
Dalton Hubble	d10c2b4cb9	Update Grafana from v6.0.0-beta2 to v6.0.0-beta3 * Update Grafana dashboards	2019-02-23 13:03:25 -08:00
Dalton Hubble	7f8572030d	Upgrade to support terraform-provider-google v2.0+ * Support terraform-provider-google v1.19.0, v1.19.1, v1.20.0 and v2.0+ (and allow for future 2.x.y releases) * Require terraform-provider-google v1.19.0 or newer. v1.19.0 introduced `network_interface` fields `network_ip` and `nat_ip` to deprecate `address` and `assigned_nat_ip`. Those deprecated fields are removed in terraform-provider-google v2.0 * https://github.com/terraform-providers/terraform-provider-google/releases/tag/v2.0.0	2019-02-20 02:33:32 -08:00
Dalton Hubble	4294bd0292	Assign Pod Priority classes to critical cluster and node components * Assign pod priorityClassNames to critical cluster and node components (higher is higher priority) to inform node out-of-resource eviction order and scheduler preemption and scheduling order * Priority Admission Controller has been enabled since Typhoon v1.11.1	2019-02-19 22:21:39 -08:00
Dalton Hubble	ba4c5de052	Set the Google Cloud minimum CPU platform to Intel Haswell * Intel Haswell or better is available in every zone around the world * Neither Kubernetes nor Typhoon have a particular minimum processor family. However, a few Google Cloud zones still default to Sandy/Ivy bridge (scheduled to shift April 2019). Price is only based on machine type so it is beneficial to opt for the next processor family * Intel Haswell is a suitable minimum since it still allows plenty of liberty in choosing any region or machine type * Likely a slight increase to preemption probability in a few zones, but any lower probability on Sandy/Ivy bridge is due to lower desirability as they're phased out * https://cloud.google.com/compute/docs/regions-zones/	2019-02-18 12:55:04 -08:00
Dalton Hubble	e483c81ce9	Improve Prometheus rules and alerts and Grafana dashboards * Collate upstream rules, alerts, and dashboards and tune for use in Typhoon * Previously, a well-chosen (but older) set of rules, alerts, and dashboards were maintained to reflect metric name changes	2019-02-18 12:19:23 -08:00
Dalton Hubble	6fa3b8a13f	Upgrade Grafana to v6.0.0-beta2 and enable Explore UI * Upgrade Grafana from v5.4.3 to v6.0.0-beta2 * Enable Grafana Explore UI while still using only the Viewer role (inspect/edit without saving) * http://docs.grafana.org/guides/whats-new-in-v6-0/	2019-02-17 13:26:42 -08:00
Dalton Hubble	d988822741	Document and recommend terraform-provider-matchbox v0.2.3 * https://github.com/coreos/terraform-provider-matchbox/releases/tag/v0.2.3	2019-02-16 15:07:49 -08:00
Dalton Hubble	170ef74eea	Remove Nginx Ingress default backend * nginx-ingress no longer requires a configured default-backend, it will respond with its own 404 page starting in v0.21.0 * https://github.com/kubernetes/ingress-nginx/pull/3196	2019-02-16 14:18:15 -08:00
Dalton Hubble	b13a651cfe	Drop metrics that are unset, high cardinality, or extraneous * https://github.com/coreos/prometheus-operator/pull/2387 * https://github.com/coreos/prometheus-operator/pull/1959	2019-02-10 23:56:11 -08:00
Dalton Hubble	9c59f393a5	Add Kubernetes pod name to metrics discovered from service endpoints * Prometheus queries from some upstreams use joins of node-exporter and kube-state-metrics metrics by (namespace,pod). Add the Kubernetes pod name to service endpoint metrics * Rename the kubernetes_namespace field to namespace * Honor labels since kube-state-metrics already include a `pod` field that should not be overridden	2019-02-10 23:54:30 -08:00
Dalton Hubble	3e4b3bfb04	Raise nginx-ingress liveness/readiness timeout * Under heavy load, avoid timeouts causing nginx-ingress restarts https://github.com/kubernetes/ingress-nginx/pull/3737	2019-02-09 12:53:09 -08:00
Dalton Hubble	584088397c	Update etcd from v3.3.11 to v3.3.12 * https://github.com/etcd-io/etcd/releases/tag/v3.3.12	2019-02-09 11:54:54 -08:00
Dalton Hubble	0200058e0e	Update Calico from v3.5.0 to v3.5.1 * Fix in confd https://github.com/projectcalico/confd/pull/205	2019-02-09 11:49:31 -08:00
Dalton Hubble	d5537405e1	Add CHANGES note about reducing the pod eviciton timeout	2019-02-02 14:54:18 -08:00
Dalton Hubble	949ce21fb2	Update Prometheus from v2.7.0 to v2.7.1 * https://github.com/prometheus/prometheus/releases/tag/v2.7.1	2019-02-02 00:13:24 -08:00
Dalton Hubble	ccd96c37da	Update Kubernetes from v1.13.2 to v1.13.3 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.13.md#v1133	2019-02-01 23:26:13 -08:00
Dalton Hubble	244a1a601a	Switch CoreDNS to use the forward plugin instead of proxy * Use the forward plugin to forward to upstream resolvers, instead of the proxy plugin. The forward plugin is reported to be a faster alternative since it can re-use open sockets * https://coredns.io/explugins/forward/ * https://coredns.io/plugins/proxy/ * https://github.com/kubernetes/kubernetes/issues/73254	2019-01-30 22:25:23 -08:00
Dalton Hubble	130daeac26	Update Prometheus from v2.6.1 to v2.7.0	2019-01-29 22:31:20 -08:00
Dalton Hubble	1ab06f69d7	Update flannel from v0.10.0 to v0.11.0 * https://github.com/coreos/flannel/releases/tag/v0.11.0	2019-01-29 21:51:25 -08:00
Dalton Hubble	eb08593eae	Fix azure provider warning, rename a public_ip field * azurerm_public_ip (used internally) added a field `allocation_method` to replace the field `public_ip_address_allocation` (deprecated) * Require terraform-provider-azurerm v1.21+ * https://github.com/terraform-providers/terraform-provider-azurerm/pull/2576	2019-01-27 17:52:35 -08:00
Dalton Hubble	e9659a8539	Update Calico from v3.4.0 to v3.5.0 * https://docs.projectcalico.org/v3.5/releases/	2019-01-27 16:34:30 -08:00
Dalton Hubble	f5ff003d0e	Update node-exporter from v0.15.2 to v0.17.0 * node-exporter renamed multiple metrics that are reflected in changes to Prometheus rules and Grafana dashboard expressions	2019-01-22 01:14:00 -08:00
Dalton Hubble	d697dd46dc	Allow kube-state-metrics PodDisruptionBudget metrics * Update kube-state-metrics ClusterRole to allow collecting poddisruptionbudget metrics (exported as kube_poddisruptionbudget_) https://github.com/kubernetes/kube-state-metrics/pull/551 * Bump addon-resizer from v1.7 to v1.8.4	2019-01-22 01:12:32 -08:00
Dalton Hubble	2f3097ebea	Update nginx-ingress from v0.21.0 to v0.22.0 * https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.22.0	2019-01-16 23:01:22 -08:00
Dalton Hubble	f4d3508578	Update CoreDNS from v1.3.0 to v1.3.1 * https://coredns.io/2019/01/13/coredns-1.3.1-release/	2019-01-15 22:50:25 -08:00
Dalton Hubble	67fb9602e7	Update Prometheus from v2.6.0 to v2.6.1 * https://github.com/prometheus/prometheus/releases/tag/v2.6.1	2019-01-15 21:13:40 -08:00
Dalton Hubble	c8a85fabe1	Update Grafana from v5.4.2 to v5.4.3 * https://github.com/grafana/grafana/releases/tag/v5.4.3	2019-01-15 21:13:16 -08:00
Dalton Hubble	7eafa59d8f	Fix instance shutdown automatic worker deletion on clouds * Fix a regression caused by lowering the Kubelet TLS client certificate to system:nodes group (#100) since dropping cluster-admin dropped the Kubelet's ability to delete nodes. * On clouds where workers can scale down (manual terraform apply, AWS spot termination, Azure low priority deletion), worker shutdown runs the delete-node.service to remove a node to prevent NotReady nodes from accumulating * Allow Kubelets to delete cluster nodes via system:nodes group. Kubelets acting with system:node and kubelet-delete ClusterRoles is still an improvement over acting as cluster-admin	2019-01-14 23:27:48 -08:00
Dalton Hubble	679079b242	Add AWS ingress_zone_id output with NLB DNS name's Route53 zone id * DNS zones served by AWS Route53 may use AWS's special alias records (other DNS providers would use a CNAME) to resolve the ingress NLB. Alias records require the NLB DNS name's DNS zone id (not the cluster `dns_zone_id`)	2019-01-13 16:45:52 -08:00
Dalton Hubble	1d27dc6528	Update kube-state-metrics exporter from v1.4.0 to v1.5.0 * https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.5.0	2019-01-12 14:24:57 -08:00
Dalton Hubble	b74cc8afd2	Update etcd from v3.3.10 to v3.3.11 * https://github.com/etcd-io/etcd/releases/tag/v3.3.11	2019-01-12 14:17:25 -08:00
Dalton Hubble	1d66ad33f7	Change AWS worker modules' default type from t2.small to t3.small * Worker instance types weren't updated in #365	2019-01-12 00:07:48 -08:00
Dalton Hubble	4d32b79c6f	Update Kubernetes from v1.13.1 to v1.13.2 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.13.md#v1132	2019-01-12 00:00:53 -08:00
Dalton Hubble	df4c0ba05d	Use HTTPS liveness probes for kube-scheduler and kube-controller-manager * Disable kube-scheduler and kube-controller-manager HTTP ports	2019-01-09 20:56:50 -08:00
Dalton Hubble	bfe0c74793	Enable the certificates.k8s.io API to issue cluster certificates * System components that require certificates signed by the cluster CA can submit a CSR to the apiserver, have an administrator inspect and approve it, and be issued a certificate * Configure kube-controller-manager to sign Approved CSR's using the cluster CA private key * Admins are responsible for approving or denying CSRs, otherwise, no certificate is issued. Read the Kubernetes docs carefully and verify the entity making the request and the authorization level * https://kubernetes.io/docs/tasks/tls/managing-tls-in-a-cluster	2019-01-06 17:33:37 -08:00
Dalton Hubble	6795a753ea	Update CoreDNS from v1.2.6 to v1.3.0 * https://coredns.io/2018/12/15/coredns-1.3.0-release/	2019-01-05 13:35:03 -08:00
Dalton Hubble	812a1adb49	Use a lower-privilege Kubelet kubeconfig in system:nodes * Kubelets can use a lower-privilege TLS client certificate with Org system:nodes and a binding to the system:node ClusterRole * Admin kubeconfig's continue to belong to Org system:masters to provide cluster-admin (available in assets/auth/kubeconfig or as a Terraform output kubeconfig-admin) * Remove bare-metal output variable kubeconfig	2019-01-05 13:08:56 -08:00
Dalton Hubble	66e1365cc4	Add ServiceAccounts for kube-apiserver and kube-scheduler * Add ServiceAccounts and ClusterRoleBindings for kube-apiserver and kube-scheduler * Remove the ClusterRoleBinding for the kube-system default ServiceAccount * Rename the CA certificate CommonName for consistency with upstream	2019-01-01 20:16:14 -08:00
Dalton Hubble	ea8b0d1c84	Update Prometheus addon from v2.5.0 to v2.6.0 * https://github.com/prometheus/prometheus/releases/tag/v2.6.0	2018-12-27 07:35:12 -08:00
Dalton Hubble	f2f4deb8bb	Change AWS default type from t2.small to t3.small * T3 is the next generation general purpose burstable instance type. Compared with t2.small, the t3.small is cheaper, has 2 vCPU (instead of 1) and provides 5 Gbps of pod-to-pod bandwidth (instead of 1 Gbps)	2018-12-18 12:38:35 -08:00
Dalton Hubble	4d2f33aee6	Update changelog for v1.13.1 release	2018-12-17 14:28:27 -08:00
Dalton Hubble	d42f47c49e	Update terraform-provider-ct plugin from v0.2.1 to v0.3.0 * Provide migration instructions for upgrading terraform-provider-ct in-place for v1.12.2+ clusters * Require switching from ~/.terraformrc to the Terraform third-party plugins directory ~/.terraform.d/plugins/ * Require Container Linux 1688.5.3 or newer	2018-12-17 14:13:50 -08:00
Dalton Hubble	479d498024	Update Calico from v3.3.2 to v3.4.0 * https://docs.projectcalico.org/v3.4/releases/	2018-12-15 18:05:16 -08:00
Dalton Hubble	e0c032be94	Increase GCP TCP proxy apiserver backend timeout to 5 minutes * On GCP, kubectl port-forward connections to pods are closed after a timeout (unlike AWS NLB's or Azure load balancers) * Increase the GCP apiserver backend service timeout from 1 minute to 5 minutes to be more similar to AWS/Azure LB behavior	2018-12-15 17:34:18 -08:00
Dalton Hubble	b74bf11772	Update Grafana from v5.4.0 to v5.4.2 * https://github.com/grafana/grafana/releases/tag/v5.4.2 * https://github.com/grafana/grafana/releases/tag/v5.4.1	2018-12-15 12:39:03 -08:00
Dalton Hubble	018c5edc25	Update Kubernetes from v1.13.0 to v1.13.1 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.13.md#v1131	2018-12-15 11:44:57 -08:00
Dalton Hubble	ff6ab571f3	Update Calico from v3.3.1 to v3.3.2 * https://docs.projectcalico.org/v3.3/releases/	2018-12-06 22:56:55 -08:00
Dalton Hubble	991fb44c37	Update Grafana from v5.3.4 to v5.4.0 * https://github.com/grafana/grafana/releases/tag/v5.4.0	2018-12-06 01:33:50 -08:00
Dalton Hubble	d31f444fcd	Update Kubernetes from v1.12.3 to v1.13.0	2018-12-03 20:44:32 -08:00
Dalton Hubble	b6016d0a26	Disable Grafana login form, admin user can't be disabled * Example manifests aim to provide a read-only dashboard visible to any users with network access (i.e. kubectl port-forward, LAN) * Problem: Grafana always has an admin user, even with the user management system disabled * Disable the login form to prevent admin login	2018-11-28 22:04:08 -08:00
Dalton Hubble	eec314b52f	Update CHANGES changelog for release	2018-11-28 09:23:13 -08:00
yokhahn	bcce02a9ce	Add Kubelet /etc/iscsi and iscsiadm mounts on bare-metal * Allow using iSCSI with Container Linux bare-metal clusters * Warning, iSCSI isn't part of Kubernetes conformance and isn't regularly evaluated	2018-11-28 00:28:46 -08:00
Dalton Hubble	42c523e6a2	Recommend switch from ~/.terraformrc to 3rd-party plugin dir * Switch tutorials from using ~/.terraformrc to using the 3rd-party plugin directory so 3rd-party plugins can be pinned * Continue to show using terraform-provider-ct v0.2.2. Updating to a newer version is only safe once all managed clusters are v1.12.2 or higher	2018-11-28 00:03:15 -08:00
Dalton Hubble	872b11b948	Update ngninx-ingress from v0.20.0 to v0.21.0 * https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.21.0	2018-11-26 21:57:34 -08:00
Dalton Hubble	5b27d8d889	Update Kubernetes from v1.12.2 to v1.12.3 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.12.md/#v1123	2018-11-26 21:06:09 -08:00
Dalton Hubble	840b73f9ba	Update pod-checkpointer image to query Kubelet secure API * Updates pod-checkpointer to prefer the Kubelet secure API (before falling back to the Kubelet read-only API that is disabled on Typhoon clusters since https://github.com/poseidon/typhoon/pull/324) * Previously, pod-checkpointer checkpointed an initial set of pods during bootstrapping so recovery from power cycling clusters was unaffected, but logs were noisy * https://github.com/kubernetes-incubator/bootkube/pull/1027 * https://github.com/kubernetes-incubator/bootkube/pull/1025	2018-11-26 20:24:32 -08:00
Dalton Hubble	915af3c6cc	Fix Calico Felix reporting usage data, require opt-in * Calico Felix has been reporting anonymous usage data about the version and cluster size, which violates Typhoon's privacy policy where analytics should be opt-in only * Add a variable enable_reporting (default: false) to allow opting in to reporting usage data to Calico (or future components)	2018-11-20 01:03:00 -08:00
Dalton Hubble	c6586b69fd	Use eviction policy Delete for Low priority VMSS workers * Fix issue where Azure defaults to Deallocate eviction policy, which required manually restarting deallocated workers * Require terraform-provider-azurerm v1.19+ to support setting the eviction_policy	2018-11-18 21:04:50 -08:00
Dalton Hubble	ea3fc6d2a7	Update CoreDNS from v1.2.4 to v1.2.6 * https://coredns.io/2018/11/05/coredns-1.2.6-release/	2018-11-18 16:45:53 -08:00
Dalton Hubble	c8c43f3991	Update Grafana from v5.3.2 to v5.3.4 * https://github.com/grafana/grafana/releases/tag/v5.3.3 * https://github.com/grafana/grafana/releases/tag/v5.3.4	2018-11-18 16:42:50 -08:00
Dalton Hubble	7f8e781ae4	Measure DigitalOcean network performance * Measuring pod-to-pod bandwidth in a few regions (NYC3, FRA1, SFO1) shows DigitalOcean has made some improvements	2018-11-11 21:08:10 -08:00
Dalton Hubble	56e9a82984	Add flannel resource request and mount only /run/flannel	2018-11-11 20:35:21 -08:00
Dalton Hubble	e95b856a22	Enable CoreDNS loop and loadbalance plugins * loop sends an initial query to detect infinite forwarding loops in configured upstream DNS servers and fast exit with an error (its a fatal misconfiguration on the network that will otherwise cause resolvers to consume memory/CPU until crashing, masking the problem) * https://github.com/coredns/coredns/tree/master/plugin/loop * loadbalance randomizes the ordering of A, AAAA, and MX records in responses to provide round-robin load balancing (as usual, clients may still cache responses though) * https://github.com/coredns/coredns/tree/master/plugin/loadbalance	2018-11-10 17:36:56 -08:00
Dalton Hubble	31f48a81a8	Update docs to show flannel DaemonSet instead of kube-flannel * No functional change, the rename is just for consistency	2018-11-10 15:16:06 -08:00
Dalton Hubble	2b3f61d1bb	Update Calico from v3.3.0 to v3.3.1 * Structure Calico and flannel manifests * Rename kube-flannel mentions to just flannel	2018-11-10 13:37:12 -08:00
Dalton Hubble	8fd2978c31	Update bootkube image version from v0.13.0 to v0.14.0 * https://github.com/kubernetes-incubator/bootkube/releases/tag/v0.14.0	2018-11-06 23:35:11 -08:00
Dalton Hubble	be9f7b87d6	Update Prometheus from v2.4.3 to v2.5.0 * https://github.com/prometheus/prometheus/releases/tag/v2.5.0	2018-11-06 22:16:12 -08:00
Dalton Hubble	721c847943	Set kube-apiserver kubelet preferred address types * Prefer InternalIP and ExternalIP over the node's hostname, to match upstream behavior and kubeadm * Previously, hostname-override was used to set node names to internal IP's to work around some cloud providers not resolving hostnames for instances (e.g. DO droplets)	2018-11-03 22:31:55 -07:00
Dalton Hubble	884c8b39dc	Update Grafana from v5.3.1 to v5.3.2 * https://github.com/grafana/grafana/releases/tag/v5.3.2	2018-10-28 19:44:22 -07:00
Dalton Hubble	0e71f7e565	Ignore controller user_data changes to allow plugin updates * Updating the `terraform-provider-ct` plugin is known to produce a `user_data` diff in all pre-existing clusters. Applying the diff to pre-existing cluster destroys controller nodes * Ignore changes to controller `user_data`. Once all managed clusters use a release containing this change, it is possible to update the `terraform-provider-ct` plugin (worker `user_data` will still be modified) * Changing the module `ref` for an existing cluster and re-applying is still NOT supported (although this PR would protect controllers from being destroyed)	2018-10-28 16:48:12 -07:00
Dalton Hubble	5be5b261e2	Add an IPv6 address and forwarding rules on Google Cloud * Allowing serving IPv6 applications via Kubernetes Ingress on Typhoon Google Cloud clusters * Add `ingress_static_ipv6` output variable for use in AAAA DNS records	2018-10-28 14:30:58 -07:00
Dalton Hubble	f034ef90ae	Add DigitalOcean AAAA DNS records resolving to workers * Improve the workers "round-robin" DNS FQDN that is created with each cluster by adding AAAA records * CNAME's resolving to the DigitalOcean `workers_dns` output can be followed to find a droplet's IPv4 or IPv6 address * The CNI portmap plugin doesn't support IPv6. Hosting IPv6 apps is possible, but requires editing the nginx-ingress addon with `hostNetwork: true`	2018-10-27 23:09:24 -07:00
Dalton Hubble	3bba1ba0dc	Use new azurerm_network_interface_backend_address_pool_association * Require terraform-provider-azurerm v1.17+ * Inline load_balancer_backend_address_pools_ids is deprecated and scheduled for removal in the v2.0 provider * https://github.com/terraform-providers/terraform-provider-azurerm/pull/2079	2018-10-27 22:55:05 -07:00
Dalton Hubble	dbe7604b67	Add primary field to ip_configuration required by Azure * Required by terraform-provider-azurerm v1.17+ * https://github.com/terraform-providers/terraform-provider-azurerm/pull/2035	2018-10-27 16:44:44 -07:00
Dalton Hubble	f1da0731d8	Update Kubernetes from v1.12.1 to v1.12.2 * Update CoreDNS from v1.2.2 to v1.2.4 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.12.md#v1122 * https://coredns.io/2018/10/17/coredns-1.2.4-release/ * https://coredns.io/2018/10/16/coredns-1.2.3-release/	2018-10-27 15:47:57 -07:00
Dalton Hubble	d641a058fe	Update Calico from v3.2.3 to v3.3.0 * https://docs.projectcalico.org/v3.3/releases/	2018-10-23 20:30:30 -07:00
Dalton Hubble	99a6d5478b	Disable Kubelet read-only port 10255 * We can finally disable the Kubelet read-only port 10255! * Journey: https://github.com/poseidon/typhoon/issues/322#issuecomment-431073073	2018-10-18 21:14:14 -07:00
Dalton Hubble	bc750aec33	Configure Heapster to source metrics from Kubelet authenticated API * Heapster can now get nodes (i.e. kubelets) from the apiserver and source metrics from the Kubelet authenticated API (10250) instead of the Kubelet HTTP read-only API (10255) * https://github.com/kubernetes/heapster/blob/master/docs/source-configuration.md * Use the heapster service account token via Kubelet bearer token authn/authz. * Permit Heapster to skip CA verification. The CA cert does not contain IP SANs and cannot since nodes get random IPs that aren't known upfront. Heapster obtains the node list from the apiserver, so the risk of spoofing a node is limited. For the same reason, Prometheus scrapes must skip CA verification for scraping Kubelet's provided by the apiserver. * https://github.com/poseidon/typhoon/blob/v1.12.1/addons/prometheus/config.yaml#L68 * Create a heapster ClusterRole to work around the default Kubernetes `system:heapster` ClusterRole lacking the proper GET `nodes/stats` access. See https://github.com/kubernetes/heapster/issues/1936	2018-10-18 21:03:01 -07:00
Dalton Hubble	d55bfd5589	Fix CoreDNS AntiAffinity spec to prefer spreading replicas * Pods were still being scheduled at random due to a typo	2018-10-17 22:19:57 -07:00
Robert Fairburn	0be4673e44	Add disk_iops variable for AWS * Setting disk_iops is required for disk_type io1 * https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSVolumeTypes.html#EBSVolumeTypes	2018-10-17 22:18:54 -07:00
Dalton Hubble	3b44972d78	Add links to header to CHANGES	2018-10-17 09:08:58 -07:00
Dalton Hubble	0127ee82c1	Update nginx-ingress from v0.19.0 to v0.20.0	2018-10-16 21:35:29 -07:00
Dalton Hubble	a10d6977b8	Update Prometheus from v2.4.2 to v2.4.3 * https://github.com/prometheus/prometheus/releases/tag/v2.4.3	2018-10-16 21:29:41 -07:00
Dalton Hubble	05fe923c14	Update Grafana from v5.3.0 to v5.3.1 * https://github.com/grafana/grafana/releases/tag/v5.3.1	2018-10-16 21:23:44 -07:00
Michael Schubert	d10620fb58	Add support for Flatcar Linux bare-metal cached_install * Support bare-metal cached_install=true mode with Flatcar Linux where assets are fetched from the Matchbox assets cache instead of from the upstream Flatcar download server * Skipped in original Flatcar support to keep it simple https://github.com/poseidon/typhoon/pull/209	2018-10-16 21:15:24 -07:00
Dalton Hubble	9b6113a058	Update Kubernetes from v1.11.3 to v1.12.1 * Mount an empty dir for the controller-manager to work around https://github.com/kubernetes/kubernetes/issues/68973 * Update coreos/pod-checkpointer to strip affinity from checkpointed pod manifests. Kubernetes v1.12.0-rc.1 introduced a default affinity that appears on checkpointed manifests; but it prevented scheduling and checkpointed pods should not have an affinity, they're run directly by the Kubelet on the local node * https://github.com/kubernetes-incubator/bootkube/issues/1001 * https://github.com/kubernetes/kubernetes/pull/68173	2018-10-16 20:28:13 -07:00
Dalton Hubble	5eb4078d68	Add docker/default seccomp to control plane and addons * Annotate pods, deployments, and daemonsets to start containers with the Docker runtime's default seccomp profile * Overrides Kubernetes default behavior which started containers with seccomp=unconfined * https://docs.docker.com/engine/security/seccomp/#pass-a-profile-for-a-container	2018-10-16 20:07:29 -07:00
Dalton Hubble	8f0d2b5db4	Update Grafana from v5.2.4 to v5.3.0	2018-10-13 23:03:31 -07:00
Dalton Hubble	2e89e161e9	Remove Azure admin_password (disabled) now that its optional * Requires terraform-provider-azurerm v1.16.0 or higher https://github.com/terraform-providers/terraform-provider-azurerm/pull/1958	2018-10-13 22:40:58 -07:00
Dalton Hubble	55bb4dfba6	Raise CoreDNS replica count to 2 or more * Run at least two replicas of CoreDNS to better support rolling updates (previously, kube-dns had a pod nanny) * On multi-master clusters, set the CoreDNS replica count to match the number of masters (e.g. a 3-master cluster previously used replicas:1, now replicas:3) * Add AntiAffinity preferred rule to favor distributing CoreDNS pods across controller nodes nodes	2018-10-13 20:31:29 -07:00
Dalton Hubble	43fe78a2cc	Raise scheduler/controller-manager replicas in multi-master * Continue to ensure scheduler and controller-manager run at least two replicas to support performing kubectl edits on single-master clusters (no change) * For multi-master clusters, set scheduler / controller-manager replica count to the number of masters (e.g. a 3-master cluster previously used replicas:2, now replicas:3)	2018-10-13 16:16:29 -07:00
Dalton Hubble	5a283b6443	Update etcd from v3.3.9 to v3.3.10 * https://github.com/etcd-io/etcd/blob/master/CHANGELOG-3.3.md#v3310-2018-10-10	2018-10-13 13:14:37 -07:00
Dalton Hubble	db36036c81	Require terraform-provider-digitalocean plugin ~> 1.0 * Require a terraform-provider-digitalocean plugin version of 1.0 or higher within the same major version (e.g. allow 1.1 but not 2.0) * Change requirement from ~> 0.1.2 (which allowed up to but not including 1.0 release)	2018-10-02 17:09:19 +02:00
Dalton Hubble	7653e511be	Update CoreDNS and Calico versions * Update CoreDNS from 1.1.3 to 1.2.2 * Update Calico from v3.2.1 to v3.2.3	2018-10-02 16:07:48 +02:00
Dalton Hubble	032a24133b	Update Prometheus from v2.3.2 to v2.4.2 * https://github.com/prometheus/prometheus/releases/tag/v2.4.0 * https://github.com/prometheus/prometheus/releases/tag/v2.4.1 * https://github.com/prometheus/prometheus/releases/tag/v2.4.2	2018-09-21 22:27:11 -07:00
Dalton Hubble	ad871dbfa9	Update Kubernetes from v1.11.2 to v1.11.3 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.11.md#v1113	2018-09-13 18:50:41 -07:00
Dalton Hubble	dc03f7a4a9	Update nginx-ingress from 0.17.1 to 0.19.0 * If using --enable-ssl-passthrough or exposing TCP/UDP services, be aware of https://github.com/kubernetes/ingress-nginx/pull/3038 * Workarounds until the fix merges are to stay on 0.17.1, use the suggested development image, or revert to securityContext `runAsNonRoot: false` for a while (less secure)	2018-09-08 17:57:01 -07:00
Dalton Hubble	1b8234eb91	Update Grafana from v5.2.2 to v5.2.4 * https://github.com/grafana/grafana/releases/tag/v5.2.3 * https://github.com/grafana/grafana/releases/tag/v5.2.4	2018-09-08 15:41:20 -07:00
Dalton Hubble	4ba090feb0	Update kube-state-metrics from v1.3.1 to v1.4.0	2018-08-29 09:37:50 -07:00
Dalton Hubble	4882fe1053	Add docs for Azure Ingress and worker pools * Azure worker pools must be in the same region as the cluster itself unfortunately	2018-08-27 23:30:56 -07:00
Dalton Hubble	7eb09237f4	Update Calico from v3.1.3 to v3.2.1 * Add new bird and felix readiness checks * Read MTU from ConfigMap veth_mtu * Add RBAC read for serviceaccounts * Remove invalid description from CRDs	2018-08-25 17:53:11 -07:00
Dalton Hubble	e58b424882	Fix firewall to allow etcd client traffic between controllers * Broaden internal-etcd firewall rule to allow etcd client traffic (2379) from other controller nodes * Previously, kube-apiservers were only able to connect to their node's local etcd peer. While master node outages were tolerated, reaching a healthy peer took longer than neccessary in some cases * Reduce time needed to bootstrap a cluster	2018-08-21 23:51:40 -07:00
Dalton Hubble	ea365b551a	Fix docs mentions of ELBs to NLBs * Typhoon AWS clusters use an NLB rather than an ELB, since v1.10.5 * Add a few missing links in CHANGES	2018-08-21 21:40:06 -07:00
Dalton Hubble	bbf2c13eef	Remove AWS security rule allowing ICMP packets to nodes * Deny ICMP packets for consistency across Typhoon clusters on various clouds and because there isn't much need to allow them	2018-08-21 21:16:16 -07:00
Dalton Hubble	da5d2c5321	Remove GCP firewall rule allowing Nginx Ingress health * Nginx Ingress addon no longer uses hostNework so Prometheus may scrape port 10254 via the CNI network, rather than via the host address	2018-08-21 21:06:03 -07:00
Dalton Hubble	bec5250e73	Remove unofficial bare-metal _networkds variables Remove controller_networkds and worker_networkds variables. These variables were always listed as experimental, unsupported, and excluded from documentation in anticipation of Container Linux Config snippets * Use Container Linux Config snippets on bare-metal instead. They provide safer, more powerful, and more elegant host customization	2018-08-13 23:33:29 -07:00

... 2 3 4 5 6 ...

516 Commits