Commit Graph

652 Commits

Author SHA1 Message Date
Dalton Hubble
96afa6a531 Update Calico from v3.8.2 to v3.9.1
* https://docs.projectcalico.org/v3.9/release-notes/
2019-09-29 11:22:53 -07:00
Dalton Hubble
a407ff72df Add stricter types for AWS modules and update docs
* Review variables available in AWS kubernetes and workers
modules and documentation
* Switching between spot and on-demand has worked since
Terraform v0.12
* Generally, there are too many knobs. Less useful ones
should be de-emphasized or removed
* Remove `cluster_domain_suffix` documentation
2019-09-29 11:19:38 -07:00
Dalton Hubble
f453c54956 Update Grafana from v6.3.5 to v6.3.6
* https://github.com/grafana/grafana/releases/tag/v6.3.6
2019-09-28 15:13:46 -07:00
Dalton Hubble
3e34fb075b Update etcd from v3.4.0 to v3.4.1
* https://github.com/etcd-io/etcd/releases/tag/v3.4.1
2019-09-28 15:09:57 -07:00
Dalton Hubble
9bfb1c5faf Update docs and variable types for worker node_labels
* Document worker pools `node_labels` variable to set the
initial node labels for a homogeneous set of workers
* Document `worker_node_labels` convenience variable to
set the initial node labels for default worker nodes
2019-09-28 15:05:12 -07:00
Dalton Hubble
8703f2c3c5 Fix missing comma separator on bare-metal and DO
* Introduced in bare-metal and DigitalOcean in #544
while addressing possible ordering race, but after
the v1.16 upgrade validation
2019-09-23 11:05:26 -07:00
Dalton Hubble
078f084220 Update CHANGES and docs for v1.16.0 release 2019-09-22 17:37:23 -07:00
Dalton Hubble
9da3725738 Update Kubernetes from v1.15.3 to v1.16.0
* Drop `node-role.kubernetes.io/master` and
`node-role.kubernetes.io/node` node labels
* Kubelet (v1.16) now rejects the node labels used
in the kubectl get nodes ROLES output
* https://github.com/kubernetes/kubernetes/issues/75457
2019-09-18 22:53:06 -07:00
Dalton Hubble
b15c60fa2f Update CHANGES for control plane static pod switch
* Remove old references to bootkube / self-hosted
2019-09-09 22:48:48 -07:00
Dalton Hubble
4a7083d94a Change Azure default controller_type and worker_type
* Change default controller_type to Standard_B2s. A B2s is cheaper
by $17/month and provides 2 vCPU, 4GB RAM (vs 1 vCPU, 3.5GB RAM)
* Change default worker_type to Standard_DS1_v2. F1 was the previous
generation. The DS1_v2 is newer, similar cost, more memory, and still
supports Low Priority mode, if desired
2019-09-09 22:34:28 -07:00
Dalton Hubble
c20683067d Update etcd from v3.3.15 to v3.4.0
* https://github.com/etcd-io/etcd/releases/tag/v3.4.0
2019-09-08 15:32:49 -07:00
Dalton Hubble
dc436b8fe9 Update Grafana from v6.3.4 to v6.3.5
* https://github.com/grafana/grafana/releases/tag/v6.3.5
2019-09-07 14:21:59 -07:00
Dalton Hubble
b74f470701 Recommend updating terraform-provider-ct from v0.3.2 to v0.4.0
* v0.4.0 adds a "strict" mode we'll start using in future and
also adds support for Fedora CoreOS
* https://github.com/poseidon/terraform-provider-ct/releases/tag/v0.4.0
2019-08-31 16:07:22 -07:00
Dalton Hubble
45bc52d156 Update Grafana from v6.3.3 to v6.3.4
* https://github.com/grafana/grafana/releases/tag/v6.3.4
2019-08-31 15:59:13 -07:00
Dalton Hubble
4d5f962d76 Update CoreDNS from v1.5.0 to v1.6.2
* https://coredns.io/2019/06/26/coredns-1.5.1-release/
* https://coredns.io/2019/07/03/coredns-1.5.2-release/
* https://coredns.io/2019/07/28/coredns-1.6.0-release/
* https://coredns.io/2019/08/02/coredns-1.6.1-release/
* https://coredns.io/2019/08/13/coredns-1.6.2-release/
2019-08-31 15:57:42 -07:00
Dalton Hubble
c42139beaa Update etcd from v3.3.14 to v3.3.15
* No functional changes, just changes to vendoring tools
(go modules -> glide). Still, update to v3.3.15 anyway
* https://github.com/etcd-io/etcd/compare/v3.3.14...v3.3.15
2019-08-19 15:05:21 -07:00
Dalton Hubble
35c2763ab0 Update Kubernetes from v1.15.2 to v1.15.3
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.15.md/#v1153
2019-08-19 14:49:24 -07:00
Dalton Hubble
8f412e2f09 Update etcd from v3.3.13 to v3.3.14
* https://github.com/etcd-io/etcd/releases/tag/v3.3.14
2019-08-18 21:05:06 -07:00
Dalton Hubble
4ef2eb7e6b Update Prometheus from v2.11.2 to v2.12.0
* https://github.com/prometheus/prometheus/releases/tag/v2.12.0
2019-08-18 20:59:44 -07:00
Dalton Hubble
99990e3cbb Use stable IDs for etcd, CoreDNS, and Ngnix dashboards
* Use unique dashboard ID so that multiple replicas of Grafana
serve dashboards with uniform paths
* Fix issue where refreshing a dashboard served by one replica
could show a 404 unless the request went to the same replica
2019-08-18 12:45:49 -07:00
Dalton Hubble
3c3708d58e Update Calico from v3.8.1 to v3.8.2
* https://docs.projectcalico.org/v3.8/release-notes/
2019-08-16 15:38:23 -07:00
Dalton Hubble
0c45cd0f06 Update Grafana from v6.3.2 to v6.3.3
* https://github.com/grafana/grafana/releases/tag/v6.3.3
2019-08-16 14:40:47 -07:00
Dalton Hubble
976452825e Update Prometheus from v2.11.0 to v2.11.2
* https://github.com/prometheus/prometheus/releases/tag/v2.11.2
2019-08-14 21:26:46 -07:00
Dalton Hubble
7bc5633c38 Update nginx-ingress from v0.25.0 to v0.25.1
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.25.1
2019-08-14 21:26:46 -07:00
Dalton Hubble
6db11d5908 Enable AWS root block device encryption by default
* terraform-provider-aws v2.23.0 allows AWS root block devices
to enable encryption by default.
* Require updating terraform-provider-aws to v2.23.0 or higher
* Enable root EBS device encryption by default for controller
instances and worker instances in auto-scaling groups

For comparison:

* Google Cloud persistent disks have been encrypted by
default for years
* Azure managed disk encryption is not ready yet (#486)
2019-08-07 21:13:44 -07:00
Dalton Hubble
eaea4d37a2 Update Grafana from v6.2.5 to v6.3.2
* https://github.com/grafana/grafana/releases/tag/v6.3.2
* https://github.com/grafana/grafana/releases/tag/v6.3.1
* https://github.com/grafana/grafana/releases/tag/v6.3.0
2019-08-07 20:01:18 -07:00
Dalton Hubble
457ad18daa Update kube-state-metrics from v1.7.1 to v1.7.2
* Add a separate liveness and readiness probe
* https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.7.2
2019-08-07 20:00:24 -07:00
Dalton Hubble
f79568c02a Add CHANGES section for v1.15.2 release 2019-08-06 09:01:22 -07:00
Dalton Hubble
10d4d9e565 Add Grafana dashboards for CoreDNS and Nginx Ingress Controller
* Add a CoreDNS dashboard originally based on an upstream dashboard,
but now customized according to preferences
* Add an Nginx Ingress Controller based on an upstream dashboard,
but customized according to preferences
2019-08-05 22:49:19 -07:00
Dalton Hubble
2227f2cc62 Update Kubernetes from v1.15.1 to v1.15.2
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.15.md#v1152
2019-08-05 08:48:57 -07:00
Dalton Hubble
dcd6733649 Update Calico from v3.8.0 to v3.8.1
* https://docs.projectcalico.org/v3.8/release-notes/
2019-07-27 15:31:13 -07:00
Dalton Hubble
b9ccfedfe5 Update CHANGES for v1.15.1 release 2019-07-21 11:58:56 -07:00
Dalton Hubble
68d8717924 Refresh Prometheus rules/alerts and Grafana dashboards
* Refresh rules, alerts, and dashboards from upstreams
2019-07-21 11:29:34 -07:00
Dalton Hubble
c8df349e55 Fix to add all Azure controller nodes to address pool
* Add all Azure controllers to the apiserver load balancer
backend address pool
* Previously, kube-apiserver availability relied on the 0th
controller being up. Multi-controller was just providing etcd
data redundancy
2019-07-21 10:38:17 -07:00
Dalton Hubble
e0be091acc Update kube-state-metrics from v1.7.0 to v1.7.1
* https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.7.1
2019-07-20 20:17:08 -07:00
Dalton Hubble
e0c7676a15 Update Kubernetes from v1.15.0 to v1.15.1
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.15.md#downloads-for-v1151
2019-07-19 01:21:08 -07:00
Dalton Hubble
6cd3e65267 Update kube-state-metrics from v1.7.0-rc.1 to v1.7.0
* https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.7.0
* Add storageclasses and verticalpodautoscalers to ClusterRole
2019-07-19 00:14:47 -07:00
Dalton Hubble
dfa6bcfecf Relax terraform-provider-ct version constraint
* Allow updating terraform-provider-ct to any release
beyond v0.3.2, but below v1.0. This relaxes the prior
constraint that allowed only v0.3.y provider versions
2019-07-16 22:07:37 -07:00
Dalton Hubble
70f5cfd33e Update kube-state-metrics from v1.6.0 to v1.7.0-rc.1
* https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.7.0-rc.1
* https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.7.0-rc.0
2019-07-13 13:13:57 -07:00
Dalton Hubble
9e91d7f011 Upgrade Calico from v3.7.4 to v3.8.0
* Enable CNI bandwidth plugin for traffic shaping
* https://kubernetes.io/docs/concepts/extend-kubernetes/compute-storage-net/network-plugins/#support-traffic-shaping
2019-07-11 21:01:41 -07:00
Dalton Hubble
eaf59bd33f Update Prometheus from v2.11.0-rc.0 to v2.11.0
* https://github.com/prometheus/prometheus/releases/tag/v2.11.0
2019-07-09 21:33:24 -07:00
Dalton Hubble
40640f3697 Upgrade nginx-ingress from v0.24.1 to v0.25.0
* Support networking.k8s.io/v1beta1 apiVersion
* Update RBAC cluster-role for networking.k8s.io/v1beta1
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.25.0
2019-07-08 22:04:50 -07:00
Dalton Hubble
28ab746068 Update Prometheus from v2.10.0 to v2.11.0-rc.0
* https://github.com/prometheus/prometheus/releases/tag/v2.11.0-rc.0
2019-07-08 21:32:50 -07:00
Dalton Hubble
69d064bfdf Run kube-apiserver with lower privilege user (nobody)
* Run kube-apiserver as a non-root user (nobody). User
no longer needs to bind low number ports.
* On most platforms, the kube-apiserver load balancer listens
on 6443 and fronts controllers with kube-apiserver pods using
port 6443. Google Cloud TCP proxy load balancers cannot listen
on 6443. However, GCP's load balancer can be made to listen on
443, while kube-apiserver uses 6443 across all platforms.
2019-07-08 20:52:00 -07:00
Dalton Hubble
7a69bae75e Raise GCP network deletion timeout from 4m to 6m
* Fix a GCP errata item https://github.com/poseidon/typhoon/wiki/Errata
* Removal of a Google Cloud cluster often required 2 runs of
`terraform apply` because network resource deletes timeout
after 4m. Raise the network deletion timeout to 6m to
ensure apply only needs to be run once to remove a cluster
2019-07-06 13:15:33 -07:00
Dalton Hubble
3fcb04f68c Improve apiserver backend service zone spanning
* google_compute_backend_services use nested blocks to define
backends (instance groups heterogeneous controllers)
* Use Terraform v0.12.x dynamic blocks so the apiserver backend
service can refer to (up to zone-many) controller instance groups
* Previously, with Terraform v0.11.x, the apiserver backend service
had to list a fixed set of backends to span controller nodes across
zones in multi-controller setups. 3 backends were used because each
GCP region offered at least 3 zones. Single-controller clusters had
the cosmetic ugliness of unused instance groups
* Allow controllers to span more than 3 zones if avilable in a
region (e.g. currently only us-central1, with 4 zones)

Related:

* https://www.terraform.io/docs/providers/google/r/compute_backend_service.html
* https://www.terraform.io/docs/configuration/expressions.html#dynamic-blocks
2019-07-05 19:46:26 -07:00
Dalton Hubble
8d373b5850 Update Calico from v3.7.3 to v3.7.4
* https://docs.projectcalico.org/v3.7/release-notes/
2019-07-02 20:18:02 -07:00
Dalton Hubble
9a395dbf88 Update Grafana from v6.2.4 to v6.2.5
* https://github.com/grafana/grafana/releases/tag/v6.2.5
2019-06-29 13:21:42 -07:00
Dalton Hubble
fff7cc035d Remove Fedora Atomic modules
* Typhoon for Fedora Atomic was deprecated in March 2019
* https://typhoon.psdn.io/announce/#march-27-2019
2019-06-23 13:40:51 -07:00
Dalton Hubble
408e60075a Update Kubernetes from v1.14.3 to v1.15.0
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.15.md#v1150
* Remove docs referring to possible v1.14.4 release
2019-06-23 13:12:18 -07:00
Dalton Hubble
5c4486f57b Allow using Flatcar Linux Edge on bare-metal and AWS
* On AWS, use Flatcar Linux Edge by setting `os_image` to
"flatcar-edge"
* On bare-metal, Flatcar Linux Edge by setting `os_channel` to
"flatcar-edge"
2019-06-22 23:38:42 -07:00
Dalton Hubble
4ad69efc43 Update Grafana from v6.2.2 to v6.2.4
* https://github.com/grafana/grafana/releases/tag/v6.2.4
2019-06-19 21:51:54 -07:00
Dalton Hubble
21fb632e90 Update Calico from v3.7.2 to v3.7.3
* https://docs.projectcalico.org/v3.7/release-notes/
2019-06-13 23:54:20 -07:00
Dalton Hubble
cc4f7e09ab Update node-exporter from v0.18.0 to v0.18.1
* https://github.com/prometheus/node_exporter/releases/tag/v0.18.1
2019-06-07 02:09:44 -07:00
Dalton Hubble
d449477272 Update Grafana from v6.2.1 to v6.2.2
* https://github.com/grafana/grafana/releases/tag/v6.2.2
2019-06-07 00:07:54 -07:00
Dalton Hubble
5303e32e38 Change DO worker_type default from s-1vcpu-1gb to s-1vcpu-2gb
* On DigitalOcean, `s-1vcpu-1gb` worker nodes have 1GB of RAM, which
is too small as a default, even for most cost constrained developers
2019-06-06 23:50:19 -07:00
Dalton Hubble
da3f2b5d95 Adjust README example and Terraform version in docs
* Delay changing README example. Its prominent display
on github.com may lead to new users copying it, even
though it corresponds to an "in between releases" state
and v1.14.4 doesn't exist yet
* Leave docs tutorials the same, they can reflect master
2019-06-06 23:36:36 -07:00
Dalton Hubble
3276bf5878 Add migration instructions from Terraform v0.11 to v0.12
* Provide Terraform v0.11 to v0.12 migration guide. Show an
in-place strategy and a move resources strategy
* Describe in-place modifying an existing cluster and providers,
using the Terraform helper to edit syntax, and checking the
plan produces a zero diff
* Describe replacing existing clusters by creating a new config
directory for use with Terraform v0.12 only and moving resources
one by one
* Provide some limited advise on migrating non-Typhoon resources
2019-06-06 09:51:22 -07:00
Dalton Hubble
db36959178 Migrate bare-metal module Terraform v0.11 to v0.12
* Replace v0.11 bracket type hints with Terraform v0.12 list expressions
* Use expression syntax instead of interpolated strings, where suggested
* Update bare-metal tutorial
* Define `clc_snippets` type constraint map(list(string))
* Define Terraform and plugin version requirements in versions.tf
  * Require matchbox ~> 0.3.0 to support Terraform v0.12
  * Require ct ~> 0.3.2 to support Terraform v0.12
2019-06-06 09:51:21 -07:00
Dalton Hubble
28506df9c7 Avoid unneeded rotations of Regular priority virtual machine scale sets
* Azure only allows `eviction_policy` to be set for Low priority VMs.
Supporting Low priority VMs meant when Regular VMs were used, each
`terraform apply` rolled workers, to set eviction_policy to null.
* Terraform v0.12 nullable variables fix the issue and plan does not
produce a diff
2019-06-06 09:50:37 -07:00
Dalton Hubble
189487ecaa Migrate Azure module Terraform v0.11 to v0.12
* Replace v0.11 bracket type hints with Terraform v0.12 list expressions
* Use expression syntax instead of interpolated strings, where suggested
* Update Azure tutorial and worker pools documentation
* Define Terraform and plugin version requirements in versions.tf
  * Require azurerm ~> 1.27 to support Terraform v0.12
  * Require ct ~> 0.3.2 to support Terraform v0.12
2019-06-06 09:50:35 -07:00
Dalton Hubble
d6d9e6c4b9 Migrate Google Cloud module Terraform v0.11 to v0.12
* Replace v0.11 bracket type hints with Terraform v0.12 list expressions
* Use expression syntax instead of interpolated strings, where suggested
* Update Google Cloud tutorial and worker pools documentation
* Define Terraform and plugin version requirements in versions.tf
  * Require google ~> 2.5 to support Terraform v0.12
  * Require ct ~> 0.3.2 to support Terraform v0.12
2019-06-06 09:48:56 -07:00
Dalton Hubble
2ba0181dbe Migrate AWS module Terraform v0.11 to v0.12
* Replace v0.11 bracket type hints with Terraform v0.12 list expressions
* Use expression syntax instead of interpolated strings, where suggested
* Update AWS tutorial and worker pools documentation
* Define Terraform and plugin version requirements in versions.tf
  * Require aws ~> 2.7 to support Terraform v0.12
  * Require ct ~> 0.3.2 to support Terraform v0.12
2019-06-06 09:45:59 -07:00
Dalton Hubble
1366ae404b Migrate DigitalOcean module from Terraform v0.11 to v0.12
* Replace v0.11 bracket type hints with Terraform v0.12 list expressions
* Use expression syntax instead of interpolated strings, where suggested
* Update DigitalOcean tutorial documentation
* Define Terraform and plugin version requirements in versions.tf
  * Require digitalocean ~> v1.3 to support Terraform v0.12
  * Require ct ~> v0.3.2 to support Terraform v0.12
2019-06-06 09:44:58 -07:00
Dalton Hubble
0ccb2217b5 Update Kubernetes from v1.14.2 to v1.14.3
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.14.md#v1143
2019-05-31 01:08:32 -07:00
Dalton Hubble
c6faa6b5b8 Recommend updating Terraform providers ct and matchbox
* Recomment updating Terraform provider plugins `terraform-provider-ct`
and `terraform-provider-matchbox` to prepare for the upcoming Terraform
v0.12 migration
* https://github.com/poseidon/terraform-provider-ct/releases/tag/v0.3.2
* https://github.com/poseidon/terraform-provider-matchbox/releases/tag/v0.3.0
2019-05-31 00:48:37 -07:00
Dalton Hubble
c565f9fd47 Rename worker pool modules' count variable to worker_count
* This change affects users who use worker pools on AWS, GCP, or
Azure with a Container Linux derivative
* Rename worker pool modules' `count` variable to `worker_count`,
because `count` will be a reserved variable name in Terraform v0.12
2019-05-27 16:40:00 -07:00
Dalton Hubble
d9e7195477 Update Grafana from v2.6.0 to v2.6.1 2019-05-27 12:25:00 -07:00
Dalton Hubble
2a71cba0e3 Update CoreDNS from v1.3.1 to v1.5.0
* Add `ready` plugin to improve readinessProbe
* https://coredns.io/2019/04/06/coredns-1.5.0-release/
2019-05-27 00:11:52 -07:00
Dalton Hubble
0a835ee403 Replace deprecated azurerm_autoscale_setting
* Fix Terraform provider azure warning about `azurerm_autoscale_setting`
* Require terraform-provider-azure v1.22+ version that introduces
the new `azurerm_monitor_autoscale_setting` resource
* https://github.com/terraform-providers/terraform-provider-azurerm/blob/master/CHANGELOG.md#1220-february-11-2019
2019-05-26 23:32:42 -07:00
Dalton Hubble
5d2684a04d Update Grafana from v6.1.6 to v6.2.0
* https://github.com/grafana/grafana/releases/tag/v6.2.0
2019-05-26 22:00:47 -07:00
Dalton Hubble
221889cc9b Update Prometheus from v2.9.2 to v2.10.0
* https://github.com/prometheus/prometheus/releases/tag/v2.10.0
2019-05-26 21:58:28 -07:00
Dalton Hubble
6e4cf65c4c Fix terraform-render-bootkube to remove trailing slash
* Fix to remove a trailing slash that was erroneously introduced
in the scripting that updated from v1.14.1 to v1.14.2
* Workaround before this fix was to re-run `terraform init`
2019-05-22 18:29:11 +02:00
Dalton Hubble
bef9b991b7 Bump Terraform provider versions in docs
* Bump Terraform provider versions to reflect the versions
used by the maintainer
2019-05-20 18:29:56 +02:00
Dalton Hubble
147c21a4bd Allow Calico networking on Azure and DigitalOcean
* Introduce "calico" as a `networking` option on Azure and DigitalOcean
using Calico's new VXLAN support (similar to flannel). Flannel remains
the default on these platforms for now.
* Historically, DigitalOcean and Azure only allowed Flannel as the
CNI provider, since those platforms don't support IPIP traffic that
was previously required for Calico.
* Looking forward, its desireable for Calico to become the default
across Typhoon clusters, since it provides NetworkPolicy and a
consistent experience
* No changes to AWS, GCP, or bare-metal where Calico remains the
default CNI provider. On these platforms, IPIP mode will always
be used, since its available and more performant than vxlan
2019-05-20 17:17:20 +02:00
Dalton Hubble
222a94247c Update node_exporter from v0.17.0 to v0.18.0
* https://github.com/prometheus/node_exporter/releases/tag/v0.18.0
2019-05-17 20:01:30 +02:00
Dalton Hubble
da97bd4f12 Update Kubernetes from v1.14.1 to v1.14.2
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.14.md#v1142
2019-05-17 13:09:15 +02:00
Dalton Hubble
37ce722f9c Fix race condition in DigitalOcean cluster create
* DigitalOcean clusters must secure copy a kubeconfig to
worker nodes, but Terraform could decide to try copying
before firewall rules have been added to allow SSH access.
* Add an explicit dependency on adding firewall rules first
2019-05-17 13:05:08 +02:00
Dalton Hubble
f62286b677 Update Calico from v3.7.0 to v3.7.2
* https://docs.projectcalico.org/v3.7/release-notes/
2019-05-17 12:29:46 +02:00
Dalton Hubble
af18296bc5 Change flannel port from 8472 to 4789
* Change flannel port from the kernel default 8472 to the
IANA assigned VXLAN port 4789
* Update firewall rules or security groups for VXLAN
* Why now? Calico now offers its own VXLAN backend so
standardizing on the IANA port will simplify config
* https://github.com/coreos/flannel/blob/master/Documentation/backends.md#vxlan
2019-05-06 21:58:10 -07:00
Dalton Hubble
2d19ab8457 Update kube-state-metrics from v1.6.0-rc.2 to v1.6.0
* https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.6.0
2019-05-06 21:30:49 -07:00
Dalton Hubble
09e0230111 Upgrade Calico from v3.6.1 to v3.7.0
* https://docs.projectcalico.org/v3.7/release-notes/
* https://github.com/poseidon/terraform-render-bootkube/pull/131
2019-05-06 00:44:15 -07:00
Dalton Hubble
feb6192aac Update etcd from v3.3.12 to v3.3.13 on Container Linux
* Skip updating etcd for Fedora Atomic clusters, now that
Fedora Atomic has been deprecated
2019-05-04 12:55:42 -07:00
Dalton Hubble
6e9b2450fe Update Grafana from v6.1.4 to v6.1.6
* https://github.com/grafana/grafana/releases/tag/v6.1.6
2019-05-04 11:14:37 -07:00
Dalton Hubble
253831aac3 Update links to Matchbox, terraform-provider-ct, etc.
* Matchbox, terraform-provider-matchbox, and terraform-provider-ct
have moved to the poseidon Github organization
2019-05-04 10:50:53 -07:00
Dalton Hubble
0e94708fd8 Update kube-state-metrics from v1.5.0 to v1.6.0-rc.2
* Collect metrics Ingress resources
* Collects metrics about certificates.k8s.io certificatesigningrequests
* https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.6.0-rc.2
2019-04-27 20:54:40 -07:00
Dalton Hubble
2c11bad439 Update Prometheus from v2.9.1 to v2.9.2
* https://github.com/prometheus/prometheus/releases/tag/v2.9.2
2019-04-27 20:39:55 -07:00
Dalton Hubble
418597aa59 Update Grafana from v6.1.3 to v6.1.4
* https://github.com/grafana/grafana/releases/tag/v6.1.4
2019-04-18 23:30:43 -07:00
Dalton Hubble
f3174c2b7a Update Prometheus from v2.8.1 to v2.9.1
* https://github.com/prometheus/prometheus/releases/tag/v2.9.1
* https://github.com/prometheus/prometheus/releases/tag/v2.9.0
2019-04-18 23:26:32 -07:00
Dalton Hubble
e73cccd7eb Update provider versions in tutorial docs
* Update terraform provider plugin version in docs to reflect
the recommended current versions that are currently used
2019-04-16 00:05:13 -07:00
Dalton Hubble
a141c5fe9e Update nginx-ingress from v0.23.0 to v0.24.1
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.24.1
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.24.0
2019-04-15 21:08:22 -07:00
Dalton Hubble
1b157a2fa4 Revert "Update kube-state-metrics from v1.5.0 to v1.6.0-rc.0"
* This reverts commit 6e5d66cf66
* kube-state-metrics v1.6.0-rc.0 fires KubeDeploymentReplicasMismatch
alerts where its own Deployment doesn't have replicas available,
(kube_deployment_status_replicas_available) even though all replicas
are available according to kubectl inspection
* This problem was present even with the CSR ClusterRole fix
(https://github.com/kubernetes/kube-state-metrics/pull/717)
2019-04-13 12:37:53 -07:00
Dalton Hubble
6e5d66cf66 Update kube-state-metrics from v1.5.0 to v1.6.0-rc.0
* Adds a metrics collector for Ingress resources and other
improvements
* https://github.com/kubernetes/kube-state-metrics/pull/640
* https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.6.0-rc.0
2019-04-09 22:16:36 -07:00
Dalton Hubble
44c293888b Update Grafana from v6.1.1 to v6.1.3
* https://github.com/grafana/grafana/releases/tag/v6.1.3
2019-04-09 22:06:27 -07:00
Dalton Hubble
452253081b Update Kubernetes from v1.14.0 to v1.14.1
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.14.md#changelog-since-v1140
2019-04-09 21:47:23 -07:00
Dalton Hubble
1aa4d2cdc1 Update CHANGES for v1.14.0 release 2019-04-08 18:49:52 -07:00
Dalton Hubble
c1fe41d34a Add ability to load balance TCP/UDP applications on Azure
* Add ability to load balance TCP/UDP applications (e.g. NodePort)
* Output the load balancer ID as `loadbalancer_id`
* Output `worker_security_group_name` and `worker_address_prefix`
for extending firewall rules
2019-04-07 22:59:46 -07:00
Dalton Hubble
be29f52039 Add enable_aggregation option (defaults to false)
* Add an `enable_aggregation` variable to enable the kube-apiserver
aggregation layer for adding extension apiservers to clusters
* Aggregation is **disabled** by default. Typhoon recommends you not
enable aggregation. Consider whether less invasive ways to achieve your
goals are possible and whether those goals are well-founded
* Enabling aggregation and extension apiservers increases the attack
surface of a cluster and makes extensions a part of the control plane.
Admins must scrutinize and trust any extension apiserver used.
* Passing a v1.14 CNCF conformance test requires aggregation be enabled.
Having an option for aggregation keeps compliance, but retains the
stricter security posture on default clusters
2019-04-07 12:00:38 -07:00
Dalton Hubble
5271e410eb Update Kubernetes from v1.13.5 to v1.14.0
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.14.md#v1140
2019-04-07 00:15:59 -07:00
Dalton Hubble
ce78d5988e Refresh Prometheus rules and Grafana dashboards
* Refresh rules and dashboards from upstreams
* Add new Kubernetes "workload" dashboards
  * View pods in a workload (deployment/daemonset/statefulset)
  * View workloads in a namespace
2019-04-06 23:31:44 -07:00
Dalton Hubble
29a3035245 Update Grafana from v6.1.0 to v6.1.1 2019-04-06 18:32:14 -07:00
Dalton Hubble
3e7a38cb13 Update Grafana from v6.0.2 to v6.1.0
* https://github.com/grafana/grafana/releases/tag/v6.1.0
2019-04-03 20:47:48 -07:00
Dalton Hubble
2a07c97538 Harden internal firewall rules on DigitalOcean
* Define firewall rules on DigitialOcean to match rules used on AWS,
GCP, and Azure
* Output `controller_tag` and `worker_tag` to simplify custom firewall
rule creation
2019-04-03 20:38:22 -07:00
Dalton Hubble
60265f9b58 Add ability to load balance TCP applications on AWS
* Add ability to load balance TCP applications (e.g. NodePort)
* Output the network load balancer ARN as `nlb_id`
* Accept a `worker_target_groups` (ARN) list to which worker
instances should be added
* AWS NLBs and target groups don't support UDP
2019-04-01 21:22:20 -07:00
Dalton Hubble
aaa8e0261a Add Google Cloud worker instances to a target pool
* Background: A managed instance group of workers is used in backend
services for global load balancing (HTTP/HTTPS Ingress) and output
for custom global load balancing use cases
* Add worker instances to a target pool load balancing TCP/UDP
applications (NodePort or proxied). Output as `worker_target_pool`
* Health check for workers with a healthy Ingress controller. Forward
rules (regional) to target pools don't support different external and
internal ports so choosing nodes with Ingress allows proxying as a
workaround
* A target pool is a logical grouping only. It doesn't add costs to
clusters or worker pools
2019-04-01 21:03:48 -07:00
Dalton Hubble
b3ec5f73e3 Update Calico from v3.6.0 to v3.6.1
* https://docs.projectcalico.org/v3.6/release-notes/
2019-03-31 17:43:43 -07:00
Dalton Hubble
3e9dc28a00 Update Prometheus from v2.8.0 to v2.8.1
* https://github.com/prometheus/prometheus/releases/tag/v2.8.1
2019-03-31 17:40:20 -07:00
Dalton Hubble
46196af500 Remove Haswell minimum CPU platform requirement
* Google Cloud API implements `min_cpu_platform` to mean
"use exactly this CPU"
* Fix error creating clusters in newer regions lacking Haswell
platform (e.g. europe-west2) (#438)
* Reverts #405, added in v1.13.4
* Original goal of ignoring old Ivy/Sandy bridge CPUs in older regions
will be achieved shortly anyway. Google Cloud is deprecating those CPUs
in April 2019
* https://cloud.google.com/compute/docs/instances/specify-min-cpu-platform#how_selecting_a_minimum_cpu_platform_works
2019-03-27 19:51:32 -07:00
Dalton Hubble
5a1bc423a1 Announce Fedora Atomic modules won't be updated beyond v1.13.x
* Thank you Project Atomic team and users
* See the deprecation announcement https://typhoon.psdn.io/announce/#march-27-2019
2019-03-26 23:56:33 -07:00
Dalton Hubble
32fe72fb2d Update mkdocs and plugin versions used in tutorials
* Recommend provider plugin versions that are currently used
by the author
* Recommend updating terraform-provider-ct plugin from v0.3.0
to v0.3.1
* https://github.com/coreos/terraform-provider-ct/releases
2019-03-26 01:00:44 -07:00
Dalton Hubble
4fea526ebf Update Kubernetes from v1.13.4 to v1.13.5
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.13.md#v1135
2019-03-25 21:43:47 -07:00
Dalton Hubble
41a9d86bc3 Add NetworkPolicy to limit traffic into Prometheus
* Allow traffic from Grafana to Prometheus in monitoring
* Allow traffic from Prometheus to Prometheus in monitoring
* NetworkPolicy denies non-whitelisted traffic. Define policy
to allow other access
2019-03-23 21:38:34 -07:00
Dalton Hubble
36e31fc9fa Add liveness and readiness probes to Grafana
* https://github.com/grafana/grafana/issues/3302
2019-03-23 17:55:37 -07:00
Dalton Hubble
619a0370dc Update Grafana from v6.0.1 to v6.0.2
* https://github.com/grafana/grafana/releases/tag/v6.0.2
2019-03-21 23:41:25 -07:00
Dalton Hubble
1feefbe9c6 Update Calico from v3.5.2 to v3.6.0
* Add calico-ipam CRDs and RBAC permissions
* Switch IPAM from host-local to calico-ipam
  * `calico-ipam` subnets `ippools` (defaults to pod CIDR) into
`ipamblocks` (defaults to /26, but set to /24 in Typhoon)
  * `host-local` subnets the pod CIDR based on the node PodCIDR
field (set via kube-controller-manager as /24's)
* Create a custom default IPv4 IPPool to ensure the block size
is kept at /24 to allow 110 pods per node (Kubernetes default)
* Retaining host-local was slightly preferred, but Calico v3.6
is migrating all usage to calico-ipam. The codepath that skipped
calico-ipam for KDD was removed
*  https://docs.projectcalico.org/v3.6/release-notes/
2019-03-19 22:49:56 -07:00
Dalton Hubble
aa630003a4 Refresh Prometheus rules and Grafana dashboards
* Refresh rules and dashboards from upstreams
* Organize dashboards and stay below the ConfigMap size
limit
2019-03-17 13:23:04 -07:00
Dalton Hubble
bf97a45b9d Remove heapster manifests from addons
* Heapster addon powers `kubectl top`
* In early Kubernetes, people legitimately used and expected
`kubectl top` to work, so the optional addon was provided
* Today the standards are different. Many better monitoring
tools exist, that are also less coupled to Kubernetes "kubectl
top" reliance on a non-core extensions means its not in-scope
for minimal Kubernetes clusters. No more exceptionalism
* Finally, Heapster isn't that useful anymore. Its manifests
have no need for Typhoon-specific modification
* Look to prior releases if you still wish to apply heapster
2019-03-17 12:41:59 -07:00
Dalton Hubble
3d6a6d4adb Re-add Kubelet metadata service dependency on DigitalOcean
* Restore the original special-casing of DigitalOcean Kubelets
* Fix node metadata InternalIP being set to the IP of the default
gateway on DigitalOcean nodes (regressed in v1.12.3)
* Reverts the "pretty" node names on DigitalOcean (worker-2 vs IP)
* Closes #424 (full details)
2019-03-17 12:39:25 -07:00
Dalton Hubble
e0bee2e417 Update Prometheus from v2.7.2 to v2.8.0
* https://github.com/prometheus/prometheus/releases/tag/v2.8.0
2019-03-13 22:11:38 -07:00
Dalton Hubble
9493ed3b1d Change default iPXE kernel/initrd download from HTTP to HTTPS
* Require an iPXE-enabled network boot environment with support for
TLS downloads. PXE clients must chainload to iPXE firmware compiled
with `DOWNLOAD_PROTO_HTTPS` enabled ([crypto](https://ipxe.org/crypto))
* iPXE's pre-compiled firmware binaries do _not_ enable HTTPS. Admins
should build iPXE from source with support enabled
* Affects the Container Linux and Flatcar Linux install profiles that
pull from public downloads. No effect when cached_install=true
or using Fedora Atomic, as those download from Matchbox
* Add `download_protocol` variable. Recognizing boot firmware TLS
support is difficult in some environments, set the protocol to "http"
for the old behavior (discouraged)
2019-03-09 23:23:40 -08:00
Dalton Hubble
4201eb1efa Update Grafana from v6.0.0 to v6.0.1
* https://github.com/grafana/grafana/releases/tag/v6.0.1
2019-03-09 12:44:18 -08:00
Dalton Hubble
fe96da27d7 Add support for terraform-provider-aws v2.0+
* Allow terraform-provider-aws >= v1.13, but < 3.0. No change
to the minimum version, but allow using v2.x.y releases
* Verify compatability with terraform-provider-aws v2.1.0
2019-03-09 12:06:44 -08:00
Dalton Hubble
4d9a692424 Update Prometheus from v2.7.1 to v2.7.2
* https://github.com/prometheus/prometheus/releases/tag/v2.7.2
2019-03-04 23:08:12 -08:00
Dalton Hubble
deec512c14 Resolve in-addr.arpa and ip6.arpa zones with CoreDNS kubernetes plugin
* Resolve in-addr.arpa and ip6.arpa DNS PTR requests for Kubernetes
service IPs and pod IPs
* Previously, CoreDNS was configured to resolve in-addr.arpa PTR
records for service IPs (but not pod IPs)
2019-03-04 23:03:00 -08:00
Dalton Hubble
5066a25d89 Add links and clarifications in CHANGES for release 2019-03-02 11:26:12 -08:00
Dalton Hubble
a08adc92b5 Update nginx-ingress from v0.22.0 to v0.23.0
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.23.0
2019-03-01 01:18:54 -08:00
Dalton Hubble
f598307998 Update Kubernetes from v1.13.3 to v1.13.4
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.13.md#v1134
2019-02-28 22:47:43 -08:00
Dalton Hubble
daee5a9d60 Update Grafana from v6.0.0-beta3 to v6.0.0
* https://github.com/grafana/grafana/releases/tag/v6.0.0
* http://docs.grafana.org/guides/whats-new-in-v6-0/
2019-02-25 21:43:43 -08:00
Dalton Hubble
73ae5d5649 Update Calico from v3.5.1 to v3.5.2
* https://docs.projectcalico.org/v3.5/releases/
2019-02-25 21:23:13 -08:00
Dalton Hubble
42d7222f3d Add a readiness probe to CoreDNS
* https://github.com/poseidon/terraform-render-bootkube/pull/115
2019-02-23 13:25:23 -08:00
Dalton Hubble
d10c2b4cb9 Update Grafana from v6.0.0-beta2 to v6.0.0-beta3
* Update Grafana dashboards
2019-02-23 13:03:25 -08:00
Dalton Hubble
7f8572030d Upgrade to support terraform-provider-google v2.0+
* Support terraform-provider-google v1.19.0, v1.19.1, v1.20.0
and v2.0+ (and allow for future 2.x.y releases)
* Require terraform-provider-google v1.19.0 or newer. v1.19.0
introduced `network_interface` fields `network_ip` and `nat_ip`
to deprecate `address` and `assigned_nat_ip`. Those deprecated
fields are removed in terraform-provider-google v2.0
* https://github.com/terraform-providers/terraform-provider-google/releases/tag/v2.0.0
2019-02-20 02:33:32 -08:00
Dalton Hubble
4294bd0292 Assign Pod Priority classes to critical cluster and node components
* Assign pod priorityClassNames to critical cluster and node
components (higher is higher priority) to inform node out-of-resource
eviction order and scheduler preemption and scheduling order
* Priority Admission Controller has been enabled since Typhoon
v1.11.1
2019-02-19 22:21:39 -08:00
Dalton Hubble
ba4c5de052 Set the Google Cloud minimum CPU platform to Intel Haswell
* Intel Haswell or better is available in every zone around the world
* Neither Kubernetes nor Typhoon have a particular minimum processor
family. However, a few Google Cloud zones still default to Sandy/Ivy
bridge (scheduled to shift April 2019). Price is only based on machine
type so it is beneficial to opt for the next processor family
* Intel Haswell is a suitable minimum since it still allows plenty of
liberty in choosing any region or machine type
* Likely a slight increase to preemption probability in a few zones,
but any lower probability on Sandy/Ivy bridge is due to lower
desirability as they're phased out
* https://cloud.google.com/compute/docs/regions-zones/
2019-02-18 12:55:04 -08:00
Dalton Hubble
e483c81ce9 Improve Prometheus rules and alerts and Grafana dashboards
* Collate upstream rules, alerts, and dashboards and tune for use
in Typhoon
* Previously, a well-chosen (but older) set of rules, alerts, and
dashboards were maintained to reflect metric name changes
2019-02-18 12:19:23 -08:00
Dalton Hubble
6fa3b8a13f Upgrade Grafana to v6.0.0-beta2 and enable Explore UI
* Upgrade Grafana from v5.4.3 to v6.0.0-beta2
* Enable Grafana Explore UI while still using only the Viewer
role (inspect/edit without saving)
* http://docs.grafana.org/guides/whats-new-in-v6-0/
2019-02-17 13:26:42 -08:00
Dalton Hubble
d988822741 Document and recommend terraform-provider-matchbox v0.2.3
* https://github.com/coreos/terraform-provider-matchbox/releases/tag/v0.2.3
2019-02-16 15:07:49 -08:00
Dalton Hubble
170ef74eea Remove Nginx Ingress default backend
* nginx-ingress no longer requires a configured default-backend,
it will respond with its own 404 page starting in v0.21.0
* https://github.com/kubernetes/ingress-nginx/pull/3196
2019-02-16 14:18:15 -08:00
Dalton Hubble
b13a651cfe Drop metrics that are unset, high cardinality, or extraneous
* https://github.com/coreos/prometheus-operator/pull/2387
* https://github.com/coreos/prometheus-operator/pull/1959
2019-02-10 23:56:11 -08:00
Dalton Hubble
9c59f393a5 Add Kubernetes pod name to metrics discovered from service endpoints
* Prometheus queries from some upstreams use joins of node-exporter
and kube-state-metrics metrics by (namespace,pod). Add the Kubernetes
pod name to service endpoint metrics
* Rename the kubernetes_namespace field to namespace
* Honor labels since kube-state-metrics already include a `pod` field
that should not be overridden
2019-02-10 23:54:30 -08:00
Dalton Hubble
3e4b3bfb04 Raise nginx-ingress liveness/readiness timeout
* Under heavy load, avoid timeouts causing nginx-ingress
restarts https://github.com/kubernetes/ingress-nginx/pull/3737
2019-02-09 12:53:09 -08:00
Dalton Hubble
584088397c Update etcd from v3.3.11 to v3.3.12
* https://github.com/etcd-io/etcd/releases/tag/v3.3.12
2019-02-09 11:54:54 -08:00
Dalton Hubble
0200058e0e Update Calico from v3.5.0 to v3.5.1
* Fix in confd https://github.com/projectcalico/confd/pull/205
2019-02-09 11:49:31 -08:00
Dalton Hubble
d5537405e1 Add CHANGES note about reducing the pod eviciton timeout 2019-02-02 14:54:18 -08:00
Dalton Hubble
949ce21fb2 Update Prometheus from v2.7.0 to v2.7.1
* https://github.com/prometheus/prometheus/releases/tag/v2.7.1
2019-02-02 00:13:24 -08:00
Dalton Hubble
ccd96c37da Update Kubernetes from v1.13.2 to v1.13.3
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.13.md#v1133
2019-02-01 23:26:13 -08:00
Dalton Hubble
244a1a601a Switch CoreDNS to use the forward plugin instead of proxy
* Use the forward plugin to forward to upstream resolvers, instead
of the proxy plugin. The forward plugin is reported to be a faster
alternative since it can re-use open sockets
* https://coredns.io/explugins/forward/
* https://coredns.io/plugins/proxy/
* https://github.com/kubernetes/kubernetes/issues/73254
2019-01-30 22:25:23 -08:00
Dalton Hubble
130daeac26 Update Prometheus from v2.6.1 to v2.7.0 2019-01-29 22:31:20 -08:00
Dalton Hubble
1ab06f69d7 Update flannel from v0.10.0 to v0.11.0
* https://github.com/coreos/flannel/releases/tag/v0.11.0
2019-01-29 21:51:25 -08:00
Dalton Hubble
eb08593eae Fix azure provider warning, rename a public_ip field
* azurerm_public_ip (used internally) added a field `allocation_method`
to replace the field `public_ip_address_allocation` (deprecated)
* Require terraform-provider-azurerm v1.21+
* https://github.com/terraform-providers/terraform-provider-azurerm/pull/2576
2019-01-27 17:52:35 -08:00
Dalton Hubble
e9659a8539 Update Calico from v3.4.0 to v3.5.0
* https://docs.projectcalico.org/v3.5/releases/
2019-01-27 16:34:30 -08:00
Dalton Hubble
f5ff003d0e Update node-exporter from v0.15.2 to v0.17.0
* node-exporter renamed multiple metrics that are reflected
in changes to Prometheus rules and Grafana dashboard expressions
2019-01-22 01:14:00 -08:00
Dalton Hubble
d697dd46dc Allow kube-state-metrics PodDisruptionBudget metrics
* Update kube-state-metrics ClusterRole to allow collecting
poddisruptionbudget metrics (exported as kube_poddisruptionbudget_*)
* https://github.com/kubernetes/kube-state-metrics/pull/551
* Bump addon-resizer from v1.7 to v1.8.4
2019-01-22 01:12:32 -08:00
Dalton Hubble
2f3097ebea Update nginx-ingress from v0.21.0 to v0.22.0
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.22.0
2019-01-16 23:01:22 -08:00
Dalton Hubble
f4d3508578 Update CoreDNS from v1.3.0 to v1.3.1
* https://coredns.io/2019/01/13/coredns-1.3.1-release/
2019-01-15 22:50:25 -08:00
Dalton Hubble
67fb9602e7 Update Prometheus from v2.6.0 to v2.6.1
* https://github.com/prometheus/prometheus/releases/tag/v2.6.1
2019-01-15 21:13:40 -08:00
Dalton Hubble
c8a85fabe1 Update Grafana from v5.4.2 to v5.4.3
* https://github.com/grafana/grafana/releases/tag/v5.4.3
2019-01-15 21:13:16 -08:00
Dalton Hubble
7eafa59d8f Fix instance shutdown automatic worker deletion on clouds
* Fix a regression caused by lowering the Kubelet TLS client
certificate to system:nodes group (#100) since dropping
cluster-admin dropped the Kubelet's ability to delete nodes.
* On clouds where workers can scale down (manual terraform apply,
AWS spot termination, Azure low priority deletion), worker shutdown
runs the delete-node.service to remove a node to prevent NotReady
nodes from accumulating
* Allow Kubelets to delete cluster nodes via system:nodes group. Kubelets
acting with system:node and kubelet-delete ClusterRoles is still an
improvement over acting as cluster-admin
2019-01-14 23:27:48 -08:00
Dalton Hubble
679079b242 Add AWS ingress_zone_id output with NLB DNS name's Route53 zone id
* DNS zones served by AWS Route53 may use AWS's special alias records
(other DNS providers would use a CNAME) to resolve the ingress NLB.
Alias records require the NLB DNS name's DNS zone id (not the cluster
`dns_zone_id`)
2019-01-13 16:45:52 -08:00
Dalton Hubble
1d27dc6528 Update kube-state-metrics exporter from v1.4.0 to v1.5.0
* https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.5.0
2019-01-12 14:24:57 -08:00
Dalton Hubble
b74cc8afd2 Update etcd from v3.3.10 to v3.3.11
* https://github.com/etcd-io/etcd/releases/tag/v3.3.11
2019-01-12 14:17:25 -08:00
Dalton Hubble
1d66ad33f7 Change AWS worker modules' default type from t2.small to t3.small
* Worker instance types weren't updated in #365
2019-01-12 00:07:48 -08:00
Dalton Hubble
4d32b79c6f Update Kubernetes from v1.13.1 to v1.13.2
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.13.md#v1132
2019-01-12 00:00:53 -08:00
Dalton Hubble
df4c0ba05d Use HTTPS liveness probes for kube-scheduler and kube-controller-manager
* Disable kube-scheduler and kube-controller-manager HTTP ports
2019-01-09 20:56:50 -08:00
Dalton Hubble
bfe0c74793 Enable the certificates.k8s.io API to issue cluster certificates
* System components that require certificates signed by the cluster
CA can submit a CSR to the apiserver, have an administrator inspect
and approve it, and be issued a certificate
* Configure kube-controller-manager to sign Approved CSR's using the
cluster CA private key
* Admins are responsible for approving or denying CSRs, otherwise,
no certificate is issued. Read the Kubernetes docs carefully and
verify the entity making the request and the authorization level
* https://kubernetes.io/docs/tasks/tls/managing-tls-in-a-cluster
2019-01-06 17:33:37 -08:00
Dalton Hubble
6795a753ea Update CoreDNS from v1.2.6 to v1.3.0
* https://coredns.io/2018/12/15/coredns-1.3.0-release/
2019-01-05 13:35:03 -08:00
Dalton Hubble
812a1adb49 Use a lower-privilege Kubelet kubeconfig in system:nodes
* Kubelets can use a lower-privilege TLS client certificate with
Org system:nodes and a binding to the system:node ClusterRole
* Admin kubeconfig's continue to belong to Org system:masters to
provide cluster-admin (available in assets/auth/kubeconfig or as
a Terraform output kubeconfig-admin)
* Remove bare-metal output variable kubeconfig
2019-01-05 13:08:56 -08:00
Dalton Hubble
66e1365cc4 Add ServiceAccounts for kube-apiserver and kube-scheduler
* Add ServiceAccounts and ClusterRoleBindings for kube-apiserver
and kube-scheduler
* Remove the ClusterRoleBinding for the kube-system default ServiceAccount
* Rename the CA certificate CommonName for consistency with upstream
2019-01-01 20:16:14 -08:00
Dalton Hubble
ea8b0d1c84 Update Prometheus addon from v2.5.0 to v2.6.0
* https://github.com/prometheus/prometheus/releases/tag/v2.6.0
2018-12-27 07:35:12 -08:00
Dalton Hubble
f2f4deb8bb Change AWS default type from t2.small to t3.small
* T3 is the next generation general purpose burstable
instance type. Compared with t2.small, the t3.small is
cheaper, has 2 vCPU (instead of 1) and provides 5 Gbps
of pod-to-pod bandwidth (instead of 1 Gbps)
2018-12-18 12:38:35 -08:00
Dalton Hubble
4d2f33aee6 Update changelog for v1.13.1 release 2018-12-17 14:28:27 -08:00
Dalton Hubble
d42f47c49e Update terraform-provider-ct plugin from v0.2.1 to v0.3.0
* Provide migration instructions for upgrading terraform-provider-ct
in-place for v1.12.2+ clusters
* Require switching from ~/.terraformrc to the Terraform third-party
plugins directory ~/.terraform.d/plugins/
* Require Container Linux 1688.5.3 or newer
2018-12-17 14:13:50 -08:00
Dalton Hubble
479d498024 Update Calico from v3.3.2 to v3.4.0
* https://docs.projectcalico.org/v3.4/releases/
2018-12-15 18:05:16 -08:00
Dalton Hubble
e0c032be94 Increase GCP TCP proxy apiserver backend timeout to 5 minutes
* On GCP, kubectl port-forward connections to pods are closed
after a timeout (unlike AWS NLB's or Azure load balancers)
* Increase the GCP apiserver backend service timeout from 1 minute
to 5 minutes to be more similar to AWS/Azure LB behavior
2018-12-15 17:34:18 -08:00
Dalton Hubble
b74bf11772 Update Grafana from v5.4.0 to v5.4.2
* https://github.com/grafana/grafana/releases/tag/v5.4.2
* https://github.com/grafana/grafana/releases/tag/v5.4.1
2018-12-15 12:39:03 -08:00
Dalton Hubble
018c5edc25 Update Kubernetes from v1.13.0 to v1.13.1
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.13.md#v1131
2018-12-15 11:44:57 -08:00
Dalton Hubble
ff6ab571f3 Update Calico from v3.3.1 to v3.3.2
* https://docs.projectcalico.org/v3.3/releases/
2018-12-06 22:56:55 -08:00
Dalton Hubble
991fb44c37 Update Grafana from v5.3.4 to v5.4.0
* https://github.com/grafana/grafana/releases/tag/v5.4.0
2018-12-06 01:33:50 -08:00
Dalton Hubble
d31f444fcd Update Kubernetes from v1.12.3 to v1.13.0 2018-12-03 20:44:32 -08:00
Dalton Hubble
b6016d0a26 Disable Grafana login form, admin user can't be disabled
* Example manifests aim to provide a read-only dashboard visible
to any users with network access (i.e. kubectl port-forward, LAN)
* Problem: Grafana always has an admin user, even with the user
management system disabled
* Disable the login form to prevent admin login
2018-11-28 22:04:08 -08:00
Dalton Hubble
eec314b52f Update CHANGES changelog for release 2018-11-28 09:23:13 -08:00
yokhahn
bcce02a9ce Add Kubelet /etc/iscsi and iscsiadm mounts on bare-metal
* Allow using iSCSI with Container Linux bare-metal clusters
* Warning, iSCSI isn't part of Kubernetes conformance and isn't
regularly evaluated
2018-11-28 00:28:46 -08:00
Dalton Hubble
42c523e6a2 Recommend switch from ~/.terraformrc to 3rd-party plugin dir
* Switch tutorials from using ~/.terraformrc to using the 3rd-party
plugin directory so 3rd-party plugins can be pinned
* Continue to show using terraform-provider-ct v0.2.2. Updating to
a newer version is only safe once all managed clusters are v1.12.2
or higher
2018-11-28 00:03:15 -08:00
Dalton Hubble
872b11b948 Update ngninx-ingress from v0.20.0 to v0.21.0
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.21.0
2018-11-26 21:57:34 -08:00
Dalton Hubble
5b27d8d889 Update Kubernetes from v1.12.2 to v1.12.3
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.12.md/#v1123
2018-11-26 21:06:09 -08:00
Dalton Hubble
840b73f9ba Update pod-checkpointer image to query Kubelet secure API
* Updates pod-checkpointer to prefer the Kubelet secure
API (before falling back to the Kubelet read-only API that
is disabled on Typhoon clusters since
https://github.com/poseidon/typhoon/pull/324)
* Previously, pod-checkpointer checkpointed an initial set
of pods during bootstrapping so recovery from power cycling
clusters was unaffected, but logs were noisy
* https://github.com/kubernetes-incubator/bootkube/pull/1027
* https://github.com/kubernetes-incubator/bootkube/pull/1025
2018-11-26 20:24:32 -08:00
Dalton Hubble
915af3c6cc Fix Calico Felix reporting usage data, require opt-in
* Calico Felix has been reporting anonymous usage data about the
version and cluster size, which violates Typhoon's privacy policy
where analytics should be opt-in only
* Add a variable enable_reporting (default: false) to allow opting
in to reporting usage data to Calico (or future components)
2018-11-20 01:03:00 -08:00
Dalton Hubble
c6586b69fd Use eviction policy Delete for Low priority VMSS workers
* Fix issue where Azure defaults to Deallocate eviction policy,
which required manually restarting deallocated workers
* Require terraform-provider-azurerm v1.19+ to support setting
the eviction_policy
2018-11-18 21:04:50 -08:00
Dalton Hubble
ea3fc6d2a7 Update CoreDNS from v1.2.4 to v1.2.6
* https://coredns.io/2018/11/05/coredns-1.2.6-release/
2018-11-18 16:45:53 -08:00
Dalton Hubble
c8c43f3991 Update Grafana from v5.3.2 to v5.3.4
* https://github.com/grafana/grafana/releases/tag/v5.3.3
* https://github.com/grafana/grafana/releases/tag/v5.3.4
2018-11-18 16:42:50 -08:00
Dalton Hubble
7f8e781ae4 Measure DigitalOcean network performance
* Measuring pod-to-pod bandwidth in a few regions (NYC3, FRA1,
SFO1) shows DigitalOcean has made some improvements
2018-11-11 21:08:10 -08:00
Dalton Hubble
56e9a82984 Add flannel resource request and mount only /run/flannel 2018-11-11 20:35:21 -08:00
Dalton Hubble
e95b856a22 Enable CoreDNS loop and loadbalance plugins
* loop sends an initial query to detect infinite forwarding
loops in configured upstream DNS servers and fast exit with
an error (its a fatal misconfiguration on the network that
will otherwise cause resolvers to consume memory/CPU until
crashing, masking the problem)
* https://github.com/coredns/coredns/tree/master/plugin/loop
* loadbalance randomizes the ordering of A, AAAA, and MX records
in responses to provide round-robin load balancing (as usual,
clients may still cache responses though)
* https://github.com/coredns/coredns/tree/master/plugin/loadbalance
2018-11-10 17:36:56 -08:00
Dalton Hubble
31f48a81a8 Update docs to show flannel DaemonSet instead of kube-flannel
* No functional change, the rename is just for consistency
2018-11-10 15:16:06 -08:00
Dalton Hubble
2b3f61d1bb Update Calico from v3.3.0 to v3.3.1
* Structure Calico and flannel manifests
* Rename kube-flannel mentions to just flannel
2018-11-10 13:37:12 -08:00
Dalton Hubble
8fd2978c31 Update bootkube image version from v0.13.0 to v0.14.0
* https://github.com/kubernetes-incubator/bootkube/releases/tag/v0.14.0
2018-11-06 23:35:11 -08:00
Dalton Hubble
be9f7b87d6 Update Prometheus from v2.4.3 to v2.5.0
* https://github.com/prometheus/prometheus/releases/tag/v2.5.0
2018-11-06 22:16:12 -08:00
Dalton Hubble
721c847943 Set kube-apiserver kubelet preferred address types
* Prefer InternalIP and ExternalIP over the node's hostname,
to match upstream behavior and kubeadm
* Previously, hostname-override was used to set node names
to internal IP's to work around some cloud providers not
resolving hostnames for instances (e.g. DO droplets)
2018-11-03 22:31:55 -07:00
Dalton Hubble
884c8b39dc Update Grafana from v5.3.1 to v5.3.2
* https://github.com/grafana/grafana/releases/tag/v5.3.2
2018-10-28 19:44:22 -07:00
Dalton Hubble
0e71f7e565 Ignore controller user_data changes to allow plugin updates
* Updating the `terraform-provider-ct` plugin is known to produce
a `user_data` diff in all pre-existing clusters. Applying the
diff to pre-existing cluster destroys controller nodes
* Ignore changes to controller `user_data`. Once all managed
clusters use a release containing this change, it is possible
to update the `terraform-provider-ct` plugin (worker `user_data`
will still be modified)
* Changing the module `ref` for an existing cluster and
re-applying is still NOT supported (although this PR
would protect controllers from being destroyed)
2018-10-28 16:48:12 -07:00
Dalton Hubble
5be5b261e2 Add an IPv6 address and forwarding rules on Google Cloud
* Allowing serving IPv6 applications via Kubernetes Ingress
on Typhoon Google Cloud clusters
* Add `ingress_static_ipv6` output variable for use in AAAA
DNS records
2018-10-28 14:30:58 -07:00
Dalton Hubble
f034ef90ae Add DigitalOcean AAAA DNS records resolving to workers
* Improve the workers "round-robin" DNS FQDN that is created
with each cluster by adding AAAA records
* CNAME's resolving to the DigitalOcean `workers_dns` output
can be followed to find a droplet's IPv4 or IPv6 address
* The CNI portmap plugin doesn't support IPv6. Hosting IPv6
apps is possible, but requires editing the nginx-ingress
addon with `hostNetwork: true`
2018-10-27 23:09:24 -07:00
Dalton Hubble
3bba1ba0dc Use new azurerm_network_interface_backend_address_pool_association
* Require terraform-provider-azurerm v1.17+
* Inline load_balancer_backend_address_pools_ids is deprecated
and scheduled for removal in the v2.0 provider
* https://github.com/terraform-providers/terraform-provider-azurerm/pull/2079
2018-10-27 22:55:05 -07:00
Dalton Hubble
dbe7604b67 Add primary field to ip_configuration required by Azure
* Required by terraform-provider-azurerm v1.17+
* https://github.com/terraform-providers/terraform-provider-azurerm/pull/2035
2018-10-27 16:44:44 -07:00
Dalton Hubble
f1da0731d8 Update Kubernetes from v1.12.1 to v1.12.2
* Update CoreDNS from v1.2.2 to v1.2.4
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.12.md#v1122
* https://coredns.io/2018/10/17/coredns-1.2.4-release/
* https://coredns.io/2018/10/16/coredns-1.2.3-release/
2018-10-27 15:47:57 -07:00
Dalton Hubble
d641a058fe Update Calico from v3.2.3 to v3.3.0
* https://docs.projectcalico.org/v3.3/releases/
2018-10-23 20:30:30 -07:00
Dalton Hubble
99a6d5478b Disable Kubelet read-only port 10255
* We can finally disable the Kubelet read-only port 10255!
* Journey: https://github.com/poseidon/typhoon/issues/322#issuecomment-431073073
2018-10-18 21:14:14 -07:00
Dalton Hubble
bc750aec33 Configure Heapster to source metrics from Kubelet authenticated API
* Heapster can now get nodes (i.e. kubelets) from the apiserver and
source metrics from the Kubelet authenticated API (10250) instead of
the Kubelet HTTP read-only API (10255)
* https://github.com/kubernetes/heapster/blob/master/docs/source-configuration.md
* Use the heapster service account token via Kubelet bearer token
authn/authz.
* Permit Heapster to skip CA verification. The CA cert does not contain
IP SANs and cannot since nodes get random IPs that aren't known upfront.
Heapster obtains the node list from the apiserver, so the risk of
spoofing a node is limited. For the same reason, Prometheus scrapes
must skip CA verification for scraping Kubelet's provided by the apiserver.
* https://github.com/poseidon/typhoon/blob/v1.12.1/addons/prometheus/config.yaml#L68
* Create a heapster ClusterRole to work around the default Kubernetes
`system:heapster` ClusterRole lacking the proper GET `nodes/stats`
access. See https://github.com/kubernetes/heapster/issues/1936
2018-10-18 21:03:01 -07:00
Dalton Hubble
d55bfd5589 Fix CoreDNS AntiAffinity spec to prefer spreading replicas
* Pods were still being scheduled at random due to a typo
2018-10-17 22:19:57 -07:00
Robert Fairburn
0be4673e44 Add disk_iops variable for AWS
* Setting disk_iops is required for disk_type io1
* https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSVolumeTypes.html#EBSVolumeTypes
2018-10-17 22:18:54 -07:00
Dalton Hubble
3b44972d78 Add links to header to CHANGES 2018-10-17 09:08:58 -07:00
Dalton Hubble
0127ee82c1 Update nginx-ingress from v0.19.0 to v0.20.0 2018-10-16 21:35:29 -07:00
Dalton Hubble
a10d6977b8 Update Prometheus from v2.4.2 to v2.4.3
* https://github.com/prometheus/prometheus/releases/tag/v2.4.3
2018-10-16 21:29:41 -07:00
Dalton Hubble
05fe923c14 Update Grafana from v5.3.0 to v5.3.1
* https://github.com/grafana/grafana/releases/tag/v5.3.1
2018-10-16 21:23:44 -07:00
Michael Schubert
d10620fb58 Add support for Flatcar Linux bare-metal cached_install
* Support bare-metal cached_install=true mode with Flatcar Linux
where assets are fetched from the Matchbox assets cache instead
of from the upstream Flatcar download server
* Skipped in original Flatcar support to keep it simple
https://github.com/poseidon/typhoon/pull/209
2018-10-16 21:15:24 -07:00
Dalton Hubble
9b6113a058 Update Kubernetes from v1.11.3 to v1.12.1
* Mount an empty dir for the controller-manager to work around
https://github.com/kubernetes/kubernetes/issues/68973
* Update coreos/pod-checkpointer to strip affinity from
checkpointed pod manifests. Kubernetes v1.12.0-rc.1 introduced
a default affinity that appears on checkpointed manifests; but
it prevented scheduling and checkpointed pods should not have an
affinity, they're run directly by the Kubelet on the local node
* https://github.com/kubernetes-incubator/bootkube/issues/1001
* https://github.com/kubernetes/kubernetes/pull/68173
2018-10-16 20:28:13 -07:00
Dalton Hubble
5eb4078d68 Add docker/default seccomp to control plane and addons
* Annotate pods, deployments, and daemonsets to start containers
with the Docker runtime's default seccomp profile
* Overrides Kubernetes default behavior which started containers
with seccomp=unconfined
* https://docs.docker.com/engine/security/seccomp/#pass-a-profile-for-a-container
2018-10-16 20:07:29 -07:00
Dalton Hubble
8f0d2b5db4 Update Grafana from v5.2.4 to v5.3.0 2018-10-13 23:03:31 -07:00
Dalton Hubble
2e89e161e9 Remove Azure admin_password (disabled) now that its optional
* Requires terraform-provider-azurerm v1.16.0 or higher
https://github.com/terraform-providers/terraform-provider-azurerm/pull/1958
2018-10-13 22:40:58 -07:00
Dalton Hubble
55bb4dfba6 Raise CoreDNS replica count to 2 or more
* Run at least two replicas of CoreDNS to better support
rolling updates (previously, kube-dns had a pod nanny)
* On multi-master clusters, set the CoreDNS replica count
to match the number of masters (e.g. a 3-master cluster
previously used replicas:1, now replicas:3)
* Add AntiAffinity preferred rule to favor distributing
CoreDNS pods across controller nodes nodes
2018-10-13 20:31:29 -07:00
Dalton Hubble
43fe78a2cc Raise scheduler/controller-manager replicas in multi-master
* Continue to ensure scheduler and controller-manager run
at least two replicas to support performing kubectl edits
on single-master clusters (no change)
* For multi-master clusters, set scheduler / controller-manager
replica count to the number of masters (e.g. a 3-master cluster
previously used replicas:2, now replicas:3)
2018-10-13 16:16:29 -07:00
Dalton Hubble
5a283b6443 Update etcd from v3.3.9 to v3.3.10
* https://github.com/etcd-io/etcd/blob/master/CHANGELOG-3.3.md#v3310-2018-10-10
2018-10-13 13:14:37 -07:00
Dalton Hubble
db36036c81 Require terraform-provider-digitalocean plugin ~> 1.0
* Require a terraform-provider-digitalocean plugin version of
1.0 or higher within the same major version (e.g. allow 1.1 but
not 2.0)
* Change requirement from ~> 0.1.2 (which allowed up to but not
including 1.0 release)
2018-10-02 17:09:19 +02:00
Dalton Hubble
7653e511be Update CoreDNS and Calico versions
* Update CoreDNS from 1.1.3 to 1.2.2
* Update Calico from v3.2.1 to v3.2.3
2018-10-02 16:07:48 +02:00
Dalton Hubble
032a24133b Update Prometheus from v2.3.2 to v2.4.2
* https://github.com/prometheus/prometheus/releases/tag/v2.4.0
* https://github.com/prometheus/prometheus/releases/tag/v2.4.1
* https://github.com/prometheus/prometheus/releases/tag/v2.4.2
2018-09-21 22:27:11 -07:00
Dalton Hubble
ad871dbfa9 Update Kubernetes from v1.11.2 to v1.11.3
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.11.md#v1113
2018-09-13 18:50:41 -07:00
Dalton Hubble
dc03f7a4a9 Update nginx-ingress from 0.17.1 to 0.19.0
* If using --enable-ssl-passthrough or exposing TCP/UDP services,
be aware of https://github.com/kubernetes/ingress-nginx/pull/3038
* Workarounds until the fix merges are to stay on 0.17.1, use the
suggested development image, or revert to securityContext
`runAsNonRoot: false` for a while (less secure)
2018-09-08 17:57:01 -07:00
Dalton Hubble
1b8234eb91 Update Grafana from v5.2.2 to v5.2.4
* https://github.com/grafana/grafana/releases/tag/v5.2.3
* https://github.com/grafana/grafana/releases/tag/v5.2.4
2018-09-08 15:41:20 -07:00
Dalton Hubble
4ba090feb0 Update kube-state-metrics from v1.3.1 to v1.4.0 2018-08-29 09:37:50 -07:00
Dalton Hubble
4882fe1053 Add docs for Azure Ingress and worker pools
* Azure worker pools must be in the same region as
the cluster itself unfortunately
2018-08-27 23:30:56 -07:00
Dalton Hubble
7eb09237f4 Update Calico from v3.1.3 to v3.2.1
* Add new bird and felix readiness checks
* Read MTU from ConfigMap veth_mtu
* Add RBAC read for serviceaccounts
* Remove invalid description from CRDs
2018-08-25 17:53:11 -07:00
Dalton Hubble
e58b424882 Fix firewall to allow etcd client traffic between controllers
* Broaden internal-etcd firewall rule to allow etcd client
traffic (2379) from other controller nodes
* Previously, kube-apiservers were only able to connect to their
node's local etcd peer. While master node outages were tolerated,
reaching a healthy peer took longer than neccessary in some cases
* Reduce time needed to bootstrap a cluster
2018-08-21 23:51:40 -07:00
Dalton Hubble
ea365b551a Fix docs mentions of ELBs to NLBs
* Typhoon AWS clusters use an NLB rather than an ELB,
since v1.10.5
* Add a few missing links in CHANGES
2018-08-21 21:40:06 -07:00
Dalton Hubble
bbf2c13eef Remove AWS security rule allowing ICMP packets to nodes
* Deny ICMP packets for consistency across Typhoon clusters on
various clouds and because there isn't much need to allow them
2018-08-21 21:16:16 -07:00
Dalton Hubble
da5d2c5321 Remove GCP firewall rule allowing Nginx Ingress health
* Nginx Ingress addon no longer uses hostNework so Prometheus may
scrape port 10254 via the CNI network, rather than via the host
address
2018-08-21 21:06:03 -07:00
Dalton Hubble
bec5250e73 Remove unofficial bare-metal *_networkds variables
* Remove controller_networkds and worker_networkds variables. These
variables were always listed as experimental, unsupported, and excluded
from documentation in anticipation of Container Linux Config snippets
* Use Container Linux Config snippets on bare-metal instead. They
provide safer, more powerful, and more elegant host customization
2018-08-13 23:33:29 -07:00
Dalton Hubble
dbdc3fc850 Add nginx-ingress addon manifests for bare-metal 2018-08-11 12:14:23 -07:00
Dalton Hubble
e00f97c578 Update nginx-ingress from 0.16.2 to 0.17.1
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.17.1
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.17.0
2018-08-08 00:45:20 -07:00
Dalton Hubble
f7ebdf475d Update Kubernetes from v1.11.1 to v1.11.2
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.11.md#v1112
2018-08-07 21:57:25 -07:00
Dalton Hubble
edc250d62a Fix Kublet version for Fedora Atomic modules
* Release v1.11.1 erroneously left Fedora Atomic clusters using
the v1.11.0 Kubelet. The rest of the control plane ran v1.11.1
as expected
* Update Kubelet from v1.11.0 to v1.11.1 so Fedora Atomic matches
Container Linux
* Container Linux modules were not affected
2018-07-29 12:13:29 -07:00
Dalton Hubble
db64ce3312 Update etcd from v3.3.8 to v3.3.9
* https://github.com/coreos/etcd/blob/master/CHANGELOG-3.3.md#v339-2018-07-24
2018-07-29 11:27:37 -07:00
Dalton Hubble
7c327b8bf4 Update from bootkube v0.12.0 to v0.13.0 2018-07-29 11:20:17 -07:00
Dalton Hubble
e6720cf738 Update heapster from v1.5.3 to v1.5.4
* https://github.com/kubernetes/heapster/releases/tag/v1.5.4
2018-07-29 11:19:57 -07:00
Dalton Hubble
844f380b4e Update Grafana from v5.2.1 to v5.2.2
* https://github.com/grafana/grafana/releases/tag/v5.2.2
2018-07-29 11:12:56 -07:00
Dalton Hubble
4e7dfc115d Support Container Linux Config snippets on bare-metal 2018-07-25 23:14:54 -07:00
Dalton Hubble
d8d524d10b Update Kubernetes from v1.11.0 to v1.11.1
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.11.md#v1111
2018-07-20 00:41:27 -07:00
Dalton Hubble
02cd8eb8d3 Update Prometheus from v2.3.1 to v2.3.2
* https://github.com/prometheus/prometheus/releases/tag/v2.3.2
2018-07-14 14:25:49 -07:00
Dalton Hubble
3352388fe6 Update changelog and docs for release 2018-07-04 12:28:25 -07:00
Dalton Hubble
915f89d3c8 Update Fedora Atomic from 27 to 28 on bare-metal 2018-07-04 11:41:54 -07:00
Dalton Hubble
f40f60b83c Update Nginx Ingress controller from 0.15.0 to 0.16.2
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.16.2
* https://github.com/kubernetes/ingress-nginx/blob/master/Changelog.md
2018-07-02 22:06:22 -07:00
Dalton Hubble
6f958d7577 Replace kube-dns with CoreDNS
* Add system:coredns ClusterRole and binding
* Annotate CoreDNS for Prometheus metrics scraping
* Remove kube-dns deployment, service, & service account
* https://github.com/poseidon/terraform-render-bootkube/pull/71
* https://kubernetes.io/blog/2018/06/27/kubernetes-1.11-release-announcement/
2018-07-01 22:55:01 -07:00
Dalton Hubble
ee31074679 Promote Typhoon Google Cloud for Container Linux to stable 2018-07-01 22:52:27 -07:00
Dalton Hubble
18502d64d6 Update Fedora Atomic from 27 to 28 on GCP 2018-07-01 22:46:51 -07:00
Dalton Hubble
a3349b5c68 Update heapster from v1.5.2 to v1.5.3 2018-07-01 21:07:52 -07:00
Dalton Hubble
74dc6b0bf9 Update Grafana from 5.1.4 to 5.2.1
* http://docs.grafana.org/guides/whats-new-in-v5-2/
* https://github.com/grafana/grafana/releases/tag/v5.2.0
* https://github.com/grafana/grafana/releases/tag/v5.2.1
2018-07-01 20:55:34 -07:00
Dalton Hubble
fd1de27aef Remove deprecated ingress_static_ip and controllers_ipv4_public outputs 2018-07-01 20:47:46 -07:00
Dalton Hubble
93de7506ef Update Fedora Atomic from 27 to 28 on AWS 2018-06-30 18:55:18 -07:00
Dalton Hubble
8464b258d8 Update Kubernetes from v1.10.5 to v1.11.0
* Force apiserver to stop listening on 127.0.0.1:8080
* Remove deprecated Kubelet `--allow-privileged`. Defaults to
true. Use `PodSecurityPolicy` if limiting is desired
* https://github.com/kubernetes/kubernetes/releases/tag/v1.11.0
* https://github.com/poseidon/terraform-render-bootkube/pull/68
2018-06-27 22:47:35 -07:00
Dalton Hubble
855aec5af3 Clarify AWS module output names and changes 2018-06-23 15:29:13 -07:00
Dalton Hubble
0c4d59db87 Use global HTTP/TCP proxy load balancing for Ingress on GCP
* Switch Ingress from regional network load balancers to global
HTTP/TCP Proxy load balancing
* Reduce cost by ~$19/month per cluster. Google bills the first 5
global and regional forwarding rules separately. Typhoon clusters now
use 3 global and 0 regional forwarding rules.
* Worker pools no longer include an extraneous load balancer. Remove
worker module's `ingress_static_ip` output.
* Add `ingress_static_ipv4` output variable
* Add `worker_instance_group` output to allow custom global load
balancing
* Deprecate `controllers_ipv4_public` module output
* Deprecate `ingress_static_ip` module output. Use `ingress_static_ipv4`
2018-06-23 14:37:40 -07:00
Dalton Hubble
2eaf04c68b Drop hostNetwork from nginx-ingress addon
* Both flannel and Calico support host port via `portmap`
* Allows writing NetworkPolicies that reference ingress pods in `from`
or `to`. HostNetwork pods were difficult to write network policy for
since they could circumvent the CNI network to communicate with pods on
the same node.
2018-06-22 00:46:41 -07:00
Dalton Hubble
fb6f40051f Disable AWS detailed monitoring on worker nodes
* Basic monitoring (free) is sufficient for casual console browsing
* Detailed monitoring (paid) is not leveraged for CloudWatch anyway
* Favor Prometheus for cloud-agnostic metrics, aggregation, and alerting
2018-06-22 00:26:06 -07:00
Dalton Hubble
316f06df06 Combine NLBs to use one NLB per cluster
* Simplify clusters to come with a single NLB
* Listen for apiserver traffic on port 6443 and forward
to controllers (with healthy apiserver)
* Listen for ingress traffic on ports 80/443 and forward
to workers (with healthy ingress controller)
* Reduce cost of default clusters by 1 NLB ($18.14/month)
* Keep using CNAME records to the `ingress_dns_name` NLB and
the nginx-ingress addon for Ingress (up to a few million RPS)
* Users with heavy traffic (many million RPS) can create their
own separate NLB(s) for Ingress and use the new output worker
target groups
* Fix issue where additional worker pools come with an
extraneous network load balancer
2018-06-21 23:46:57 -07:00
Dalton Hubble
f4d3059b00 Update Kubernetes from v1.10.4 to v1.10.5
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.10.md#v1105
2018-06-21 22:51:39 -07:00
Dalton Hubble
6c5a1964aa Change kube-apiserver port from 443 to 6443
* Adjust firewall rules, security groups, cloud load balancers,
and generated kubeconfig's
* Facilitates some future simplifications and cost reductions
* Bare-Metal users who exposed kube-apiserver on a WAN via their
router or load balancer will need to adjust its configuration.
This is uncommon, most apiserver are on LAN and/or behind VPN
so no routing infrastructure is configured with the port number
2018-06-19 23:48:51 -07:00
Dalton Hubble
6e64634748 Update etcd from v3.3.7 to v3.3.8
* https://github.com/coreos/etcd/releases/tag/v3.3.8
2018-06-19 21:56:21 -07:00
Dalton Hubble
d5de41e07a Update Grafana from 5.1.3 to 5.1.4
* https://github.com/grafana/grafana/releases/tag/v5.1.4
2018-06-19 21:45:15 -07:00
Dalton Hubble
05b99178ae Update prometheus from v2.3.0 to v2.3.1
* https://github.com/prometheus/prometheus/releases/tag/v2.3.1
2018-06-19 21:43:50 -07:00
Dalton Hubble
ed0b781296 Fix possible deadlock for provisioning bare-metal clusters
* Closes #235
2018-06-14 23:15:28 -07:00
Dalton Hubble
51906bf398 Update etcd from v3.3.6 to v3.3.7 2018-06-14 22:46:16 -07:00
Stephen Demos
18dd7ccc09 Update CLUO from v0.6.0 to v0.7.0 2018-06-14 22:32:36 -07:00
Dalton Hubble
cbe646fba6 Label namespaces to ease writing Network Policies 2018-06-09 11:45:11 -07:00
Dalton Hubble
c166b2ba33 Update prometheus from v2.2.1 to v2.3.0 2018-06-09 11:43:10 -07:00
Dalton Hubble
79260c48f6 Update Kubernetes from v1.10.3 to v1.10.4 2018-06-06 23:23:11 -07:00
Dalton Hubble
589c3569b7 Update etcd from v3.3.5 to v3.3.6
* https://github.com/coreos/etcd/releases/tag/v3.3.6
2018-06-06 23:19:30 -07:00
Dalton Hubble
d32e6797ae Annotate Grafana so Prometheus scrapes metrics 2018-05-30 22:37:47 -07:00
Dalton Hubble
32a9a83190 Add Prometheus liveness and readiness probes 2018-05-30 22:34:07 -07:00
Dalton Hubble
6e968cd152 Update Calico from v3.1.2 to v3.1.3
* https://github.com/projectcalico/calico/releases/tag/v3.1.3
* https://github.com/projectcalico/cni-plugin/releases/tag/v3.1.3
2018-05-30 21:32:12 -07:00
Dalton Hubble
4ea1fde9c5 Update Kubernetes from v1.10.2 to v1.10.3
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.10.md#v1103
* Update Calico from v3.1.1 to v3.1.2
2018-05-21 21:38:43 -07:00
Dalton Hubble
1e2eec6487 Update Fedora Atomic from 27 to 28 on DigitalOcean
* Fedora Atomic 27 images disappeared from DigitalOcean and
forced this early update (there are known bugs)
2018-05-21 21:30:23 -07:00
Dalton Hubble
28d0891729 Annotate nginx-ingress addon for Prometheus auto-discovery
* Add Google Cloud firewall rule to allow worker to worker access
to health and metrics
2018-05-19 13:13:14 -07:00
Dalton Hubble
714419342e Update nginx-ingress from 0.14.0 to 0.15.0
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.15.0
2018-05-17 21:42:55 -07:00
Dalton Hubble
3701c0b1fe Update Grafana from v5.1.2 to v5.1.3
* https://github.com/grafana/grafana/releases/tag/v5.1.3
2018-05-17 21:36:09 -07:00
Dalton Hubble
0c3557e68e Allow Flatcar Linux os_channel on bare-metal
* Choose the Container Linux derivative Flatcar Linux on
bare-metal by setting os_channel to flatcar-stable, flatcar-beta
or flatcar-alpha
* As with Container Linux from Red Hat, the version (os_version)
must correspond to the channel being used
* Thank you to @dongsupark from Kinvolk
2018-05-17 20:09:36 -07:00
Dalton Hubble
adc6c6866d Rename container_linux_ bare-metal variables
* Allow for Container Linux derivatives
* Replace container_linux_channel variable with `os_channel`
* Replace `container_linux_version` variable with `os_version`
* Please change values `stable`, `beta`, or `alpha` to `coreos-stable`,
`coreos-beta`, `coreos-alpha` (action required!)
2018-05-16 22:40:39 -07:00
Dalton Hubble
9ac7b0655f Add bare-metal network_ip_autodetection_method variable for multi-NIC
* Allow setting the Calico host IPv4 address autodetection method
* Use Calico's default "first-found" method to support single NIC
and bonded NIC nodes
* Allow methods like `can-reach=IP` or `interface=REGEX` for multi
NIC nodes
* https://docs.projectcalico.org/v3.1/reference/node/configuration#ip-autodetection-methods
2018-05-15 23:27:34 -07:00
Dalton Hubble
c2b719dc75 Configure Prometheus to scrape Kubelets directly
* Use Kubelet bearer token authn/authz to scrape metrics
* Drop RBAC permission from nodes/proxy to nodes/metrics
* Stop proxying kubelet scrapes through the apiserver, since
this required higher privilege (nodes/proxy) and can add
load to the apiserver on large clusters
2018-05-14 23:06:50 -07:00
Dalton Hubble
37981f9fb1 Allow bearer token authn/authz to the Kubelet
* Require Webhook authorization to the Kubelet
* Switch apiserver X509 client cert org to systems:masters
to grant the apiserver admin and satisfy the authorization
requirement. kubectl commands like logs or exec that have
the apiserver make requests of a kubelet continue to work
as before
* https://kubernetes.io/docs/admin/kubelet-authentication-authorization/
* https://github.com/poseidon/typhoon/issues/215
2018-05-13 23:20:42 -07:00
Dalton Hubble
5eb11f5104 Allow Flatcar Linux os_image on AWS, rename os_channel
* Replace os_channel variable with os_image to align naming
across clouds. Users who set this option to stable, beta, or
alpha should now set os_image to coreos-stable, coreos-beta,
or coreos-alpha.
* Default os_image to coreos-stable. This continues to use
the most recent image from the stable channel as always.
* Allow Container Linux derivative Flatcar Linux by setting
os_image to `flatcar-stable`, `flatcar-beta`, `flatcar-alpha`
2018-05-12 11:41:58 -07:00
Dalton Hubble
f2ee75ac98 Require Terraform v0.11.x, drop v0.10.x support
* Raise minimum Terraform version to v0.11.0
* Terraform v0.11.x has been supported since Typhoon v1.9.2
and Terraform v0.10.x was last released in Nov 2017. I'd like
to stop worrying about v0.10.x and remove migration docs as
a later followup
* Migration docs docs/topics/maintenance.md#terraform-v011x
2018-05-10 02:20:46 -07:00
Dalton Hubble
8b8e364915 Update etcd from v3.3.4 to v3.3.5
* https://github.com/coreos/etcd/releases/tag/v3.3.5
2018-05-10 02:12:53 -07:00
Dalton Hubble
fb88113523 Disable default Google Analytics in Grafana addon
* Its come to my attention Grafana reports analytics data
by default. Typhoon's philosophy requires user permission
for data collection so the addon should have this disabled
* http://docs.grafana.org/installation/configuration/#analytics
2018-05-10 01:18:47 -07:00
Dalton Hubble
1854f5c104 Update Grafana from v5.1.1 to v5.1.2
* https://github.com/grafana/grafana/releases/tag/v5.1.2
2018-05-10 01:09:08 -07:00
Dalton Hubble
726b58b697 Update Grafana from v5.0.4 to v5.1.1
* https://github.com/grafana/grafana/releases/tag/v5.1.1
* https://github.com/grafana/grafana/releases/tag/v5.1.0
2018-05-07 22:05:19 -07:00
Dalton Hubble
a54e3c0da1 Fix Prometheus data dir to /var/lib/prometheus
* A data volume (emptyDir) is mounted to /var/lib/prometheus
* Users could swap emptyDir for any desired volume if data
persistence is desired. Prometheus previously defaulted to
keeping its data in ./data relative to /prometheus. Override
this behavior to store data in /var/lib/prometheus
2018-05-01 22:05:27 -07:00
Dalton Hubble
cc29530ba0 Allow preemptible workers on AWS via spot instances
* Add `worker_price` to allow worker spot instances. Defaults
to empty string for the worker autoscaling group to use regular
on-demand instances.
* Add `spot_price` to internal `workers` module for spot worker
pools
* Note: Unlike GCP `preemptible` workers, spot instances require
you to pick a bid price.
2018-04-29 13:31:17 -07:00
Dalton Hubble
385584b712 Add changelog notes for release 2018-04-29 12:04:44 -07:00
Dalton Hubble
32ddfa94e1 Update Kubernetes from v1.10.1 to v1.10.2
* https://github.com/kubernetes/kubernetes/releases/tag/v1.10.2
2018-04-28 00:27:00 -07:00
Dalton Hubble
681450aa0d Update etcd from v3.3.3 to v3.3.4
* https://github.com/coreos/etcd/releases/tag/v3.3.4
2018-04-27 23:57:26 -07:00
Dalton Hubble
fafa028052 Add Typhoon for Fedora Atomic to changelog 2018-04-27 23:55:59 -07:00