* Use etcd v3.3 --listen-metrics-urls to expose only metrics
data via http://0.0.0.0:2381 on controllers
* Add Prometheus discovery for etcd peers on controller nodes
* Temporarily drop two noisy Prometheus alerts
* Change EBS volume type from `standard` ("prior generation)
to `gp2`. Prometheus alerts are tuned for SSDs
* Other platforms have fast enough disks by default
* Calico isn't viable on Digital Ocean because their firewalls
do not support IP-IP protocol. Its not viable to run a cluster
without firewalls just to use Calico.
* Remove the caveat note. Don't allow users to shoot themselves
in the foot
* Allow groups of workers to be defined and joined to
a cluster (i.e. worker pools)
* Move worker resources into a Terraform submodule
* Output variables needed for passing to worker pools
* Add usage docs for AWS worker pools (advanced)
* Set defaults for internal worker module's count,
machine_type, and os_image
* Allow "pools" of homogeneous workers to be created
using the google-cloud/kubernetes/workers module
* Template terraform-render-bootkube's multi-line kubeconfig
output using the right indentation
* Add `kubeconfig` variable to google-cloud controllers and
workers Terraform submodules
* Remove `kubeconfig_*` variables from google-cloud controllers
and workers Terraform submodules
* Upcoming releases may begin to use features that require
the `terraform-provider-ct` plugin v0.2.1
* New users should use `terraform-provider-ct` v0.2.1. Existing
users can safely drop-in replace their v0.2.0 plugin with v0.2.1
as well (location referenced in ~/.terraformrc).
* See https://github.com/poseidon/typhoon/pull/145
* Set Kubelet search path for flexvolume plugins
to /var/lib/kubelet/volumeplugins
* Add support for flexvolume plugins on AWS, GCE, and DO
* See 9548572d98 which added flexvolume support for bare-metal
* Stop maintaining Kubernetes Dashboard manifests. Dashboard takes
an unusual approch to security and is often a security weak point.
* Recommendation: Use `kubectl` and avoid using the dashboard. If
you must use the dashboard, explore hardening and consider using an
authenticating proxy rather than the dashboard's auth features
* Deployments now belong to the apps/v1 API group
* DaemonSets now belong to the apps/v1 API group
* RBAC types now belong to the rbac.authorization.k8s.io/v1 API group
* Add flannel service account and limited RBAC cluster role
* Change DaemonSets to tolerate NoSchedule and NoExecute taints
* Remove deprecated apiserver --etcd-quorum-read flag
* Update Calico from v3.0.1 to v3.0.2
* Add Calico GlobalNetworkSet CRD
* https://github.com/poseidon/terraform-render-bootkube/pull/44
* Update CLUO from v0.4.1 to v0.5.0
* Earlier versions of CLUO fail to drain nodes on Kubernetes 1.9
so nodes drain one at a time repeatedly and Container Linux OS
updates are not applied to nodes.
* Check current OS versions via `kubectl get nodes --show-labels`
* Create separate container-linux-install profiles (and
cached-container-linux-install) for each node in a cluster
* Fix contention bug on bare-metal during `terraform apply`.
With only a global install profile, terraform would create
(or retain) the profile for each cluster and try to delete
it for each cluster being deleted. As a result, in some cases
apply had to be run multiple times before terraform's repr
of constraints was satisfied (profile deleted and recreated)
* Allow Container Linux install properties to vary between
clusters, such as using a different Container Linux channel
or version for different clusters
* Add explicit "providers" section to modules for Terraform v0.11.x
* Retain support for Terraform v0.10.4+
* Add migration guide from Terraform v0.10.x to v0.11.x for those managing
existing clusters (action required!)
* Container Linux stable and beta now provide Docker 17.09 (instead
of 1.12). Recommend images which provide 17.09.
* Older clusters (with CLUO addon) auto-update node's Container Linux version
and will begin using Docker 17.09.
* Adapt the coreos/prometheus-operator alerting rules for Typhoon,
https://github.com/coreos/prometheus-operator/tree/master/contrib/kube-prometheus/manifests
* Add controller manager and scheduler shim services to let
prometheus discover them via service endpoints
* Fix several alert rules to use service endpoint discovery
* A few rules still don't do much, but they default to green
* Change controllers ASG to heterogeneous EC2 instances
* Create DNS records for each controller's private IP for etcd
* Change etcd to run on-host, across controllers (etcd-member.service)
* Reduce time to bootstrap a cluster
* Deprecate self-hosted-etcd on the AWS platform
* Change controllers from a managed group to individual instances
* Create discrete DNS records to each controller's private IP for etcd
* Change etcd to run on-host, across controllers (etcd-member.service)
* Reduce time to bootstrap a cluster
* Deprecate self-hosted-etcd on the Google Cloud platform
* Controller preemption is not safe or covered in documentation. Delete
the option, the variable is a holdover from old experiments
* Note, worker_preemeptible is still a great feature that's supported
* Change Google Cloud module to require the `region` variable
* Workers are created in random zones within the given region
* Tolerate Google Cloud zone failures or capacity issues
* If workers are preempted (if enabled), replacement instances can
be drawn from any zone in the region, which should avoid scheduling
issues that were possible before if a single zone aggressively
preempts instances (presumably due to Google Cloud capacity)
* Kubernetes v1.8.2 fixes a memory leak in the v1.8.1 apiserver
* Switch to using the `gcr.io/google_containers/hyperkube` for the
on-host kubelet and shutdown drains
* Update terraform-render-bootkube manifests generation
* Update flannel from v0.8.0 to v0.9.0
* Add `hairpinMode` to flannel CNI config
* Add `--no-negcache` to kube-dns dnsmasq
* Run etcd peers with TLS across controller nodes
* Deprecate self-hosted-etcd on the Digital Ocean platform
* Distribute etcd TLS certificates as part of initial provisioning
* Check the status of etcd by running `systemctl status etcd-member`