Commit Graph

1384 Commits

Author SHA1 Message Date
Dalton Hubble 5ef4155e08 Detect most recent Fedora CoreOS AMI in region
* Detect the most recent Fedora CoreOS AMI to allow usage
of Fedora CoreOS in supported regions (previously just
us-east-1)
* Unpin the Fedora CoreOS AMI image which was pinned to
images that had been checked. This does mean if Fedora
publishes a broken image, it will be selected
* Filter out "dev" images which have similar naming
2019-10-06 18:13:55 -07:00
Dalton Hubble 15c4b793c3 Use new Fedora CoreOS kernel/initrd/raw asset names
* Fedora CoreOS changed the kernel, initramfs, and raw
image asset download paths and names in 30.20191002.0
2019-10-06 17:31:21 -07:00
Dalton Hubble 36ed53924f Add stricter types for bare-metal modules
* Review variables available in bare-metal kubernetes modules
for Container Linux and Fedora CoreOS
* Deprecate cluster_domain_suffix variable
* Remove deprecated container_linux_oem variable
2019-10-06 17:18:50 -07:00
Dalton Hubble 19de38b30d Fix Prometheus etcd metrics scraping
* Prometheus was configured to use kubernetes discovery
of etcd targets based on nodes matching the node label
node-role.kubernetes.io/controller=true
* Kubernetes v1.16 stopped permitting node role labels
node-role.kubernetes.io/* so Typhoon renamed these labels
(no longer any association with roles) to
node.kubermetes.io/controller=true
* As a result, Prometheus didn't discover etcd targets,
etcd metrics were missing, etcd alerts were ineffective,
and the etcd Grafana dashboard was empty
* Introduced: https://github.com/poseidon/typhoon/pull/543
2019-10-03 19:07:05 -07:00
Dalton Hubble 995824fa6d Add stricter types for DigitalOcean module
* Review variables available in DigitalOcean kubernetes
module and sync with documentation
* Promote Calico for DigitalOcean and Azure beyond experimental
(its the primary mode I've used since it was introduced)
2019-10-02 21:48:24 -07:00
Dalton Hubble 1c5ed84fc2 Update Kubernetes from v1.16.0 to v1.16.1
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.16.md#v1161
2019-10-02 21:31:55 -07:00
Dalton Hubble ca7d62720e Update Grafana from v6.3.6 to v6.4.1
* https://github.com/grafana/grafana/releases/tag/v6.4.1
2019-10-02 20:36:05 -07:00
Dalton Hubble 26f8d76755 Update kube-state-metrics from v1.7.2 to v1.8.0
* https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.8.0
2019-10-01 20:50:33 -07:00
Dalton Hubble fdd6882a87 Add stricter types to Azure modules
* Review variables available in Azure kubernetes and workers
modules and sync with documentation
* Fix internal workers module default type to Standard_DS1_v2
2019-09-30 22:20:20 -07:00
Dalton Hubble f82266ac8c Add stricter types for GCP modules
* Review variables available in google-cloud kubernetes
and workers modules and in documentation
2019-09-30 22:04:35 -07:00
Dalton Hubble 7bcf2d7831 Update nginx-ingress from v0.25.1 to v0.26.1
* Add lifecycle hook to allow draining connections for
up to 5 minutes
2019-09-30 22:01:07 -07:00
Dalton Hubble 78bfff0afe Update Fedora CoreOS to testing 30.20190905.0
* Fix duplicated cluster_domain_suffix variable
2019-09-29 11:34:31 -07:00
Dalton Hubble a6de245d8a Rename bootkube.tf to bootstrap.tf
* Typhoon no longer uses the bootkube project
2019-09-29 11:30:49 -07:00
Dalton Hubble 96afa6a531 Update Calico from v3.8.2 to v3.9.1
* https://docs.projectcalico.org/v3.9/release-notes/
2019-09-29 11:22:53 -07:00
Dalton Hubble a407ff72df Add stricter types for AWS modules and update docs
* Review variables available in AWS kubernetes and workers
modules and documentation
* Switching between spot and on-demand has worked since
Terraform v0.12
* Generally, there are too many knobs. Less useful ones
should be de-emphasized or removed
* Remove `cluster_domain_suffix` documentation
2019-09-29 11:19:38 -07:00
Dalton Hubble f453c54956 Update Grafana from v6.3.5 to v6.3.6
* https://github.com/grafana/grafana/releases/tag/v6.3.6
2019-09-28 15:13:46 -07:00
Dalton Hubble 3e34fb075b Update etcd from v3.4.0 to v3.4.1
* https://github.com/etcd-io/etcd/releases/tag/v3.4.1
2019-09-28 15:09:57 -07:00
Dalton Hubble 9bfb1c5faf Update docs and variable types for worker node_labels
* Document worker pools `node_labels` variable to set the
initial node labels for a homogeneous set of workers
* Document `worker_node_labels` convenience variable to
set the initial node labels for default worker nodes
2019-09-28 15:05:12 -07:00
Valer Cara 99ab81f79c Add node_labels variable in workers modules to set initial node labels (#550)
* Also add `worker_node_labels` variable in `kubernetes` modules to set
initial node labels for the default workers
2019-09-28 14:59:24 -07:00
Dalton Hubble 8703f2c3c5 Fix missing comma separator on bare-metal and DO
* Introduced in bare-metal and DigitalOcean in #544
while addressing possible ordering race, but after
the v1.16 upgrade validation
2019-09-23 11:05:26 -07:00
Dalton Hubble 078f084220 Update CHANGES and docs for v1.16.0 release 2019-09-22 17:37:23 -07:00
Dalton Hubble 81a1ae38e6 Update Terraform provider plugin versions
* Recommend provider plugin versions that Typhoon
authors use
2019-09-22 17:14:30 -07:00
Dalton Hubble 5b06e0e869 Organize and cleanup Kubelet ExecStartPre
* Sort Kubelet ExecStartPre mkdir commands
* Remove unused inactive-manifests and checkpoint-secrets
directories (were used by bootkube self-hosting)
2019-09-19 00:38:34 -07:00
Dalton Hubble b951aca66f Create /etc/kubernetes/manifests before asset copy
* Fix issue (present since bootkube->bootstrap switch) where
controller asset copy could fail if /etc/kubernetes/manifests
wasn't created in time on platforms using path activation for
the Kubelet (observed on DigitalOcean, also possible on
bare-metal)
2019-09-19 00:30:53 -07:00
Dalton Hubble 9da3725738 Update Kubernetes from v1.15.3 to v1.16.0
* Drop `node-role.kubernetes.io/master` and
`node-role.kubernetes.io/node` node labels
* Kubelet (v1.16) now rejects the node labels used
in the kubectl get nodes ROLES output
* https://github.com/kubernetes/kubernetes/issues/75457
2019-09-18 22:53:06 -07:00
Dalton Hubble fd12f3612b Rename CA organization from bootkube to typhoon
* Rename the organization in generated CA certificates from
bootkube to typhoon. Avoid confusion with the bootkube project
* https://github.com/poseidon/terraform-render-bootstrap/pull/149
2019-09-14 16:56:53 -07:00
Dalton Hubble 96b646cf6d Rename bootkube modules to bootstrap
* Rename render module from bootkube to bootstrap. Avoid
confusion with the kubernetes-incubator/bootkube tool since
it is no longer used
* Use the poseidon/terraform-render-bootstrap Terraform module
(formerly poseidon/terraform-render-bootkube)
* https://github.com/poseidon/terraform-render-bootkube/pull/149
2019-09-14 16:24:32 -07:00
Dalton Hubble b15c60fa2f Update CHANGES for control plane static pod switch
* Remove old references to bootkube / self-hosted
2019-09-09 22:48:48 -07:00
Dalton Hubble db947537d1 Migrate GCP, DO, Azure to static pod control plane
* Run a kube-apiserver, kube-scheduler, and kube-controller-manager
static pod on each controller node. Previously, kube-apiserver was
self-hosted as a DaemonSet across controllers and kube-scheduler
and kube-controller-manager were a Deployment (with 2 or
controller_count many replicas).
* Remove bootkube bootstrap and pivot to self-hosted
* Remove pod-checkpointer manifests (no longer needed)
2019-09-09 22:37:31 -07:00
Dalton Hubble c933bdfc26 Migrate Container Linux AWS to static pod control plane
* Run a kube-apiserver, kube-scheduler, and kube-controller-manager
static pod on each controller node. Previously, kube-apiserver was
self-hosted as a DaemonSet across controllers and kube-scheduler
and kube-controller-manager were a Deployment (with 2 or
controller_count many replicas).
* Remove bootkube bootstrap and pivot to self-hosted
* Remove pod-checkpointer manifests (no longer needed)
2019-09-09 22:37:31 -07:00
Dalton Hubble 21632c6674 Migrate Container Linux bare-metal to static pod control plane
* Run a kube-apiserver, kube-scheduler, and kube-controller-manager
static pod on each controller node. Previously, kube-apiserver was
self-hosted as a DaemonSet across controllers and kube-scheduler
and kube-controller-manager were a Deployment (with 2 or
controller_count many replicas).
* Remove bootkube bootstrap and pivot to self-hosted
* Remove pod-checkpointer manifests (no longer needed)
2019-09-09 22:37:31 -07:00
Dalton Hubble 74780fb09f Migrate Fedora CoreOS bare-metal to static pod control plane
* Run a kube-apiserver, kube-scheduler, and kube-controller-manager
static pod on each controller node. Previously, kube-apiserver was
self-hosted as a DaemonSet across controllers and kube-scheduler
and kube-controller-manager were a Deployment (with 2 or
controller_count many replicas).
* Remove bootkube bootstrap and pivot to self-hosted
* Remove pod-checkpointer manifests (no longer needed)
2019-09-09 22:37:31 -07:00
Dalton Hubble b60a2ecdf7 Migrate Fedora CoreOS AWS to a static pod control plane
* Run a kube-apiserver, kube-scheduler, and kube-controller-manager
static pod on each controller node. Previously, kube-apiserver was
self-hosted as a DaemonSet across controllers and kube-scheduler
and kube-controller-manager were a Deployment (with 2 or
controller_count many replicas).
* Remove bootkube bootstrap and pivot to self-hosted
* Remove pod-checkpointer manifests (no longer needed)
2019-09-09 22:37:31 -07:00
Dalton Hubble 4a7083d94a Change Azure default controller_type and worker_type
* Change default controller_type to Standard_B2s. A B2s is cheaper
by $17/month and provides 2 vCPU, 4GB RAM (vs 1 vCPU, 3.5GB RAM)
* Change default worker_type to Standard_DS1_v2. F1 was the previous
generation. The DS1_v2 is newer, similar cost, more memory, and still
supports Low Priority mode, if desired
2019-09-09 22:34:28 -07:00
Dalton Hubble c20683067d Update etcd from v3.3.15 to v3.4.0
* https://github.com/etcd-io/etcd/releases/tag/v3.4.0
2019-09-08 15:32:49 -07:00
Dalton Hubble dc436b8fe9 Update Grafana from v6.3.4 to v6.3.5
* https://github.com/grafana/grafana/releases/tag/v6.3.5
2019-09-07 14:21:59 -07:00
Dalton Hubble efb9a2d09a Update Fedora CoreOS bare-metal docs for 30.20190801.0 2019-09-04 21:11:22 -07:00
Dalton Hubble e8d586f3b3 Enable QoS on Fedora CoreOS controllers
* Kubelet race should be fixed in Kubernetes v1.15.1
* https://github.com/kubernetes/kubernetes/issues/79046
* Reverts temporary mitigation https://github.com/poseidon/typhoon/pull/515
2019-09-04 21:09:45 -07:00
Dalton Hubble b74f470701 Recommend updating terraform-provider-ct from v0.3.2 to v0.4.0
* v0.4.0 adds a "strict" mode we'll start using in future and
also adds support for Fedora CoreOS
* https://github.com/poseidon/terraform-provider-ct/releases/tag/v0.4.0
2019-08-31 16:07:22 -07:00
Dalton Hubble 45bc52d156 Update Grafana from v6.3.3 to v6.3.4
* https://github.com/grafana/grafana/releases/tag/v6.3.4
2019-08-31 15:59:13 -07:00
Dalton Hubble 4d5f962d76 Update CoreDNS from v1.5.0 to v1.6.2
* https://coredns.io/2019/06/26/coredns-1.5.1-release/
* https://coredns.io/2019/07/03/coredns-1.5.2-release/
* https://coredns.io/2019/07/28/coredns-1.6.0-release/
* https://coredns.io/2019/08/02/coredns-1.6.1-release/
* https://coredns.io/2019/08/13/coredns-1.6.2-release/
2019-08-31 15:57:42 -07:00
Dalton Hubble e7d805d9a4 Sync recommended versions of Terraform providers for clouds
* Align Terraform provider plugin versions with those tested against
2019-08-27 22:00:08 -07:00
Dalton Hubble d95bf2d1ea Update mkdocs-material from v4.4.0 to v4.4.2 2019-08-27 21:57:20 -07:00
Dalton Hubble c42139beaa Update etcd from v3.3.14 to v3.3.15
* No functional changes, just changes to vendoring tools
(go modules -> glide). Still, update to v3.3.15 anyway
* https://github.com/etcd-io/etcd/compare/v3.3.14...v3.3.15
2019-08-19 15:05:21 -07:00
Dalton Hubble 35c2763ab0 Update Kubernetes from v1.15.2 to v1.15.3
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.15.md/#v1153
2019-08-19 14:49:24 -07:00
Dalton Hubble 2067356ae9 Update Fedora CoreOS to testing 30.20190801.0 2019-08-18 21:46:59 -07:00
Dalton Hubble 8f412e2f09 Update etcd from v3.3.13 to v3.3.14
* https://github.com/etcd-io/etcd/releases/tag/v3.3.14
2019-08-18 21:05:06 -07:00
Dalton Hubble 4ef2eb7e6b Update Prometheus from v2.11.2 to v2.12.0
* https://github.com/prometheus/prometheus/releases/tag/v2.12.0
2019-08-18 20:59:44 -07:00
Dalton Hubble 99990e3cbb Use stable IDs for etcd, CoreDNS, and Ngnix dashboards
* Use unique dashboard ID so that multiple replicas of Grafana
serve dashboards with uniform paths
* Fix issue where refreshing a dashboard served by one replica
could show a 404 unless the request went to the same replica
2019-08-18 12:45:49 -07:00
Dalton Hubble 3c3708d58e Update Calico from v3.8.1 to v3.8.2
* https://docs.projectcalico.org/v3.8/release-notes/
2019-08-16 15:38:23 -07:00