typhoon

mirror of https://github.com/puppetmaster/typhoon.git synced 2025-10-03 23:44:38 +02:00

Author	SHA1	Message	Date
Arve Knudsen	aa275796cb	Fix DigitalOcean controller and worker ipv4/ipv6 outputs (#594 ) * Fix controller and worker ipv4/ipv4 outputs to be lists of strings * With Terraform v0.11 syntax, an enclosing list was required to coerce the output to be a list of strings * With Terraform v0.12 syntax, the enclosing list shouldn't be needed	2019-12-02 21:20:47 -08:00
Dalton Hubble	26674083b6	Update Grafana from v6.5.0 to v6.5.1 * https://github.com/grafana/grafana/releases/tag/v6.5.1	2019-11-28 14:11:25 -08:00
Dalton Hubble	030a4cec19	Update Grafana from v6.4.4 to v6.5.0 * https://grafana.com/docs/guides/whats-new-in-v6-5/	2019-11-25 22:45:58 -08:00
Dalton Hubble	ddea7dc452	Use new resource dashboards in Grafana deployment * kubernetes-mixin pod resource dashboards were split into two ConfigMap parts because they provide richer networking details * New dashboards have been used by the author at the global level, but were missing in the per-cluster Grafana tracked here	2019-11-25 22:27:11 -08:00
Dalton Hubble	4b485a9bf2	Fix recent deletion of bootstrap module pinned SHA * Fix deletion of bootstrap module pinned SHA, which was introduced recently through an automation mistake creating https://github.com/poseidon/typhoon/pull/589	2019-11-21 22:34:09 -08:00
Dalton Hubble	4704b494f0	Update mkdocs-material from v4.4.3 to v4.4.0 * Upgrade dependency packages as well	2019-11-18 23:05:29 -08:00
Dalton Hubble	525ae23305	Add node-exporter alerts and Grafana dashboard * Add Prometheus alerts from node-exporter * Add Grafana dashboard nodes.json, from node-exporter * Not adding recording rules, since those are only used by some node-exporter USE dashboards not being included	2019-11-16 13:47:20 -08:00
Dalton Hubble	8a9e8595ae	Fix terraform fmt formatting	2019-11-13 23:44:02 -08:00
Dalton Hubble	19ee57dc04	Use GCP region_instance_group_manager version block format * terraform-provider-google v2.19.0 deprecates `instance_template` within `google_compute_region_instance_group_manager` in order to support a scheme with multiple version blocks. Adapt our single version to the new format to resolve deprecation warnings. * Fixes: Warning: "instance_template": [DEPRECATED] This field will be replaced by `version.instance_template` in 3.0.0 * Require terraform-provider-google v2.19.0+ (action required)	2019-11-13 17:41:13 -08:00
Dalton Hubble	0e4ee5efc9	Add small CPU resource requests to static pods * Set small CPU requests on static pods kube-apiserver, kube-controller-manager, and kube-scheduler to align with upstream tooling and for edge cases * Effectively, a practical case for these requests hasn't been observed. However, a small static pod CPU request may offer a slight benefit if a controller became overloaded and the below mechanisms were insufficient Existing safeguards: * Control plane nodes are tainted to isolate them from ordinary workloads. Even dense workloads can only compress CPU resources on worker nodes. * Control plane static pods use the highest priority class, so contention favors control plane pods (over say node-exporter) and CPU is compressible too. See: https://github.com/poseidon/terraform-render-bootstrap/pull/161	2019-11-13 17:18:45 -08:00
Dalton Hubble	a271b9f340	Update CoreDNS from v1.6.2 to v1.6.5 * Add health `lameduck` option 5s. Before CoreDNS shuts down, it will wait and report unhealthy for 5s to allow time for plugins to shutdown cleanly * Minor bug fixes over a few releases * https://coredns.io/2019/08/31/coredns-1.6.3-release/ * https://coredns.io/2019/09/27/coredns-1.6.4-release/ * https://coredns.io/2019/11/05/coredns-1.6.5-release/	2019-11-13 16:47:44 -08:00
Dalton Hubble	cb0598e275	Adopt Terraform v0.12 templatefile function * Update terraform-render-bootstrap module to adopt the Terrform v0.12 templatefile function feature to replace the use of terraform-provider-template's `template_dir` * Require Terraform v0.12.6+ which adds `for_each` Background: * `template_dir` was added to `terraform-provider-template` to add support for template directory rendering in CoreOS Tectonic Kubernetes distribution (~2017) * Terraform v0.12 introduced a native `templatefile` function and v0.12.6 introduced native `for_each` support (July 2019) that makes it possible to replace `template_dir` usage	2019-11-13 16:33:36 -08:00
Dalton Hubble	ad117f4592	Update recommended Terraform provider versions * Recommend provider plugin version tested against v1.16.3	2019-11-13 13:53:46 -08:00
Dalton Hubble	42b6df89c8	Update Prometheus from v2.14.0-rc.0 to v2.14.0 * https://github.com/prometheus/prometheus/releases/tag/v2.14.0	2019-11-13 13:41:11 -08:00
Dalton Hubble	d7061020ba	Update Kubernetes from v1.16.2 to v1.16.3 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.16.md#v1163	2019-11-13 13:05:15 -08:00
Dalton Hubble	a8b7792338	Update Grafana from v6.4.3 to v6.4.4 * https://github.com/grafana/grafana/releases/tag/v6.4.4	2019-11-07 12:00:25 -08:00
Dalton Hubble	a3807086d4	Update Prometheus from v2.13.1 to v2.14.0-rc.0 * Happy PromCon 2019! * https://github.com/prometheus/prometheus/releases/tag/v2.14.0-rc.0	2019-11-07 11:48:23 -08:00
Dalton Hubble	2c163503f1	Update etcd from v3.4.2 to v3.4.3 * etcd v3.4.3 builds with Go v1.12.12 instead of v1.12.9 and adds a few minor metrics fixes * https://github.com/etcd-io/etcd/compare/v3.4.2...v3.4.3	2019-11-07 11:41:01 -08:00
Dalton Hubble	0034a15711	Update Calico from v3.10.0 to v3.10.1 * https://docs.projectcalico.org/v3.10/release-notes/	2019-11-07 11:38:32 -08:00
Konstantinos Koukopoulos	38957163cb	Output resource_group_id in Azure (#577 ) * Add an output variable `resource_group_id` to the azure module	2019-10-31 01:05:04 -07:00
Dalton Hubble	d4573092b5	Improve Kubelet and Compute Resource dashboards * Add cluster filter to Kubelet dashboard * Add network details in resource dashboards * https://github.com/kubernetes-monitoring/kubernetes-mixin/pull/275 * https://github.com/kubernetes-monitoring/kubernetes-mixin/pull/284 * https://github.com/kubernetes-monitoring/kubernetes-mixin/pull/285	2019-10-28 02:22:15 -07:00
Dalton Hubble	4775e9d0f7	Upgrade Calico v3.9.2 to v3.10.0 * Allow advertising Kubernetes service ClusterIPs to BGPPeer routers via a BGPConfiguration * Improve EdgeRouter docs about routes and BGP * https://docs.projectcalico.org/v3.10/release-notes/ * https://docs.projectcalico.org/v3.10/networking/advertise-service-ips	2019-10-27 14:13:41 -07:00
Dalton Hubble	d418045929	Switch kube-proxy from iptables mode to ipvs mode * Kubernetes v1.11 considered kube-proxy IPVS mode GA * Many problems were found #321 * Since then, major blockers seem to have been addressed	2019-10-27 00:37:41 -07:00
Dalton Hubble	eb7b6d39f2	Improve minor aspects of CoreDNS and nginx-ingress dashboards * Add default 10s refresh rate to custom dashboards to match those from Kubernetes * Show labels for "instance" as "pod" for clarity * Add cluster filter for internal use	2019-10-20 23:16:55 -07:00
Dalton Hubble	33d4c2fd68	Add explicit annotation for Prometheus port to scrape * Without the prometheus.io/port annotation, Prometheus service discovery can scrape other Prometheus ports that may be available. * For example, Prometheus sidecars (not included) may be scraped and that may be unintended	2019-10-20 16:05:09 -07:00
Dalton Hubble	de90cb9246	Remove kube-state-metrics addon-resizer * addon-resizer is outdated and has been dropped from kube-state-metrics examples. Those using it should look to the cluster-proportional-vertical-autoscaler. * Eliminate addon-resizer log spew * Remove associated Role and RoleBinding * Also fix kube-state-metrics readinessProbe port	2019-10-20 16:03:29 -07:00
Dalton Hubble	68da420adc	Refresh Prometheus rules/alerts and Grafana dashboards * Update Prometheus rules/alerts and Grafana dashboards * Remove dashboards that were moved to node-exporter, they may be added back later if valuable * Remove kube-prometheus based rules/alerts (ClockSkew alert)	2019-10-19 17:43:47 -07:00
Dalton Hubble	130c97f8eb	Update Prometheus from v2.13.0 to v2.13.1 * https://github.com/prometheus/prometheus/releases/tag/v2.13.1	2019-10-18 00:10:25 -07:00
Dalton Hubble	271d2f6b52	Update Grafana from v6.4.2 to v6.4.3 * https://github.com/grafana/grafana/releases/tag/v6.4.3	2019-10-18 00:08:39 -07:00
Dalton Hubble	0595915a19	Cleanup CHANGES notes v1.16.2	2019-10-15 23:25:45 -07:00
Dalton Hubble	e6bc5143aa	Default to Calico as the CNI provider on Azure/DigitalOcean * Change `networking` default from flannel to calico on Azure and DigitalOcean * AWS, bare-metal, and Google Cloud continue to default to Calico (as they have since v1.7.5) * Typhoon now defaults to using Calico and supporting NetworkPolicy on all platforms	2019-10-15 23:15:40 -07:00
Dalton Hubble	e4ac1027c8	Update Grafana from v6.4.1 to v6.4.2 * https://github.com/grafana/grafana/releases/tag/v6.4.2	2019-10-15 22:58:43 -07:00
Dalton Hubble	24fc440d83	Update Kubernetes from v1.16.1 to v1.16.2 * Update Calico from v3.9.1 to v3.9.2	2019-10-15 22:42:52 -07:00
Dalton Hubble	a6702573a2	Update etcd from v3.4.1 to v3.4.2 * https://github.com/etcd-io/etcd/releases/tag/v3.4.2	2019-10-15 00:06:15 -07:00
Dalton Hubble	69188af565	Rename CLUO label from "app" to "name" * Match the labeling pattern in other addons	2019-10-15 00:05:02 -07:00
Dalton Hubble	d874bdd17d	Update bootstrap module control plane manifests and type constraints * Remove unneeded control plane flags that correspond to defaults * Adopt Terraform v0.12 type constraints in bootstrap module	2019-10-06 21:09:30 -07:00
Dalton Hubble	5b9dab6659	Introduce list of detail objects for bare-metal machines * Define bare-metal `controllers` and `workers` as a complex type list(object{name=string, mac=string, domain=string}) to allow clusters with many machines to be defined more cleanly * Remove `controller_names` list variable * Remove `controller_macs` list variable * Remove `controller_domains` list variable * Remove `worker_names` list variable * Remove `worker_macs` list variable * Remove `worker_domains` list variable	2019-10-06 20:22:45 -07:00
Dalton Hubble	5196709fe0	Update docs, CHANGES, and mkdocs-material * Update mkdocs-material from v4.4.2 to v4.4.3 * Update recommended Terraform provider versions * Cleanup the changelog before release v1.16.1	2019-10-06 18:41:25 -07:00
Dalton Hubble	ab72f1ab2d	Update Prometheus from v2.12.0 to v2.13.0 * https://github.com/prometheus/prometheus/releases/tag/v2.13.0	2019-10-06 18:22:20 -07:00
Dalton Hubble	5ef4155e08	Detect most recent Fedora CoreOS AMI in region * Detect the most recent Fedora CoreOS AMI to allow usage of Fedora CoreOS in supported regions (previously just us-east-1) * Unpin the Fedora CoreOS AMI image which was pinned to images that had been checked. This does mean if Fedora publishes a broken image, it will be selected * Filter out "dev" images which have similar naming	2019-10-06 18:13:55 -07:00
Dalton Hubble	15c4b793c3	Use new Fedora CoreOS kernel/initrd/raw asset names * Fedora CoreOS changed the kernel, initramfs, and raw image asset download paths and names in 30.20191002.0	2019-10-06 17:31:21 -07:00
Dalton Hubble	36ed53924f	Add stricter types for bare-metal modules * Review variables available in bare-metal kubernetes modules for Container Linux and Fedora CoreOS * Deprecate cluster_domain_suffix variable * Remove deprecated container_linux_oem variable	2019-10-06 17:18:50 -07:00
Dalton Hubble	19de38b30d	Fix Prometheus etcd metrics scraping * Prometheus was configured to use kubernetes discovery of etcd targets based on nodes matching the node label node-role.kubernetes.io/controller=true * Kubernetes v1.16 stopped permitting node role labels node-role.kubernetes.io/* so Typhoon renamed these labels (no longer any association with roles) to node.kubermetes.io/controller=true * As a result, Prometheus didn't discover etcd targets, etcd metrics were missing, etcd alerts were ineffective, and the etcd Grafana dashboard was empty * Introduced: https://github.com/poseidon/typhoon/pull/543	2019-10-03 19:07:05 -07:00
Dalton Hubble	995824fa6d	Add stricter types for DigitalOcean module * Review variables available in DigitalOcean kubernetes module and sync with documentation * Promote Calico for DigitalOcean and Azure beyond experimental (its the primary mode I've used since it was introduced)	2019-10-02 21:48:24 -07:00
Dalton Hubble	1c5ed84fc2	Update Kubernetes from v1.16.0 to v1.16.1 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.16.md#v1161	2019-10-02 21:31:55 -07:00
Dalton Hubble	ca7d62720e	Update Grafana from v6.3.6 to v6.4.1 * https://github.com/grafana/grafana/releases/tag/v6.4.1	2019-10-02 20:36:05 -07:00
Dalton Hubble	26f8d76755	Update kube-state-metrics from v1.7.2 to v1.8.0 * https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.8.0	2019-10-01 20:50:33 -07:00
Dalton Hubble	fdd6882a87	Add stricter types to Azure modules * Review variables available in Azure kubernetes and workers modules and sync with documentation * Fix internal workers module default type to Standard_DS1_v2	2019-09-30 22:20:20 -07:00
Dalton Hubble	f82266ac8c	Add stricter types for GCP modules * Review variables available in google-cloud kubernetes and workers modules and in documentation	2019-09-30 22:04:35 -07:00
Dalton Hubble	7bcf2d7831	Update nginx-ingress from v0.25.1 to v0.26.1 * Add lifecycle hook to allow draining connections for up to 5 minutes	2019-09-30 22:01:07 -07:00

1 2 3 4 5 ...

823 Commits