typhoon

mirror of https://github.com/puppetmaster/typhoon.git synced 2025-02-18 22:51:27 +01:00

Author	SHA1	Message	Date
Dalton Hubble	ab7913a061	Accept initial worker node labels and taints map on bare-metal * Add `worker_node_labels` map from node name to a list of initial node label strings * Add `worker_node_taints` map from node name to a list of initial node taint strings * Unlike cloud platforms, bare-metal node labels and taints are defined via a map from node name to list of labels/taints. Bare-metal clusters may have heterogeneous hardware so per node labels and taints are accepted * Only worker node names are allowed. Workloads are not scheduled on controller nodes so altering their labels/taints isn't suitable ``` module "mercury" { ... worker_node_labels = { "node2" = ["role=special"] } worker_node_taints = { "node2" = ["role=special:NoSchedule"] } } ``` Related: https://github.com/poseidon/typhoon/issues/429	2020-03-09 00:12:02 -07:00
Dalton Hubble	7b0ea23cdc	Upgrade terraform-provider-azurerm to v2.0+ * Add support for `terraform-provider-azurerm` v2.0+. Require `terraform-provider-azurerm` v2.0+ and drop v1.x support since the Azure provider major release is not backwards compatible * Use Azure's new Linux VM and Linux VM Scale Set resources * Change controller's Azure disk caching to None * Associate subnets (in addition to NICs) with security groups (aesthetic) * If set, change `worker_priority` from `Low` to `Spot` (action required) Related: * https://www.terraform.io/docs/providers/azurerm/guides/2.0-upgrade-guide.html	2020-03-08 17:40:13 -07:00
Dalton Hubble	c4683c5bad	Refresh Prometheus alerts and Grafana dashboards * Add 2 min wait before KubeNodeUnreachable to be less noisy on premeptible clusters * Add a BlackboxProbeFailure alert for any failing probes for services annotated `prometheus.io/probe: true`	2020-03-02 20:08:37 -08:00
Dalton Hubble	51cee6d5a4	Change Container Linux etcd-member to fetch with docker:// * Quay has historically generated ACI signatures for images to facilitate rkt's notions of verification (it allowed authors to actually sign images, though `--trust-keys-from-https` is in use since etcd and most authors don't sign images). OCI standardization didn't adopt verification ideas and checking signatures has fallen out of favor. * Fix an issue where Quay no longer seems to be generating ACI signatures for new images (e.g. quay.io/coreos/etcd:v.3.4.4) * Don't be alarmed by rkt `--insecure-options=image`. It refers to disabling image signature checking (i.e. docker pull doesn't check signatures either) * System containers for Kubelet and bootstrap have transitioned to the docker:// transport, so there is precedent and this brings all the system containers on Container Linux controllers into alignment	2020-03-02 19:57:45 -08:00
Dalton Hubble	87f9a2fc35	Add automatic worker deletion on Fedora CoreOS clouds * On clouds where workers can scale down or be preempted (AWS, GCP, Azure), shutdown runs delete-node.service to remove a node a prevent NotReady nodes from lingering * Add the delete-node.service that wasn't carried over from Container Linux and port it to use podman	2020-02-29 20:22:03 -08:00
Dalton Hubble	6de5cf5a55	Update etcd from v3.4.3 to v3.4.4 * https://github.com/etcd-io/etcd/releases/tag/v3.4.4	2020-02-29 16:19:29 -08:00
Dalton Hubble	3250994c95	Use a route table with separate (rather than inline) routes * Allow users to extend the route table using a data reference and adding route resources (e.g. unusual peering setups) * Note: Internally connecting AWS clusters can reduce cross-cloud flexibility and inhibits blue-green cluster patterns. It is not recommended	2020-02-25 23:21:58 -08:00
Dalton Hubble	f4d260645c	Update node-exporter from v0.18.1 to v1.0.0-rc.0 * Update mdadm alert rule; node-exporter adds `state` label to `node_md_disks` and removes `node_md_disks_active` * https://github.com/prometheus/node_exporter/releases/tag/v1.0.0-rc.0	2020-02-25 22:29:52 -08:00
Dalton Hubble	d9219a6722	Update nginx-ingress from v0.29.0 to v0.30.0 * https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.30.0	2020-02-25 22:11:59 -08:00
Dalton Hubble	60c7eb85ee	Update nginx-ingress from v0.28.0 to v0.29.0 * https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.29.0	2020-02-22 15:57:59 -08:00
Dalton Hubble	4c964b56a0	Update kube-state-metrics from v1.9.4 to v1.9.5 * https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.9.5	2020-02-22 15:21:10 -08:00
Dalton Hubble	1fbd6835f2	Update Grafana from v6.6.1 to v6.6.2 * https://github.com/grafana/grafana/releases/tag/v6.6.2	2020-02-22 15:19:24 -08:00
Dalton Hubble	e4d977bfcd	Fix worker_node_labels for initial Fedora CoreOS * Add Terraform strip markers to consume beginning and trailing whitespace in templated Kubelet arguments for podman (Fedora CoreOS only) * Fix initial `worker_node_labels` being quietly ignored on Fedora CoreOS cloud platforms that offer the feature * Close https://github.com/poseidon/typhoon/issues/650	2020-02-22 15:12:35 -08:00
Dalton Hubble	4a38fb5927	Update CoreDNS from v1.6.6 to v1.6.7 * https://coredns.io/2020/01/28/coredns-1.6.7-release/	2020-02-18 21:46:19 -08:00
Dalton Hubble	7ca03e5219	Update Prometheus from v1.15.2 to v1.16.0 * https://github.com/prometheus/prometheus/releases/tag/v2.16.0	2020-02-14 12:10:56 -08:00
Dalton Hubble	362b3fac5c	Add guide for Typhoon with Flatcar Linux on DigitalOcean * Add docs on manually uploading a Flatcar Linux DigitalOcean bin image as a custom image and using a data reference * Set status of Flatcar Linux on DigitalOcean to alpha * IPv6 is not supported for DigitalOcean custom images	2020-02-14 12:08:58 -08:00
Dalton Hubble	32db59b9eb	Update CHANGELOG sections and links	2020-02-14 12:05:51 -08:00
Dalton Hubble	008817b0aa	Promote Fedora CoreOS AWS/bare-metal to beta * Remove alpha warnings from docs headers	2020-02-13 14:25:22 -08:00
Dalton Hubble	49d3b9e6b3	Set docker log driver to json-file on Fedora CoreOS * Fix the last minor issue for Fedora CoreOS clusters to pass CNCF's Kubernetes conformance tests * Kubelet supports a seldom used feature `kubectl logs --limit-bytes=N` to trim a log stream to a desired length. Kubelet handles this in the CRI driver. The Kubelet docker shim only supports the limit bytes feature when Docker is configured with the default `json-file` logging driver * CNCF conformance tests started requiring limit-bytes be supported, indirectly forcing the log driver choice until either the Kubelet or the conformance tests are fixed * Fedora CoreOS defaults Docker to use `journald` (desired). For now, as a workaround to offer conformant clusters, the log driver can be set back to `json-file`. RHEL CoreOS likely won't have noticed the non-conformance since its using crio runtime * https://github.com/kubernetes/kubernetes/issues/86367 Note: When upstream has a fix, the aim is to drop the docker config override and use the journald default	2020-02-11 23:00:38 -08:00
Dalton Hubble	1243f395d1	Update Kubernetes from v1.17.2 to v1.17.3 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.17.md#v1173	2020-02-11 20:22:14 -08:00
Dalton Hubble	846f11097f	Update Fedora CoreOS kernel arguments to align with upstream * Align bare-metal kernel arguments with upstream docs * Add missing initrd argument which can cause issues if not present. Fix #638 * Add tty0 and ttyS0 consoles (matches Container Linux) * Remove unused coreos.inst=yes Related: https://docs.fedoraproject.org/en-US/fedora-coreos/bare-metal/	2020-02-11 20:11:19 -08:00
Dalton Hubble	ba84f86dc7	Add guide for Typhoon with Flatcar Linux on Google Cloud * Add docs on manually uploading a Flatcar Linux GCE/GCP gzipped tarball image as a Compute Engine image for use with the Typhoon container-linux module * Set status of Flatcar Linux on Google Cloud to alpha	2020-02-11 19:38:40 -08:00
Dalton Hubble	34c3d7cc39	Update Grafana from v6.6.0 to v6.6.1 * https://github.com/grafana/grafana/releases/tag/v6.6.1	2020-02-08 14:50:33 -08:00
Dalton Hubble	ca96a1335c	Update Calico from v3.11.2 to v3.12.0 * https://docs.projectcalico.org/release-notes/#v3120 * Remove reverse packet filter override, since Calico no longer relies on the setting * https://github.com/coreos/fedora-coreos-tracker/issues/219 * https://github.com/projectcalico/felix/pull/2189	2020-02-06 00:43:33 -08:00
Dalton Hubble	e339fbd2b6	Update kube-state-metrics from v1.9.3 to v1.9.4 * https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.9.4	2020-02-04 21:33:34 -08:00
Dalton Hubble	8cc303c9ac	Add module for Fedora CoreOS on Google Cloud * Add Typhoon Fedora CoreOS on Google Cloud as alpha * Add docs on uploading the Fedora CoreOS GCP gzipped tarball to Google Cloud storage to create a boot disk image	2020-02-01 15:21:40 -08:00
Dalton Hubble	b19ba16afa	Update nginx-ingress from v0.27.1 to v0.28.0 * https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.28.0	2020-01-30 18:00:23 -08:00
Dalton Hubble	d127a7345c	Update Grafana from v6.5.3 to v6.6.0 * https://github.com/grafana/grafana/releases/tag/v6.6.0	2020-01-27 20:46:32 -08:00
Dalton Hubble	5643ad525f	Promote Fedora CoreOS from preview to alpha in docs * Add an announcement to the website as well	2020-01-23 08:47:18 -08:00
Dalton Hubble	d5b7ce8f27	Update kube-state-metrics from v1.9.2 to v1.9.3 * https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.9.3	2020-01-23 00:03:16 -08:00
Dalton Hubble	1cda5bcd2a	Update Kubernetes from v1.17.1 to v1.17.2 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.17.md#v1172	2020-01-21 18:27:39 -08:00
Dalton Hubble	bda73264f7	Update nginx-ingress from v0.26.1 to v0.27.1 * Change runAsUser from 33 to 101 for new alpine-based image * https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.27.0 * https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.27.1	2020-01-20 15:22:16 -08:00
Dalton Hubble	dd930a2ff9	Update bare-metal Fedora CoreOS image location * Use Fedora CoreOS production download streams (change) * Use live PXE kernel and initramfs images * https://getfedora.org/coreos/download/ * Update docs example to use public images (cache is still recommended at large scale) and stable stream	2020-01-20 14:44:06 -08:00
Dalton Hubble	03ff3a9cf3	Update kube-state-metrics from v1.9.1 to v1.9.2 * https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.9.2	2020-01-18 15:32:10 -08:00
Dalton Hubble	48703f9906	Update Grafana from v6.5.2 to v6.5.3 * https://github.com/grafana/grafana/releases/tag/v6.5.3	2020-01-18 15:30:39 -08:00
Dalton Hubble	7daabd28b5	Update Calico from v3.11.1 to v3.11.2 * https://docs.projectcalico.org/v3.11/release-notes/	2020-01-18 13:45:24 -08:00
Dalton Hubble	0e2fc89f78	Update kube-state-metrics from v1.9.0 to v1.9.1 * https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.9.1	2020-01-11 14:15:55 -08:00
Dalton Hubble	b1f521fc4a	Allow terraform-provider-google v3.x plugin versions * Typhoon Google Cloud is compatible with `terraform-provider-google` v3.x releases * No v3.x specific features are used, so v2.19+ provider versions are still allowed, to ease migrations	2020-01-11 14:07:18 -08:00
Dalton Hubble	73588cfad3	Update Prometheus from v2.15.1 to v2.15.2 * https://github.com/prometheus/prometheus/releases/tag/v2.15.2	2020-01-06 22:08:34 -08:00
Dalton Hubble	bb586b60da	Reduce Prometheus addon's node-exporter tolerations * Change node-exporter DaemonSet tolerations from tolerating all possible NoSchedule taints to tolerating the master taint and the not ready taint (we'd like metrics regardless) * Users who add custom node taints must add their custom taints to the addon node-exporter DaemonSet. As an addon, its expected users copy and manipulate manifests out-of-band in their own systems	2020-01-06 21:24:24 -08:00
Dalton Hubble	43e05b9131	Enable kube-proxy metrics and allow Prometheus scrapes * Configure kube-proxy --metrics-bind-address=0.0.0.0 (default 127.0.0.1) to serve metrics on 0.0.0.0:10249 * Add firewall rules to allow Prometheus (resides on a worker) to scrape kube-proxy service endpoints on controllers or workers * Add a clusterIP: None service for kube-proxy endpoint discovery	2020-01-06 21:11:18 -08:00
Dalton Hubble	b2eb3e05d0	Disable Kubelet 127.0.0.1.10248 healthz endpoint * Kubelet runs a healthz server listening on 127.0.0.1:10248 by default. Its unused by Typhoon and can be disabled * https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/	2019-12-29 11:23:25 -08:00
Dalton Hubble	f1f4cd6fc0	Inline Container Linux kubelet.service, deprecate kubelet-wrapper * Change kubelet.service on Container Linux nodes to ExecStart Kubelet inline to replace the use of the host OS kubelet-wrapper script * Express rkt run flags and volume mounts in a clear, uniform way to make the Kubelet service easier to audit, manage, and understand * Eliminate reliance on a Container Linux kubelet-wrapper script * Typhoon for Fedora CoreOS developed a kubelet.service that similarly uses an inline ExecStart (except with podman instead of rkt) and a more minimal set of volume mounts. Adopt the volume improvements: * Change Kubelet /etc/kubernetes volume to read-only * Change Kubelet /etc/resolv.conf volume to read-only * Remove unneeded /var/lib/cni volume mount Background: * kubelet-wrapper was added in CoreOS around the time of Kubernetes v1.0 to simplify running a CoreOS-built hyperkube ACI image via rkt-fly. The script defaults are no longer ideal (e.g. rkt's notion of trust dates back to quay.io ACI image serving and signing, which informed the OCI standard images we use today, though they still lack rkt's signing ideas). * Shipping kubelet-wrapper was regretted at CoreOS, but remains in the distro for compatibility. The script is not updated to track hyperkube changes, but it is stable and kubelet.env overrides bridge most gaps * Typhoon Container Linux nodes have used kubelet-wrapper to rkt/rkt-fly run the Kubelet via the official k8s.gcr.io hyperkube image using overrides (new image registry, new image format, restart handling, new mounts, new entrypoint in v1.17). * Observation: Most of what it takes to run a Kubelet container is defined in Typhoon, not in kubelet-wrapper. The wrapper's value is now undermined by having to workaround its dated defaults. Typhoon may be better served defining Kubelet.service explicitly * Typhoon for Fedora CoreOS developed a kubelet.service without the use of a host OS kubelet-wrapper which is both clearer and eliminated some volume mounts	2019-12-29 11:17:26 -08:00
Dalton Hubble	11565ffa8a	Update Calico from v3.10.2 to v3.11.1 * https://docs.projectcalico.org/v3.11/release-notes/	2019-12-28 11:08:03 -08:00
Dalton Hubble	a4e843693f	Update Prometheus from v2.15.0 to v2.15.1 * https://github.com/prometheus/prometheus/releases/tag/v2.15.1	2019-12-26 09:12:55 -05:00
Dalton Hubble	f48e43c0b1	Update Prometheus from v2.14.0 to v2.15.0 * https://github.com/prometheus/prometheus/releases/tag/v2.15.0	2019-12-24 10:52:19 -05:00
Dalton Hubble	daa8d9d9ec	Update CoreDNS from v1.6.5 to v1.6.6 * https://coredns.io/2019/12/11/coredns-1.6.6-release/	2019-12-22 10:47:19 -05:00
Dalton Hubble	52d11096dc	Update kube-state-metrics from v1.9.0-rc.1 to v1.9.0 * https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.9.0 * https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.9.0-rc.1 * https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.9.0-rc.0	2019-12-20 13:53:37 -08:00
Dalton Hubble	1b9fa2e688	Update Grafana from v6.5.1 to v6.5.2 * https://github.com/grafana/grafana/releases/tag/v6.5.2	2019-12-14 15:25:48 -08:00
Dalton Hubble	f69dc2ea0f	Update CHANGES and tutorial notes for release * Update recommended Terraform and provider plugin versions * Update the rough count of resources created per cluster since its not been refreshed in a while (will vary based on cluster options)	2019-12-10 23:03:39 -08:00

... 3 4 5 6 7 ...

696 Commits