typhoon

mirror of https://github.com/puppetmaster/typhoon.git synced 2025-10-03 22:34:38 +02:00

Author	SHA1	Message	Date
Dalton Hubble	fc277eaab6	Document the GCP DNS admin requirement for cluster provisioning * Configure the google terraform provider to use GCP service account credentials with compute and dns admin privileges	2019-03-02 10:54:35 -08:00
Dalton Hubble	a08adc92b5	Update nginx-ingress from v0.22.0 to v0.23.0 * https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.23.0	2019-03-01 01:18:54 -08:00
Dalton Hubble	d42f42df4e	Re-measure cluster provision times and document	2019-03-01 01:15:08 -08:00
Dalton Hubble	4ff7fe2c29	Update Grafana dashboards from upstreams	2019-02-28 23:22:07 -08:00
Dalton Hubble	f598307998	Update Kubernetes from v1.13.3 to v1.13.4 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.13.md#v1134	2019-02-28 22:47:43 -08:00
Dalton Hubble	8ae552ebda	Update documentation for use with Ubiquiti EdgeOS * Show creation of a PXE-enabled network boot environment when using dnsmasq as the DHCP server * Recommend TFTP be served from /config/tftpboot since /config is preserved between firmware upgrades * Recommend compiling undionly.kpxe from source to enable TLS features * Add a note that equal-cost multi-path service IP routing (e.g. for ingress) requires EdgeOS v2.0. Previously, it was known that TLS handshakes couldn't be completed with packet balacing. I've verified this is no longer the case when using the v2.0 EdgeOS firmware, ECMP works as expected.	2019-02-27 23:36:27 -08:00
Dalton Hubble	daee5a9d60	Update Grafana from v6.0.0-beta3 to v6.0.0 * https://github.com/grafana/grafana/releases/tag/v6.0.0 * http://docs.grafana.org/guides/whats-new-in-v6-0/	2019-02-25 21:43:43 -08:00
Dalton Hubble	73ae5d5649	Update Calico from v3.5.1 to v3.5.2 * https://docs.projectcalico.org/v3.5/releases/	2019-02-25 21:23:13 -08:00
Dalton Hubble	42d7222f3d	Add a readiness probe to CoreDNS * https://github.com/poseidon/terraform-render-bootkube/pull/115	2019-02-23 13:25:23 -08:00
Dalton Hubble	d10c2b4cb9	Update Grafana from v6.0.0-beta2 to v6.0.0-beta3 * Update Grafana dashboards	2019-02-23 13:03:25 -08:00
Dalton Hubble	7f8572030d	Upgrade to support terraform-provider-google v2.0+ * Support terraform-provider-google v1.19.0, v1.19.1, v1.20.0 and v2.0+ (and allow for future 2.x.y releases) * Require terraform-provider-google v1.19.0 or newer. v1.19.0 introduced `network_interface` fields `network_ip` and `nat_ip` to deprecate `address` and `assigned_nat_ip`. Those deprecated fields are removed in terraform-provider-google v2.0 * https://github.com/terraform-providers/terraform-provider-google/releases/tag/v2.0.0	2019-02-20 02:33:32 -08:00
Dalton Hubble	4294bd0292	Assign Pod Priority classes to critical cluster and node components * Assign pod priorityClassNames to critical cluster and node components (higher is higher priority) to inform node out-of-resource eviction order and scheduler preemption and scheduling order * Priority Admission Controller has been enabled since Typhoon v1.11.1	2019-02-19 22:21:39 -08:00
Dalton Hubble	ba4c5de052	Set the Google Cloud minimum CPU platform to Intel Haswell * Intel Haswell or better is available in every zone around the world * Neither Kubernetes nor Typhoon have a particular minimum processor family. However, a few Google Cloud zones still default to Sandy/Ivy bridge (scheduled to shift April 2019). Price is only based on machine type so it is beneficial to opt for the next processor family * Intel Haswell is a suitable minimum since it still allows plenty of liberty in choosing any region or machine type * Likely a slight increase to preemption probability in a few zones, but any lower probability on Sandy/Ivy bridge is due to lower desirability as they're phased out * https://cloud.google.com/compute/docs/regions-zones/	2019-02-18 12:55:04 -08:00
Dalton Hubble	e483c81ce9	Improve Prometheus rules and alerts and Grafana dashboards * Collate upstream rules, alerts, and dashboards and tune for use in Typhoon * Previously, a well-chosen (but older) set of rules, alerts, and dashboards were maintained to reflect metric name changes	2019-02-18 12:19:23 -08:00
Dalton Hubble	6fa3b8a13f	Upgrade Grafana to v6.0.0-beta2 and enable Explore UI * Upgrade Grafana from v5.4.3 to v6.0.0-beta2 * Enable Grafana Explore UI while still using only the Viewer role (inspect/edit without saving) * http://docs.grafana.org/guides/whats-new-in-v6-0/	2019-02-17 13:26:42 -08:00
Dalton Hubble	ac95e83249	Update mkdocs-material from v3.3.0 to v4.0.1	2019-02-16 15:55:38 -08:00
Dalton Hubble	d988822741	Document and recommend terraform-provider-matchbox v0.2.3 * https://github.com/coreos/terraform-provider-matchbox/releases/tag/v0.2.3	2019-02-16 15:07:49 -08:00
Dalton Hubble	170ef74eea	Remove Nginx Ingress default backend * nginx-ingress no longer requires a configured default-backend, it will respond with its own 404 page starting in v0.21.0 * https://github.com/kubernetes/ingress-nginx/pull/3196	2019-02-16 14:18:15 -08:00
Dalton Hubble	b13a651cfe	Drop metrics that are unset, high cardinality, or extraneous * https://github.com/coreos/prometheus-operator/pull/2387 * https://github.com/coreos/prometheus-operator/pull/1959	2019-02-10 23:56:11 -08:00
Dalton Hubble	9c59f393a5	Add Kubernetes pod name to metrics discovered from service endpoints * Prometheus queries from some upstreams use joins of node-exporter and kube-state-metrics metrics by (namespace,pod). Add the Kubernetes pod name to service endpoint metrics * Rename the kubernetes_namespace field to namespace * Honor labels since kube-state-metrics already include a `pod` field that should not be overridden	2019-02-10 23:54:30 -08:00
Dalton Hubble	3e4b3bfb04	Raise nginx-ingress liveness/readiness timeout * Under heavy load, avoid timeouts causing nginx-ingress restarts https://github.com/kubernetes/ingress-nginx/pull/3737	2019-02-09 12:53:09 -08:00
Dalton Hubble	584088397c	Update etcd from v3.3.11 to v3.3.12 * https://github.com/etcd-io/etcd/releases/tag/v3.3.12	2019-02-09 11:54:54 -08:00
Dalton Hubble	0200058e0e	Update Calico from v3.5.0 to v3.5.1 * Fix in confd https://github.com/projectcalico/confd/pull/205	2019-02-09 11:49:31 -08:00
Dalton Hubble	d5537405e1	Add CHANGES note about reducing the pod eviciton timeout v1.13.3	2019-02-02 14:54:18 -08:00
Dalton Hubble	949ce21fb2	Update Prometheus from v2.7.0 to v2.7.1 * https://github.com/prometheus/prometheus/releases/tag/v2.7.1	2019-02-02 00:13:24 -08:00
Dalton Hubble	ccd96c37da	Update Kubernetes from v1.13.2 to v1.13.3 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.13.md#v1133	2019-02-01 23:26:13 -08:00
Carlos Cobo	acd539f865	Fix architecture title for DigitalOcean (#390 )	2019-02-01 23:20:06 -08:00
Dalton Hubble	244a1a601a	Switch CoreDNS to use the forward plugin instead of proxy * Use the forward plugin to forward to upstream resolvers, instead of the proxy plugin. The forward plugin is reported to be a faster alternative since it can re-use open sockets * https://coredns.io/explugins/forward/ * https://coredns.io/plugins/proxy/ * https://github.com/kubernetes/kubernetes/issues/73254	2019-01-30 22:25:23 -08:00
Dalton Hubble	d02af3d40d	Update mkdocs-material from v3.2.0 to v3.3.0 * Fix minor docs typos and errors * Allow a transient verison of the six PyPi package, the docs build system can use the 0.12.0 (0.11.0 broke sync tools so pinning to 0.10.0 was previously needed)	2019-01-29 23:16:57 -08:00
Dalton Hubble	130daeac26	Update Prometheus from v2.6.1 to v2.7.0	2019-01-29 22:31:20 -08:00
Dalton Hubble	1ab06f69d7	Update flannel from v0.10.0 to v0.11.0 * https://github.com/coreos/flannel/releases/tag/v0.11.0	2019-01-29 21:51:25 -08:00
Dalton Hubble	eb08593eae	Fix azure provider warning, rename a public_ip field * azurerm_public_ip (used internally) added a field `allocation_method` to replace the field `public_ip_address_allocation` (deprecated) * Require terraform-provider-azurerm v1.21+ * https://github.com/terraform-providers/terraform-provider-azurerm/pull/2576	2019-01-27 17:52:35 -08:00
Dalton Hubble	e9659a8539	Update Calico from v3.4.0 to v3.5.0 * https://docs.projectcalico.org/v3.5/releases/	2019-01-27 16:34:30 -08:00
Dalton Hubble	6b87132aa1	Fix per platform/OS links on the docs home page * Considering the reader of each, the Github README module links can go to module source code and docs module links can go to the associated tutorial docs for the platform/OS	2019-01-26 16:50:00 -08:00
Dalton Hubble	f5ff003d0e	Update node-exporter from v0.15.2 to v0.17.0 * node-exporter renamed multiple metrics that are reflected in changes to Prometheus rules and Grafana dashboard expressions	2019-01-22 01:14:00 -08:00
Dalton Hubble	d697dd46dc	Allow kube-state-metrics PodDisruptionBudget metrics * Update kube-state-metrics ClusterRole to allow collecting poddisruptionbudget metrics (exported as kube_poddisruptionbudget_) https://github.com/kubernetes/kube-state-metrics/pull/551 * Bump addon-resizer from v1.7 to v1.8.4	2019-01-22 01:12:32 -08:00
Dalton Hubble	2f3097ebea	Update nginx-ingress from v0.21.0 to v0.22.0 * https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.22.0	2019-01-16 23:01:22 -08:00
Dalton Hubble	f4d3508578	Update CoreDNS from v1.3.0 to v1.3.1 * https://coredns.io/2019/01/13/coredns-1.3.1-release/	2019-01-15 22:50:25 -08:00
Dalton Hubble	67fb9602e7	Update Prometheus from v2.6.0 to v2.6.1 * https://github.com/prometheus/prometheus/releases/tag/v2.6.1	2019-01-15 21:13:40 -08:00
Dalton Hubble	c8a85fabe1	Update Grafana from v5.4.2 to v5.4.3 * https://github.com/grafana/grafana/releases/tag/v5.4.3	2019-01-15 21:13:16 -08:00
Dalton Hubble	7eafa59d8f	Fix instance shutdown automatic worker deletion on clouds * Fix a regression caused by lowering the Kubelet TLS client certificate to system:nodes group (#100) since dropping cluster-admin dropped the Kubelet's ability to delete nodes. * On clouds where workers can scale down (manual terraform apply, AWS spot termination, Azure low priority deletion), worker shutdown runs the delete-node.service to remove a node to prevent NotReady nodes from accumulating * Allow Kubelets to delete cluster nodes via system:nodes group. Kubelets acting with system:node and kubelet-delete ClusterRoles is still an improvement over acting as cluster-admin	2019-01-14 23:27:48 -08:00
Dalton Hubble	679079b242	Add AWS ingress_zone_id output with NLB DNS name's Route53 zone id * DNS zones served by AWS Route53 may use AWS's special alias records (other DNS providers would use a CNAME) to resolve the ingress NLB. Alias records require the NLB DNS name's DNS zone id (not the cluster `dns_zone_id`)	2019-01-13 16:45:52 -08:00
Dalton Hubble	1d27dc6528	Update kube-state-metrics exporter from v1.4.0 to v1.5.0 * https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.5.0	2019-01-12 14:24:57 -08:00
Dalton Hubble	b74cc8afd2	Update etcd from v3.3.10 to v3.3.11 * https://github.com/etcd-io/etcd/releases/tag/v3.3.11	2019-01-12 14:17:25 -08:00
Dalton Hubble	1d66ad33f7	Change AWS worker modules' default type from t2.small to t3.small * Worker instance types weren't updated in #365 v1.13.2	2019-01-12 00:07:48 -08:00
Dalton Hubble	4d32b79c6f	Update Kubernetes from v1.13.1 to v1.13.2 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.13.md#v1132	2019-01-12 00:00:53 -08:00
Dalton Hubble	df4c0ba05d	Use HTTPS liveness probes for kube-scheduler and kube-controller-manager * Disable kube-scheduler and kube-controller-manager HTTP ports	2019-01-09 20:56:50 -08:00
Dalton Hubble	bfe0c74793	Enable the certificates.k8s.io API to issue cluster certificates * System components that require certificates signed by the cluster CA can submit a CSR to the apiserver, have an administrator inspect and approve it, and be issued a certificate * Configure kube-controller-manager to sign Approved CSR's using the cluster CA private key * Admins are responsible for approving or denying CSRs, otherwise, no certificate is issued. Read the Kubernetes docs carefully and verify the entity making the request and the authorization level * https://kubernetes.io/docs/tasks/tls/managing-tls-in-a-cluster	2019-01-06 17:33:37 -08:00
Dalton Hubble	60c70797ec	Use a single format of the admin kubeconfig * Use a single admin kubeconfig for initial bootkube bootstrap and for use by a human admin. Previously, an admin kubeconfig without a named context was used for bootstrap and direct usage with KUBECONFIG=path, while one with a named context was used for `kubectl config use-context` style usage. Confusing. * Provide the admin kubeconfig via `assets/auth/kubeconfig`, `assets/auth/CLUSTER-config`, or output `kubeconfig-admin`	2019-01-05 14:57:18 -08:00
Dalton Hubble	6795a753ea	Update CoreDNS from v1.2.6 to v1.3.0 * https://coredns.io/2018/12/15/coredns-1.3.0-release/	2019-01-05 13:35:03 -08:00

... 2 3 4 5 6 ...

735 Commits