Fix Prometheus etcd metrics scraping

* Prometheus was configured to use kubernetes discovery
of etcd targets based on nodes matching the node label
node-role.kubernetes.io/controller=true
* Kubernetes v1.16 stopped permitting node role labels
node-role.kubernetes.io/* so Typhoon renamed these labels
(no longer any association with roles) to
node.kubermetes.io/controller=true
* As a result, Prometheus didn't discover etcd targets,
etcd metrics were missing, etcd alerts were ineffective,
and the etcd Grafana dashboard was empty
* Introduced: https://github.com/poseidon/typhoon/pull/543
This commit is contained in:
Dalton Hubble 2019-10-03 18:56:51 -07:00
parent 995824fa6d
commit 19de38b30d
2 changed files with 3 additions and 1 deletions

View File

@ -35,6 +35,8 @@ Notable changes between versions.
#### Addons #### Addons
* Fix Prometheus etcd target discovery and scraping ([#561](https://github.com/poseidon/typhoon/pull/561))
* Fix node label matcher for etcd target discovery (regressed in v1.16.0)
* Update kube-state-metrics from v1.7.2 to v1.8.0 * Update kube-state-metrics from v1.7.2 to v1.8.0
* Update nginx-ingress from v0.25.1 to [v0.26.1](https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.26.1) ([#555](https://github.com/poseidon/typhoon/pull/555)) * Update nginx-ingress from v0.25.1 to [v0.26.1](https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.26.1) ([#555](https://github.com/poseidon/typhoon/pull/555))
* Add lifecycle hook to allow draining for up to 5 minutes * Add lifecycle hook to allow draining for up to 5 minutes

View File

@ -115,7 +115,7 @@ data:
- role: node - role: node
scheme: http scheme: http
relabel_configs: relabel_configs:
- source_labels: [__meta_kubernetes_node_label_node_role_kubernetes_io_controller] - source_labels: [__meta_kubernetes_node_label_node_kubernetes_io_controller]
action: keep action: keep
regex: 'true' regex: 'true'
- action: labelmap - action: labelmap