Dalton Hubble
c4683c5bad
Refresh Prometheus alerts and Grafana dashboards
...
* Add 2 min wait before KubeNodeUnreachable to be less
noisy on premeptible clusters
* Add a BlackboxProbeFailure alert for any failing probes
for services annotated `prometheus.io/probe: true`
2020-03-02 20:08:37 -08:00
Dalton Hubble
1fbd6835f2
Update Grafana from v6.6.1 to v6.6.2
...
* https://github.com/grafana/grafana/releases/tag/v6.6.2
2020-02-22 15:19:24 -08:00
Dalton Hubble
34c3d7cc39
Update Grafana from v6.6.0 to v6.6.1
...
* https://github.com/grafana/grafana/releases/tag/v6.6.1
2020-02-08 14:50:33 -08:00
Dalton Hubble
d127a7345c
Update Grafana from v6.5.3 to v6.6.0
...
* https://github.com/grafana/grafana/releases/tag/v6.6.0
2020-01-27 20:46:32 -08:00
Dalton Hubble
48703f9906
Update Grafana from v6.5.2 to v6.5.3
...
* https://github.com/grafana/grafana/releases/tag/v6.5.3
2020-01-18 15:30:39 -08:00
Dalton Hubble
1b9fa2e688
Update Grafana from v6.5.1 to v6.5.2
...
* https://github.com/grafana/grafana/releases/tag/v6.5.2
2019-12-14 15:25:48 -08:00
Dalton Hubble
26674083b6
Update Grafana from v6.5.0 to v6.5.1
...
* https://github.com/grafana/grafana/releases/tag/v6.5.1
2019-11-28 14:11:25 -08:00
Dalton Hubble
030a4cec19
Update Grafana from v6.4.4 to v6.5.0
...
* https://grafana.com/docs/guides/whats-new-in-v6-5/
2019-11-25 22:45:58 -08:00
Dalton Hubble
ddea7dc452
Use new resource dashboards in Grafana deployment
...
* kubernetes-mixin pod resource dashboards were split into
two ConfigMap parts because they provide richer networking
details
* New dashboards have been used by the author at the global
level, but were missing in the per-cluster Grafana tracked
here
2019-11-25 22:27:11 -08:00
Dalton Hubble
525ae23305
Add node-exporter alerts and Grafana dashboard
...
* Add Prometheus alerts from node-exporter
* Add Grafana dashboard nodes.json, from node-exporter
* Not adding recording rules, since those are only used
by some node-exporter USE dashboards not being included
2019-11-16 13:47:20 -08:00
Dalton Hubble
a8b7792338
Update Grafana from v6.4.3 to v6.4.4
...
* https://github.com/grafana/grafana/releases/tag/v6.4.4
2019-11-07 12:00:25 -08:00
Dalton Hubble
d4573092b5
Improve Kubelet and Compute Resource dashboards
...
* Add cluster filter to Kubelet dashboard
* Add network details in resource dashboards
* https://github.com/kubernetes-monitoring/kubernetes-mixin/pull/275
* https://github.com/kubernetes-monitoring/kubernetes-mixin/pull/284
* https://github.com/kubernetes-monitoring/kubernetes-mixin/pull/285
2019-10-28 02:22:15 -07:00
Dalton Hubble
eb7b6d39f2
Improve minor aspects of CoreDNS and nginx-ingress dashboards
...
* Add default 10s refresh rate to custom dashboards to match
those from Kubernetes
* Show labels for "instance" as "pod" for clarity
* Add cluster filter for internal use
2019-10-20 23:16:55 -07:00
Dalton Hubble
68da420adc
Refresh Prometheus rules/alerts and Grafana dashboards
...
* Update Prometheus rules/alerts and Grafana dashboards
* Remove dashboards that were moved to node-exporter, they
may be added back later if valuable
* Remove kube-prometheus based rules/alerts (ClockSkew alert)
2019-10-19 17:43:47 -07:00
Dalton Hubble
271d2f6b52
Update Grafana from v6.4.2 to v6.4.3
...
* https://github.com/grafana/grafana/releases/tag/v6.4.3
2019-10-18 00:08:39 -07:00
Dalton Hubble
e4ac1027c8
Update Grafana from v6.4.1 to v6.4.2
...
* https://github.com/grafana/grafana/releases/tag/v6.4.2
2019-10-15 22:58:43 -07:00
Dalton Hubble
ca7d62720e
Update Grafana from v6.3.6 to v6.4.1
...
* https://github.com/grafana/grafana/releases/tag/v6.4.1
2019-10-02 20:36:05 -07:00
Dalton Hubble
f453c54956
Update Grafana from v6.3.5 to v6.3.6
...
* https://github.com/grafana/grafana/releases/tag/v6.3.6
2019-09-28 15:13:46 -07:00
Dalton Hubble
dc436b8fe9
Update Grafana from v6.3.4 to v6.3.5
...
* https://github.com/grafana/grafana/releases/tag/v6.3.5
2019-09-07 14:21:59 -07:00
Dalton Hubble
45bc52d156
Update Grafana from v6.3.3 to v6.3.4
...
* https://github.com/grafana/grafana/releases/tag/v6.3.4
2019-08-31 15:59:13 -07:00
Dalton Hubble
99990e3cbb
Use stable IDs for etcd, CoreDNS, and Ngnix dashboards
...
* Use unique dashboard ID so that multiple replicas of Grafana
serve dashboards with uniform paths
* Fix issue where refreshing a dashboard served by one replica
could show a 404 unless the request went to the same replica
2019-08-18 12:45:49 -07:00
Dalton Hubble
0c45cd0f06
Update Grafana from v6.3.2 to v6.3.3
...
* https://github.com/grafana/grafana/releases/tag/v6.3.3
2019-08-16 14:40:47 -07:00
Dalton Hubble
eaea4d37a2
Update Grafana from v6.2.5 to v6.3.2
...
* https://github.com/grafana/grafana/releases/tag/v6.3.2
* https://github.com/grafana/grafana/releases/tag/v6.3.1
* https://github.com/grafana/grafana/releases/tag/v6.3.0
2019-08-07 20:01:18 -07:00
Dalton Hubble
10d4d9e565
Add Grafana dashboards for CoreDNS and Nginx Ingress Controller
...
* Add a CoreDNS dashboard originally based on an upstream dashboard,
but now customized according to preferences
* Add an Nginx Ingress Controller based on an upstream dashboard,
but customized according to preferences
2019-08-05 22:49:19 -07:00
Dalton Hubble
68d8717924
Refresh Prometheus rules/alerts and Grafana dashboards
...
* Refresh rules, alerts, and dashboards from upstreams
2019-07-21 11:29:34 -07:00
Dalton Hubble
9a395dbf88
Update Grafana from v6.2.4 to v6.2.5
...
* https://github.com/grafana/grafana/releases/tag/v6.2.5
2019-06-29 13:21:42 -07:00
Dalton Hubble
4ad69efc43
Update Grafana from v6.2.2 to v6.2.4
...
* https://github.com/grafana/grafana/releases/tag/v6.2.4
2019-06-19 21:51:54 -07:00
Dalton Hubble
d449477272
Update Grafana from v6.2.1 to v6.2.2
...
* https://github.com/grafana/grafana/releases/tag/v6.2.2
2019-06-07 00:07:54 -07:00
Dalton Hubble
d9e7195477
Update Grafana from v2.6.0 to v2.6.1
2019-05-27 12:25:00 -07:00
Dalton Hubble
5d2684a04d
Update Grafana from v6.1.6 to v6.2.0
...
* https://github.com/grafana/grafana/releases/tag/v6.2.0
2019-05-26 22:00:47 -07:00
Dalton Hubble
6e9b2450fe
Update Grafana from v6.1.4 to v6.1.6
...
* https://github.com/grafana/grafana/releases/tag/v6.1.6
2019-05-04 11:14:37 -07:00
Dalton Hubble
ec5aef5c92
Refresh Prometheus rules and Grafana dashboards
...
* Adds several network related alerts from upstream
2019-04-27 22:41:13 -07:00
Dalton Hubble
418597aa59
Update Grafana from v6.1.3 to v6.1.4
...
* https://github.com/grafana/grafana/releases/tag/v6.1.4
2019-04-18 23:30:43 -07:00
Dalton Hubble
44c293888b
Update Grafana from v6.1.1 to v6.1.3
...
* https://github.com/grafana/grafana/releases/tag/v6.1.3
2019-04-09 22:06:27 -07:00
Dalton Hubble
ce78d5988e
Refresh Prometheus rules and Grafana dashboards
...
* Refresh rules and dashboards from upstreams
* Add new Kubernetes "workload" dashboards
* View pods in a workload (deployment/daemonset/statefulset)
* View workloads in a namespace
2019-04-06 23:31:44 -07:00
Dalton Hubble
29a3035245
Update Grafana from v6.1.0 to v6.1.1
2019-04-06 18:32:14 -07:00
Dalton Hubble
3e7a38cb13
Update Grafana from v6.0.2 to v6.1.0
...
* https://github.com/grafana/grafana/releases/tag/v6.1.0
2019-04-03 20:47:48 -07:00
Dalton Hubble
36e31fc9fa
Add liveness and readiness probes to Grafana
...
* https://github.com/grafana/grafana/issues/3302
2019-03-23 17:55:37 -07:00
Dalton Hubble
619a0370dc
Update Grafana from v6.0.1 to v6.0.2
...
* https://github.com/grafana/grafana/releases/tag/v6.0.2
2019-03-21 23:41:25 -07:00
Dalton Hubble
6dd2731046
Set cpu/memory resources requests/limits for some addons
...
* Set resource requests and limits for Grafana and CLUO
* Set resource requests for Prometheus, but allow usage
to grow since needs vary widely
* Leave nginx without resource requests/limits for now,
its typically well behaved
2019-03-20 00:15:08 -07:00
Dalton Hubble
aa630003a4
Refresh Prometheus rules and Grafana dashboards
...
* Refresh rules and dashboards from upstreams
* Organize dashboards and stay below the ConfigMap size
limit
2019-03-17 13:23:04 -07:00
Dalton Hubble
4201eb1efa
Update Grafana from v6.0.0 to v6.0.1
...
* https://github.com/grafana/grafana/releases/tag/v6.0.1
2019-03-09 12:44:18 -08:00
Dalton Hubble
4ff7fe2c29
Update Grafana dashboards from upstreams
2019-02-28 23:22:07 -08:00
Dalton Hubble
daee5a9d60
Update Grafana from v6.0.0-beta3 to v6.0.0
...
* https://github.com/grafana/grafana/releases/tag/v6.0.0
* http://docs.grafana.org/guides/whats-new-in-v6-0/
2019-02-25 21:43:43 -08:00
Dalton Hubble
d10c2b4cb9
Update Grafana from v6.0.0-beta2 to v6.0.0-beta3
...
* Update Grafana dashboards
2019-02-23 13:03:25 -08:00
Dalton Hubble
e483c81ce9
Improve Prometheus rules and alerts and Grafana dashboards
...
* Collate upstream rules, alerts, and dashboards and tune for use
in Typhoon
* Previously, a well-chosen (but older) set of rules, alerts, and
dashboards were maintained to reflect metric name changes
2019-02-18 12:19:23 -08:00
Dalton Hubble
6fa3b8a13f
Upgrade Grafana to v6.0.0-beta2 and enable Explore UI
...
* Upgrade Grafana from v5.4.3 to v6.0.0-beta2
* Enable Grafana Explore UI while still using only the Viewer
role (inspect/edit without saving)
* http://docs.grafana.org/guides/whats-new-in-v6-0/
2019-02-17 13:26:42 -08:00
Dalton Hubble
f5ff003d0e
Update node-exporter from v0.15.2 to v0.17.0
...
* node-exporter renamed multiple metrics that are reflected
in changes to Prometheus rules and Grafana dashboard expressions
2019-01-22 01:14:00 -08:00
Dalton Hubble
c8a85fabe1
Update Grafana from v5.4.2 to v5.4.3
...
* https://github.com/grafana/grafana/releases/tag/v5.4.3
2019-01-15 21:13:16 -08:00
Dalton Hubble
b74bf11772
Update Grafana from v5.4.0 to v5.4.2
...
* https://github.com/grafana/grafana/releases/tag/v5.4.2
* https://github.com/grafana/grafana/releases/tag/v5.4.1
2018-12-15 12:39:03 -08:00