typhoon/addons/prometheus
Dalton Hubble 178afe4a9b Reduce apiserver metrics cardinality and extraneous labels
* Stop mapping node labels to targets discovered via Kubernetes
nodes (e.g. etcd, kubelet, cadvisor). It is rarely useful to
store node labels (e.g. kubernetes.io/os=linux) on these metrics
* kube-apiserver's apiserver_request_duration_seconds_bucket metric
has a high cardinality that includes labels for the API group, verb,
scope, resource, and component for each object type, including for
each CRD. This one metric has ~10k time series in a typical cluster
(btw 10-40% of total)
* Removing the apiserver request duration outright would make latency
alerts a NoOp and break a Grafana apiserver panel. Instead, drop series
that have a "group" label. Effectively, only request durations for
core Kubernetes APIs will be kept (e.g. cardinality won't grow with
each CRD added). This reduces the metric to ~2k unique series
2019-12-08 22:48:25 -08:00
..
discovery addons: Add better alerting rules to Prometheus manifests 2017-11-10 20:57:47 -08:00
exporters Remove kube-state-metrics addon-resizer 2019-10-20 16:03:29 -07:00
rbac Configure Prometheus to scrape Kubelets directly 2018-05-14 23:06:50 -07:00
0-namespace.yaml Label namespaces to ease writing Network Policies 2018-06-09 11:45:11 -07:00
config.yaml Reduce apiserver metrics cardinality and extraneous labels 2019-12-08 22:48:25 -08:00
deployment.yaml Update Prometheus from v2.14.0-rc.0 to v2.14.0 2019-11-13 13:41:11 -08:00
network-policy.yaml Add NetworkPolicy to limit traffic into Prometheus 2019-03-23 21:38:34 -07:00
rules.yaml Add node-exporter alerts and Grafana dashboard 2019-11-16 13:47:20 -08:00
service-account.yaml addons: Update Prometheus to v2.1.0 2018-01-27 21:00:15 -08:00
service.yaml Add explicit annotation for Prometheus port to scrape 2019-10-20 16:05:09 -07:00