Dalton Hubble
02cd8eb8d3
Update Prometheus from v2.3.1 to v2.3.2
...
* https://github.com/prometheus/prometheus/releases/tag/v2.3.2
2018-07-14 14:25:49 -07:00
Dalton Hubble
84d6cfe7b3
Add Prometheus alert rule for inactive md devices
...
* node-exporter exposes metrics to Prometheus about total and
active md devices (e.g. disks in mdadm RAID arrays)
* Add alert that fires when a RAID disk fails or becomes inactive
for another reason
2018-07-10 00:20:30 -07:00
Dalton Hubble
f40f60b83c
Update Nginx Ingress controller from 0.15.0 to 0.16.2
...
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.16.2
* https://github.com/kubernetes/ingress-nginx/blob/master/Changelog.md
2018-07-02 22:06:22 -07:00
Dalton Hubble
a3349b5c68
Update heapster from v1.5.2 to v1.5.3
2018-07-01 21:07:52 -07:00
Dalton Hubble
74dc6b0bf9
Update Grafana from 5.1.4 to 5.2.1
...
* http://docs.grafana.org/guides/whats-new-in-v5-2/
* https://github.com/grafana/grafana/releases/tag/v5.2.0
* https://github.com/grafana/grafana/releases/tag/v5.2.1
2018-07-01 20:55:34 -07:00
Dalton Hubble
2eaf04c68b
Drop hostNetwork from nginx-ingress addon
...
* Both flannel and Calico support host port via `portmap`
* Allows writing NetworkPolicies that reference ingress pods in `from`
or `to`. HostNetwork pods were difficult to write network policy for
since they could circumvent the CNI network to communicate with pods on
the same node.
2018-06-22 00:46:41 -07:00
Dalton Hubble
d5de41e07a
Update Grafana from 5.1.3 to 5.1.4
...
* https://github.com/grafana/grafana/releases/tag/v5.1.4
2018-06-19 21:45:15 -07:00
Dalton Hubble
05b99178ae
Update prometheus from v2.3.0 to v2.3.1
...
* https://github.com/prometheus/prometheus/releases/tag/v2.3.1
2018-06-19 21:43:50 -07:00
Stephen Demos
18dd7ccc09
Update CLUO from v0.6.0 to v0.7.0
2018-06-14 22:32:36 -07:00
Dalton Hubble
cbe646fba6
Label namespaces to ease writing Network Policies
2018-06-09 11:45:11 -07:00
Dalton Hubble
c166b2ba33
Update prometheus from v2.2.1 to v2.3.0
2018-06-09 11:43:10 -07:00
Dalton Hubble
d32e6797ae
Annotate Grafana so Prometheus scrapes metrics
2018-05-30 22:37:47 -07:00
Dalton Hubble
32a9a83190
Add Prometheus liveness and readiness probes
2018-05-30 22:34:07 -07:00
Dalton Hubble
28d0891729
Annotate nginx-ingress addon for Prometheus auto-discovery
...
* Add Google Cloud firewall rule to allow worker to worker access
to health and metrics
2018-05-19 13:13:14 -07:00
Dalton Hubble
714419342e
Update nginx-ingress from 0.14.0 to 0.15.0
...
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.15.0
2018-05-17 21:42:55 -07:00
Dalton Hubble
3701c0b1fe
Update Grafana from v5.1.2 to v5.1.3
...
* https://github.com/grafana/grafana/releases/tag/v5.1.3
2018-05-17 21:36:09 -07:00
Dalton Hubble
c2b719dc75
Configure Prometheus to scrape Kubelets directly
...
* Use Kubelet bearer token authn/authz to scrape metrics
* Drop RBAC permission from nodes/proxy to nodes/metrics
* Stop proxying kubelet scrapes through the apiserver, since
this required higher privilege (nodes/proxy) and can add
load to the apiserver on large clusters
2018-05-14 23:06:50 -07:00
Dalton Hubble
fb88113523
Disable default Google Analytics in Grafana addon
...
* Its come to my attention Grafana reports analytics data
by default. Typhoon's philosophy requires user permission
for data collection so the addon should have this disabled
* http://docs.grafana.org/installation/configuration/#analytics
2018-05-10 01:18:47 -07:00
Dalton Hubble
1854f5c104
Update Grafana from v5.1.1 to v5.1.2
...
* https://github.com/grafana/grafana/releases/tag/v5.1.2
2018-05-10 01:09:08 -07:00
Dalton Hubble
726b58b697
Update Grafana from v5.0.4 to v5.1.1
...
* https://github.com/grafana/grafana/releases/tag/v5.1.1
* https://github.com/grafana/grafana/releases/tag/v5.1.0
2018-05-07 22:05:19 -07:00
Dalton Hubble
a54e3c0da1
Fix Prometheus data dir to /var/lib/prometheus
...
* A data volume (emptyDir) is mounted to /var/lib/prometheus
* Users could swap emptyDir for any desired volume if data
persistence is desired. Prometheus previously defaulted to
keeping its data in ./data relative to /prometheus. Override
this behavior to store data in /var/lib/prometheus
2018-05-01 22:05:27 -07:00
Dalton Hubble
731a6ec23a
Update nginx-ingress from 0.13.0 to 0.14.0
...
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.14.0
2018-04-28 13:10:03 -07:00
Dalton Hubble
e0d9e9979c
Update nginx-ingress from 0.12.0 to 0.13.0
...
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.13.0
2018-04-18 21:12:09 -07:00
Dalton Hubble
9789881243
Update kube-state-metrics from v1.3.0 to v1.3.1
...
* https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.3.1
2018-04-15 17:10:02 -07:00
Dalton Hubble
6b08bde479
Use k8s.gcr.io instead of gcr.io/google_containers
...
* Kubernetes recommends using the alias to fetch images
from the nearest GCR regional mirror, to abstract the use
of GCR, and to drop names containing 'google'
* https://groups.google.com/forum/#!msg/kubernetes-dev/ytjk_rNrTa0/3EFUHvovCAAJ
2018-04-08 12:57:52 -07:00
Dalton Hubble
f4b2396718
Return Prometheus deployment to be a worker workload
...
* Expose etcd metrics to workers so Prometheus can
run on a worker, rather than a controller
* Drop temporary firewall rules allowing Prometheus
to run on a controller and scrape targes
* Related to https://github.com/poseidon/typhoon/pull/175
2018-04-08 12:20:00 -07:00
Dalton Hubble
7186aa46da
Update kube-state-metrics from v1.2.0 to v1.3.0
...
* https://github.com/kubernetes/kube-state-metrics/pull/412
* https://github.com/kubernetes/kube-state-metrics/pull/413
2018-04-04 21:04:13 -07:00
Dalton Hubble
d770393dbc
Add etcd metrics, Prometheus scrapes, and Grafana dash
...
* Use etcd v3.3 --listen-metrics-urls to expose only metrics
data via http://0.0.0.0:2381 on controllers
* Add Prometheus discovery for etcd peers on controller nodes
* Temporarily drop two noisy Prometheus alerts
2018-04-03 20:31:00 -07:00
Dalton Hubble
b1e41dcb99
addons: Update from Grafana v4.6.3 to v5.0.4
...
This reverts commit c59a9c66b1
.
2018-03-28 19:45:19 -07:00
Dalton Hubble
65a2751f77
addons: Update heapster from v1.5.1 to v1.5.2
...
* https://github.com/kubernetes/heapster/releases/tag/v1.5.2
2018-03-21 20:32:01 -07:00
Dalton Hubble
851bc1a3f8
Update nginx-ingress from 0.11.0 to 0.12.0
2018-03-19 23:17:17 -07:00
Dalton Hubble
46226a8015
Update Prometheus from 2.2.0 to 2.2.1
2018-03-18 15:56:44 -07:00
Dalton Hubble
c59a9c66b1
Revert "addons: Update from Grafana v4.6.3 to v5.0.0"
...
* Revert commit 9dcc255f8e
.
* Grafana v5.0 is not compatible with Kubernetes v1.9.4. See
https://github.com/poseidon/typhoon/pull/162
2018-03-12 21:01:14 -07:00
Dalton Hubble
42708f9a70
Update Prometheus from v2.2.0-rc.1 to v2.2.0
...
* https://github.com/prometheus/prometheus/releases/tag/v2.2.0
2018-03-09 00:20:40 -08:00
Dalton Hubble
d54709f89c
Update Grafana from v5.0.0 to 5.0.1
...
* https://github.com/grafana/grafana/releases/tag/v5.0.1
2018-03-09 00:20:40 -08:00
Dalton Hubble
9dcc255f8e
addons: Update from Grafana v4.6.3 to v5.0.0
2018-03-09 00:20:40 -08:00
Dalton Hubble
9307e97c46
addons: Update Prometheus from v2.1.0 to v2.2.0
...
* Annotate Prometheus service to scrape metrics from
Prometheus itself (enables Prometheus* alerts)
* Update kube-state-metrics addon-resizer to 1.7
* Use port 8080 for kube-state-metrics
* Add PrometheusNotIngestingSamples alert rule
* Change K8SKubeletDown alert rule to fire when 10%
of kubelets are down, not 1%
* https://github.com/coreos/prometheus-operator/pull/1032
2018-03-09 00:20:40 -08:00
Paul Saunders
86420fd507
Rename namespace manifests to be applied first
...
* Ensure kubectl apply -R creates manifests in the right order
2018-02-22 01:04:30 -08:00
Dalton Hubble
5c383f4184
addons: Update nginx-ingress from 0.10.2 to 0.11.0
2018-02-21 23:54:12 -08:00
Dalton Hubble
de88fa5457
addons: Update Heapster from v1.5.0 to v1.5.1
...
* Switch to k8s.gcr.io vanity image name
* Add service account, Role, and ClusterRole for heapster
2018-02-15 10:57:47 -08:00
Stephen Augustus
d9a0183f3f
addons/nginx-ingress: Fix typo in GCP selector name
2018-02-14 03:07:36 -05:00
Dalton Hubble
03d23bfde7
addons: Remove Kubernetes Dashboard manifests and docs
...
* Stop maintaining Kubernetes Dashboard manifests. Dashboard takes
an unusual approch to security and is often a security weak point.
* Recommendation: Use `kubectl` and avoid using the dashboard. If
you must use the dashboard, explore hardening and consider using an
authenticating proxy rather than the dashboard's auth features
2018-02-11 10:33:23 -08:00
Dalton Hubble
2c10d24113
addons: Switch to apps/v1 workload APIs
...
* Deployments now belong to the apps/v1 API group
* DaemonSets now belong to the apps/v1 API group
* RBAC types now belong to the rbac.authorization.k8s.io/v1 API group
2018-02-10 23:56:31 -08:00
Dalton Hubble
65321acad2
addons: Add grafana-watcher and bundle dashboards
...
* Add separate Grafana addons docs and screenshots
2018-01-28 01:01:30 -08:00
Dalton Hubble
064ce83f25
addons: Update Prometheus to v2.1.0
...
* Change service discovery to relabel jobs to align with
rule expressions in upstream examples
* Use a separate service account for prometheus instead
of granting roles to the namespace's default
* Use a separate service account for node-exporter
* Update node-exporter and kube-state-metrics exporters
2018-01-27 21:00:15 -08:00
Dalton Hubble
c3b0cdddf3
addons: Update nginx-ingress from v0.10.1 to v0.10.2
2018-01-26 17:27:36 -08:00
Dalton Hubble
211ec94c75
addons: Update CLUO from v0.5.0 to v0.6.0
...
* https://github.com/coreos/container-linux-update-operator/releases/tag/v0.6.0
2018-01-26 17:24:09 -08:00
Dalton Hubble
8aca5a089e
addons: Update nginx-ingress to 0.10.1
2018-01-24 20:34:05 -08:00
Dalton Hubble
103f1e16d7
addons: Update nginx-ingress to 0.10.0
2018-01-23 09:28:37 -08:00
Dalton Hubble
bc967ddcd0
addons: Update CLUO to fix compatability with Kubernetes 1.9
...
* Update CLUO from v0.4.1 to v0.5.0
* Earlier versions of CLUO fail to drain nodes on Kubernetes 1.9
so nodes drain one at a time repeatedly and Container Linux OS
updates are not applied to nodes.
* Check current OS versions via `kubectl get nodes --show-labels`
2018-01-19 08:33:26 -08:00
Dalton Hubble
996651c605
Update kube-state-metrics version and RBAC cluster role
...
* https://github.com/kubernetes/kube-state-metrics/pull/345
* https://github.com/kubernetes/kube-state-metrics/pull/334
2018-01-15 08:33:44 -08:00
Dalton Hubble
21e540159b
addons: Update grafana from v4.6.2 to v4.6.3
...
* https://github.com/grafana/grafana/releases/tag/v4.6.3
2017-12-15 16:09:14 -08:00
Dalton Hubble
521a1f0fee
addons: Update heapster from v1.4.3 to v1.5.0
...
* Rollback addon-resizer to 1.7 to address issues in large
clusters https://github.com/kubernetes/kubernetes/pull/52536
2017-12-11 23:34:25 -08:00
Dalton Hubble
7345cb6419
addons: Update nginx-ingress to 0.9.0
2017-12-11 00:48:15 -08:00
Dalton Hubble
a481d71d7d
addons: Update nginx-ingress to 0.9.0-beta.19
...
* Undo rollback f00ecde854
* Port binding regression only occurs with --enable-ssl-passthrough,
which isn't used in these examples. See
https://github.com/kubernetes/ingress-nginx/issues/1788
2017-12-11 00:44:32 -08:00
Dalton Hubble
f00ecde854
Rollback nginx-ingress on GCE to 0.9.0-beta.17
...
* https://github.com/kubernetes/ingress-nginx/issues/1788
2017-12-02 14:06:22 -08:00
Dalton Hubble
65f006e6cc
addons: Sync prometheus alerts to upstream
...
* https://github.com/coreos/prometheus-operator/pull/774
2017-12-01 23:24:08 -08:00
Dalton Hubble
8d3817e0ae
addons: Update nginx-ingress to 0.9.0-beta.19
...
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.9.0-beta.19
2017-12-01 22:32:33 -08:00
Dalton Hubble
63ab117205
addons: Add prometheus rules for DaemonSets
...
* https://github.com/coreos/prometheus-operator/pull/755
2017-11-16 23:51:21 -08:00
Dalton Hubble
1cd262e712
addons: Fix prometheus K8SApiServerLatency alert rule
...
* https://github.com/coreos/prometheus-operator/issues/751
2017-11-16 23:37:15 -08:00
Dalton Hubble
32bdda1b6c
addons: Update Grafana from v4.6.1 to v4.6.2
...
* https://github.com/grafana/grafana/releases/tag/v4.6.2
2017-11-16 23:34:36 -08:00
Dalton Hubble
159443bae7
addons: Add better alerting rules to Prometheus manifests
...
* Adapt the coreos/prometheus-operator alerting rules for Typhoon,
https://github.com/coreos/prometheus-operator/tree/master/contrib/kube-prometheus/manifests
* Add controller manager and scheduler shim services to let
prometheus discover them via service endpoints
* Fix several alert rules to use service endpoint discovery
* A few rules still don't do much, but they default to green
2017-11-10 20:57:47 -08:00
Dalton Hubble
119dc859d3
addons: Update nginx-ingress to 0.9.0-beta.17
...
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.9.0-beta.17
2017-11-10 20:16:40 -08:00
Dalton Hubble
f570af9418
addons: Update from Prometheus v1.8.2 to v2.0.0
2017-11-08 22:48:23 -08:00
Dalton Hubble
8eaa72c1ca
addons: Update nginx-ingress to 0.9.0-beta.16
...
* Image registry changed from gcr.io to quay.io
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.9.0-beta.16
2017-11-06 23:15:15 -08:00
Dalton Hubble
10b977d54a
addons: Set kube-state-metrics to have clusterIP None
...
* kube-state-metrics service exists to facilitate prometheus discovery
2017-11-05 17:54:09 -08:00
Dalton Hubble
b7a268fc45
addons: Add prometheus alertmanager flag
...
* Pass -alertmanager.url to work with a user's in-cluster
alertmanager deployment, if any
2017-11-05 15:50:46 -08:00
Dalton Hubble
279f36effd
addons: Add grafana 4.6.1 and extend prometheus docs
2017-11-05 15:23:56 -08:00
Dalton Hubble
ae07a21e3d
addons: Omit static resource requests/limits for kube-state-metrics
...
* Allow the addon-resizer to dynamically set resource values
* https://github.com/kubernetes/kube-state-metrics/pull/285
2017-11-04 14:41:04 -07:00
Dalton Hubble
0ab1ae3210
addons: Fix typo in kube-state-metrics strategy
2017-11-04 14:39:56 -07:00
Dalton Hubble
e32885c9cd
addons: Update prometheus from v1.8.0 to v1.8.2
...
* https://github.com/prometheus/prometheus/releases/tag/v1.8.2
2017-11-04 11:00:39 -07:00
Dalton Hubble
8582e19077
Expand Nginx Ingress liveness and readiness probes
...
* Remove dnsPolicy: ClusterFirst
* https://github.com/kubernetes/ingress-nginx/pull/1584
2017-10-25 22:29:20 -07:00
Dalton Hubble
3727c40c6c
Update Nginx Ingress defaultbackend from 1.0 to 1.4
...
* https://github.com/kubernetes/ingress-nginx/pull/1568
2017-10-25 22:16:23 -07:00
Dalton Hubble
b608f9c615
addons: Use service endpoints to scrape node-exporter
2017-10-24 22:59:00 -07:00
Dalton Hubble
ec1dbb853c
addons: Include kube-state-metrics exporter manifests
2017-10-24 22:59:00 -07:00
Dalton Hubble
d046d45769
addons: Include Prometheus and node-exporter manifests
2017-10-24 22:58:59 -07:00
Dalton Hubble
a73f57fe4e
Update CLUO from v0.4.0 to v0.4.1
2017-10-24 22:14:03 -07:00
Dalton Hubble
f86c00288f
Add missing update-agent RBAC role to get pods
...
* Drain now gets pods, deletes pods, and waits for deletion
2017-10-20 01:21:46 -07:00
Dalton Hubble
a57b3cf973
Update CLUO addon to v0.4.0 and RBAC ClusterRole
2017-10-20 00:40:17 -07:00
Dalton Hubble
7b5ffd0085
Add Container Linux reboot-coordinator RBAC
...
* Add a reboot-coordinator namespace for CLUO components
* Define an RBAC ClusterRole for update-operator and update-agent
* Replace the older-style where CLUO ran in kube-system, with
admin privilege
2017-10-14 19:35:06 -07:00
Dalton Hubble
11453bac91
Update heapster addon from v1.4.0 to v1.4.3
...
* Use normal name and phase labels
2017-10-14 19:07:37 -07:00
Dalton Hubble
dd0c61d1d9
Update Nginx Ingress controller addon to 0.9.0-beta.15
2017-10-14 18:30:58 -07:00
Dalton Hubble
f7f983c7da
docs: Add docs and addons for Nginx AWS Ingress
2017-09-28 01:09:31 -07:00
Dalton Hubble
7c733bd314
Add Nginx Ingress controller addons and docs
2017-09-18 01:48:21 -07:00
Dalton Hubble
a2609c14c0
addons: Disable Google Analytics in CLUO
2017-08-27 21:06:49 -07:00
Dalton Hubble
564c0160bf
Add heapster, dashboard, and CLUO addons
2017-08-27 17:20:29 -07:00