Dalton Hubble
be9f7b87d6
Update Prometheus from v2.4.3 to v2.5.0
...
* https://github.com/prometheus/prometheus/releases/tag/v2.5.0
2018-11-06 22:16:12 -08:00
Dalton Hubble
884c8b39dc
Update Grafana from v5.3.1 to v5.3.2
...
* https://github.com/grafana/grafana/releases/tag/v5.3.2
2018-10-28 19:44:22 -07:00
Dalton Hubble
bc750aec33
Configure Heapster to source metrics from Kubelet authenticated API
...
* Heapster can now get nodes (i.e. kubelets) from the apiserver and
source metrics from the Kubelet authenticated API (10250) instead of
the Kubelet HTTP read-only API (10255)
* https://github.com/kubernetes/heapster/blob/master/docs/source-configuration.md
* Use the heapster service account token via Kubelet bearer token
authn/authz.
* Permit Heapster to skip CA verification. The CA cert does not contain
IP SANs and cannot since nodes get random IPs that aren't known upfront.
Heapster obtains the node list from the apiserver, so the risk of
spoofing a node is limited. For the same reason, Prometheus scrapes
must skip CA verification for scraping Kubelet's provided by the apiserver.
* https://github.com/poseidon/typhoon/blob/v1.12.1/addons/prometheus/config.yaml#L68
* Create a heapster ClusterRole to work around the default Kubernetes
`system:heapster` ClusterRole lacking the proper GET `nodes/stats`
access. See https://github.com/kubernetes/heapster/issues/1936
2018-10-18 21:03:01 -07:00
Dalton Hubble
0127ee82c1
Update nginx-ingress from v0.19.0 to v0.20.0
2018-10-16 21:35:29 -07:00
Dalton Hubble
a10d6977b8
Update Prometheus from v2.4.2 to v2.4.3
...
* https://github.com/prometheus/prometheus/releases/tag/v2.4.3
2018-10-16 21:29:41 -07:00
Dalton Hubble
05fe923c14
Update Grafana from v5.3.0 to v5.3.1
...
* https://github.com/grafana/grafana/releases/tag/v5.3.1
2018-10-16 21:23:44 -07:00
Dalton Hubble
5eb4078d68
Add docker/default seccomp to control plane and addons
...
* Annotate pods, deployments, and daemonsets to start containers
with the Docker runtime's default seccomp profile
* Overrides Kubernetes default behavior which started containers
with seccomp=unconfined
* https://docs.docker.com/engine/security/seccomp/#pass-a-profile-for-a-container
2018-10-16 20:07:29 -07:00
Dalton Hubble
8f0d2b5db4
Update Grafana from v5.2.4 to v5.3.0
2018-10-13 23:03:31 -07:00
Dalton Hubble
032a24133b
Update Prometheus from v2.3.2 to v2.4.2
...
* https://github.com/prometheus/prometheus/releases/tag/v2.4.0
* https://github.com/prometheus/prometheus/releases/tag/v2.4.1
* https://github.com/prometheus/prometheus/releases/tag/v2.4.2
2018-09-21 22:27:11 -07:00
Dalton Hubble
dc03f7a4a9
Update nginx-ingress from 0.17.1 to 0.19.0
...
* If using --enable-ssl-passthrough or exposing TCP/UDP services,
be aware of https://github.com/kubernetes/ingress-nginx/pull/3038
* Workarounds until the fix merges are to stay on 0.17.1, use the
suggested development image, or revert to securityContext
`runAsNonRoot: false` for a while (less secure)
2018-09-08 17:57:01 -07:00
Dalton Hubble
1b8234eb91
Update Grafana from v5.2.2 to v5.2.4
...
* https://github.com/grafana/grafana/releases/tag/v5.2.3
* https://github.com/grafana/grafana/releases/tag/v5.2.4
2018-09-08 15:41:20 -07:00
Dalton Hubble
4ba090feb0
Update kube-state-metrics from v1.3.1 to v1.4.0
2018-08-29 09:37:50 -07:00
Dalton Hubble
4882fe1053
Add docs for Azure Ingress and worker pools
...
* Azure worker pools must be in the same region as
the cluster itself unfortunately
2018-08-27 23:30:56 -07:00
Becca Powell
49a9dc9b8b
Fix typo in Prometheus alerting rules
2018-08-21 16:55:49 -07:00
Dalton Hubble
dbdc3fc850
Add nginx-ingress addon manifests for bare-metal
2018-08-11 12:14:23 -07:00
Dalton Hubble
e00f97c578
Update nginx-ingress from 0.16.2 to 0.17.1
...
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.17.1
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.17.0
2018-08-08 00:45:20 -07:00
Dalton Hubble
e6720cf738
Update heapster from v1.5.3 to v1.5.4
...
* https://github.com/kubernetes/heapster/releases/tag/v1.5.4
2018-07-29 11:19:57 -07:00
Dalton Hubble
844f380b4e
Update Grafana from v5.2.1 to v5.2.2
...
* https://github.com/grafana/grafana/releases/tag/v5.2.2
2018-07-29 11:12:56 -07:00
Dalton Hubble
02cd8eb8d3
Update Prometheus from v2.3.1 to v2.3.2
...
* https://github.com/prometheus/prometheus/releases/tag/v2.3.2
2018-07-14 14:25:49 -07:00
Dalton Hubble
84d6cfe7b3
Add Prometheus alert rule for inactive md devices
...
* node-exporter exposes metrics to Prometheus about total and
active md devices (e.g. disks in mdadm RAID arrays)
* Add alert that fires when a RAID disk fails or becomes inactive
for another reason
2018-07-10 00:20:30 -07:00
Dalton Hubble
f40f60b83c
Update Nginx Ingress controller from 0.15.0 to 0.16.2
...
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.16.2
* https://github.com/kubernetes/ingress-nginx/blob/master/Changelog.md
2018-07-02 22:06:22 -07:00
Dalton Hubble
a3349b5c68
Update heapster from v1.5.2 to v1.5.3
2018-07-01 21:07:52 -07:00
Dalton Hubble
74dc6b0bf9
Update Grafana from 5.1.4 to 5.2.1
...
* http://docs.grafana.org/guides/whats-new-in-v5-2/
* https://github.com/grafana/grafana/releases/tag/v5.2.0
* https://github.com/grafana/grafana/releases/tag/v5.2.1
2018-07-01 20:55:34 -07:00
Dalton Hubble
2eaf04c68b
Drop hostNetwork from nginx-ingress addon
...
* Both flannel and Calico support host port via `portmap`
* Allows writing NetworkPolicies that reference ingress pods in `from`
or `to`. HostNetwork pods were difficult to write network policy for
since they could circumvent the CNI network to communicate with pods on
the same node.
2018-06-22 00:46:41 -07:00
Dalton Hubble
d5de41e07a
Update Grafana from 5.1.3 to 5.1.4
...
* https://github.com/grafana/grafana/releases/tag/v5.1.4
2018-06-19 21:45:15 -07:00
Dalton Hubble
05b99178ae
Update prometheus from v2.3.0 to v2.3.1
...
* https://github.com/prometheus/prometheus/releases/tag/v2.3.1
2018-06-19 21:43:50 -07:00
Stephen Demos
18dd7ccc09
Update CLUO from v0.6.0 to v0.7.0
2018-06-14 22:32:36 -07:00
Dalton Hubble
cbe646fba6
Label namespaces to ease writing Network Policies
2018-06-09 11:45:11 -07:00
Dalton Hubble
c166b2ba33
Update prometheus from v2.2.1 to v2.3.0
2018-06-09 11:43:10 -07:00
Dalton Hubble
d32e6797ae
Annotate Grafana so Prometheus scrapes metrics
2018-05-30 22:37:47 -07:00
Dalton Hubble
32a9a83190
Add Prometheus liveness and readiness probes
2018-05-30 22:34:07 -07:00
Dalton Hubble
28d0891729
Annotate nginx-ingress addon for Prometheus auto-discovery
...
* Add Google Cloud firewall rule to allow worker to worker access
to health and metrics
2018-05-19 13:13:14 -07:00
Dalton Hubble
714419342e
Update nginx-ingress from 0.14.0 to 0.15.0
...
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.15.0
2018-05-17 21:42:55 -07:00
Dalton Hubble
3701c0b1fe
Update Grafana from v5.1.2 to v5.1.3
...
* https://github.com/grafana/grafana/releases/tag/v5.1.3
2018-05-17 21:36:09 -07:00
Dalton Hubble
c2b719dc75
Configure Prometheus to scrape Kubelets directly
...
* Use Kubelet bearer token authn/authz to scrape metrics
* Drop RBAC permission from nodes/proxy to nodes/metrics
* Stop proxying kubelet scrapes through the apiserver, since
this required higher privilege (nodes/proxy) and can add
load to the apiserver on large clusters
2018-05-14 23:06:50 -07:00
Dalton Hubble
fb88113523
Disable default Google Analytics in Grafana addon
...
* Its come to my attention Grafana reports analytics data
by default. Typhoon's philosophy requires user permission
for data collection so the addon should have this disabled
* http://docs.grafana.org/installation/configuration/#analytics
2018-05-10 01:18:47 -07:00
Dalton Hubble
1854f5c104
Update Grafana from v5.1.1 to v5.1.2
...
* https://github.com/grafana/grafana/releases/tag/v5.1.2
2018-05-10 01:09:08 -07:00
Dalton Hubble
726b58b697
Update Grafana from v5.0.4 to v5.1.1
...
* https://github.com/grafana/grafana/releases/tag/v5.1.1
* https://github.com/grafana/grafana/releases/tag/v5.1.0
2018-05-07 22:05:19 -07:00
Dalton Hubble
a54e3c0da1
Fix Prometheus data dir to /var/lib/prometheus
...
* A data volume (emptyDir) is mounted to /var/lib/prometheus
* Users could swap emptyDir for any desired volume if data
persistence is desired. Prometheus previously defaulted to
keeping its data in ./data relative to /prometheus. Override
this behavior to store data in /var/lib/prometheus
2018-05-01 22:05:27 -07:00
Dalton Hubble
731a6ec23a
Update nginx-ingress from 0.13.0 to 0.14.0
...
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.14.0
2018-04-28 13:10:03 -07:00
Dalton Hubble
e0d9e9979c
Update nginx-ingress from 0.12.0 to 0.13.0
...
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.13.0
2018-04-18 21:12:09 -07:00
Dalton Hubble
9789881243
Update kube-state-metrics from v1.3.0 to v1.3.1
...
* https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.3.1
2018-04-15 17:10:02 -07:00
Dalton Hubble
6b08bde479
Use k8s.gcr.io instead of gcr.io/google_containers
...
* Kubernetes recommends using the alias to fetch images
from the nearest GCR regional mirror, to abstract the use
of GCR, and to drop names containing 'google'
* https://groups.google.com/forum/#!msg/kubernetes-dev/ytjk_rNrTa0/3EFUHvovCAAJ
2018-04-08 12:57:52 -07:00
Dalton Hubble
f4b2396718
Return Prometheus deployment to be a worker workload
...
* Expose etcd metrics to workers so Prometheus can
run on a worker, rather than a controller
* Drop temporary firewall rules allowing Prometheus
to run on a controller and scrape targes
* Related to https://github.com/poseidon/typhoon/pull/175
2018-04-08 12:20:00 -07:00
Dalton Hubble
7186aa46da
Update kube-state-metrics from v1.2.0 to v1.3.0
...
* https://github.com/kubernetes/kube-state-metrics/pull/412
* https://github.com/kubernetes/kube-state-metrics/pull/413
2018-04-04 21:04:13 -07:00
Dalton Hubble
d770393dbc
Add etcd metrics, Prometheus scrapes, and Grafana dash
...
* Use etcd v3.3 --listen-metrics-urls to expose only metrics
data via http://0.0.0.0:2381 on controllers
* Add Prometheus discovery for etcd peers on controller nodes
* Temporarily drop two noisy Prometheus alerts
2018-04-03 20:31:00 -07:00
Dalton Hubble
b1e41dcb99
addons: Update from Grafana v4.6.3 to v5.0.4
...
This reverts commit c59a9c66b1
.
2018-03-28 19:45:19 -07:00
Dalton Hubble
65a2751f77
addons: Update heapster from v1.5.1 to v1.5.2
...
* https://github.com/kubernetes/heapster/releases/tag/v1.5.2
2018-03-21 20:32:01 -07:00
Dalton Hubble
851bc1a3f8
Update nginx-ingress from 0.11.0 to 0.12.0
2018-03-19 23:17:17 -07:00
Dalton Hubble
46226a8015
Update Prometheus from 2.2.0 to 2.2.1
2018-03-18 15:56:44 -07:00
Dalton Hubble
c59a9c66b1
Revert "addons: Update from Grafana v4.6.3 to v5.0.0"
...
* Revert commit 9dcc255f8e
.
* Grafana v5.0 is not compatible with Kubernetes v1.9.4. See
https://github.com/poseidon/typhoon/pull/162
2018-03-12 21:01:14 -07:00
Dalton Hubble
42708f9a70
Update Prometheus from v2.2.0-rc.1 to v2.2.0
...
* https://github.com/prometheus/prometheus/releases/tag/v2.2.0
2018-03-09 00:20:40 -08:00
Dalton Hubble
d54709f89c
Update Grafana from v5.0.0 to 5.0.1
...
* https://github.com/grafana/grafana/releases/tag/v5.0.1
2018-03-09 00:20:40 -08:00
Dalton Hubble
9dcc255f8e
addons: Update from Grafana v4.6.3 to v5.0.0
2018-03-09 00:20:40 -08:00
Dalton Hubble
9307e97c46
addons: Update Prometheus from v2.1.0 to v2.2.0
...
* Annotate Prometheus service to scrape metrics from
Prometheus itself (enables Prometheus* alerts)
* Update kube-state-metrics addon-resizer to 1.7
* Use port 8080 for kube-state-metrics
* Add PrometheusNotIngestingSamples alert rule
* Change K8SKubeletDown alert rule to fire when 10%
of kubelets are down, not 1%
* https://github.com/coreos/prometheus-operator/pull/1032
2018-03-09 00:20:40 -08:00
Paul Saunders
86420fd507
Rename namespace manifests to be applied first
...
* Ensure kubectl apply -R creates manifests in the right order
2018-02-22 01:04:30 -08:00
Dalton Hubble
5c383f4184
addons: Update nginx-ingress from 0.10.2 to 0.11.0
2018-02-21 23:54:12 -08:00
Dalton Hubble
de88fa5457
addons: Update Heapster from v1.5.0 to v1.5.1
...
* Switch to k8s.gcr.io vanity image name
* Add service account, Role, and ClusterRole for heapster
2018-02-15 10:57:47 -08:00
Stephen Augustus
d9a0183f3f
addons/nginx-ingress: Fix typo in GCP selector name
2018-02-14 03:07:36 -05:00
Dalton Hubble
03d23bfde7
addons: Remove Kubernetes Dashboard manifests and docs
...
* Stop maintaining Kubernetes Dashboard manifests. Dashboard takes
an unusual approch to security and is often a security weak point.
* Recommendation: Use `kubectl` and avoid using the dashboard. If
you must use the dashboard, explore hardening and consider using an
authenticating proxy rather than the dashboard's auth features
2018-02-11 10:33:23 -08:00
Dalton Hubble
2c10d24113
addons: Switch to apps/v1 workload APIs
...
* Deployments now belong to the apps/v1 API group
* DaemonSets now belong to the apps/v1 API group
* RBAC types now belong to the rbac.authorization.k8s.io/v1 API group
2018-02-10 23:56:31 -08:00
Dalton Hubble
65321acad2
addons: Add grafana-watcher and bundle dashboards
...
* Add separate Grafana addons docs and screenshots
2018-01-28 01:01:30 -08:00
Dalton Hubble
064ce83f25
addons: Update Prometheus to v2.1.0
...
* Change service discovery to relabel jobs to align with
rule expressions in upstream examples
* Use a separate service account for prometheus instead
of granting roles to the namespace's default
* Use a separate service account for node-exporter
* Update node-exporter and kube-state-metrics exporters
2018-01-27 21:00:15 -08:00
Dalton Hubble
c3b0cdddf3
addons: Update nginx-ingress from v0.10.1 to v0.10.2
2018-01-26 17:27:36 -08:00
Dalton Hubble
211ec94c75
addons: Update CLUO from v0.5.0 to v0.6.0
...
* https://github.com/coreos/container-linux-update-operator/releases/tag/v0.6.0
2018-01-26 17:24:09 -08:00
Dalton Hubble
8aca5a089e
addons: Update nginx-ingress to 0.10.1
2018-01-24 20:34:05 -08:00
Dalton Hubble
103f1e16d7
addons: Update nginx-ingress to 0.10.0
2018-01-23 09:28:37 -08:00
Dalton Hubble
bc967ddcd0
addons: Update CLUO to fix compatability with Kubernetes 1.9
...
* Update CLUO from v0.4.1 to v0.5.0
* Earlier versions of CLUO fail to drain nodes on Kubernetes 1.9
so nodes drain one at a time repeatedly and Container Linux OS
updates are not applied to nodes.
* Check current OS versions via `kubectl get nodes --show-labels`
2018-01-19 08:33:26 -08:00
Dalton Hubble
996651c605
Update kube-state-metrics version and RBAC cluster role
...
* https://github.com/kubernetes/kube-state-metrics/pull/345
* https://github.com/kubernetes/kube-state-metrics/pull/334
2018-01-15 08:33:44 -08:00
Dalton Hubble
21e540159b
addons: Update grafana from v4.6.2 to v4.6.3
...
* https://github.com/grafana/grafana/releases/tag/v4.6.3
2017-12-15 16:09:14 -08:00
Dalton Hubble
521a1f0fee
addons: Update heapster from v1.4.3 to v1.5.0
...
* Rollback addon-resizer to 1.7 to address issues in large
clusters https://github.com/kubernetes/kubernetes/pull/52536
2017-12-11 23:34:25 -08:00
Dalton Hubble
7345cb6419
addons: Update nginx-ingress to 0.9.0
2017-12-11 00:48:15 -08:00
Dalton Hubble
a481d71d7d
addons: Update nginx-ingress to 0.9.0-beta.19
...
* Undo rollback f00ecde854
* Port binding regression only occurs with --enable-ssl-passthrough,
which isn't used in these examples. See
https://github.com/kubernetes/ingress-nginx/issues/1788
2017-12-11 00:44:32 -08:00
Dalton Hubble
f00ecde854
Rollback nginx-ingress on GCE to 0.9.0-beta.17
...
* https://github.com/kubernetes/ingress-nginx/issues/1788
2017-12-02 14:06:22 -08:00
Dalton Hubble
65f006e6cc
addons: Sync prometheus alerts to upstream
...
* https://github.com/coreos/prometheus-operator/pull/774
2017-12-01 23:24:08 -08:00
Dalton Hubble
8d3817e0ae
addons: Update nginx-ingress to 0.9.0-beta.19
...
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.9.0-beta.19
2017-12-01 22:32:33 -08:00
Dalton Hubble
63ab117205
addons: Add prometheus rules for DaemonSets
...
* https://github.com/coreos/prometheus-operator/pull/755
2017-11-16 23:51:21 -08:00
Dalton Hubble
1cd262e712
addons: Fix prometheus K8SApiServerLatency alert rule
...
* https://github.com/coreos/prometheus-operator/issues/751
2017-11-16 23:37:15 -08:00
Dalton Hubble
32bdda1b6c
addons: Update Grafana from v4.6.1 to v4.6.2
...
* https://github.com/grafana/grafana/releases/tag/v4.6.2
2017-11-16 23:34:36 -08:00
Dalton Hubble
159443bae7
addons: Add better alerting rules to Prometheus manifests
...
* Adapt the coreos/prometheus-operator alerting rules for Typhoon,
https://github.com/coreos/prometheus-operator/tree/master/contrib/kube-prometheus/manifests
* Add controller manager and scheduler shim services to let
prometheus discover them via service endpoints
* Fix several alert rules to use service endpoint discovery
* A few rules still don't do much, but they default to green
2017-11-10 20:57:47 -08:00
Dalton Hubble
119dc859d3
addons: Update nginx-ingress to 0.9.0-beta.17
...
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.9.0-beta.17
2017-11-10 20:16:40 -08:00
Dalton Hubble
f570af9418
addons: Update from Prometheus v1.8.2 to v2.0.0
2017-11-08 22:48:23 -08:00
Dalton Hubble
8eaa72c1ca
addons: Update nginx-ingress to 0.9.0-beta.16
...
* Image registry changed from gcr.io to quay.io
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.9.0-beta.16
2017-11-06 23:15:15 -08:00
Dalton Hubble
10b977d54a
addons: Set kube-state-metrics to have clusterIP None
...
* kube-state-metrics service exists to facilitate prometheus discovery
2017-11-05 17:54:09 -08:00
Dalton Hubble
b7a268fc45
addons: Add prometheus alertmanager flag
...
* Pass -alertmanager.url to work with a user's in-cluster
alertmanager deployment, if any
2017-11-05 15:50:46 -08:00
Dalton Hubble
279f36effd
addons: Add grafana 4.6.1 and extend prometheus docs
2017-11-05 15:23:56 -08:00
Dalton Hubble
ae07a21e3d
addons: Omit static resource requests/limits for kube-state-metrics
...
* Allow the addon-resizer to dynamically set resource values
* https://github.com/kubernetes/kube-state-metrics/pull/285
2017-11-04 14:41:04 -07:00
Dalton Hubble
0ab1ae3210
addons: Fix typo in kube-state-metrics strategy
2017-11-04 14:39:56 -07:00
Dalton Hubble
e32885c9cd
addons: Update prometheus from v1.8.0 to v1.8.2
...
* https://github.com/prometheus/prometheus/releases/tag/v1.8.2
2017-11-04 11:00:39 -07:00
Dalton Hubble
8582e19077
Expand Nginx Ingress liveness and readiness probes
...
* Remove dnsPolicy: ClusterFirst
* https://github.com/kubernetes/ingress-nginx/pull/1584
2017-10-25 22:29:20 -07:00
Dalton Hubble
3727c40c6c
Update Nginx Ingress defaultbackend from 1.0 to 1.4
...
* https://github.com/kubernetes/ingress-nginx/pull/1568
2017-10-25 22:16:23 -07:00
Dalton Hubble
b608f9c615
addons: Use service endpoints to scrape node-exporter
2017-10-24 22:59:00 -07:00
Dalton Hubble
ec1dbb853c
addons: Include kube-state-metrics exporter manifests
2017-10-24 22:59:00 -07:00
Dalton Hubble
d046d45769
addons: Include Prometheus and node-exporter manifests
2017-10-24 22:58:59 -07:00
Dalton Hubble
a73f57fe4e
Update CLUO from v0.4.0 to v0.4.1
2017-10-24 22:14:03 -07:00
Dalton Hubble
f86c00288f
Add missing update-agent RBAC role to get pods
...
* Drain now gets pods, deletes pods, and waits for deletion
2017-10-20 01:21:46 -07:00
Dalton Hubble
a57b3cf973
Update CLUO addon to v0.4.0 and RBAC ClusterRole
2017-10-20 00:40:17 -07:00
Dalton Hubble
7b5ffd0085
Add Container Linux reboot-coordinator RBAC
...
* Add a reboot-coordinator namespace for CLUO components
* Define an RBAC ClusterRole for update-operator and update-agent
* Replace the older-style where CLUO ran in kube-system, with
admin privilege
2017-10-14 19:35:06 -07:00
Dalton Hubble
11453bac91
Update heapster addon from v1.4.0 to v1.4.3
...
* Use normal name and phase labels
2017-10-14 19:07:37 -07:00
Dalton Hubble
dd0c61d1d9
Update Nginx Ingress controller addon to 0.9.0-beta.15
2017-10-14 18:30:58 -07:00