Commit Graph

303 Commits

Author SHA1 Message Date
Jordan Pittier
fd3c81d04d Remove create/update endpoints from nginx-ingress Role (#458)
* nginx-ingress no longer requires endpoints create/update RBAC Role permissions
* https://github.com/kubernetes/ingress-nginx/pull/1527
2019-05-04 11:36:02 -07:00
Dalton Hubble
6e9b2450fe Update Grafana from v6.1.4 to v6.1.6
* https://github.com/grafana/grafana/releases/tag/v6.1.6
2019-05-04 11:14:37 -07:00
Dalton Hubble
ec5aef5c92 Refresh Prometheus rules and Grafana dashboards
* Adds several network related alerts from upstream
2019-04-27 22:41:13 -07:00
Dalton Hubble
0e94708fd8 Update kube-state-metrics from v1.5.0 to v1.6.0-rc.2
* Collect metrics Ingress resources
* Collects metrics about certificates.k8s.io certificatesigningrequests
* https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.6.0-rc.2
2019-04-27 20:54:40 -07:00
Dalton Hubble
2c11bad439 Update Prometheus from v2.9.1 to v2.9.2
* https://github.com/prometheus/prometheus/releases/tag/v2.9.2
2019-04-27 20:39:55 -07:00
Dalton Hubble
418597aa59 Update Grafana from v6.1.3 to v6.1.4
* https://github.com/grafana/grafana/releases/tag/v6.1.4
2019-04-18 23:30:43 -07:00
Dalton Hubble
f3174c2b7a Update Prometheus from v2.8.1 to v2.9.1
* https://github.com/prometheus/prometheus/releases/tag/v2.9.1
* https://github.com/prometheus/prometheus/releases/tag/v2.9.0
2019-04-18 23:26:32 -07:00
Dalton Hubble
a141c5fe9e Update nginx-ingress from v0.23.0 to v0.24.1
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.24.1
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.24.0
2019-04-15 21:08:22 -07:00
Dalton Hubble
1b157a2fa4 Revert "Update kube-state-metrics from v1.5.0 to v1.6.0-rc.0"
* This reverts commit 6e5d66cf66
* kube-state-metrics v1.6.0-rc.0 fires KubeDeploymentReplicasMismatch
alerts where its own Deployment doesn't have replicas available,
(kube_deployment_status_replicas_available) even though all replicas
are available according to kubectl inspection
* This problem was present even with the CSR ClusterRole fix
(https://github.com/kubernetes/kube-state-metrics/pull/717)
2019-04-13 12:37:53 -07:00
Dalton Hubble
6e5d66cf66 Update kube-state-metrics from v1.5.0 to v1.6.0-rc.0
* Adds a metrics collector for Ingress resources and other
improvements
* https://github.com/kubernetes/kube-state-metrics/pull/640
* https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.6.0-rc.0
2019-04-09 22:16:36 -07:00
Dalton Hubble
44c293888b Update Grafana from v6.1.1 to v6.1.3
* https://github.com/grafana/grafana/releases/tag/v6.1.3
2019-04-09 22:06:27 -07:00
Dalton Hubble
ce78d5988e Refresh Prometheus rules and Grafana dashboards
* Refresh rules and dashboards from upstreams
* Add new Kubernetes "workload" dashboards
  * View pods in a workload (deployment/daemonset/statefulset)
  * View workloads in a namespace
2019-04-06 23:31:44 -07:00
Dalton Hubble
29a3035245 Update Grafana from v6.1.0 to v6.1.1 2019-04-06 18:32:14 -07:00
Dalton Hubble
3e7a38cb13 Update Grafana from v6.0.2 to v6.1.0
* https://github.com/grafana/grafana/releases/tag/v6.1.0
2019-04-03 20:47:48 -07:00
Dalton Hubble
3e9dc28a00 Update Prometheus from v2.8.0 to v2.8.1
* https://github.com/prometheus/prometheus/releases/tag/v2.8.1
2019-03-31 17:40:20 -07:00
Dalton Hubble
41a9d86bc3 Add NetworkPolicy to limit traffic into Prometheus
* Allow traffic from Grafana to Prometheus in monitoring
* Allow traffic from Prometheus to Prometheus in monitoring
* NetworkPolicy denies non-whitelisted traffic. Define policy
to allow other access
2019-03-23 21:38:34 -07:00
Dalton Hubble
36e31fc9fa Add liveness and readiness probes to Grafana
* https://github.com/grafana/grafana/issues/3302
2019-03-23 17:55:37 -07:00
Dalton Hubble
619a0370dc Update Grafana from v6.0.1 to v6.0.2
* https://github.com/grafana/grafana/releases/tag/v6.0.2
2019-03-21 23:41:25 -07:00
Dalton Hubble
6dd2731046 Set cpu/memory resources requests/limits for some addons
* Set resource requests and limits for Grafana and CLUO
* Set resource requests for Prometheus, but allow usage
to grow since needs vary widely
* Leave nginx without resource requests/limits for now,
its typically well behaved
2019-03-20 00:15:08 -07:00
Dalton Hubble
aa630003a4 Refresh Prometheus rules and Grafana dashboards
* Refresh rules and dashboards from upstreams
* Organize dashboards and stay below the ConfigMap size
limit
2019-03-17 13:23:04 -07:00
Dalton Hubble
bf97a45b9d Remove heapster manifests from addons
* Heapster addon powers `kubectl top`
* In early Kubernetes, people legitimately used and expected
`kubectl top` to work, so the optional addon was provided
* Today the standards are different. Many better monitoring
tools exist, that are also less coupled to Kubernetes "kubectl
top" reliance on a non-core extensions means its not in-scope
for minimal Kubernetes clusters. No more exceptionalism
* Finally, Heapster isn't that useful anymore. Its manifests
have no need for Typhoon-specific modification
* Look to prior releases if you still wish to apply heapster
2019-03-17 12:41:59 -07:00
Dalton Hubble
e0bee2e417 Update Prometheus from v2.7.2 to v2.8.0
* https://github.com/prometheus/prometheus/releases/tag/v2.8.0
2019-03-13 22:11:38 -07:00
Dalton Hubble
4201eb1efa Update Grafana from v6.0.0 to v6.0.1
* https://github.com/grafana/grafana/releases/tag/v6.0.1
2019-03-09 12:44:18 -08:00
Dalton Hubble
4d9a692424 Update Prometheus from v2.7.1 to v2.7.2
* https://github.com/prometheus/prometheus/releases/tag/v2.7.2
2019-03-04 23:08:12 -08:00
Dalton Hubble
a08adc92b5 Update nginx-ingress from v0.22.0 to v0.23.0
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.23.0
2019-03-01 01:18:54 -08:00
Dalton Hubble
4ff7fe2c29 Update Grafana dashboards from upstreams 2019-02-28 23:22:07 -08:00
Dalton Hubble
daee5a9d60 Update Grafana from v6.0.0-beta3 to v6.0.0
* https://github.com/grafana/grafana/releases/tag/v6.0.0
* http://docs.grafana.org/guides/whats-new-in-v6-0/
2019-02-25 21:43:43 -08:00
Dalton Hubble
d10c2b4cb9 Update Grafana from v6.0.0-beta2 to v6.0.0-beta3
* Update Grafana dashboards
2019-02-23 13:03:25 -08:00
Dalton Hubble
e483c81ce9 Improve Prometheus rules and alerts and Grafana dashboards
* Collate upstream rules, alerts, and dashboards and tune for use
in Typhoon
* Previously, a well-chosen (but older) set of rules, alerts, and
dashboards were maintained to reflect metric name changes
2019-02-18 12:19:23 -08:00
Dalton Hubble
6fa3b8a13f Upgrade Grafana to v6.0.0-beta2 and enable Explore UI
* Upgrade Grafana from v5.4.3 to v6.0.0-beta2
* Enable Grafana Explore UI while still using only the Viewer
role (inspect/edit without saving)
* http://docs.grafana.org/guides/whats-new-in-v6-0/
2019-02-17 13:26:42 -08:00
Dalton Hubble
170ef74eea Remove Nginx Ingress default backend
* nginx-ingress no longer requires a configured default-backend,
it will respond with its own 404 page starting in v0.21.0
* https://github.com/kubernetes/ingress-nginx/pull/3196
2019-02-16 14:18:15 -08:00
Dalton Hubble
b13a651cfe Drop metrics that are unset, high cardinality, or extraneous
* https://github.com/coreos/prometheus-operator/pull/2387
* https://github.com/coreos/prometheus-operator/pull/1959
2019-02-10 23:56:11 -08:00
Dalton Hubble
9c59f393a5 Add Kubernetes pod name to metrics discovered from service endpoints
* Prometheus queries from some upstreams use joins of node-exporter
and kube-state-metrics metrics by (namespace,pod). Add the Kubernetes
pod name to service endpoint metrics
* Rename the kubernetes_namespace field to namespace
* Honor labels since kube-state-metrics already include a `pod` field
that should not be overridden
2019-02-10 23:54:30 -08:00
Dalton Hubble
3e4b3bfb04 Raise nginx-ingress liveness/readiness timeout
* Under heavy load, avoid timeouts causing nginx-ingress
restarts https://github.com/kubernetes/ingress-nginx/pull/3737
2019-02-09 12:53:09 -08:00
Dalton Hubble
949ce21fb2 Update Prometheus from v2.7.0 to v2.7.1
* https://github.com/prometheus/prometheus/releases/tag/v2.7.1
2019-02-02 00:13:24 -08:00
Dalton Hubble
130daeac26 Update Prometheus from v2.6.1 to v2.7.0 2019-01-29 22:31:20 -08:00
Dalton Hubble
f5ff003d0e Update node-exporter from v0.15.2 to v0.17.0
* node-exporter renamed multiple metrics that are reflected
in changes to Prometheus rules and Grafana dashboard expressions
2019-01-22 01:14:00 -08:00
Dalton Hubble
d697dd46dc Allow kube-state-metrics PodDisruptionBudget metrics
* Update kube-state-metrics ClusterRole to allow collecting
poddisruptionbudget metrics (exported as kube_poddisruptionbudget_*)
* https://github.com/kubernetes/kube-state-metrics/pull/551
* Bump addon-resizer from v1.7 to v1.8.4
2019-01-22 01:12:32 -08:00
Dalton Hubble
2f3097ebea Update nginx-ingress from v0.21.0 to v0.22.0
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.22.0
2019-01-16 23:01:22 -08:00
Dalton Hubble
67fb9602e7 Update Prometheus from v2.6.0 to v2.6.1
* https://github.com/prometheus/prometheus/releases/tag/v2.6.1
2019-01-15 21:13:40 -08:00
Dalton Hubble
c8a85fabe1 Update Grafana from v5.4.2 to v5.4.3
* https://github.com/grafana/grafana/releases/tag/v5.4.3
2019-01-15 21:13:16 -08:00
Dalton Hubble
1d27dc6528 Update kube-state-metrics exporter from v1.4.0 to v1.5.0
* https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.5.0
2019-01-12 14:24:57 -08:00
Dalton Hubble
ea8b0d1c84 Update Prometheus addon from v2.5.0 to v2.6.0
* https://github.com/prometheus/prometheus/releases/tag/v2.6.0
2018-12-27 07:35:12 -08:00
Dalton Hubble
b74bf11772 Update Grafana from v5.4.0 to v5.4.2
* https://github.com/grafana/grafana/releases/tag/v5.4.2
* https://github.com/grafana/grafana/releases/tag/v5.4.1
2018-12-15 12:39:03 -08:00
Dalton Hubble
991fb44c37 Update Grafana from v5.3.4 to v5.4.0
* https://github.com/grafana/grafana/releases/tag/v5.4.0
2018-12-06 01:33:50 -08:00
Dalton Hubble
b6016d0a26 Disable Grafana login form, admin user can't be disabled
* Example manifests aim to provide a read-only dashboard visible
to any users with network access (i.e. kubectl port-forward, LAN)
* Problem: Grafana always has an admin user, even with the user
management system disabled
* Disable the login form to prevent admin login
2018-11-28 22:04:08 -08:00
Dalton Hubble
872b11b948 Update ngninx-ingress from v0.20.0 to v0.21.0
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.21.0
2018-11-26 21:57:34 -08:00
Dalton Hubble
c8c43f3991 Update Grafana from v5.3.2 to v5.3.4
* https://github.com/grafana/grafana/releases/tag/v5.3.3
* https://github.com/grafana/grafana/releases/tag/v5.3.4
2018-11-18 16:42:50 -08:00
Dalton Hubble
7de03a1279 Fix Prometheus etcd scrape config for DigitalOcean
* Kubelet uses a node's hostname as the node name, which isn't
resolvable on DigitalOcean. On DigitalOcean, the node name was
set to the internal IP until #337 switched to instead configuring
kube-apiserver to prefer the InternalIP for communication
* Explicitly configure etcd scrapes to target each controller by
internal IP and port 2381 (replace __address__)
2018-11-06 23:02:45 -08:00
Dalton Hubble
be9f7b87d6 Update Prometheus from v2.4.3 to v2.5.0
* https://github.com/prometheus/prometheus/releases/tag/v2.5.0
2018-11-06 22:16:12 -08:00
Dalton Hubble
884c8b39dc Update Grafana from v5.3.1 to v5.3.2
* https://github.com/grafana/grafana/releases/tag/v5.3.2
2018-10-28 19:44:22 -07:00
Dalton Hubble
bc750aec33 Configure Heapster to source metrics from Kubelet authenticated API
* Heapster can now get nodes (i.e. kubelets) from the apiserver and
source metrics from the Kubelet authenticated API (10250) instead of
the Kubelet HTTP read-only API (10255)
* https://github.com/kubernetes/heapster/blob/master/docs/source-configuration.md
* Use the heapster service account token via Kubelet bearer token
authn/authz.
* Permit Heapster to skip CA verification. The CA cert does not contain
IP SANs and cannot since nodes get random IPs that aren't known upfront.
Heapster obtains the node list from the apiserver, so the risk of
spoofing a node is limited. For the same reason, Prometheus scrapes
must skip CA verification for scraping Kubelet's provided by the apiserver.
* https://github.com/poseidon/typhoon/blob/v1.12.1/addons/prometheus/config.yaml#L68
* Create a heapster ClusterRole to work around the default Kubernetes
`system:heapster` ClusterRole lacking the proper GET `nodes/stats`
access. See https://github.com/kubernetes/heapster/issues/1936
2018-10-18 21:03:01 -07:00
Dalton Hubble
0127ee82c1 Update nginx-ingress from v0.19.0 to v0.20.0 2018-10-16 21:35:29 -07:00
Dalton Hubble
a10d6977b8 Update Prometheus from v2.4.2 to v2.4.3
* https://github.com/prometheus/prometheus/releases/tag/v2.4.3
2018-10-16 21:29:41 -07:00
Dalton Hubble
05fe923c14 Update Grafana from v5.3.0 to v5.3.1
* https://github.com/grafana/grafana/releases/tag/v5.3.1
2018-10-16 21:23:44 -07:00
Dalton Hubble
5eb4078d68 Add docker/default seccomp to control plane and addons
* Annotate pods, deployments, and daemonsets to start containers
with the Docker runtime's default seccomp profile
* Overrides Kubernetes default behavior which started containers
with seccomp=unconfined
* https://docs.docker.com/engine/security/seccomp/#pass-a-profile-for-a-container
2018-10-16 20:07:29 -07:00
Dalton Hubble
8f0d2b5db4 Update Grafana from v5.2.4 to v5.3.0 2018-10-13 23:03:31 -07:00
Dalton Hubble
032a24133b Update Prometheus from v2.3.2 to v2.4.2
* https://github.com/prometheus/prometheus/releases/tag/v2.4.0
* https://github.com/prometheus/prometheus/releases/tag/v2.4.1
* https://github.com/prometheus/prometheus/releases/tag/v2.4.2
2018-09-21 22:27:11 -07:00
Dalton Hubble
dc03f7a4a9 Update nginx-ingress from 0.17.1 to 0.19.0
* If using --enable-ssl-passthrough or exposing TCP/UDP services,
be aware of https://github.com/kubernetes/ingress-nginx/pull/3038
* Workarounds until the fix merges are to stay on 0.17.1, use the
suggested development image, or revert to securityContext
`runAsNonRoot: false` for a while (less secure)
2018-09-08 17:57:01 -07:00
Dalton Hubble
1b8234eb91 Update Grafana from v5.2.2 to v5.2.4
* https://github.com/grafana/grafana/releases/tag/v5.2.3
* https://github.com/grafana/grafana/releases/tag/v5.2.4
2018-09-08 15:41:20 -07:00
Dalton Hubble
4ba090feb0 Update kube-state-metrics from v1.3.1 to v1.4.0 2018-08-29 09:37:50 -07:00
Dalton Hubble
4882fe1053 Add docs for Azure Ingress and worker pools
* Azure worker pools must be in the same region as
the cluster itself unfortunately
2018-08-27 23:30:56 -07:00
Becca Powell
49a9dc9b8b Fix typo in Prometheus alerting rules 2018-08-21 16:55:49 -07:00
Dalton Hubble
dbdc3fc850 Add nginx-ingress addon manifests for bare-metal 2018-08-11 12:14:23 -07:00
Dalton Hubble
e00f97c578 Update nginx-ingress from 0.16.2 to 0.17.1
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.17.1
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.17.0
2018-08-08 00:45:20 -07:00
Dalton Hubble
e6720cf738 Update heapster from v1.5.3 to v1.5.4
* https://github.com/kubernetes/heapster/releases/tag/v1.5.4
2018-07-29 11:19:57 -07:00
Dalton Hubble
844f380b4e Update Grafana from v5.2.1 to v5.2.2
* https://github.com/grafana/grafana/releases/tag/v5.2.2
2018-07-29 11:12:56 -07:00
Dalton Hubble
02cd8eb8d3 Update Prometheus from v2.3.1 to v2.3.2
* https://github.com/prometheus/prometheus/releases/tag/v2.3.2
2018-07-14 14:25:49 -07:00
Dalton Hubble
84d6cfe7b3 Add Prometheus alert rule for inactive md devices
* node-exporter exposes metrics to Prometheus about total and
active md devices (e.g. disks in mdadm RAID arrays)
* Add alert that fires when a RAID disk fails or becomes inactive
for another reason
2018-07-10 00:20:30 -07:00
Dalton Hubble
f40f60b83c Update Nginx Ingress controller from 0.15.0 to 0.16.2
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.16.2
* https://github.com/kubernetes/ingress-nginx/blob/master/Changelog.md
2018-07-02 22:06:22 -07:00
Dalton Hubble
a3349b5c68 Update heapster from v1.5.2 to v1.5.3 2018-07-01 21:07:52 -07:00
Dalton Hubble
74dc6b0bf9 Update Grafana from 5.1.4 to 5.2.1
* http://docs.grafana.org/guides/whats-new-in-v5-2/
* https://github.com/grafana/grafana/releases/tag/v5.2.0
* https://github.com/grafana/grafana/releases/tag/v5.2.1
2018-07-01 20:55:34 -07:00
Dalton Hubble
2eaf04c68b Drop hostNetwork from nginx-ingress addon
* Both flannel and Calico support host port via `portmap`
* Allows writing NetworkPolicies that reference ingress pods in `from`
or `to`. HostNetwork pods were difficult to write network policy for
since they could circumvent the CNI network to communicate with pods on
the same node.
2018-06-22 00:46:41 -07:00
Dalton Hubble
d5de41e07a Update Grafana from 5.1.3 to 5.1.4
* https://github.com/grafana/grafana/releases/tag/v5.1.4
2018-06-19 21:45:15 -07:00
Dalton Hubble
05b99178ae Update prometheus from v2.3.0 to v2.3.1
* https://github.com/prometheus/prometheus/releases/tag/v2.3.1
2018-06-19 21:43:50 -07:00
Stephen Demos
18dd7ccc09 Update CLUO from v0.6.0 to v0.7.0 2018-06-14 22:32:36 -07:00
Dalton Hubble
cbe646fba6 Label namespaces to ease writing Network Policies 2018-06-09 11:45:11 -07:00
Dalton Hubble
c166b2ba33 Update prometheus from v2.2.1 to v2.3.0 2018-06-09 11:43:10 -07:00
Dalton Hubble
d32e6797ae Annotate Grafana so Prometheus scrapes metrics 2018-05-30 22:37:47 -07:00
Dalton Hubble
32a9a83190 Add Prometheus liveness and readiness probes 2018-05-30 22:34:07 -07:00
Dalton Hubble
28d0891729 Annotate nginx-ingress addon for Prometheus auto-discovery
* Add Google Cloud firewall rule to allow worker to worker access
to health and metrics
2018-05-19 13:13:14 -07:00
Dalton Hubble
714419342e Update nginx-ingress from 0.14.0 to 0.15.0
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.15.0
2018-05-17 21:42:55 -07:00
Dalton Hubble
3701c0b1fe Update Grafana from v5.1.2 to v5.1.3
* https://github.com/grafana/grafana/releases/tag/v5.1.3
2018-05-17 21:36:09 -07:00
Dalton Hubble
c2b719dc75 Configure Prometheus to scrape Kubelets directly
* Use Kubelet bearer token authn/authz to scrape metrics
* Drop RBAC permission from nodes/proxy to nodes/metrics
* Stop proxying kubelet scrapes through the apiserver, since
this required higher privilege (nodes/proxy) and can add
load to the apiserver on large clusters
2018-05-14 23:06:50 -07:00
Dalton Hubble
fb88113523 Disable default Google Analytics in Grafana addon
* Its come to my attention Grafana reports analytics data
by default. Typhoon's philosophy requires user permission
for data collection so the addon should have this disabled
* http://docs.grafana.org/installation/configuration/#analytics
2018-05-10 01:18:47 -07:00
Dalton Hubble
1854f5c104 Update Grafana from v5.1.1 to v5.1.2
* https://github.com/grafana/grafana/releases/tag/v5.1.2
2018-05-10 01:09:08 -07:00
Dalton Hubble
726b58b697 Update Grafana from v5.0.4 to v5.1.1
* https://github.com/grafana/grafana/releases/tag/v5.1.1
* https://github.com/grafana/grafana/releases/tag/v5.1.0
2018-05-07 22:05:19 -07:00
Dalton Hubble
a54e3c0da1 Fix Prometheus data dir to /var/lib/prometheus
* A data volume (emptyDir) is mounted to /var/lib/prometheus
* Users could swap emptyDir for any desired volume if data
persistence is desired. Prometheus previously defaulted to
keeping its data in ./data relative to /prometheus. Override
this behavior to store data in /var/lib/prometheus
2018-05-01 22:05:27 -07:00
Dalton Hubble
731a6ec23a Update nginx-ingress from 0.13.0 to 0.14.0
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.14.0
2018-04-28 13:10:03 -07:00
Dalton Hubble
e0d9e9979c Update nginx-ingress from 0.12.0 to 0.13.0
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.13.0
2018-04-18 21:12:09 -07:00
Dalton Hubble
9789881243 Update kube-state-metrics from v1.3.0 to v1.3.1
* https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.3.1
2018-04-15 17:10:02 -07:00
Dalton Hubble
6b08bde479 Use k8s.gcr.io instead of gcr.io/google_containers
* Kubernetes recommends using the alias to fetch images
from the nearest GCR regional mirror, to abstract the use
of GCR, and to drop names containing 'google'
* https://groups.google.com/forum/#!msg/kubernetes-dev/ytjk_rNrTa0/3EFUHvovCAAJ
2018-04-08 12:57:52 -07:00
Dalton Hubble
f4b2396718 Return Prometheus deployment to be a worker workload
* Expose etcd metrics to workers so Prometheus can
run on a worker, rather than a controller
* Drop temporary firewall rules allowing Prometheus
to run on a controller and scrape targes
* Related to https://github.com/poseidon/typhoon/pull/175
2018-04-08 12:20:00 -07:00
Dalton Hubble
7186aa46da Update kube-state-metrics from v1.2.0 to v1.3.0
* https://github.com/kubernetes/kube-state-metrics/pull/412
* https://github.com/kubernetes/kube-state-metrics/pull/413
2018-04-04 21:04:13 -07:00
Dalton Hubble
d770393dbc Add etcd metrics, Prometheus scrapes, and Grafana dash
* Use etcd v3.3 --listen-metrics-urls to expose only metrics
data via http://0.0.0.0:2381 on controllers
* Add Prometheus discovery for etcd peers on controller nodes
* Temporarily drop two noisy Prometheus alerts
2018-04-03 20:31:00 -07:00
Dalton Hubble
b1e41dcb99 addons: Update from Grafana v4.6.3 to v5.0.4
This reverts commit c59a9c66b1.
2018-03-28 19:45:19 -07:00
Dalton Hubble
65a2751f77 addons: Update heapster from v1.5.1 to v1.5.2
* https://github.com/kubernetes/heapster/releases/tag/v1.5.2
2018-03-21 20:32:01 -07:00
Dalton Hubble
851bc1a3f8 Update nginx-ingress from 0.11.0 to 0.12.0 2018-03-19 23:17:17 -07:00
Dalton Hubble
46226a8015 Update Prometheus from 2.2.0 to 2.2.1 2018-03-18 15:56:44 -07:00
Dalton Hubble
c59a9c66b1 Revert "addons: Update from Grafana v4.6.3 to v5.0.0"
* Revert commit 9dcc255f8e.
* Grafana v5.0 is not compatible with Kubernetes v1.9.4. See
https://github.com/poseidon/typhoon/pull/162
2018-03-12 21:01:14 -07:00
Dalton Hubble
42708f9a70 Update Prometheus from v2.2.0-rc.1 to v2.2.0
* https://github.com/prometheus/prometheus/releases/tag/v2.2.0
2018-03-09 00:20:40 -08:00
Dalton Hubble
d54709f89c Update Grafana from v5.0.0 to 5.0.1
* https://github.com/grafana/grafana/releases/tag/v5.0.1
2018-03-09 00:20:40 -08:00
Dalton Hubble
9dcc255f8e addons: Update from Grafana v4.6.3 to v5.0.0 2018-03-09 00:20:40 -08:00
Dalton Hubble
9307e97c46 addons: Update Prometheus from v2.1.0 to v2.2.0
* Annotate Prometheus service to scrape metrics from
Prometheus itself (enables Prometheus* alerts)
* Update kube-state-metrics addon-resizer to 1.7
* Use port 8080 for kube-state-metrics
* Add PrometheusNotIngestingSamples alert rule
* Change K8SKubeletDown alert rule to fire when 10%
of kubelets are down, not 1%
  * https://github.com/coreos/prometheus-operator/pull/1032
2018-03-09 00:20:40 -08:00
Paul Saunders
86420fd507 Rename namespace manifests to be applied first
* Ensure kubectl apply -R creates manifests in the right order
2018-02-22 01:04:30 -08:00
Dalton Hubble
5c383f4184 addons: Update nginx-ingress from 0.10.2 to 0.11.0 2018-02-21 23:54:12 -08:00
Dalton Hubble
de88fa5457 addons: Update Heapster from v1.5.0 to v1.5.1
* Switch to k8s.gcr.io vanity image name
* Add service account, Role, and ClusterRole for heapster
2018-02-15 10:57:47 -08:00
Stephen Augustus
d9a0183f3f addons/nginx-ingress: Fix typo in GCP selector name 2018-02-14 03:07:36 -05:00
Dalton Hubble
03d23bfde7 addons: Remove Kubernetes Dashboard manifests and docs
* Stop maintaining Kubernetes Dashboard manifests. Dashboard takes
an unusual approch to security and is often a security weak point.
* Recommendation: Use `kubectl` and avoid using the dashboard. If
you must use the dashboard, explore hardening and consider using an
authenticating proxy rather than the dashboard's auth features
2018-02-11 10:33:23 -08:00
Dalton Hubble
2c10d24113 addons: Switch to apps/v1 workload APIs
* Deployments now belong to the apps/v1 API group
* DaemonSets now belong to the apps/v1 API group
* RBAC types now belong to the rbac.authorization.k8s.io/v1 API group
2018-02-10 23:56:31 -08:00
Dalton Hubble
65321acad2 addons: Add grafana-watcher and bundle dashboards
* Add separate Grafana addons docs and screenshots
2018-01-28 01:01:30 -08:00
Dalton Hubble
064ce83f25 addons: Update Prometheus to v2.1.0
* Change service discovery to relabel jobs to align with
rule expressions in upstream examples
* Use a separate service account for prometheus instead
of granting roles to the namespace's default
* Use a separate service account for node-exporter
* Update node-exporter and kube-state-metrics exporters
2018-01-27 21:00:15 -08:00
Dalton Hubble
c3b0cdddf3 addons: Update nginx-ingress from v0.10.1 to v0.10.2 2018-01-26 17:27:36 -08:00
Dalton Hubble
211ec94c75 addons: Update CLUO from v0.5.0 to v0.6.0
* https://github.com/coreos/container-linux-update-operator/releases/tag/v0.6.0
2018-01-26 17:24:09 -08:00
Dalton Hubble
8aca5a089e addons: Update nginx-ingress to 0.10.1 2018-01-24 20:34:05 -08:00
Dalton Hubble
103f1e16d7 addons: Update nginx-ingress to 0.10.0 2018-01-23 09:28:37 -08:00
Dalton Hubble
bc967ddcd0 addons: Update CLUO to fix compatability with Kubernetes 1.9
* Update CLUO from v0.4.1 to v0.5.0
* Earlier versions of CLUO fail to drain nodes on Kubernetes 1.9
so nodes drain one at a time repeatedly and Container Linux OS
updates are not applied to nodes.
* Check current OS versions via `kubectl get nodes --show-labels`
2018-01-19 08:33:26 -08:00
Dalton Hubble
996651c605 Update kube-state-metrics version and RBAC cluster role
* https://github.com/kubernetes/kube-state-metrics/pull/345
* https://github.com/kubernetes/kube-state-metrics/pull/334
2018-01-15 08:33:44 -08:00
Dalton Hubble
21e540159b addons: Update grafana from v4.6.2 to v4.6.3
* https://github.com/grafana/grafana/releases/tag/v4.6.3
2017-12-15 16:09:14 -08:00
Dalton Hubble
521a1f0fee addons: Update heapster from v1.4.3 to v1.5.0
* Rollback addon-resizer to 1.7 to address issues in large
clusters https://github.com/kubernetes/kubernetes/pull/52536
2017-12-11 23:34:25 -08:00
Dalton Hubble
7345cb6419 addons: Update nginx-ingress to 0.9.0 2017-12-11 00:48:15 -08:00
Dalton Hubble
a481d71d7d addons: Update nginx-ingress to 0.9.0-beta.19
* Undo rollback f00ecde854
* Port binding regression only occurs with --enable-ssl-passthrough,
which isn't used in these examples. See
https://github.com/kubernetes/ingress-nginx/issues/1788
2017-12-11 00:44:32 -08:00
Dalton Hubble
f00ecde854 Rollback nginx-ingress on GCE to 0.9.0-beta.17
* https://github.com/kubernetes/ingress-nginx/issues/1788
2017-12-02 14:06:22 -08:00
Dalton Hubble
65f006e6cc addons: Sync prometheus alerts to upstream
* https://github.com/coreos/prometheus-operator/pull/774
2017-12-01 23:24:08 -08:00
Dalton Hubble
8d3817e0ae addons: Update nginx-ingress to 0.9.0-beta.19
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.9.0-beta.19
2017-12-01 22:32:33 -08:00
Dalton Hubble
63ab117205 addons: Add prometheus rules for DaemonSets
* https://github.com/coreos/prometheus-operator/pull/755
2017-11-16 23:51:21 -08:00
Dalton Hubble
1cd262e712 addons: Fix prometheus K8SApiServerLatency alert rule
* https://github.com/coreos/prometheus-operator/issues/751
2017-11-16 23:37:15 -08:00
Dalton Hubble
32bdda1b6c addons: Update Grafana from v4.6.1 to v4.6.2
* https://github.com/grafana/grafana/releases/tag/v4.6.2
2017-11-16 23:34:36 -08:00
Dalton Hubble
159443bae7 addons: Add better alerting rules to Prometheus manifests
* Adapt the coreos/prometheus-operator alerting rules for Typhoon,
https://github.com/coreos/prometheus-operator/tree/master/contrib/kube-prometheus/manifests
* Add controller manager and scheduler shim services to let
prometheus discover them via service endpoints
* Fix several alert rules to use service endpoint discovery
* A few rules still don't do much, but they default to green
2017-11-10 20:57:47 -08:00
Dalton Hubble
119dc859d3 addons: Update nginx-ingress to 0.9.0-beta.17
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.9.0-beta.17
2017-11-10 20:16:40 -08:00
Dalton Hubble
f570af9418 addons: Update from Prometheus v1.8.2 to v2.0.0 2017-11-08 22:48:23 -08:00
Dalton Hubble
8eaa72c1ca addons: Update nginx-ingress to 0.9.0-beta.16
* Image registry changed from gcr.io to quay.io
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.9.0-beta.16
2017-11-06 23:15:15 -08:00
Dalton Hubble
10b977d54a addons: Set kube-state-metrics to have clusterIP None
* kube-state-metrics service exists to facilitate prometheus discovery
2017-11-05 17:54:09 -08:00
Dalton Hubble
b7a268fc45 addons: Add prometheus alertmanager flag
* Pass -alertmanager.url to work with a user's in-cluster
alertmanager deployment, if any
2017-11-05 15:50:46 -08:00
Dalton Hubble
279f36effd addons: Add grafana 4.6.1 and extend prometheus docs 2017-11-05 15:23:56 -08:00
Dalton Hubble
ae07a21e3d addons: Omit static resource requests/limits for kube-state-metrics
* Allow the addon-resizer to dynamically set resource values
* https://github.com/kubernetes/kube-state-metrics/pull/285
2017-11-04 14:41:04 -07:00
Dalton Hubble
0ab1ae3210 addons: Fix typo in kube-state-metrics strategy 2017-11-04 14:39:56 -07:00
Dalton Hubble
e32885c9cd addons: Update prometheus from v1.8.0 to v1.8.2
* https://github.com/prometheus/prometheus/releases/tag/v1.8.2
2017-11-04 11:00:39 -07:00
Dalton Hubble
8582e19077 Expand Nginx Ingress liveness and readiness probes
* Remove dnsPolicy: ClusterFirst
* https://github.com/kubernetes/ingress-nginx/pull/1584
2017-10-25 22:29:20 -07:00
Dalton Hubble
3727c40c6c Update Nginx Ingress defaultbackend from 1.0 to 1.4
* https://github.com/kubernetes/ingress-nginx/pull/1568
2017-10-25 22:16:23 -07:00
Dalton Hubble
b608f9c615 addons: Use service endpoints to scrape node-exporter 2017-10-24 22:59:00 -07:00
Dalton Hubble
ec1dbb853c addons: Include kube-state-metrics exporter manifests 2017-10-24 22:59:00 -07:00
Dalton Hubble
d046d45769 addons: Include Prometheus and node-exporter manifests 2017-10-24 22:58:59 -07:00
Dalton Hubble
a73f57fe4e Update CLUO from v0.4.0 to v0.4.1 2017-10-24 22:14:03 -07:00
Dalton Hubble
f86c00288f Add missing update-agent RBAC role to get pods
* Drain now gets pods, deletes pods, and waits for deletion
2017-10-20 01:21:46 -07:00
Dalton Hubble
a57b3cf973 Update CLUO addon to v0.4.0 and RBAC ClusterRole 2017-10-20 00:40:17 -07:00
Dalton Hubble
7b5ffd0085 Add Container Linux reboot-coordinator RBAC
* Add a reboot-coordinator namespace for CLUO components
* Define an RBAC ClusterRole for update-operator and update-agent
* Replace the older-style where CLUO ran in kube-system, with
admin privilege
2017-10-14 19:35:06 -07:00
Dalton Hubble
11453bac91 Update heapster addon from v1.4.0 to v1.4.3
* Use normal name and phase labels
2017-10-14 19:07:37 -07:00
Dalton Hubble
dd0c61d1d9 Update Nginx Ingress controller addon to 0.9.0-beta.15 2017-10-14 18:30:58 -07:00
Dalton Hubble
f7f983c7da docs: Add docs and addons for Nginx AWS Ingress 2017-09-28 01:09:31 -07:00
Dalton Hubble
7c733bd314 Add Nginx Ingress controller addons and docs 2017-09-18 01:48:21 -07:00
Dalton Hubble
a2609c14c0 addons: Disable Google Analytics in CLUO 2017-08-27 21:06:49 -07:00
Dalton Hubble
564c0160bf Add heapster, dashboard, and CLUO addons 2017-08-27 17:20:29 -07:00