mirror of
https://github.com/puppetmaster/typhoon.git
synced 2025-08-15 07:04:57 +02:00
Compare commits
31 Commits
Author | SHA1 | Date | |
---|---|---|---|
a37aff7f35 | |||
03d23bfde7 | |||
2c10d24113 | |||
82a616c70b | |||
2fa7dac247 | |||
a41691b222 | |||
9034203d7a | |||
d42f6d6b5d | |||
2fa1840c30 | |||
8e0b8d7e40 | |||
a0cf527ccf | |||
65321acad2 | |||
064ce83f25 | |||
c3b0cdddf3 | |||
211ec94c75 | |||
8aca5a089e | |||
3e6e4ea339 | |||
103f1e16d7 | |||
50dd3e3b82 | |||
3dc755994b | |||
ddbfb2eee1 | |||
868265988b | |||
6adffcb778 | |||
bc967ddcd0 | |||
ef18f19ec4 | |||
f5efcc1ff8 | |||
996651c605 | |||
38fa7dff1a | |||
bbe295a3f1 | |||
d8db296932 | |||
388ac08492 |
2
.github/ISSUE_TEMPLATE.md
vendored
2
.github/ISSUE_TEMPLATE.md
vendored
@ -4,7 +4,7 @@
|
|||||||
|
|
||||||
### Environment
|
### Environment
|
||||||
|
|
||||||
* Platform: bare-metal, google-cloud, digital-ocean
|
* Platform: aws, bare-metal, google-cloud, digital-ocean
|
||||||
* OS: container-linux, fedora-cloud
|
* OS: container-linux, fedora-cloud
|
||||||
* Terraform: `terraform version`
|
* Terraform: `terraform version`
|
||||||
* Plugins: Provider plugin versions
|
* Plugins: Provider plugin versions
|
||||||
|
64
CHANGES.md
64
CHANGES.md
@ -4,12 +4,74 @@ Notable changes between versions.
|
|||||||
|
|
||||||
## Latest
|
## Latest
|
||||||
|
|
||||||
|
## v1.9.3
|
||||||
|
|
||||||
|
* Kubernetes [v1.9.3](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.9.md#v193)
|
||||||
|
* Network improvements and fixes ([#104](https://github.com/poseidon/typhoon/pull/104))
|
||||||
|
* Switch from Calico v2.6.6 to v3.0.2
|
||||||
|
* Add Calico GlobalNetworkSet CRD
|
||||||
|
* Update flannel from v0.9.0 to v0.10.0
|
||||||
|
* Use separate service account for flannel
|
||||||
|
* Update etcd from v3.2.14 to v3.2.15
|
||||||
|
|
||||||
|
#### Addons
|
||||||
|
|
||||||
|
* Update Prometheus from v2.0.0 to v2.1.0 ([#113](https://github.com/poseidon/typhoon/pull/113))
|
||||||
|
* Improve alerting rules
|
||||||
|
* Relabel discovered kubelet, endpoint, service, and apiserver scrapes
|
||||||
|
* Use separate service accounts
|
||||||
|
* Update node-exporter and kube-state-metrics
|
||||||
|
* Include Grafana dashboards for Kubernetes admins ([#113](https://github.com/poseidon/typhoon/pull/113))
|
||||||
|
* Add grafana-watcher to load bundled upstream dashboards
|
||||||
|
* Update nginx-ingress from 0.9.0 to 0.10.2
|
||||||
|
* Update CLUO from v0.5.0 to v0.6.0
|
||||||
|
* Switch manifests to use `apps/v1` Deployments and Daemonsets ([#120](https://github.com/poseidon/typhoon/pull/120))
|
||||||
|
* Remove Kubernetes Dashboard manifests ([#121](https://github.com/poseidon/typhoon/pull/121))
|
||||||
|
|
||||||
|
#### Digital Ocean
|
||||||
|
|
||||||
|
* Use new Droplet [types](https://developers.digitalocean.com/documentation/changelog/api-v2/new-size-slugs-for-droplet-plan-changes/) which offer more CPU/memory, at lower cost. ([#105](https://github.com/poseidon/typhoon/pull/105))
|
||||||
|
* A small Digital Ocean cluster costs less than $25 a month!
|
||||||
|
|
||||||
|
## v1.9.2
|
||||||
|
|
||||||
|
* Kubernetes [v1.9.2](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.9.md#v192)
|
||||||
|
* Add Terraform v0.11.x support
|
||||||
|
* Add explicit "providers" section to modules for Terraform v0.11.x
|
||||||
|
* Retain support for Terraform v0.10.4+
|
||||||
|
* Add [migration guide](https://github.com/poseidon/typhoon/blob/master/docs/topics/maintenance.md) from Terraform v0.10.x to v0.11.x (**action required!**)
|
||||||
|
* Update etcd from 3.2.13 to 3.2.14
|
||||||
|
* Update calico from 2.6.5 to 2.6.6
|
||||||
|
* Update kube-dns from v1.14.7 to v1.14.8
|
||||||
|
* Use separate service account for kube-dns
|
||||||
|
* Use kubernetes-incubator/bootkube v0.10.0
|
||||||
|
|
||||||
|
#### Addons
|
||||||
|
|
||||||
|
* Update CLUO to v0.5.0 to fix compatibility with Kubernetes 1.9 (**important**)
|
||||||
|
* Earlier versions can't roll out Container Linux updates on Kubernetes 1.9 nodes ([cluo#163](https://github.com/coreos/container-linux-update-operator/issues/163))
|
||||||
|
* Update kube-state-metrics from v1.1.0 to v1.2.0
|
||||||
|
* Fix RBAC cluster role for kube-state-metrics
|
||||||
|
|
||||||
|
#### Bare-Metal
|
||||||
|
|
||||||
|
* Use per-node Container Linux install profiles ([#97](https://github.com/poseidon/typhoon/pull/97))
|
||||||
|
* Allow Container Linux channel/version to be chosen per-cluster
|
||||||
|
* Fix issue where cluster deletion could require `terraform apply` multiple times
|
||||||
|
|
||||||
|
#### Digital Ocean
|
||||||
|
|
||||||
|
* Relax `digitalocean` provider version constraint
|
||||||
|
* Fix bug with `terraform plan` always showing a firewall diff to be applied ([#3](https://github.com/poseidon/typhoon/issues/3))
|
||||||
|
|
||||||
|
## v1.9.1
|
||||||
|
|
||||||
* Kubernetes [v1.9.1](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.9.md#v191)
|
* Kubernetes [v1.9.1](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.9.md#v191)
|
||||||
* Update kube-dns from 1.14.5 to v1.14.7
|
* Update kube-dns from 1.14.5 to v1.14.7
|
||||||
* Update etcd from 3.2.0 to 3.2.13
|
* Update etcd from 3.2.0 to 3.2.13
|
||||||
* Update Calico from v2.6.4 to v2.6.5
|
* Update Calico from v2.6.4 to v2.6.5
|
||||||
* Enable portmap to fix hostPort with Calico
|
* Enable portmap to fix hostPort with Calico
|
||||||
* Service account for controller-manager
|
* Use separate service account for controller-manager
|
||||||
|
|
||||||
## v1.8.6
|
## v1.8.6
|
||||||
|
|
||||||
|
18
README.md
18
README.md
@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
|
|||||||
|
|
||||||
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
|
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
|
||||||
|
|
||||||
* Kubernetes v1.9.1 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
|
* Kubernetes v1.9.3 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
|
||||||
* Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
|
* Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
|
||||||
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
|
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
|
||||||
* Ready for Ingress, Dashboards, Metrics, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
|
* Ready for Ingress, Dashboards, Metrics, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
|
||||||
@ -45,6 +45,14 @@ Define a Kubernetes cluster by using the Terraform module for your chosen platfo
|
|||||||
module "google-cloud-yavin" {
|
module "google-cloud-yavin" {
|
||||||
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes"
|
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes"
|
||||||
|
|
||||||
|
providers = {
|
||||||
|
google = "google.default"
|
||||||
|
local = "local.default"
|
||||||
|
null = "null.default"
|
||||||
|
template = "template.default"
|
||||||
|
tls = "tls.default"
|
||||||
|
}
|
||||||
|
|
||||||
# Google Cloud
|
# Google Cloud
|
||||||
region = "us-central1"
|
region = "us-central1"
|
||||||
dns_zone = "example.com"
|
dns_zone = "example.com"
|
||||||
@ -78,9 +86,9 @@ In 4-8 minutes (varies by platform), the cluster will be ready. This Google Clou
|
|||||||
$ export KUBECONFIG=/home/user/.secrets/clusters/yavin/auth/kubeconfig
|
$ export KUBECONFIG=/home/user/.secrets/clusters/yavin/auth/kubeconfig
|
||||||
$ kubectl get nodes
|
$ kubectl get nodes
|
||||||
NAME STATUS AGE VERSION
|
NAME STATUS AGE VERSION
|
||||||
yavin-controller-0.c.example-com.internal Ready 6m v1.9.1
|
yavin-controller-0.c.example-com.internal Ready 6m v1.9.3
|
||||||
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.9.1
|
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.9.3
|
||||||
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.9.1
|
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.9.3
|
||||||
```
|
```
|
||||||
|
|
||||||
List the pods.
|
List the pods.
|
||||||
@ -127,4 +135,4 @@ Typhoon is not a product, trial, or free-tier. It is not run by a company, does
|
|||||||
|
|
||||||
Typhoon clusters will contain only [free](https://www.debian.org/intro/free) components. Cluster components will not collect data on users without their permission.
|
Typhoon clusters will contain only [free](https://www.debian.org/intro/free) components. Cluster components will not collect data on users without their permission.
|
||||||
|
|
||||||
*Disclosure: The author works for CoreOS and previously wrote Matchbox and original Tectonic for bare-metal and AWS. This project is not associated with CoreOS.*
|
*Disclosure: The author works for Red Hat (prev CoreOS), but Typhoon is unassociated and maintained independently.*
|
||||||
|
@ -1,5 +1,5 @@
|
|||||||
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
kind: ClusterRoleBinding
|
kind: ClusterRoleBinding
|
||||||
apiVersion: rbac.authorization.k8s.io/v1beta1
|
|
||||||
metadata:
|
metadata:
|
||||||
name: reboot-coordinator
|
name: reboot-coordinator
|
||||||
roleRef:
|
roleRef:
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
apiVersion: rbac.authorization.k8s.io/v1beta1
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
kind: ClusterRole
|
kind: ClusterRole
|
||||||
metadata:
|
metadata:
|
||||||
name: reboot-coordinator
|
name: reboot-coordinator
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
apiVersion: extensions/v1beta1
|
apiVersion: apps/v1
|
||||||
kind: DaemonSet
|
kind: DaemonSet
|
||||||
metadata:
|
metadata:
|
||||||
name: container-linux-update-agent
|
name: container-linux-update-agent
|
||||||
@ -8,6 +8,9 @@ spec:
|
|||||||
type: RollingUpdate
|
type: RollingUpdate
|
||||||
rollingUpdate:
|
rollingUpdate:
|
||||||
maxUnavailable: 1
|
maxUnavailable: 1
|
||||||
|
selector:
|
||||||
|
matchLabels:
|
||||||
|
app: container-linux-update-agent
|
||||||
template:
|
template:
|
||||||
metadata:
|
metadata:
|
||||||
labels:
|
labels:
|
||||||
@ -15,7 +18,7 @@ spec:
|
|||||||
spec:
|
spec:
|
||||||
containers:
|
containers:
|
||||||
- name: update-agent
|
- name: update-agent
|
||||||
image: quay.io/coreos/container-linux-update-operator:v0.4.1
|
image: quay.io/coreos/container-linux-update-operator:v0.6.0
|
||||||
command:
|
command:
|
||||||
- "/bin/update-agent"
|
- "/bin/update-agent"
|
||||||
volumeMounts:
|
volumeMounts:
|
||||||
|
@ -1,10 +1,13 @@
|
|||||||
apiVersion: extensions/v1beta1
|
apiVersion: apps/v1
|
||||||
kind: Deployment
|
kind: Deployment
|
||||||
metadata:
|
metadata:
|
||||||
name: container-linux-update-operator
|
name: container-linux-update-operator
|
||||||
namespace: reboot-coordinator
|
namespace: reboot-coordinator
|
||||||
spec:
|
spec:
|
||||||
replicas: 1
|
replicas: 1
|
||||||
|
selector:
|
||||||
|
matchLabels:
|
||||||
|
app: container-linux-update-operator
|
||||||
template:
|
template:
|
||||||
metadata:
|
metadata:
|
||||||
labels:
|
labels:
|
||||||
@ -12,7 +15,7 @@ spec:
|
|||||||
spec:
|
spec:
|
||||||
containers:
|
containers:
|
||||||
- name: update-operator
|
- name: update-operator
|
||||||
image: quay.io/coreos/container-linux-update-operator:v0.4.1
|
image: quay.io/coreos/container-linux-update-operator:v0.6.0
|
||||||
command:
|
command:
|
||||||
- "/bin/update-operator"
|
- "/bin/update-operator"
|
||||||
env:
|
env:
|
||||||
|
@ -1,32 +0,0 @@
|
|||||||
apiVersion: extensions/v1beta1
|
|
||||||
kind: Deployment
|
|
||||||
metadata:
|
|
||||||
name: kubernetes-dashboard
|
|
||||||
namespace: kube-system
|
|
||||||
spec:
|
|
||||||
replicas: 1
|
|
||||||
template:
|
|
||||||
metadata:
|
|
||||||
labels:
|
|
||||||
name: kubernetes-dashboard
|
|
||||||
phase: prod
|
|
||||||
spec:
|
|
||||||
containers:
|
|
||||||
- name: kubernetes-dashboard
|
|
||||||
image: gcr.io/google_containers/kubernetes-dashboard-amd64:v1.6.1
|
|
||||||
ports:
|
|
||||||
- name: http
|
|
||||||
containerPort: 9090
|
|
||||||
resources:
|
|
||||||
limits:
|
|
||||||
cpu: 100m
|
|
||||||
memory: 300Mi
|
|
||||||
requests:
|
|
||||||
cpu: 100m
|
|
||||||
memory: 100Mi
|
|
||||||
livenessProbe:
|
|
||||||
httpGet:
|
|
||||||
path: /
|
|
||||||
port: 9090
|
|
||||||
initialDelaySeconds: 30
|
|
||||||
timeoutSeconds: 30
|
|
@ -1,15 +0,0 @@
|
|||||||
apiVersion: v1
|
|
||||||
kind: Service
|
|
||||||
metadata:
|
|
||||||
name: kubernetes-dashboard
|
|
||||||
namespace: kube-system
|
|
||||||
spec:
|
|
||||||
type: ClusterIP
|
|
||||||
selector:
|
|
||||||
name: kubernetes-dashboard
|
|
||||||
phase: prod
|
|
||||||
ports:
|
|
||||||
- name: http
|
|
||||||
protocol: TCP
|
|
||||||
port: 80
|
|
||||||
targetPort: 9090
|
|
7499
addons/grafana/config.yaml
Normal file
7499
addons/grafana/config.yaml
Normal file
@ -0,0 +1,7499 @@
|
|||||||
|
apiVersion: v1
|
||||||
|
kind: ConfigMap
|
||||||
|
metadata:
|
||||||
|
name: grafana-dashboards
|
||||||
|
namespace: monitoring
|
||||||
|
data:
|
||||||
|
deployment-dashboard.json: |+
|
||||||
|
{
|
||||||
|
"dashboard":
|
||||||
|
{
|
||||||
|
"__inputs": [
|
||||||
|
{
|
||||||
|
"description": "",
|
||||||
|
"label": "prometheus",
|
||||||
|
"name": "DS_PROMETHEUS",
|
||||||
|
"pluginId": "prometheus",
|
||||||
|
"pluginName": "Prometheus",
|
||||||
|
"type": "datasource"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"annotations": {
|
||||||
|
"list": []
|
||||||
|
},
|
||||||
|
"editable": false,
|
||||||
|
"graphTooltip": 1,
|
||||||
|
"hideControls": false,
|
||||||
|
"links": [],
|
||||||
|
"rows": [
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "200px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(245, 54, 54, 0.9)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(50, 172, 45, 0.97)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "none",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"id": 8,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfix": "cores",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 4,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(rate(container_cpu_usage_seconds_total{namespace=\"$deployment_namespace\",pod_name=~\"$deployment_name.*\"}[3m]))",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "CPU",
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "110%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(245, 54, 54, 0.9)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(50, 172, 45, 0.97)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "none",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"id": 9,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfix": "GB",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "80%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 4,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(container_memory_usage_bytes{namespace=\"$deployment_namespace\",pod_name=~\"$deployment_name.*\"}) / 1024^3",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Memory",
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "110%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(245, 54, 54, 0.9)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(50, 172, 45, 0.97)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "Bps",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": false
|
||||||
|
},
|
||||||
|
"id": 7,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfix": "",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 4,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(rate(container_network_transmit_bytes_total{namespace=\"$deployment_namespace\",pod_name=~\"$deployment_name.*\"}[3m])) + sum(rate(container_network_receive_bytes_total{namespace=\"$deployment_namespace\",pod_name=~\"$deployment_name.*\"}[3m]))",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Network",
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "Dashboard Row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "100px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(245, 54, 54, 0.9)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(50, 172, 45, 0.97)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "none",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": false
|
||||||
|
},
|
||||||
|
"id": 5,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "max(kube_deployment_spec_replicas{deployment=\"$deployment_name\",namespace=\"$deployment_namespace\"}) without (instance, pod)",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"metric": "kube_deployment_spec_replicas",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Desired Replicas",
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(245, 54, 54, 0.9)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(50, 172, 45, 0.97)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "none",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"id": 6,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "min(kube_deployment_status_replicas_available{deployment=\"$deployment_name\",namespace=\"$deployment_namespace\"}) without (instance, pod)",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Available Replicas",
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(245, 54, 54, 0.9)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(50, 172, 45, 0.97)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "none",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"id": 3,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "max(kube_deployment_status_observed_generation{deployment=\"$deployment_name\",namespace=\"$deployment_namespace\"}) without (instance, pod)",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Observed Generation",
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(245, 54, 54, 0.9)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(50, 172, 45, 0.97)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "none",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"id": 2,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "max(kube_deployment_metadata_generation{deployment=\"$deployment_name\",namespace=\"$deployment_namespace\"}) without (instance, pod)",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Metadata Generation",
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "Dashboard Row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "350px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 1,
|
||||||
|
"grid": {
|
||||||
|
"threshold1Color": "rgba(216, 200, 27, 0.27)",
|
||||||
|
"threshold2Color": "rgba(234, 112, 112, 0.22)"
|
||||||
|
},
|
||||||
|
"id": 1,
|
||||||
|
"isNew": true,
|
||||||
|
"legend": {
|
||||||
|
"alignAsTable": false,
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"hideEmpty": false,
|
||||||
|
"hideZero": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"rightSide": false,
|
||||||
|
"show": true,
|
||||||
|
"total": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 12,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "max(kube_deployment_status_replicas{deployment=\"$deployment_name\",namespace=\"$deployment_namespace\"}) without (instance, pod)",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "current replicas",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 30
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "min(kube_deployment_status_replicas_available{deployment=\"$deployment_name\",namespace=\"$deployment_namespace\"}) without (instance, pod)",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "available",
|
||||||
|
"refId": "B",
|
||||||
|
"step": 30
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "max(kube_deployment_status_replicas_unavailable{deployment=\"$deployment_name\",namespace=\"$deployment_namespace\"}) without (instance, pod)",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "unavailable",
|
||||||
|
"refId": "C",
|
||||||
|
"step": 30
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "min(kube_deployment_status_replicas_updated{deployment=\"$deployment_name\",namespace=\"$deployment_namespace\"}) without (instance, pod)",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "updated",
|
||||||
|
"refId": "D",
|
||||||
|
"step": 30
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "max(kube_deployment_spec_replicas{deployment=\"$deployment_name\",namespace=\"$deployment_namespace\"}) without (instance, pod)",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "desired",
|
||||||
|
"refId": "E",
|
||||||
|
"step": 30
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Replicas",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": true,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "cumulative"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"mode": "time",
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "none",
|
||||||
|
"label": "",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"label": "",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": false
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "Dashboard Row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"schemaVersion": 14,
|
||||||
|
"sharedCrosshair": false,
|
||||||
|
"style": "dark",
|
||||||
|
"tags": [],
|
||||||
|
"templating": {
|
||||||
|
"list": [
|
||||||
|
{
|
||||||
|
"allValue": ".*",
|
||||||
|
"current": {},
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"hide": 0,
|
||||||
|
"includeAll": false,
|
||||||
|
"label": "Namespace",
|
||||||
|
"multi": false,
|
||||||
|
"name": "deployment_namespace",
|
||||||
|
"options": [],
|
||||||
|
"query": "label_values(kube_deployment_metadata_generation, namespace)",
|
||||||
|
"refresh": 1,
|
||||||
|
"regex": "",
|
||||||
|
"sort": 0,
|
||||||
|
"tagValuesQuery": null,
|
||||||
|
"tags": [],
|
||||||
|
"tagsQuery": "",
|
||||||
|
"type": "query",
|
||||||
|
"useTags": false
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"allValue": null,
|
||||||
|
"current": {},
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"hide": 0,
|
||||||
|
"includeAll": false,
|
||||||
|
"label": "Deployment",
|
||||||
|
"multi": false,
|
||||||
|
"name": "deployment_name",
|
||||||
|
"options": [],
|
||||||
|
"query": "label_values(kube_deployment_metadata_generation{namespace=\"$deployment_namespace\"}, deployment)",
|
||||||
|
"refresh": 1,
|
||||||
|
"regex": "",
|
||||||
|
"sort": 0,
|
||||||
|
"tagValuesQuery": "",
|
||||||
|
"tags": [],
|
||||||
|
"tagsQuery": "deployment",
|
||||||
|
"type": "query",
|
||||||
|
"useTags": false
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"time": {
|
||||||
|
"from": "now-6h",
|
||||||
|
"to": "now"
|
||||||
|
},
|
||||||
|
"timepicker": {
|
||||||
|
"refresh_intervals": [
|
||||||
|
"5s",
|
||||||
|
"10s",
|
||||||
|
"30s",
|
||||||
|
"1m",
|
||||||
|
"5m",
|
||||||
|
"15m",
|
||||||
|
"30m",
|
||||||
|
"1h",
|
||||||
|
"2h",
|
||||||
|
"1d"
|
||||||
|
],
|
||||||
|
"time_options": [
|
||||||
|
"5m",
|
||||||
|
"15m",
|
||||||
|
"1h",
|
||||||
|
"6h",
|
||||||
|
"12h",
|
||||||
|
"24h",
|
||||||
|
"2d",
|
||||||
|
"7d",
|
||||||
|
"30d"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"timezone": "browser",
|
||||||
|
"title": "Deployment",
|
||||||
|
"version": 1
|
||||||
|
}
|
||||||
|
,
|
||||||
|
"inputs": [
|
||||||
|
{
|
||||||
|
"name": "DS_PROMETHEUS",
|
||||||
|
"pluginId": "prometheus",
|
||||||
|
"type": "datasource",
|
||||||
|
"value": "prometheus"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"overwrite": true
|
||||||
|
}
|
||||||
|
etcd-dashboard.json: |+
|
||||||
|
{
|
||||||
|
"dashboard":
|
||||||
|
{
|
||||||
|
"__inputs": [
|
||||||
|
{
|
||||||
|
"name": "DS_PROMETHEUS",
|
||||||
|
"label": "prometheus",
|
||||||
|
"description": "",
|
||||||
|
"type": "datasource",
|
||||||
|
"pluginId": "prometheus",
|
||||||
|
"pluginName": "Prometheus"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"__requires": [
|
||||||
|
{
|
||||||
|
"type": "grafana",
|
||||||
|
"id": "grafana",
|
||||||
|
"name": "Grafana",
|
||||||
|
"version": "4.5.2"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"type": "panel",
|
||||||
|
"id": "graph",
|
||||||
|
"name": "Graph",
|
||||||
|
"version": ""
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"type": "datasource",
|
||||||
|
"id": "prometheus",
|
||||||
|
"name": "Prometheus",
|
||||||
|
"version": "1.0.0"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"type": "panel",
|
||||||
|
"id": "singlestat",
|
||||||
|
"name": "Singlestat",
|
||||||
|
"version": ""
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"annotations": {
|
||||||
|
"list": []
|
||||||
|
},
|
||||||
|
"description": "etcd sample Grafana dashboard with Prometheus",
|
||||||
|
"editable": false,
|
||||||
|
"gnetId": null,
|
||||||
|
"graphTooltip": 0,
|
||||||
|
"hideControls": false,
|
||||||
|
"id": null,
|
||||||
|
"links": [],
|
||||||
|
"refresh": false,
|
||||||
|
"rows": [
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"height": "250px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"cacheTimeout": null,
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(245, 54, 54, 0.9)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(50, 172, 45, 0.97)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"format": "none",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"id": 28,
|
||||||
|
"interval": null,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"nullText": null,
|
||||||
|
"postfix": "",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"tableColumn": "",
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(etcd_server_has_leader)",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "",
|
||||||
|
"metric": "etcd_server_has_leader",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 20
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "",
|
||||||
|
"title": "Up",
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "200%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 0,
|
||||||
|
"id": 23,
|
||||||
|
"legend": {
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"show": false,
|
||||||
|
"total": false,
|
||||||
|
"values": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 5,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(rate(grpc_server_started_total{grpc_type=\"unary\"}[5m]))",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "RPC Rate",
|
||||||
|
"metric": "grpc_server_started_total",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 4
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "sum(rate(grpc_server_handled_total{grpc_type=\"unary\",grpc_code!=\"OK\"}[5m]))",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "RPC Failed Rate",
|
||||||
|
"metric": "grpc_server_handled_total",
|
||||||
|
"refId": "B",
|
||||||
|
"step": 4
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": [],
|
||||||
|
"timeFrom": null,
|
||||||
|
"timeShift": null,
|
||||||
|
"title": "RPC Rate",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "individual"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"buckets": null,
|
||||||
|
"mode": "time",
|
||||||
|
"name": null,
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "ops",
|
||||||
|
"label": null,
|
||||||
|
"logBase": 1,
|
||||||
|
"max": null,
|
||||||
|
"min": null,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"label": null,
|
||||||
|
"logBase": 1,
|
||||||
|
"max": null,
|
||||||
|
"min": null,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 0,
|
||||||
|
"id": 41,
|
||||||
|
"legend": {
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"show": false,
|
||||||
|
"total": false,
|
||||||
|
"values": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 4,
|
||||||
|
"stack": true,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(grpc_server_started_total{grpc_service=\"etcdserverpb.Watch\",grpc_type=\"bidi_stream\"}) - sum(grpc_server_handled_total{grpc_service=\"etcdserverpb.Watch\",grpc_type=\"bidi_stream\"})",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "Watch Streams",
|
||||||
|
"metric": "grpc_server_handled_total",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 4
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "sum(grpc_server_started_total{grpc_service=\"etcdserverpb.Lease\",grpc_type=\"bidi_stream\"}) - sum(grpc_server_handled_total{grpc_service=\"etcdserverpb.Lease\",grpc_type=\"bidi_stream\"})",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "Lease Streams",
|
||||||
|
"metric": "grpc_server_handled_total",
|
||||||
|
"refId": "B",
|
||||||
|
"step": 4
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": [],
|
||||||
|
"timeFrom": null,
|
||||||
|
"timeShift": null,
|
||||||
|
"title": "Active Streams",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "individual"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"buckets": null,
|
||||||
|
"mode": "time",
|
||||||
|
"name": null,
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"label": "",
|
||||||
|
"logBase": 1,
|
||||||
|
"max": null,
|
||||||
|
"min": null,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"label": null,
|
||||||
|
"logBase": 1,
|
||||||
|
"max": null,
|
||||||
|
"min": null,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"repeat": null,
|
||||||
|
"repeatIteration": null,
|
||||||
|
"repeatRowId": null,
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "Row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"height": "250px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"decimals": null,
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 0,
|
||||||
|
"grid": {},
|
||||||
|
"id": 1,
|
||||||
|
"legend": {
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"show": false,
|
||||||
|
"total": false,
|
||||||
|
"values": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 4,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "etcd_debugging_mvcc_db_total_size_in_bytes",
|
||||||
|
"format": "time_series",
|
||||||
|
"hide": false,
|
||||||
|
"interval": "",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "{{instance}} DB Size",
|
||||||
|
"metric": "",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 4
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": [],
|
||||||
|
"timeFrom": null,
|
||||||
|
"timeShift": null,
|
||||||
|
"title": "DB Size",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "cumulative"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"buckets": null,
|
||||||
|
"mode": "time",
|
||||||
|
"name": null,
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "bytes",
|
||||||
|
"logBase": 1,
|
||||||
|
"max": null,
|
||||||
|
"min": null,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"logBase": 1,
|
||||||
|
"max": null,
|
||||||
|
"min": null,
|
||||||
|
"show": false
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 0,
|
||||||
|
"grid": {},
|
||||||
|
"id": 3,
|
||||||
|
"legend": {
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"show": false,
|
||||||
|
"total": false,
|
||||||
|
"values": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 1,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 4,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": true,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "histogram_quantile(0.99, sum(rate(etcd_disk_wal_fsync_duration_seconds_bucket[5m])) by (instance, le))",
|
||||||
|
"format": "time_series",
|
||||||
|
"hide": false,
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "{{instance}} WAL fsync",
|
||||||
|
"metric": "etcd_disk_wal_fsync_duration_seconds_bucket",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 4
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "histogram_quantile(0.99, sum(rate(etcd_disk_backend_commit_duration_seconds_bucket[5m])) by (instance, le))",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "{{instance}} DB fsync",
|
||||||
|
"metric": "etcd_disk_backend_commit_duration_seconds_bucket",
|
||||||
|
"refId": "B",
|
||||||
|
"step": 4
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": [],
|
||||||
|
"timeFrom": null,
|
||||||
|
"timeShift": null,
|
||||||
|
"title": "Disk Sync Duration",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "cumulative"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"buckets": null,
|
||||||
|
"mode": "time",
|
||||||
|
"name": null,
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "s",
|
||||||
|
"logBase": 1,
|
||||||
|
"max": null,
|
||||||
|
"min": null,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"logBase": 1,
|
||||||
|
"max": null,
|
||||||
|
"min": null,
|
||||||
|
"show": false
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 0,
|
||||||
|
"id": 29,
|
||||||
|
"legend": {
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"show": false,
|
||||||
|
"total": false,
|
||||||
|
"values": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 4,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "process_resident_memory_bytes",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "{{instance}} Resident Memory",
|
||||||
|
"metric": "process_resident_memory_bytes",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 4
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": [],
|
||||||
|
"timeFrom": null,
|
||||||
|
"timeShift": null,
|
||||||
|
"title": "Memory",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "individual"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"buckets": null,
|
||||||
|
"mode": "time",
|
||||||
|
"name": null,
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "bytes",
|
||||||
|
"label": null,
|
||||||
|
"logBase": 1,
|
||||||
|
"max": null,
|
||||||
|
"min": null,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"label": null,
|
||||||
|
"logBase": 1,
|
||||||
|
"max": null,
|
||||||
|
"min": null,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"repeat": null,
|
||||||
|
"repeatIteration": null,
|
||||||
|
"repeatRowId": null,
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "New row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"height": "250px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 5,
|
||||||
|
"id": 22,
|
||||||
|
"legend": {
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"show": false,
|
||||||
|
"total": false,
|
||||||
|
"values": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 3,
|
||||||
|
"stack": true,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "rate(etcd_network_client_grpc_received_bytes_total[5m])",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "{{instance}} Client Traffic In",
|
||||||
|
"metric": "etcd_network_client_grpc_received_bytes_total",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 4
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": [],
|
||||||
|
"timeFrom": null,
|
||||||
|
"timeShift": null,
|
||||||
|
"title": "Client Traffic In",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "individual"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"buckets": null,
|
||||||
|
"mode": "time",
|
||||||
|
"name": null,
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "Bps",
|
||||||
|
"label": null,
|
||||||
|
"logBase": 1,
|
||||||
|
"max": null,
|
||||||
|
"min": null,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"label": null,
|
||||||
|
"logBase": 1,
|
||||||
|
"max": null,
|
||||||
|
"min": null,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 5,
|
||||||
|
"id": 21,
|
||||||
|
"legend": {
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"show": false,
|
||||||
|
"total": false,
|
||||||
|
"values": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 3,
|
||||||
|
"stack": true,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "rate(etcd_network_client_grpc_sent_bytes_total[5m])",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "{{instance}} Client Traffic Out",
|
||||||
|
"metric": "etcd_network_client_grpc_sent_bytes_total",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 4
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": [],
|
||||||
|
"timeFrom": null,
|
||||||
|
"timeShift": null,
|
||||||
|
"title": "Client Traffic Out",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "individual"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"buckets": null,
|
||||||
|
"mode": "time",
|
||||||
|
"name": null,
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "Bps",
|
||||||
|
"label": null,
|
||||||
|
"logBase": 1,
|
||||||
|
"max": null,
|
||||||
|
"min": null,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"label": null,
|
||||||
|
"logBase": 1,
|
||||||
|
"max": null,
|
||||||
|
"min": null,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 0,
|
||||||
|
"id": 20,
|
||||||
|
"legend": {
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"show": false,
|
||||||
|
"total": false,
|
||||||
|
"values": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 3,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(rate(etcd_network_peer_received_bytes_total[5m])) by (instance)",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "{{instance}} Peer Traffic In",
|
||||||
|
"metric": "etcd_network_peer_received_bytes_total",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 4
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": [],
|
||||||
|
"timeFrom": null,
|
||||||
|
"timeShift": null,
|
||||||
|
"title": "Peer Traffic In",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "individual"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"buckets": null,
|
||||||
|
"mode": "time",
|
||||||
|
"name": null,
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "Bps",
|
||||||
|
"label": null,
|
||||||
|
"logBase": 1,
|
||||||
|
"max": null,
|
||||||
|
"min": null,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"label": null,
|
||||||
|
"logBase": 1,
|
||||||
|
"max": null,
|
||||||
|
"min": null,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"decimals": null,
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 0,
|
||||||
|
"grid": {},
|
||||||
|
"id": 16,
|
||||||
|
"legend": {
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"show": false,
|
||||||
|
"total": false,
|
||||||
|
"values": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 3,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(rate(etcd_network_peer_sent_bytes_total[5m])) by (instance)",
|
||||||
|
"format": "time_series",
|
||||||
|
"hide": false,
|
||||||
|
"interval": "",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "{{instance}} Peer Traffic Out",
|
||||||
|
"metric": "etcd_network_peer_sent_bytes_total",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 4
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": [],
|
||||||
|
"timeFrom": null,
|
||||||
|
"timeShift": null,
|
||||||
|
"title": "Peer Traffic Out",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "cumulative"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"buckets": null,
|
||||||
|
"mode": "time",
|
||||||
|
"name": null,
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "Bps",
|
||||||
|
"logBase": 1,
|
||||||
|
"max": null,
|
||||||
|
"min": null,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"logBase": 1,
|
||||||
|
"max": null,
|
||||||
|
"min": null,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"repeat": null,
|
||||||
|
"repeatIteration": null,
|
||||||
|
"repeatRowId": null,
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "New row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"height": "250px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 0,
|
||||||
|
"id": 40,
|
||||||
|
"legend": {
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"show": false,
|
||||||
|
"total": false,
|
||||||
|
"values": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 6,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(rate(etcd_server_proposals_failed_total[5m]))",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "Proposal Failure Rate",
|
||||||
|
"metric": "etcd_server_proposals_failed_total",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 2
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "sum(etcd_server_proposals_pending)",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "Proposal Pending Total",
|
||||||
|
"metric": "etcd_server_proposals_pending",
|
||||||
|
"refId": "B",
|
||||||
|
"step": 2
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "sum(rate(etcd_server_proposals_committed_total[5m]))",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "Proposal Commit Rate",
|
||||||
|
"metric": "etcd_server_proposals_committed_total",
|
||||||
|
"refId": "C",
|
||||||
|
"step": 2
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "sum(rate(etcd_server_proposals_applied_total[5m]))",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "Proposal Apply Rate",
|
||||||
|
"refId": "D",
|
||||||
|
"step": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": [],
|
||||||
|
"timeFrom": null,
|
||||||
|
"timeShift": null,
|
||||||
|
"title": "Raft Proposals",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "individual"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"buckets": null,
|
||||||
|
"mode": "time",
|
||||||
|
"name": null,
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"label": "",
|
||||||
|
"logBase": 1,
|
||||||
|
"max": null,
|
||||||
|
"min": null,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"label": null,
|
||||||
|
"logBase": 1,
|
||||||
|
"max": null,
|
||||||
|
"min": null,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"decimals": 0,
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 0,
|
||||||
|
"id": 19,
|
||||||
|
"legend": {
|
||||||
|
"alignAsTable": false,
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"rightSide": false,
|
||||||
|
"show": false,
|
||||||
|
"total": false,
|
||||||
|
"values": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 6,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "changes(etcd_server_leader_changes_seen_total[1d])",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "{{instance}} Total Leader Elections Per Day",
|
||||||
|
"metric": "etcd_server_leader_changes_seen_total",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": [],
|
||||||
|
"timeFrom": null,
|
||||||
|
"timeShift": null,
|
||||||
|
"title": "Total Leader Elections Per Day",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "individual"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"buckets": null,
|
||||||
|
"mode": "time",
|
||||||
|
"name": null,
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"label": null,
|
||||||
|
"logBase": 1,
|
||||||
|
"max": null,
|
||||||
|
"min": null,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"label": null,
|
||||||
|
"logBase": 1,
|
||||||
|
"max": null,
|
||||||
|
"min": null,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"repeat": null,
|
||||||
|
"repeatIteration": null,
|
||||||
|
"repeatRowId": null,
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "New row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"schemaVersion": 14,
|
||||||
|
"style": "dark",
|
||||||
|
"tags": [],
|
||||||
|
"templating": {
|
||||||
|
"list": []
|
||||||
|
},
|
||||||
|
"time": {
|
||||||
|
"from": "now-15m",
|
||||||
|
"to": "now"
|
||||||
|
},
|
||||||
|
"timepicker": {
|
||||||
|
"now": true,
|
||||||
|
"refresh_intervals": [
|
||||||
|
"5s",
|
||||||
|
"10s",
|
||||||
|
"30s",
|
||||||
|
"1m",
|
||||||
|
"5m",
|
||||||
|
"15m",
|
||||||
|
"30m",
|
||||||
|
"1h",
|
||||||
|
"2h",
|
||||||
|
"1d"
|
||||||
|
],
|
||||||
|
"time_options": [
|
||||||
|
"5m",
|
||||||
|
"15m",
|
||||||
|
"1h",
|
||||||
|
"6h",
|
||||||
|
"12h",
|
||||||
|
"24h",
|
||||||
|
"2d",
|
||||||
|
"7d",
|
||||||
|
"30d"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"timezone": "browser",
|
||||||
|
"title": "etcd",
|
||||||
|
"version": 4
|
||||||
|
}
|
||||||
|
,
|
||||||
|
"inputs": [
|
||||||
|
{
|
||||||
|
"name": "DS_PROMETHEUS",
|
||||||
|
"pluginId": "prometheus",
|
||||||
|
"type": "datasource",
|
||||||
|
"value": "prometheus"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"overwrite": true
|
||||||
|
}
|
||||||
|
kubernetes-capacity-planning-dashboard.json: |+
|
||||||
|
{
|
||||||
|
"dashboard":
|
||||||
|
{
|
||||||
|
"__inputs": [
|
||||||
|
{
|
||||||
|
"description": "",
|
||||||
|
"label": "prometheus",
|
||||||
|
"name": "DS_PROMETHEUS",
|
||||||
|
"pluginId": "prometheus",
|
||||||
|
"pluginName": "Prometheus",
|
||||||
|
"type": "datasource"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"annotations": {
|
||||||
|
"list": []
|
||||||
|
},
|
||||||
|
"editable": false,
|
||||||
|
"gnetId": 22,
|
||||||
|
"graphTooltip": 0,
|
||||||
|
"hideControls": false,
|
||||||
|
"links": [],
|
||||||
|
"refresh": false,
|
||||||
|
"rows": [
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "250px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 1,
|
||||||
|
"grid": {
|
||||||
|
"threshold1Color": "rgba(216, 200, 27, 0.27)",
|
||||||
|
"threshold2Color": "rgba(234, 112, 112, 0.22)"
|
||||||
|
},
|
||||||
|
"id": 3,
|
||||||
|
"isNew": false,
|
||||||
|
"legend": {
|
||||||
|
"alignAsTable": false,
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"hideEmpty": false,
|
||||||
|
"hideZero": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"rightSide": false,
|
||||||
|
"show": true,
|
||||||
|
"total": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 6,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(rate(node_cpu{mode=\"idle\"}[2m])) * 100",
|
||||||
|
"hide": false,
|
||||||
|
"intervalFactor": 10,
|
||||||
|
"legendFormat": "",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 50
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Idle CPU",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "cumulative"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"mode": "time",
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "percent",
|
||||||
|
"label": "cpu usage",
|
||||||
|
"logBase": 1,
|
||||||
|
"min": 0,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 1,
|
||||||
|
"grid": {
|
||||||
|
"threshold1Color": "rgba(216, 200, 27, 0.27)",
|
||||||
|
"threshold2Color": "rgba(234, 112, 112, 0.22)"
|
||||||
|
},
|
||||||
|
"id": 9,
|
||||||
|
"isNew": false,
|
||||||
|
"legend": {
|
||||||
|
"alignAsTable": false,
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"hideEmpty": false,
|
||||||
|
"hideZero": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"rightSide": false,
|
||||||
|
"show": true,
|
||||||
|
"total": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 6,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(node_load1)",
|
||||||
|
"intervalFactor": 4,
|
||||||
|
"legendFormat": "load 1m",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 20,
|
||||||
|
"target": ""
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "sum(node_load5)",
|
||||||
|
"intervalFactor": 4,
|
||||||
|
"legendFormat": "load 5m",
|
||||||
|
"refId": "B",
|
||||||
|
"step": 20,
|
||||||
|
"target": ""
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "sum(node_load15)",
|
||||||
|
"intervalFactor": 4,
|
||||||
|
"legendFormat": "load 15m",
|
||||||
|
"refId": "C",
|
||||||
|
"step": 20,
|
||||||
|
"target": ""
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "System Load",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "cumulative"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"mode": "time",
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "percentunit",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "New Row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "250px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 1,
|
||||||
|
"grid": {
|
||||||
|
"threshold1Color": "rgba(216, 200, 27, 0.27)",
|
||||||
|
"threshold2Color": "rgba(234, 112, 112, 0.22)"
|
||||||
|
},
|
||||||
|
"id": 4,
|
||||||
|
"isNew": false,
|
||||||
|
"legend": {
|
||||||
|
"alignAsTable": false,
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"hideEmpty": false,
|
||||||
|
"hideZero": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"rightSide": false,
|
||||||
|
"show": true,
|
||||||
|
"total": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [
|
||||||
|
{
|
||||||
|
"alias": "node_memory_SwapFree{instance=\"172.17.0.1:9100\",job=\"prometheus\"}",
|
||||||
|
"yaxis": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 9,
|
||||||
|
"stack": true,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(node_memory_MemTotal) - sum(node_memory_MemFree) - sum(node_memory_Buffers) - sum(node_memory_Cached)",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "memory usage",
|
||||||
|
"metric": "memo",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 10,
|
||||||
|
"target": ""
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "sum(node_memory_Buffers)",
|
||||||
|
"interval": "",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "memory buffers",
|
||||||
|
"metric": "memo",
|
||||||
|
"refId": "B",
|
||||||
|
"step": 10,
|
||||||
|
"target": ""
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "sum(node_memory_Cached)",
|
||||||
|
"interval": "",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "memory cached",
|
||||||
|
"metric": "memo",
|
||||||
|
"refId": "C",
|
||||||
|
"step": 10,
|
||||||
|
"target": ""
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "sum(node_memory_MemFree)",
|
||||||
|
"interval": "",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "memory free",
|
||||||
|
"metric": "memo",
|
||||||
|
"refId": "D",
|
||||||
|
"step": 10,
|
||||||
|
"target": ""
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Memory Usage",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "individual"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"mode": "time",
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "bytes",
|
||||||
|
"logBase": 1,
|
||||||
|
"min": "0",
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(50, 172, 45, 0.97)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(245, 54, 54, 0.9)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "percent",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": true,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"hideTimeOverride": false,
|
||||||
|
"id": 5,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfix": "",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "((sum(node_memory_MemTotal) - sum(node_memory_MemFree) - sum(node_memory_Buffers) - sum(node_memory_Cached)) / sum(node_memory_MemTotal)) * 100",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"metric": "",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 60,
|
||||||
|
"target": ""
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "80, 90",
|
||||||
|
"title": "Memory Usage",
|
||||||
|
"transparent": false,
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "New Row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "246px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 1,
|
||||||
|
"grid": {
|
||||||
|
"threshold1Color": "rgba(216, 200, 27, 0.27)",
|
||||||
|
"threshold2Color": "rgba(234, 112, 112, 0.22)"
|
||||||
|
},
|
||||||
|
"id": 6,
|
||||||
|
"isNew": false,
|
||||||
|
"legend": {
|
||||||
|
"alignAsTable": false,
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"hideEmpty": false,
|
||||||
|
"hideZero": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"rightSide": false,
|
||||||
|
"show": true,
|
||||||
|
"total": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [
|
||||||
|
{
|
||||||
|
"alias": "read",
|
||||||
|
"yaxis": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"alias": "{instance=\"172.17.0.1:9100\"}",
|
||||||
|
"yaxis": 2
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"alias": "io time",
|
||||||
|
"yaxis": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 9,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(rate(node_disk_bytes_read[5m]))",
|
||||||
|
"hide": false,
|
||||||
|
"intervalFactor": 4,
|
||||||
|
"legendFormat": "read",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 20,
|
||||||
|
"target": ""
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "sum(rate(node_disk_bytes_written[5m]))",
|
||||||
|
"intervalFactor": 4,
|
||||||
|
"legendFormat": "written",
|
||||||
|
"refId": "B",
|
||||||
|
"step": 20
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "sum(rate(node_disk_io_time_ms[5m]))",
|
||||||
|
"intervalFactor": 4,
|
||||||
|
"legendFormat": "io time",
|
||||||
|
"refId": "C",
|
||||||
|
"step": 20
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Disk I/O",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "cumulative"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"mode": "time",
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "bytes",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "ms",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(50, 172, 45, 0.97)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(245, 54, 54, 0.9)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "percentunit",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 1,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": true,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"hideTimeOverride": false,
|
||||||
|
"id": 12,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfix": "",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "(sum(node_filesystem_size{device!=\"rootfs\"}) - sum(node_filesystem_free{device!=\"rootfs\"})) / sum(node_filesystem_size{device!=\"rootfs\"})",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 60,
|
||||||
|
"target": ""
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "0.75, 0.9",
|
||||||
|
"title": "Disk Space Usage",
|
||||||
|
"transparent": false,
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "current"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "New Row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "250px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 1,
|
||||||
|
"grid": {
|
||||||
|
"threshold1Color": "rgba(216, 200, 27, 0.27)",
|
||||||
|
"threshold2Color": "rgba(234, 112, 112, 0.22)"
|
||||||
|
},
|
||||||
|
"id": 8,
|
||||||
|
"isNew": false,
|
||||||
|
"legend": {
|
||||||
|
"alignAsTable": false,
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"hideEmpty": false,
|
||||||
|
"hideZero": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"rightSide": false,
|
||||||
|
"show": true,
|
||||||
|
"total": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [
|
||||||
|
{
|
||||||
|
"alias": "transmitted",
|
||||||
|
"yaxis": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 6,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(rate(node_network_receive_bytes{device!~\"lo\"}[5m]))",
|
||||||
|
"hide": false,
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 10,
|
||||||
|
"target": ""
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Network Received",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "cumulative"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"mode": "time",
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "bytes",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "bytes",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 1,
|
||||||
|
"grid": {
|
||||||
|
"threshold1Color": "rgba(216, 200, 27, 0.27)",
|
||||||
|
"threshold2Color": "rgba(234, 112, 112, 0.22)"
|
||||||
|
},
|
||||||
|
"id": 10,
|
||||||
|
"isNew": false,
|
||||||
|
"legend": {
|
||||||
|
"alignAsTable": false,
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"hideEmpty": false,
|
||||||
|
"hideZero": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"rightSide": false,
|
||||||
|
"show": true,
|
||||||
|
"total": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [
|
||||||
|
{
|
||||||
|
"alias": "transmitted",
|
||||||
|
"yaxis": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 6,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(rate(node_network_transmit_bytes{device!~\"lo\"}[5m]))",
|
||||||
|
"hide": false,
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "",
|
||||||
|
"refId": "B",
|
||||||
|
"step": 10,
|
||||||
|
"target": ""
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Network Transmitted",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "cumulative"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"mode": "time",
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "bytes",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "bytes",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "New Row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "276px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 1,
|
||||||
|
"grid": {
|
||||||
|
"threshold1Color": "rgba(216, 200, 27, 0.27)",
|
||||||
|
"threshold2Color": "rgba(234, 112, 112, 0.22)"
|
||||||
|
},
|
||||||
|
"id": 11,
|
||||||
|
"isNew": true,
|
||||||
|
"legend": {
|
||||||
|
"alignAsTable": false,
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"hideEmpty": false,
|
||||||
|
"hideZero": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"rightSide": false,
|
||||||
|
"show": true,
|
||||||
|
"total": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 11,
|
||||||
|
"span": 9,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(kube_pod_info)",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "Current number of Pods",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 10
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "sum(kube_node_status_capacity_pods)",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "Maximum capacity of pods",
|
||||||
|
"refId": "B",
|
||||||
|
"step": 10
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Cluster Pod Utilization",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "individual"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"mode": "time",
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(50, 172, 45, 0.97)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(245, 54, 54, 0.9)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "percent",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": true,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"hideTimeOverride": false,
|
||||||
|
"id": 7,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfix": "",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "100 - (sum(kube_node_status_capacity_pods) - sum(kube_pod_info)) / sum(kube_node_status_capacity_pods) * 100",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 60,
|
||||||
|
"target": ""
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "80, 90",
|
||||||
|
"title": "Pod Utilization",
|
||||||
|
"transparent": false,
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "current"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "New Row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"schemaVersion": 14,
|
||||||
|
"sharedCrosshair": false,
|
||||||
|
"style": "dark",
|
||||||
|
"tags": [],
|
||||||
|
"templating": {
|
||||||
|
"list": []
|
||||||
|
},
|
||||||
|
"time": {
|
||||||
|
"from": "now-1h",
|
||||||
|
"to": "now"
|
||||||
|
},
|
||||||
|
"timepicker": {
|
||||||
|
"refresh_intervals": [
|
||||||
|
"5s",
|
||||||
|
"10s",
|
||||||
|
"30s",
|
||||||
|
"1m",
|
||||||
|
"5m",
|
||||||
|
"15m",
|
||||||
|
"30m",
|
||||||
|
"1h",
|
||||||
|
"2h",
|
||||||
|
"1d"
|
||||||
|
],
|
||||||
|
"time_options": [
|
||||||
|
"5m",
|
||||||
|
"15m",
|
||||||
|
"1h",
|
||||||
|
"6h",
|
||||||
|
"12h",
|
||||||
|
"24h",
|
||||||
|
"2d",
|
||||||
|
"7d",
|
||||||
|
"30d"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"timezone": "browser",
|
||||||
|
"title": "Kubernetes Capacity Planning",
|
||||||
|
"version": 4
|
||||||
|
}
|
||||||
|
,
|
||||||
|
"inputs": [
|
||||||
|
{
|
||||||
|
"name": "DS_PROMETHEUS",
|
||||||
|
"pluginId": "prometheus",
|
||||||
|
"type": "datasource",
|
||||||
|
"value": "prometheus"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"overwrite": true
|
||||||
|
}
|
||||||
|
kubernetes-cluster-health-dashboard.json: |+
|
||||||
|
{
|
||||||
|
"dashboard":
|
||||||
|
{
|
||||||
|
"__inputs": [
|
||||||
|
{
|
||||||
|
"description": "",
|
||||||
|
"label": "prometheus",
|
||||||
|
"name": "DS_PROMETHEUS",
|
||||||
|
"pluginId": "prometheus",
|
||||||
|
"pluginName": "Prometheus",
|
||||||
|
"type": "datasource"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"annotations": {
|
||||||
|
"list": []
|
||||||
|
},
|
||||||
|
"editable": false,
|
||||||
|
"graphTooltip": 0,
|
||||||
|
"hideControls": false,
|
||||||
|
"links": [],
|
||||||
|
"refresh": "10s",
|
||||||
|
"rows": [
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "254px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": true,
|
||||||
|
"colors": [
|
||||||
|
"rgba(50, 172, 45, 0.97)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(245, 54, 54, 0.9)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "none",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"hideTimeOverride": false,
|
||||||
|
"id": 1,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfix": "",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(up{job=~\"apiserver|kube-scheduler|kube-controller-manager\"} == 0)",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "1, 3",
|
||||||
|
"title": "Control Plane Components Down",
|
||||||
|
"transparent": false,
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "Everything UP and healthy",
|
||||||
|
"value": "null"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "",
|
||||||
|
"value": ""
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": true,
|
||||||
|
"colors": [
|
||||||
|
"rgba(50, 172, 45, 0.97)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(245, 54, 54, 0.9)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "none",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"hideTimeOverride": false,
|
||||||
|
"id": 2,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfix": "",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(ALERTS{alertstate=\"firing\",alertname!=\"DeadMansSwitch\"})",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "1, 3",
|
||||||
|
"title": "Alerts Firing",
|
||||||
|
"transparent": false,
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "0",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "current"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": true,
|
||||||
|
"colors": [
|
||||||
|
"rgba(50, 172, 45, 0.97)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(245, 54, 54, 0.9)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "none",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"hideTimeOverride": false,
|
||||||
|
"id": 3,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfix": "",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(ALERTS{alertstate=\"pending\",alertname!=\"DeadMansSwitch\"})",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "3, 5",
|
||||||
|
"title": "Alerts Pending",
|
||||||
|
"transparent": false,
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "0",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "current"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": true,
|
||||||
|
"colors": [
|
||||||
|
"rgba(50, 172, 45, 0.97)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(245, 54, 54, 0.9)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "none",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"hideTimeOverride": false,
|
||||||
|
"id": 4,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfix": "",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "count(increase(kube_pod_container_status_restarts[1h]) > 5)",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "1, 3",
|
||||||
|
"title": "Crashlooping Pods",
|
||||||
|
"transparent": false,
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "0",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "current"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "Row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "250px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": true,
|
||||||
|
"colors": [
|
||||||
|
"rgba(50, 172, 45, 0.97)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(245, 54, 54, 0.9)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "none",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"hideTimeOverride": false,
|
||||||
|
"id": 5,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfix": "",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(kube_node_status_condition{condition=\"Ready\",status!=\"true\"})",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "1, 3",
|
||||||
|
"title": "Node Not Ready",
|
||||||
|
"transparent": false,
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "current"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": true,
|
||||||
|
"colors": [
|
||||||
|
"rgba(50, 172, 45, 0.97)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(245, 54, 54, 0.9)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "none",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"hideTimeOverride": false,
|
||||||
|
"id": 6,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfix": "",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(kube_node_status_condition{condition=\"DiskPressure\",status=\"true\"})",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "1, 3",
|
||||||
|
"title": "Node Disk Pressure",
|
||||||
|
"transparent": false,
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "current"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": true,
|
||||||
|
"colors": [
|
||||||
|
"rgba(50, 172, 45, 0.97)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(245, 54, 54, 0.9)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "none",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"hideTimeOverride": false,
|
||||||
|
"id": 7,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfix": "",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(kube_node_status_condition{condition=\"MemoryPressure\",status=\"true\"})",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "1, 3",
|
||||||
|
"title": "Node Memory Pressure",
|
||||||
|
"transparent": false,
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "current"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": true,
|
||||||
|
"colors": [
|
||||||
|
"rgba(50, 172, 45, 0.97)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(245, 54, 54, 0.9)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "none",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"hideTimeOverride": false,
|
||||||
|
"id": 8,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfix": "",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(kube_node_spec_unschedulable)",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "1, 3",
|
||||||
|
"title": "Nodes Unschedulable",
|
||||||
|
"transparent": false,
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "current"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "Row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"schemaVersion": 14,
|
||||||
|
"sharedCrosshair": false,
|
||||||
|
"style": "dark",
|
||||||
|
"tags": [],
|
||||||
|
"templating": {
|
||||||
|
"list": []
|
||||||
|
},
|
||||||
|
"time": {
|
||||||
|
"from": "now-6h",
|
||||||
|
"to": "now"
|
||||||
|
},
|
||||||
|
"timepicker": {
|
||||||
|
"refresh_intervals": [
|
||||||
|
"5s",
|
||||||
|
"10s",
|
||||||
|
"30s",
|
||||||
|
"1m",
|
||||||
|
"5m",
|
||||||
|
"15m",
|
||||||
|
"30m",
|
||||||
|
"1h",
|
||||||
|
"2h",
|
||||||
|
"1d"
|
||||||
|
],
|
||||||
|
"time_options": [
|
||||||
|
"5m",
|
||||||
|
"15m",
|
||||||
|
"1h",
|
||||||
|
"6h",
|
||||||
|
"12h",
|
||||||
|
"24h",
|
||||||
|
"2d",
|
||||||
|
"7d",
|
||||||
|
"30d"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"timezone": "browser",
|
||||||
|
"title": "Kubernetes Cluster Health",
|
||||||
|
"version": 9
|
||||||
|
}
|
||||||
|
,
|
||||||
|
"inputs": [
|
||||||
|
{
|
||||||
|
"name": "DS_PROMETHEUS",
|
||||||
|
"pluginId": "prometheus",
|
||||||
|
"type": "datasource",
|
||||||
|
"value": "prometheus"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"overwrite": true
|
||||||
|
}
|
||||||
|
kubernetes-cluster-status-dashboard.json: |+
|
||||||
|
{
|
||||||
|
"dashboard":
|
||||||
|
{
|
||||||
|
"__inputs": [
|
||||||
|
{
|
||||||
|
"description": "",
|
||||||
|
"label": "prometheus",
|
||||||
|
"name": "DS_PROMETHEUS",
|
||||||
|
"pluginId": "prometheus",
|
||||||
|
"pluginName": "Prometheus",
|
||||||
|
"type": "datasource"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"annotations": {
|
||||||
|
"list": []
|
||||||
|
},
|
||||||
|
"editable": false,
|
||||||
|
"graphTooltip": 0,
|
||||||
|
"hideControls": false,
|
||||||
|
"links": [],
|
||||||
|
"rows": [
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "129px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": true,
|
||||||
|
"colors": [
|
||||||
|
"rgba(50, 172, 45, 0.97)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(245, 54, 54, 0.9)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "none",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"id": 5,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 6,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(up{job=~\"apiserver|kube-scheduler|kube-controller-manager\"} == 0)",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "1, 3",
|
||||||
|
"title": "Control Plane UP",
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "UP",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "total"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": true,
|
||||||
|
"colors": [
|
||||||
|
"rgba(50, 172, 45, 0.97)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(245, 54, 54, 0.9)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "none",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"id": 6,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 6,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(ALERTS{alertstate=\"firing\",alertname!=\"DeadMansSwitch\"})",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "3, 5",
|
||||||
|
"title": "Alerts Firing",
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "0",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "current"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": true,
|
||||||
|
"title": "Cluster Health",
|
||||||
|
"titleSize": "h6"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "168px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(245, 54, 54, 0.9)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(50, 172, 45, 0.97)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "percent",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": true,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"id": 1,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "(sum(up{job=\"apiserver\"} == 1) / count(up{job=\"apiserver\"})) * 100",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "50, 80",
|
||||||
|
"title": "API Servers UP",
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "current"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(245, 54, 54, 0.9)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(50, 172, 45, 0.97)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "percent",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": true,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"id": 2,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "(sum(up{job=\"kube-controller-manager\"} == 1) / count(up{job=\"kube-controller-manager\"})) * 100",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "50, 80",
|
||||||
|
"title": "Controller Managers UP",
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "current"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(245, 54, 54, 0.9)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(50, 172, 45, 0.97)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "percent",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": true,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"id": 3,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "(sum(up{job=\"kube-scheduler\"} == 1) / count(up{job=\"kube-scheduler\"})) * 100",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "50, 80",
|
||||||
|
"title": "Schedulers UP",
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "current"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": true,
|
||||||
|
"colors": [
|
||||||
|
"rgba(50, 172, 45, 0.97)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(245, 54, 54, 0.9)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "none",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"id": 4,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "count(increase(kube_pod_container_status_restarts{namespace=~\"kube-system|tectonic-system\"}[1h]) > 5)",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "1, 3",
|
||||||
|
"title": "Crashlooping Control Plane Pods",
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "0",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "current"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": true,
|
||||||
|
"title": "Control Plane Status",
|
||||||
|
"titleSize": "h6"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "158px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(50, 172, 45, 0.97)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(245, 54, 54, 0.9)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "percent",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": true,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"id": 8,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(100 - (avg by (instance) (rate(node_cpu{job=\"node-exporter\",mode=\"idle\"}[5m])) * 100)) / count(node_cpu{job=\"node-exporter\",mode=\"idle\"})",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "80, 90",
|
||||||
|
"title": "CPU Utilization",
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(50, 172, 45, 0.97)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(245, 54, 54, 0.9)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "percent",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": true,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"id": 7,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "((sum(node_memory_MemTotal) - sum(node_memory_MemFree) - sum(node_memory_Buffers) - sum(node_memory_Cached)) / sum(node_memory_MemTotal)) * 100",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "80, 90",
|
||||||
|
"title": "Memory Utilization",
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(50, 172, 45, 0.97)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(245, 54, 54, 0.9)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "percent",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": true,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"id": 9,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "(sum(node_filesystem_size{device!=\"rootfs\"}) - sum(node_filesystem_free{device!=\"rootfs\"})) / sum(node_filesystem_size{device!=\"rootfs\"})",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "80, 90",
|
||||||
|
"title": "Filesystem Utilization",
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(50, 172, 45, 0.97)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(245, 54, 54, 0.9)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "percent",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": true,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"id": 10,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "100 - (sum(kube_node_status_capacity_pods) - sum(kube_pod_info)) / sum(kube_node_status_capacity_pods) * 100",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "80, 90",
|
||||||
|
"title": "Pod Utilization",
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": true,
|
||||||
|
"title": "Capacity Planning",
|
||||||
|
"titleSize": "h6"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"schemaVersion": 14,
|
||||||
|
"sharedCrosshair": false,
|
||||||
|
"style": "dark",
|
||||||
|
"tags": [],
|
||||||
|
"templating": {
|
||||||
|
"list": []
|
||||||
|
},
|
||||||
|
"time": {
|
||||||
|
"from": "now-6h",
|
||||||
|
"to": "now"
|
||||||
|
},
|
||||||
|
"timepicker": {
|
||||||
|
"refresh_intervals": [
|
||||||
|
"5s",
|
||||||
|
"10s",
|
||||||
|
"30s",
|
||||||
|
"1m",
|
||||||
|
"5m",
|
||||||
|
"15m",
|
||||||
|
"30m",
|
||||||
|
"1h",
|
||||||
|
"2h",
|
||||||
|
"1d"
|
||||||
|
],
|
||||||
|
"time_options": [
|
||||||
|
"5m",
|
||||||
|
"15m",
|
||||||
|
"1h",
|
||||||
|
"6h",
|
||||||
|
"12h",
|
||||||
|
"24h",
|
||||||
|
"2d",
|
||||||
|
"7d",
|
||||||
|
"30d"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"timezone": "browser",
|
||||||
|
"title": "Kubernetes Cluster Status",
|
||||||
|
"version": 3
|
||||||
|
}
|
||||||
|
,
|
||||||
|
"inputs": [
|
||||||
|
{
|
||||||
|
"name": "DS_PROMETHEUS",
|
||||||
|
"pluginId": "prometheus",
|
||||||
|
"type": "datasource",
|
||||||
|
"value": "prometheus"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"overwrite": true
|
||||||
|
}
|
||||||
|
kubernetes-control-plane-status-dashboard.json: |+
|
||||||
|
{
|
||||||
|
"dashboard":
|
||||||
|
{
|
||||||
|
"__inputs": [
|
||||||
|
{
|
||||||
|
"description": "",
|
||||||
|
"label": "prometheus",
|
||||||
|
"name": "DS_PROMETHEUS",
|
||||||
|
"pluginId": "prometheus",
|
||||||
|
"pluginName": "Prometheus",
|
||||||
|
"type": "datasource"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"annotations": {
|
||||||
|
"list": []
|
||||||
|
},
|
||||||
|
"editable": false,
|
||||||
|
"graphTooltip": 0,
|
||||||
|
"hideControls": false,
|
||||||
|
"links": [],
|
||||||
|
"rows": [
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "250px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(245, 54, 54, 0.9)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(50, 172, 45, 0.97)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "percent",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": true,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"hideTimeOverride": false,
|
||||||
|
"id": 1,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfix": "",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "(sum(up{job=\"apiserver\"} == 1) / sum(up{job=\"apiserver\"})) * 100",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "50, 80",
|
||||||
|
"title": "API Servers UP",
|
||||||
|
"transparent": false,
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(245, 54, 54, 0.9)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(50, 172, 45, 0.97)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "percent",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": true,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"hideTimeOverride": false,
|
||||||
|
"id": 2,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfix": "",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "(sum(up{job=\"kube-controller-manager\"} == 1) / sum(up{job=\"kube-controller-manager\"})) * 100",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "50, 80",
|
||||||
|
"title": "Controller Managers UP",
|
||||||
|
"transparent": false,
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(245, 54, 54, 0.9)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(50, 172, 45, 0.97)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "percent",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": true,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"hideTimeOverride": false,
|
||||||
|
"id": 3,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfix": "",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "(sum(up{job=\"kube-scheduler\"} == 1) / sum(up{job=\"kube-scheduler\"})) * 100",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "50, 80",
|
||||||
|
"title": "Schedulers UP",
|
||||||
|
"transparent": false,
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(50, 172, 45, 0.97)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(245, 54, 54, 0.9)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "percent",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": true,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"hideTimeOverride": false,
|
||||||
|
"id": 4,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfix": "",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "max(sum by(instance) (rate(apiserver_request_count{code=~\"5..\"}[5m])) / sum by(instance) (rate(apiserver_request_count[5m]))) * 100",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "5, 10",
|
||||||
|
"title": "API Server Request Error Rate",
|
||||||
|
"transparent": false,
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "0",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "Dashboard Row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "250px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 1,
|
||||||
|
"grid": {
|
||||||
|
"threshold1Color": "rgba(216, 200, 27, 0.27)",
|
||||||
|
"threshold2Color": "rgba(234, 112, 112, 0.22)"
|
||||||
|
},
|
||||||
|
"id": 7,
|
||||||
|
"isNew": false,
|
||||||
|
"legend": {
|
||||||
|
"alignAsTable": false,
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"hideEmpty": false,
|
||||||
|
"hideZero": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"rightSide": false,
|
||||||
|
"show": true,
|
||||||
|
"total": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 1,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "null",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 12,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum by(verb) (rate(apiserver_latency_seconds:quantile[5m]) >= 0)",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 30
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "API Server Request Latency",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "individual"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"mode": "time",
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "Dashboard Row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "250px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 1,
|
||||||
|
"grid": {
|
||||||
|
"threshold1Color": "rgba(216, 200, 27, 0.27)",
|
||||||
|
"threshold2Color": "rgba(234, 112, 112, 0.22)"
|
||||||
|
},
|
||||||
|
"id": 5,
|
||||||
|
"isNew": false,
|
||||||
|
"legend": {
|
||||||
|
"alignAsTable": false,
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"hideEmpty": false,
|
||||||
|
"hideZero": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"rightSide": false,
|
||||||
|
"show": true,
|
||||||
|
"total": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 1,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "null",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 6,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "cluster:scheduler_e2e_scheduling_latency_seconds:quantile",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 60
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "End to End Scheduling Latency",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "individual"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"mode": "time",
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "dtdurations",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 1,
|
||||||
|
"grid": {
|
||||||
|
"threshold1Color": "rgba(216, 200, 27, 0.27)",
|
||||||
|
"threshold2Color": "rgba(234, 112, 112, 0.22)"
|
||||||
|
},
|
||||||
|
"id": 6,
|
||||||
|
"isNew": false,
|
||||||
|
"legend": {
|
||||||
|
"alignAsTable": false,
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"hideEmpty": false,
|
||||||
|
"hideZero": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"rightSide": false,
|
||||||
|
"show": true,
|
||||||
|
"total": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 1,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "null",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 6,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum by(instance) (rate(apiserver_request_count{code!~\"2..\"}[5m]))",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "Error Rate",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 60
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "sum by(instance) (rate(apiserver_request_count[5m]))",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "Request Rate",
|
||||||
|
"refId": "B",
|
||||||
|
"step": 60
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "API Server Request Rates",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "individual"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"mode": "time",
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "Dashboard Row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"schemaVersion": 14,
|
||||||
|
"sharedCrosshair": false,
|
||||||
|
"style": "dark",
|
||||||
|
"tags": [],
|
||||||
|
"templating": {
|
||||||
|
"list": []
|
||||||
|
},
|
||||||
|
"time": {
|
||||||
|
"from": "now-6h",
|
||||||
|
"to": "now"
|
||||||
|
},
|
||||||
|
"timepicker": {
|
||||||
|
"refresh_intervals": [
|
||||||
|
"5s",
|
||||||
|
"10s",
|
||||||
|
"30s",
|
||||||
|
"1m",
|
||||||
|
"5m",
|
||||||
|
"15m",
|
||||||
|
"30m",
|
||||||
|
"1h",
|
||||||
|
"2h",
|
||||||
|
"1d"
|
||||||
|
],
|
||||||
|
"time_options": [
|
||||||
|
"5m",
|
||||||
|
"15m",
|
||||||
|
"1h",
|
||||||
|
"6h",
|
||||||
|
"12h",
|
||||||
|
"24h",
|
||||||
|
"2d",
|
||||||
|
"7d",
|
||||||
|
"30d"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"timezone": "browser",
|
||||||
|
"title": "Kubernetes Control Plane Status",
|
||||||
|
"version": 3
|
||||||
|
}
|
||||||
|
,
|
||||||
|
"inputs": [
|
||||||
|
{
|
||||||
|
"name": "DS_PROMETHEUS",
|
||||||
|
"pluginId": "prometheus",
|
||||||
|
"type": "datasource",
|
||||||
|
"value": "prometheus"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"overwrite": true
|
||||||
|
}
|
||||||
|
kubernetes-resource-requests-dashboard.json: |+
|
||||||
|
{
|
||||||
|
"dashboard":
|
||||||
|
{
|
||||||
|
"__inputs": [
|
||||||
|
{
|
||||||
|
"description": "",
|
||||||
|
"label": "prometheus",
|
||||||
|
"name": "DS_PROMETHEUS",
|
||||||
|
"pluginId": "prometheus",
|
||||||
|
"pluginName": "Prometheus",
|
||||||
|
"type": "datasource"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"annotations": {
|
||||||
|
"list": []
|
||||||
|
},
|
||||||
|
"editable": false,
|
||||||
|
"graphTooltip": 0,
|
||||||
|
"hideControls": false,
|
||||||
|
"links": [],
|
||||||
|
"refresh": false,
|
||||||
|
"rows": [
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "300px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"description": "This represents the total [CPU resource requests](https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/#meaning-of-cpu) in the cluster.\nFor comparison the total [allocatable CPU cores](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/node-allocatable.md) is also shown.",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 1,
|
||||||
|
"grid": {
|
||||||
|
"threshold1Color": "rgba(216, 200, 27, 0.27)",
|
||||||
|
"threshold2Color": "rgba(234, 112, 112, 0.22)"
|
||||||
|
},
|
||||||
|
"id": 1,
|
||||||
|
"isNew": false,
|
||||||
|
"legend": {
|
||||||
|
"alignAsTable": false,
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"hideEmpty": false,
|
||||||
|
"hideZero": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"rightSide": false,
|
||||||
|
"show": true,
|
||||||
|
"total": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 1,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "null",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 9,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "min(sum(kube_node_status_allocatable_cpu_cores) by (instance))",
|
||||||
|
"hide": false,
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "Allocatable CPU Cores",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 20
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "max(sum(kube_pod_container_resource_requests_cpu_cores) by (instance))",
|
||||||
|
"hide": false,
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "Requested CPU Cores",
|
||||||
|
"refId": "B",
|
||||||
|
"step": 20
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "CPU Cores",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "individual"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"mode": "time",
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"label": "CPU Cores",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(50, 172, 45, 0.97)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(245, 54, 54, 0.9)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "percent",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": true,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"hideTimeOverride": false,
|
||||||
|
"id": 2,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfix": "",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "max(sum(kube_pod_container_resource_requests_cpu_cores) by (instance)) / min(sum(kube_node_status_allocatable_cpu_cores) by (instance)) * 100",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 240
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "80, 90",
|
||||||
|
"title": "CPU Cores",
|
||||||
|
"transparent": false,
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "110%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "CPU Cores",
|
||||||
|
"titleSize": "h6"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "300px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"description": "This represents the total [memory resource requests](https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/#meaning-of-memory) in the cluster.\nFor comparison the total [allocatable memory](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/node-allocatable.md) is also shown.",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 1,
|
||||||
|
"grid": {
|
||||||
|
"threshold1Color": "rgba(216, 200, 27, 0.27)",
|
||||||
|
"threshold2Color": "rgba(234, 112, 112, 0.22)"
|
||||||
|
},
|
||||||
|
"id": 3,
|
||||||
|
"isNew": false,
|
||||||
|
"legend": {
|
||||||
|
"alignAsTable": false,
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"hideEmpty": false,
|
||||||
|
"hideZero": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"rightSide": false,
|
||||||
|
"show": true,
|
||||||
|
"total": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 1,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "null",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 9,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "min(sum(kube_node_status_allocatable_memory_bytes) by (instance))",
|
||||||
|
"hide": false,
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "Allocatable Memory",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 20
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "max(sum(kube_pod_container_resource_requests_memory_bytes) by (instance))",
|
||||||
|
"hide": false,
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "Requested Memory",
|
||||||
|
"refId": "B",
|
||||||
|
"step": 20
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Memory",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "individual"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"mode": "time",
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "bytes",
|
||||||
|
"label": "Memory",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(50, 172, 45, 0.97)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(245, 54, 54, 0.9)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "percent",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": true,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"hideTimeOverride": false,
|
||||||
|
"id": 4,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfix": "",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "max(sum(kube_pod_container_resource_requests_memory_bytes) by (instance)) / min(sum(kube_node_status_allocatable_memory_bytes) by (instance)) * 100",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 240
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "80, 90",
|
||||||
|
"title": "Memory",
|
||||||
|
"transparent": false,
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "110%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "Memory",
|
||||||
|
"titleSize": "h6"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"schemaVersion": 14,
|
||||||
|
"sharedCrosshair": false,
|
||||||
|
"style": "dark",
|
||||||
|
"tags": [],
|
||||||
|
"templating": {
|
||||||
|
"list": []
|
||||||
|
},
|
||||||
|
"time": {
|
||||||
|
"from": "now-3h",
|
||||||
|
"to": "now"
|
||||||
|
},
|
||||||
|
"timepicker": {
|
||||||
|
"refresh_intervals": [
|
||||||
|
"5s",
|
||||||
|
"10s",
|
||||||
|
"30s",
|
||||||
|
"1m",
|
||||||
|
"5m",
|
||||||
|
"15m",
|
||||||
|
"30m",
|
||||||
|
"1h",
|
||||||
|
"2h",
|
||||||
|
"1d"
|
||||||
|
],
|
||||||
|
"time_options": [
|
||||||
|
"5m",
|
||||||
|
"15m",
|
||||||
|
"1h",
|
||||||
|
"6h",
|
||||||
|
"12h",
|
||||||
|
"24h",
|
||||||
|
"2d",
|
||||||
|
"7d",
|
||||||
|
"30d"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"timezone": "browser",
|
||||||
|
"title": "Kubernetes Resource Requests",
|
||||||
|
"version": 2
|
||||||
|
}
|
||||||
|
,
|
||||||
|
"inputs": [
|
||||||
|
{
|
||||||
|
"name": "DS_PROMETHEUS",
|
||||||
|
"pluginId": "prometheus",
|
||||||
|
"type": "datasource",
|
||||||
|
"value": "prometheus"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"overwrite": true
|
||||||
|
}
|
||||||
|
nodes-dashboard.json: |+
|
||||||
|
{
|
||||||
|
"dashboard":
|
||||||
|
{
|
||||||
|
"__inputs": [
|
||||||
|
{
|
||||||
|
"description": "",
|
||||||
|
"label": "prometheus",
|
||||||
|
"name": "DS_PROMETHEUS",
|
||||||
|
"pluginId": "prometheus",
|
||||||
|
"pluginName": "Prometheus",
|
||||||
|
"type": "datasource"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"annotations": {
|
||||||
|
"list": []
|
||||||
|
},
|
||||||
|
"description": "Dashboard to get an overview of one server",
|
||||||
|
"editable": false,
|
||||||
|
"gnetId": 22,
|
||||||
|
"graphTooltip": 0,
|
||||||
|
"hideControls": false,
|
||||||
|
"links": [],
|
||||||
|
"refresh": false,
|
||||||
|
"rows": [
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "250px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 1,
|
||||||
|
"grid": {
|
||||||
|
"threshold1Color": "rgba(216, 200, 27, 0.27)",
|
||||||
|
"threshold2Color": "rgba(234, 112, 112, 0.22)"
|
||||||
|
},
|
||||||
|
"id": 3,
|
||||||
|
"isNew": false,
|
||||||
|
"legend": {
|
||||||
|
"alignAsTable": false,
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"hideEmpty": false,
|
||||||
|
"hideZero": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"rightSide": false,
|
||||||
|
"show": true,
|
||||||
|
"total": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 6,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "100 - (avg by (cpu) (irate(node_cpu{mode=\"idle\", instance=\"$server\"}[5m])) * 100)",
|
||||||
|
"hide": false,
|
||||||
|
"intervalFactor": 10,
|
||||||
|
"legendFormat": "{{cpu}}",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 50
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Idle CPU",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "cumulative"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"mode": "time",
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "percent",
|
||||||
|
"label": "cpu usage",
|
||||||
|
"logBase": 1,
|
||||||
|
"max": 100,
|
||||||
|
"min": 0,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 1,
|
||||||
|
"grid": {
|
||||||
|
"threshold1Color": "rgba(216, 200, 27, 0.27)",
|
||||||
|
"threshold2Color": "rgba(234, 112, 112, 0.22)"
|
||||||
|
},
|
||||||
|
"id": 9,
|
||||||
|
"isNew": false,
|
||||||
|
"legend": {
|
||||||
|
"alignAsTable": false,
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"hideEmpty": false,
|
||||||
|
"hideZero": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"rightSide": false,
|
||||||
|
"show": true,
|
||||||
|
"total": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 6,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "node_load1{instance=\"$server\"}",
|
||||||
|
"intervalFactor": 4,
|
||||||
|
"legendFormat": "load 1m",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 20,
|
||||||
|
"target": ""
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "node_load5{instance=\"$server\"}",
|
||||||
|
"intervalFactor": 4,
|
||||||
|
"legendFormat": "load 5m",
|
||||||
|
"refId": "B",
|
||||||
|
"step": 20,
|
||||||
|
"target": ""
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "node_load15{instance=\"$server\"}",
|
||||||
|
"intervalFactor": 4,
|
||||||
|
"legendFormat": "load 15m",
|
||||||
|
"refId": "C",
|
||||||
|
"step": 20,
|
||||||
|
"target": ""
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "System Load",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "cumulative"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"mode": "time",
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "percentunit",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "New Row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "250px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 1,
|
||||||
|
"grid": {
|
||||||
|
"threshold1Color": "rgba(216, 200, 27, 0.27)",
|
||||||
|
"threshold2Color": "rgba(234, 112, 112, 0.22)"
|
||||||
|
},
|
||||||
|
"id": 4,
|
||||||
|
"isNew": false,
|
||||||
|
"legend": {
|
||||||
|
"alignAsTable": false,
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"hideEmpty": false,
|
||||||
|
"hideZero": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"rightSide": false,
|
||||||
|
"show": true,
|
||||||
|
"total": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [
|
||||||
|
{
|
||||||
|
"alias": "node_memory_SwapFree{instance=\"172.17.0.1:9100\",job=\"prometheus\"}",
|
||||||
|
"yaxis": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 9,
|
||||||
|
"stack": true,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "node_memory_MemTotal{instance=\"$server\"} - node_memory_MemFree{instance=\"$server\"} - node_memory_Buffers{instance=\"$server\"} - node_memory_Cached{instance=\"$server\"}",
|
||||||
|
"hide": false,
|
||||||
|
"interval": "",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "memory used",
|
||||||
|
"metric": "",
|
||||||
|
"refId": "C",
|
||||||
|
"step": 10
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "node_memory_Buffers{instance=\"$server\"}",
|
||||||
|
"interval": "",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "memory buffers",
|
||||||
|
"metric": "",
|
||||||
|
"refId": "E",
|
||||||
|
"step": 10
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "node_memory_Cached{instance=\"$server\"}",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "memory cached",
|
||||||
|
"metric": "",
|
||||||
|
"refId": "F",
|
||||||
|
"step": 10
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "node_memory_MemFree{instance=\"$server\"}",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "memory free",
|
||||||
|
"metric": "",
|
||||||
|
"refId": "D",
|
||||||
|
"step": 10
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Memory Usage",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "individual"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"mode": "time",
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "bytes",
|
||||||
|
"logBase": 1,
|
||||||
|
"min": "0",
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(50, 172, 45, 0.97)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(245, 54, 54, 0.9)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "percent",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": true,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"hideTimeOverride": false,
|
||||||
|
"id": 5,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfix": "",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "((node_memory_MemTotal{instance=\"$server\"} - node_memory_MemFree{instance=\"$server\"} - node_memory_Buffers{instance=\"$server\"} - node_memory_Cached{instance=\"$server\"}) / node_memory_MemTotal{instance=\"$server\"}) * 100",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 60,
|
||||||
|
"target": ""
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "80, 90",
|
||||||
|
"title": "Memory Usage",
|
||||||
|
"transparent": false,
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "New Row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "250px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 1,
|
||||||
|
"grid": {
|
||||||
|
"threshold1Color": "rgba(216, 200, 27, 0.27)",
|
||||||
|
"threshold2Color": "rgba(234, 112, 112, 0.22)"
|
||||||
|
},
|
||||||
|
"id": 6,
|
||||||
|
"isNew": true,
|
||||||
|
"legend": {
|
||||||
|
"alignAsTable": false,
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"hideEmpty": false,
|
||||||
|
"hideZero": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"rightSide": false,
|
||||||
|
"show": true,
|
||||||
|
"total": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [
|
||||||
|
{
|
||||||
|
"alias": "read",
|
||||||
|
"yaxis": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"alias": "{instance=\"172.17.0.1:9100\"}",
|
||||||
|
"yaxis": 2
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"alias": "io time",
|
||||||
|
"yaxis": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 9,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum by (instance) (rate(node_disk_bytes_read{instance=\"$server\"}[2m]))",
|
||||||
|
"hide": false,
|
||||||
|
"intervalFactor": 4,
|
||||||
|
"legendFormat": "read",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 20,
|
||||||
|
"target": ""
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "sum by (instance) (rate(node_disk_bytes_written{instance=\"$server\"}[2m]))",
|
||||||
|
"intervalFactor": 4,
|
||||||
|
"legendFormat": "written",
|
||||||
|
"refId": "B",
|
||||||
|
"step": 20
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "sum by (instance) (rate(node_disk_io_time_ms{instance=\"$server\"}[2m]))",
|
||||||
|
"intervalFactor": 4,
|
||||||
|
"legendFormat": "io time",
|
||||||
|
"refId": "C",
|
||||||
|
"step": 20
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Disk I/O",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "cumulative"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"mode": "time",
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "bytes",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "ms",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(50, 172, 45, 0.97)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(245, 54, 54, 0.9)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "percentunit",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 1,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": true,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"hideTimeOverride": false,
|
||||||
|
"id": 7,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfix": "",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "(sum(node_filesystem_size{device!=\"rootfs\",instance=\"$server\"}) - sum(node_filesystem_free{device!=\"rootfs\",instance=\"$server\"})) / sum(node_filesystem_size{device!=\"rootfs\",instance=\"$server\"})",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 60,
|
||||||
|
"target": ""
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "0.75, 0.9",
|
||||||
|
"title": "Disk Space Usage",
|
||||||
|
"transparent": false,
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "current"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "New Row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "250px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 1,
|
||||||
|
"grid": {
|
||||||
|
"threshold1Color": "rgba(216, 200, 27, 0.27)",
|
||||||
|
"threshold2Color": "rgba(234, 112, 112, 0.22)"
|
||||||
|
},
|
||||||
|
"id": 8,
|
||||||
|
"isNew": false,
|
||||||
|
"legend": {
|
||||||
|
"alignAsTable": false,
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"hideEmpty": false,
|
||||||
|
"hideZero": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"rightSide": false,
|
||||||
|
"show": true,
|
||||||
|
"total": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [
|
||||||
|
{
|
||||||
|
"alias": "transmitted",
|
||||||
|
"yaxis": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 6,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "rate(node_network_receive_bytes{instance=\"$server\",device!~\"lo\"}[5m])",
|
||||||
|
"hide": false,
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "{{device}}",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 10,
|
||||||
|
"target": ""
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Network Received",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "cumulative"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"mode": "time",
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "bytes",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "bytes",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 1,
|
||||||
|
"grid": {
|
||||||
|
"threshold1Color": "rgba(216, 200, 27, 0.27)",
|
||||||
|
"threshold2Color": "rgba(234, 112, 112, 0.22)"
|
||||||
|
},
|
||||||
|
"id": 10,
|
||||||
|
"isNew": false,
|
||||||
|
"legend": {
|
||||||
|
"alignAsTable": false,
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"hideEmpty": false,
|
||||||
|
"hideZero": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"rightSide": false,
|
||||||
|
"show": true,
|
||||||
|
"total": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [
|
||||||
|
{
|
||||||
|
"alias": "transmitted",
|
||||||
|
"yaxis": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 6,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "rate(node_network_transmit_bytes{instance=\"$server\",device!~\"lo\"}[5m])",
|
||||||
|
"hide": false,
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "{{device}}",
|
||||||
|
"refId": "B",
|
||||||
|
"step": 10,
|
||||||
|
"target": ""
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Network Transmitted",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "cumulative"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"mode": "time",
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "bytes",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "bytes",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "New Row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"schemaVersion": 14,
|
||||||
|
"sharedCrosshair": false,
|
||||||
|
"style": "dark",
|
||||||
|
"tags": [],
|
||||||
|
"templating": {
|
||||||
|
"list": [
|
||||||
|
{
|
||||||
|
"allValue": null,
|
||||||
|
"current": {},
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"hide": 0,
|
||||||
|
"includeAll": false,
|
||||||
|
"label": null,
|
||||||
|
"multi": false,
|
||||||
|
"name": "server",
|
||||||
|
"options": [],
|
||||||
|
"query": "label_values(node_boot_time, instance)",
|
||||||
|
"refresh": 1,
|
||||||
|
"regex": "",
|
||||||
|
"sort": 0,
|
||||||
|
"tagValuesQuery": "",
|
||||||
|
"tags": [],
|
||||||
|
"tagsQuery": "",
|
||||||
|
"type": "query",
|
||||||
|
"useTags": false
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"time": {
|
||||||
|
"from": "now-1h",
|
||||||
|
"to": "now"
|
||||||
|
},
|
||||||
|
"timepicker": {
|
||||||
|
"refresh_intervals": [
|
||||||
|
"5s",
|
||||||
|
"10s",
|
||||||
|
"30s",
|
||||||
|
"1m",
|
||||||
|
"5m",
|
||||||
|
"15m",
|
||||||
|
"30m",
|
||||||
|
"1h",
|
||||||
|
"2h",
|
||||||
|
"1d"
|
||||||
|
],
|
||||||
|
"time_options": [
|
||||||
|
"5m",
|
||||||
|
"15m",
|
||||||
|
"1h",
|
||||||
|
"6h",
|
||||||
|
"12h",
|
||||||
|
"24h",
|
||||||
|
"2d",
|
||||||
|
"7d",
|
||||||
|
"30d"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"timezone": "browser",
|
||||||
|
"title": "Nodes",
|
||||||
|
"version": 2
|
||||||
|
}
|
||||||
|
,
|
||||||
|
"inputs": [
|
||||||
|
{
|
||||||
|
"name": "DS_PROMETHEUS",
|
||||||
|
"pluginId": "prometheus",
|
||||||
|
"type": "datasource",
|
||||||
|
"value": "prometheus"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"overwrite": true
|
||||||
|
}
|
||||||
|
pods-dashboard.json: |+
|
||||||
|
{
|
||||||
|
"dashboard":
|
||||||
|
{
|
||||||
|
"__inputs": [
|
||||||
|
{
|
||||||
|
"description": "",
|
||||||
|
"label": "prometheus",
|
||||||
|
"name": "DS_PROMETHEUS",
|
||||||
|
"pluginId": "prometheus",
|
||||||
|
"pluginName": "Prometheus",
|
||||||
|
"type": "datasource"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"annotations": {
|
||||||
|
"list": []
|
||||||
|
},
|
||||||
|
"editable": false,
|
||||||
|
"graphTooltip": 1,
|
||||||
|
"hideControls": false,
|
||||||
|
"links": [],
|
||||||
|
"refresh": false,
|
||||||
|
"rows": [
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "250px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 1,
|
||||||
|
"grid": {
|
||||||
|
"threshold1Color": "rgba(216, 200, 27, 0.27)",
|
||||||
|
"threshold2Color": "rgba(234, 112, 112, 0.22)"
|
||||||
|
},
|
||||||
|
"id": 1,
|
||||||
|
"isNew": false,
|
||||||
|
"legend": {
|
||||||
|
"alignAsTable": true,
|
||||||
|
"avg": true,
|
||||||
|
"current": true,
|
||||||
|
"hideEmpty": false,
|
||||||
|
"hideZero": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"rightSide": true,
|
||||||
|
"show": true,
|
||||||
|
"total": false,
|
||||||
|
"values": true
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 12,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum by(container_name) (container_memory_usage_bytes{pod_name=\"$pod\", container_name=~\"$container\", container_name!=\"POD\"})",
|
||||||
|
"interval": "10s",
|
||||||
|
"intervalFactor": 1,
|
||||||
|
"legendFormat": "Current: {{ container_name }}",
|
||||||
|
"metric": "container_memory_usage_bytes",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 15
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "kube_pod_container_resource_requests_memory_bytes{pod=\"$pod\", container=~\"$container\"}",
|
||||||
|
"interval": "10s",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "Requested: {{ container }}",
|
||||||
|
"metric": "kube_pod_container_resource_requests_memory_bytes",
|
||||||
|
"refId": "B",
|
||||||
|
"step": 20
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "kube_pod_container_resource_limits_memory_bytes{pod=\"$pod\", container=~\"$container\"}",
|
||||||
|
"interval": "10s",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "Limit: {{ container }}",
|
||||||
|
"metric": "kube_pod_container_resource_limits_memory_bytes",
|
||||||
|
"refId": "C",
|
||||||
|
"step": 20
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Memory Usage",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": true,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "cumulative"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"mode": "time",
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "bytes",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "Row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "250px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 1,
|
||||||
|
"grid": {
|
||||||
|
"threshold1Color": "rgba(216, 200, 27, 0.27)",
|
||||||
|
"threshold2Color": "rgba(234, 112, 112, 0.22)"
|
||||||
|
},
|
||||||
|
"id": 2,
|
||||||
|
"isNew": false,
|
||||||
|
"legend": {
|
||||||
|
"alignAsTable": true,
|
||||||
|
"avg": true,
|
||||||
|
"current": true,
|
||||||
|
"hideEmpty": false,
|
||||||
|
"hideZero": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"rightSide": true,
|
||||||
|
"show": true,
|
||||||
|
"total": false,
|
||||||
|
"values": true
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 12,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum by (container_name)(rate(container_cpu_usage_seconds_total{image!=\"\",container_name!=\"POD\",pod_name=\"$pod\"}[1m]))",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "{{ container_name }}",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 30
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "kube_pod_container_resource_requests_cpu_cores{pod=\"$pod\", container=~\"$container\"}",
|
||||||
|
"interval": "10s",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "Requested: {{ container }}",
|
||||||
|
"metric": "kube_pod_container_resource_requests_cpu_cores",
|
||||||
|
"refId": "B",
|
||||||
|
"step": 20
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "kube_pod_container_resource_limits_cpu_cores{pod=\"$pod\", container=~\"$container\"}",
|
||||||
|
"interval": "10s",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "Limit: {{ container }}",
|
||||||
|
"metric": "kube_pod_container_resource_limits_memory_bytes",
|
||||||
|
"refId": "C",
|
||||||
|
"step": 20
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "CPU Usage",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": true,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "cumulative"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"mode": "time",
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "Row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "250px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 1,
|
||||||
|
"grid": {
|
||||||
|
"threshold1Color": "rgba(216, 200, 27, 0.27)",
|
||||||
|
"threshold2Color": "rgba(234, 112, 112, 0.22)"
|
||||||
|
},
|
||||||
|
"id": 3,
|
||||||
|
"isNew": false,
|
||||||
|
"legend": {
|
||||||
|
"alignAsTable": true,
|
||||||
|
"avg": true,
|
||||||
|
"current": true,
|
||||||
|
"hideEmpty": false,
|
||||||
|
"hideZero": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"rightSide": true,
|
||||||
|
"show": true,
|
||||||
|
"total": false,
|
||||||
|
"values": true
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 12,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sort_desc(sum by (pod_name) (rate(container_network_receive_bytes_total{pod_name=\"$pod\"}[1m])))",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "{{ pod_name }}",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 30
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Network I/O",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": true,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "cumulative"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"mode": "time",
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "bytes",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "New Row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"schemaVersion": 14,
|
||||||
|
"sharedCrosshair": false,
|
||||||
|
"style": "dark",
|
||||||
|
"tags": [],
|
||||||
|
"templating": {
|
||||||
|
"list": [
|
||||||
|
{
|
||||||
|
"allValue": ".*",
|
||||||
|
"current": {},
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"hide": 0,
|
||||||
|
"includeAll": true,
|
||||||
|
"label": "Namespace",
|
||||||
|
"multi": false,
|
||||||
|
"name": "namespace",
|
||||||
|
"options": [],
|
||||||
|
"query": "label_values(kube_pod_info, namespace)",
|
||||||
|
"refresh": 1,
|
||||||
|
"regex": "",
|
||||||
|
"sort": 0,
|
||||||
|
"tagValuesQuery": "",
|
||||||
|
"tags": [],
|
||||||
|
"tagsQuery": "",
|
||||||
|
"type": "query",
|
||||||
|
"useTags": false
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"allValue": null,
|
||||||
|
"current": {},
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"hide": 0,
|
||||||
|
"includeAll": false,
|
||||||
|
"label": "Pod",
|
||||||
|
"multi": false,
|
||||||
|
"name": "pod",
|
||||||
|
"options": [],
|
||||||
|
"query": "label_values(kube_pod_info{namespace=~\"$namespace\"}, pod)",
|
||||||
|
"refresh": 1,
|
||||||
|
"regex": "",
|
||||||
|
"sort": 0,
|
||||||
|
"tagValuesQuery": "",
|
||||||
|
"tags": [],
|
||||||
|
"tagsQuery": "",
|
||||||
|
"type": "query",
|
||||||
|
"useTags": false
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"allValue": ".*",
|
||||||
|
"current": {},
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"hide": 0,
|
||||||
|
"includeAll": true,
|
||||||
|
"label": "Container",
|
||||||
|
"multi": false,
|
||||||
|
"name": "container",
|
||||||
|
"options": [],
|
||||||
|
"query": "label_values(kube_pod_container_info{namespace=\"$namespace\", pod=\"$pod\"}, container)",
|
||||||
|
"refresh": 1,
|
||||||
|
"regex": "",
|
||||||
|
"sort": 0,
|
||||||
|
"tagValuesQuery": "",
|
||||||
|
"tags": [],
|
||||||
|
"tagsQuery": "",
|
||||||
|
"type": "query",
|
||||||
|
"useTags": false
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"time": {
|
||||||
|
"from": "now-6h",
|
||||||
|
"to": "now"
|
||||||
|
},
|
||||||
|
"timepicker": {
|
||||||
|
"refresh_intervals": [
|
||||||
|
"5s",
|
||||||
|
"10s",
|
||||||
|
"30s",
|
||||||
|
"1m",
|
||||||
|
"5m",
|
||||||
|
"15m",
|
||||||
|
"30m",
|
||||||
|
"1h",
|
||||||
|
"2h",
|
||||||
|
"1d"
|
||||||
|
],
|
||||||
|
"time_options": [
|
||||||
|
"5m",
|
||||||
|
"15m",
|
||||||
|
"1h",
|
||||||
|
"6h",
|
||||||
|
"12h",
|
||||||
|
"24h",
|
||||||
|
"2d",
|
||||||
|
"7d",
|
||||||
|
"30d"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"timezone": "browser",
|
||||||
|
"title": "Pods",
|
||||||
|
"version": 1
|
||||||
|
}
|
||||||
|
,
|
||||||
|
"inputs": [
|
||||||
|
{
|
||||||
|
"name": "DS_PROMETHEUS",
|
||||||
|
"pluginId": "prometheus",
|
||||||
|
"type": "datasource",
|
||||||
|
"value": "prometheus"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"overwrite": true
|
||||||
|
}
|
||||||
|
statefulset-dashboard.json: |+
|
||||||
|
{
|
||||||
|
"dashboard":
|
||||||
|
{
|
||||||
|
"__inputs": [
|
||||||
|
{
|
||||||
|
"description": "",
|
||||||
|
"label": "prometheus",
|
||||||
|
"name": "DS_PROMETHEUS",
|
||||||
|
"pluginId": "prometheus",
|
||||||
|
"pluginName": "Prometheus",
|
||||||
|
"type": "datasource"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"annotations": {
|
||||||
|
"list": []
|
||||||
|
},
|
||||||
|
"editable": false,
|
||||||
|
"graphTooltip": 1,
|
||||||
|
"hideControls": false,
|
||||||
|
"links": [],
|
||||||
|
"rows": [
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "200px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(245, 54, 54, 0.9)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(50, 172, 45, 0.97)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "none",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"id": 8,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfix": "cores",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 4,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(rate(container_cpu_usage_seconds_total{namespace=\"$statefulset_namespace\",pod_name=~\"$statefulset_name.*\"}[3m]))",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "CPU",
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "110%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(245, 54, 54, 0.9)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(50, 172, 45, 0.97)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "none",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"id": 9,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfix": "GB",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "80%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 4,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(container_memory_usage_bytes{namespace=\"$statefulset_namespace\",pod_name=~\"$statefulset_name.*\"}) / 1024^3",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Memory",
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "110%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(245, 54, 54, 0.9)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(50, 172, 45, 0.97)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "Bps",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": false
|
||||||
|
},
|
||||||
|
"id": 7,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfix": "",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 4,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(rate(container_network_transmit_bytes_total{namespace=\"$statefulset_namespace\",pod_name=~\"$statefulset_name.*\"}[3m])) + sum(rate(container_network_receive_bytes_total{namespace=\"$statefulset_namespace\",pod_name=~\"$statefulset_name.*\"}[3m]))",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Network",
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "Dashboard Row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "100px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(245, 54, 54, 0.9)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(50, 172, 45, 0.97)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "none",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": false
|
||||||
|
},
|
||||||
|
"id": 5,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "max(kube_statefulset_replicas{statefulset=\"$statefulset_name\",namespace=\"$statefulset_namespace\"}) without (instance, pod)",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"metric": "kube_statefulset_replicas",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Desired Replicas",
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(245, 54, 54, 0.9)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(50, 172, 45, 0.97)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "none",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"id": 6,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "min(kube_statefulset_status_replicas{statefulset=\"$statefulset_name\",namespace=\"$statefulset_namespace\"}) without (instance, pod)",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Available Replicas",
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(245, 54, 54, 0.9)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(50, 172, 45, 0.97)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "none",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"id": 3,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "max(kube_statefulset_status_observed_generation{statefulset=\"$statefulset_name\",namespace=\"$statefulset_namespace\"}) without (instance, pod)",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Observed Generation",
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(245, 54, 54, 0.9)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(50, 172, 45, 0.97)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "none",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"id": 2,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "max(kube_statefulset_metadata_generation{statefulset=\"$statefulset_name\",namespace=\"$statefulset_namespace\"}) without (instance, pod)",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Metadata Generation",
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "Dashboard Row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "350px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 1,
|
||||||
|
"grid": {
|
||||||
|
"threshold1Color": "rgba(216, 200, 27, 0.27)",
|
||||||
|
"threshold2Color": "rgba(234, 112, 112, 0.22)"
|
||||||
|
},
|
||||||
|
"id": 1,
|
||||||
|
"isNew": true,
|
||||||
|
"legend": {
|
||||||
|
"alignAsTable": false,
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"hideEmpty": false,
|
||||||
|
"hideZero": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"rightSide": false,
|
||||||
|
"show": true,
|
||||||
|
"total": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 12,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "min(kube_statefulset_status_replicas{statefulset=\"$statefulset_name\",namespace=\"$statefulset_namespace\"}) without (instance, pod)",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "available",
|
||||||
|
"refId": "B",
|
||||||
|
"step": 30
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "max(kube_statefulset_replicas{statefulset=\"$statefulset_name\",namespace=\"$statefulset_namespace\"}) without (instance, pod)",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "desired",
|
||||||
|
"refId": "E",
|
||||||
|
"step": 30
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Replicas",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": true,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "cumulative"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"mode": "time",
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "none",
|
||||||
|
"label": "",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"label": "",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": false
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "Dashboard Row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"schemaVersion": 14,
|
||||||
|
"sharedCrosshair": false,
|
||||||
|
"style": "dark",
|
||||||
|
"tags": [],
|
||||||
|
"templating": {
|
||||||
|
"list": [
|
||||||
|
{
|
||||||
|
"allValue": ".*",
|
||||||
|
"current": {},
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"hide": 0,
|
||||||
|
"includeAll": false,
|
||||||
|
"label": "Namespace",
|
||||||
|
"multi": false,
|
||||||
|
"name": "statefulset_namespace",
|
||||||
|
"options": [],
|
||||||
|
"query": "label_values(kube_statefulset_metadata_generation, namespace)",
|
||||||
|
"refresh": 1,
|
||||||
|
"regex": "",
|
||||||
|
"sort": 0,
|
||||||
|
"tagValuesQuery": null,
|
||||||
|
"tags": [],
|
||||||
|
"tagsQuery": "",
|
||||||
|
"type": "query",
|
||||||
|
"useTags": false
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"allValue": null,
|
||||||
|
"current": {},
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"hide": 0,
|
||||||
|
"includeAll": false,
|
||||||
|
"label": "StatefulSet",
|
||||||
|
"multi": false,
|
||||||
|
"name": "statefulset_name",
|
||||||
|
"options": [],
|
||||||
|
"query": "label_values(kube_statefulset_metadata_generation{namespace=\"$statefulset_namespace\"}, statefulset)",
|
||||||
|
"refresh": 1,
|
||||||
|
"regex": "",
|
||||||
|
"sort": 0,
|
||||||
|
"tagValuesQuery": "",
|
||||||
|
"tags": [],
|
||||||
|
"tagsQuery": "statefulset",
|
||||||
|
"type": "query",
|
||||||
|
"useTags": false
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"time": {
|
||||||
|
"from": "now-6h",
|
||||||
|
"to": "now"
|
||||||
|
},
|
||||||
|
"timepicker": {
|
||||||
|
"refresh_intervals": [
|
||||||
|
"5s",
|
||||||
|
"10s",
|
||||||
|
"30s",
|
||||||
|
"1m",
|
||||||
|
"5m",
|
||||||
|
"15m",
|
||||||
|
"30m",
|
||||||
|
"1h",
|
||||||
|
"2h",
|
||||||
|
"1d"
|
||||||
|
],
|
||||||
|
"time_options": [
|
||||||
|
"5m",
|
||||||
|
"15m",
|
||||||
|
"1h",
|
||||||
|
"6h",
|
||||||
|
"12h",
|
||||||
|
"24h",
|
||||||
|
"2d",
|
||||||
|
"7d",
|
||||||
|
"30d"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"timezone": "browser",
|
||||||
|
"title": "StatefulSet",
|
||||||
|
"version": 1
|
||||||
|
}
|
||||||
|
,
|
||||||
|
"inputs": [
|
||||||
|
{
|
||||||
|
"name": "DS_PROMETHEUS",
|
||||||
|
"pluginId": "prometheus",
|
||||||
|
"type": "datasource",
|
||||||
|
"value": "prometheus"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"overwrite": true
|
||||||
|
}
|
||||||
|
prometheus-datasource.json: |+
|
||||||
|
{
|
||||||
|
"access": "proxy",
|
||||||
|
"basicAuth": false,
|
||||||
|
"name": "prometheus",
|
||||||
|
"type": "prometheus",
|
||||||
|
"url": "http://prometheus.monitoring.svc"
|
||||||
|
}
|
||||||
|
---
|
@ -1,4 +1,4 @@
|
|||||||
apiVersion: apps/v1beta2
|
apiVersion: apps/v1
|
||||||
kind: Deployment
|
kind: Deployment
|
||||||
metadata:
|
metadata:
|
||||||
name: grafana
|
name: grafana
|
||||||
@ -41,6 +41,22 @@ spec:
|
|||||||
limits:
|
limits:
|
||||||
memory: 200Mi
|
memory: 200Mi
|
||||||
cpu: 200m
|
cpu: 200m
|
||||||
|
- name: grafana-watcher
|
||||||
|
image: quay.io/coreos/grafana-watcher:v0.0.8
|
||||||
|
args:
|
||||||
|
- '--watch-dir=/etc/grafana/dashboards'
|
||||||
|
- '--grafana-url=http://localhost:8080'
|
||||||
|
resources:
|
||||||
|
requests:
|
||||||
|
memory: "16Mi"
|
||||||
|
cpu: "50m"
|
||||||
|
limits:
|
||||||
|
memory: "32Mi"
|
||||||
|
cpu: "100m"
|
||||||
|
volumeMounts:
|
||||||
|
- name: dashboards
|
||||||
|
mountPath: /etc/grafana/dashboards
|
||||||
volumes:
|
volumes:
|
||||||
- name: grafana-storage
|
- name: dashboards
|
||||||
emptyDir: {}
|
configMap:
|
||||||
|
name: grafana-dashboards
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
apiVersion: apps/v1beta2
|
apiVersion: apps/v1
|
||||||
kind: Deployment
|
kind: Deployment
|
||||||
metadata:
|
metadata:
|
||||||
name: heapster
|
name: heapster
|
||||||
|
@ -1,10 +1,14 @@
|
|||||||
apiVersion: extensions/v1beta1
|
apiVersion: apps/v1
|
||||||
kind: Deployment
|
kind: Deployment
|
||||||
metadata:
|
metadata:
|
||||||
name: default-backend
|
name: default-backend
|
||||||
namespace: ingress
|
namespace: ingress
|
||||||
spec:
|
spec:
|
||||||
replicas: 1
|
replicas: 1
|
||||||
|
selector:
|
||||||
|
matchLabels:
|
||||||
|
name: default-backend
|
||||||
|
phase: prod
|
||||||
template:
|
template:
|
||||||
metadata:
|
metadata:
|
||||||
labels:
|
labels:
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
apiVersion: extensions/v1beta1
|
apiVersion: apps/v1
|
||||||
kind: Deployment
|
kind: Deployment
|
||||||
metadata:
|
metadata:
|
||||||
name: nginx-ingress-controller
|
name: nginx-ingress-controller
|
||||||
@ -8,6 +8,10 @@ spec:
|
|||||||
strategy:
|
strategy:
|
||||||
rollingUpdate:
|
rollingUpdate:
|
||||||
maxUnavailable: 1
|
maxUnavailable: 1
|
||||||
|
selector:
|
||||||
|
matchLabels:
|
||||||
|
name: nginx-ingress-controller
|
||||||
|
phase: prod
|
||||||
template:
|
template:
|
||||||
metadata:
|
metadata:
|
||||||
labels:
|
labels:
|
||||||
@ -19,7 +23,7 @@ spec:
|
|||||||
hostNetwork: true
|
hostNetwork: true
|
||||||
containers:
|
containers:
|
||||||
- name: nginx-ingress-controller
|
- name: nginx-ingress-controller
|
||||||
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.9.0
|
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.10.2
|
||||||
args:
|
args:
|
||||||
- /nginx-ingress-controller
|
- /nginx-ingress-controller
|
||||||
- --default-backend-service=$(POD_NAMESPACE)/default-backend
|
- --default-backend-service=$(POD_NAMESPACE)/default-backend
|
||||||
|
@ -1,5 +1,5 @@
|
|||||||
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
kind: ClusterRoleBinding
|
kind: ClusterRoleBinding
|
||||||
apiVersion: rbac.authorization.k8s.io/v1beta1
|
|
||||||
metadata:
|
metadata:
|
||||||
name: ingress
|
name: ingress
|
||||||
roleRef:
|
roleRef:
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
apiVersion: rbac.authorization.k8s.io/v1beta1
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
kind: ClusterRole
|
kind: ClusterRole
|
||||||
metadata:
|
metadata:
|
||||||
name: ingress
|
name: ingress
|
||||||
|
@ -1,5 +1,5 @@
|
|||||||
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
kind: RoleBinding
|
kind: RoleBinding
|
||||||
apiVersion: rbac.authorization.k8s.io/v1beta1
|
|
||||||
metadata:
|
metadata:
|
||||||
name: ingress
|
name: ingress
|
||||||
namespace: ingress
|
namespace: ingress
|
||||||
|
@ -1,5 +1,5 @@
|
|||||||
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
kind: Role
|
kind: Role
|
||||||
apiVersion: rbac.authorization.k8s.io/v1beta1
|
|
||||||
metadata:
|
metadata:
|
||||||
name: ingress
|
name: ingress
|
||||||
namespace: ingress
|
namespace: ingress
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
apiVersion: extensions/v1beta1
|
apiVersion: apps/v1
|
||||||
kind: DaemonSet
|
kind: DaemonSet
|
||||||
metadata:
|
metadata:
|
||||||
name: nginx-ingress-controller
|
name: nginx-ingress-controller
|
||||||
@ -8,6 +8,10 @@ spec:
|
|||||||
type: RollingUpdate
|
type: RollingUpdate
|
||||||
rollingUpdate:
|
rollingUpdate:
|
||||||
maxUnavailable: 1
|
maxUnavailable: 1
|
||||||
|
selector:
|
||||||
|
matchLabels:
|
||||||
|
name: nginx-ingress-controller
|
||||||
|
phase: prod
|
||||||
template:
|
template:
|
||||||
metadata:
|
metadata:
|
||||||
labels:
|
labels:
|
||||||
@ -19,7 +23,7 @@ spec:
|
|||||||
hostNetwork: true
|
hostNetwork: true
|
||||||
containers:
|
containers:
|
||||||
- name: nginx-ingress-controller
|
- name: nginx-ingress-controller
|
||||||
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.9.0
|
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.10.2
|
||||||
args:
|
args:
|
||||||
- /nginx-ingress-controller
|
- /nginx-ingress-controller
|
||||||
- --default-backend-service=$(POD_NAMESPACE)/default-backend
|
- --default-backend-service=$(POD_NAMESPACE)/default-backend
|
||||||
|
@ -1,10 +1,14 @@
|
|||||||
apiVersion: extensions/v1beta1
|
apiVersion: apps/v1
|
||||||
kind: Deployment
|
kind: Deployment
|
||||||
metadata:
|
metadata:
|
||||||
name: default-backend
|
name: default-backend
|
||||||
namespace: ingress
|
namespace: ingress
|
||||||
spec:
|
spec:
|
||||||
replicas: 1
|
replicas: 1
|
||||||
|
selector:
|
||||||
|
matchLabels:
|
||||||
|
name: default-backend
|
||||||
|
phase: prod
|
||||||
template:
|
template:
|
||||||
metadata:
|
metadata:
|
||||||
labels:
|
labels:
|
||||||
|
@ -1,5 +1,5 @@
|
|||||||
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
kind: ClusterRoleBinding
|
kind: ClusterRoleBinding
|
||||||
apiVersion: rbac.authorization.k8s.io/v1beta1
|
|
||||||
metadata:
|
metadata:
|
||||||
name: ingress
|
name: ingress
|
||||||
roleRef:
|
roleRef:
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
apiVersion: rbac.authorization.k8s.io/v1beta1
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
kind: ClusterRole
|
kind: ClusterRole
|
||||||
metadata:
|
metadata:
|
||||||
name: ingress
|
name: ingress
|
||||||
|
@ -1,5 +1,5 @@
|
|||||||
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
kind: RoleBinding
|
kind: RoleBinding
|
||||||
apiVersion: rbac.authorization.k8s.io/v1beta1
|
|
||||||
metadata:
|
metadata:
|
||||||
name: ingress
|
name: ingress
|
||||||
namespace: ingress
|
namespace: ingress
|
||||||
|
@ -1,5 +1,5 @@
|
|||||||
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
kind: Role
|
kind: Role
|
||||||
apiVersion: rbac.authorization.k8s.io/v1beta1
|
|
||||||
metadata:
|
metadata:
|
||||||
name: ingress
|
name: ingress
|
||||||
namespace: ingress
|
namespace: ingress
|
||||||
|
@ -1,10 +1,14 @@
|
|||||||
apiVersion: extensions/v1beta1
|
apiVersion: apps/v1
|
||||||
kind: Deployment
|
kind: Deployment
|
||||||
metadata:
|
metadata:
|
||||||
name: default-backend
|
name: default-backend
|
||||||
namespace: ingress
|
namespace: ingress
|
||||||
spec:
|
spec:
|
||||||
replicas: 1
|
replicas: 1
|
||||||
|
selector:
|
||||||
|
matchLabels:
|
||||||
|
name: default-backend
|
||||||
|
phase: prod
|
||||||
template:
|
template:
|
||||||
metadata:
|
metadata:
|
||||||
labels:
|
labels:
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
apiVersion: extensions/v1beta1
|
apiVersion: apps/v1
|
||||||
kind: Deployment
|
kind: Deployment
|
||||||
metadata:
|
metadata:
|
||||||
name: nginx-ingress-controller
|
name: nginx-ingress-controller
|
||||||
@ -8,6 +8,10 @@ spec:
|
|||||||
strategy:
|
strategy:
|
||||||
rollingUpdate:
|
rollingUpdate:
|
||||||
maxUnavailable: 1
|
maxUnavailable: 1
|
||||||
|
selector:
|
||||||
|
matchLabels:
|
||||||
|
name: nginx-ingess-controller
|
||||||
|
phase: prod
|
||||||
template:
|
template:
|
||||||
metadata:
|
metadata:
|
||||||
labels:
|
labels:
|
||||||
@ -19,7 +23,7 @@ spec:
|
|||||||
hostNetwork: true
|
hostNetwork: true
|
||||||
containers:
|
containers:
|
||||||
- name: nginx-ingress-controller
|
- name: nginx-ingress-controller
|
||||||
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.9.0
|
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.10.2
|
||||||
args:
|
args:
|
||||||
- /nginx-ingress-controller
|
- /nginx-ingress-controller
|
||||||
- --default-backend-service=$(POD_NAMESPACE)/default-backend
|
- --default-backend-service=$(POD_NAMESPACE)/default-backend
|
||||||
|
@ -1,5 +1,5 @@
|
|||||||
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
kind: ClusterRoleBinding
|
kind: ClusterRoleBinding
|
||||||
apiVersion: rbac.authorization.k8s.io/v1beta1
|
|
||||||
metadata:
|
metadata:
|
||||||
name: ingress
|
name: ingress
|
||||||
roleRef:
|
roleRef:
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
apiVersion: rbac.authorization.k8s.io/v1beta1
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
kind: ClusterRole
|
kind: ClusterRole
|
||||||
metadata:
|
metadata:
|
||||||
name: ingress
|
name: ingress
|
||||||
|
@ -1,5 +1,5 @@
|
|||||||
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
kind: RoleBinding
|
kind: RoleBinding
|
||||||
apiVersion: rbac.authorization.k8s.io/v1beta1
|
|
||||||
metadata:
|
metadata:
|
||||||
name: ingress
|
name: ingress
|
||||||
namespace: ingress
|
namespace: ingress
|
||||||
|
@ -1,5 +1,5 @@
|
|||||||
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
kind: Role
|
kind: Role
|
||||||
apiVersion: rbac.authorization.k8s.io/v1beta1
|
|
||||||
metadata:
|
metadata:
|
||||||
name: ingress
|
name: ingress
|
||||||
namespace: ingress
|
namespace: ingress
|
||||||
|
@ -39,7 +39,7 @@ data:
|
|||||||
tls_config:
|
tls_config:
|
||||||
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
|
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
|
||||||
# Using endpoints to discover kube-apiserver targets finds the pod IP
|
# Using endpoints to discover kube-apiserver targets finds the pod IP
|
||||||
# (host IP since apiserver is uses host network) which is not used in
|
# (host IP since apiserver uses host network) which is not used in
|
||||||
# the server certificate.
|
# the server certificate.
|
||||||
insecure_skip_verify: true
|
insecure_skip_verify: true
|
||||||
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
|
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
|
||||||
@ -51,6 +51,9 @@ data:
|
|||||||
- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
|
- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
|
||||||
action: keep
|
action: keep
|
||||||
regex: default;kubernetes;https
|
regex: default;kubernetes;https
|
||||||
|
- replacement: apiserver
|
||||||
|
action: replace
|
||||||
|
target_label: job
|
||||||
|
|
||||||
# Scrape config for node (i.e. kubelet) /metrics (e.g. 'kubelet_'). Explore
|
# Scrape config for node (i.e. kubelet) /metrics (e.g. 'kubelet_'). Explore
|
||||||
# metrics from a node by scraping kubelet (127.0.0.1:10255/metrics).
|
# metrics from a node by scraping kubelet (127.0.0.1:10255/metrics).
|
||||||
@ -59,7 +62,7 @@ data:
|
|||||||
# Kubernetes apiserver. This means it will work if Prometheus is running out of
|
# Kubernetes apiserver. This means it will work if Prometheus is running out of
|
||||||
# cluster, or can't connect to nodes for some other reason (e.g. because of
|
# cluster, or can't connect to nodes for some other reason (e.g. because of
|
||||||
# firewalling).
|
# firewalling).
|
||||||
- job_name: 'kubernetes-nodes'
|
- job_name: 'kubelet'
|
||||||
kubernetes_sd_configs:
|
kubernetes_sd_configs:
|
||||||
- role: node
|
- role: node
|
||||||
|
|
||||||
@ -149,7 +152,7 @@ data:
|
|||||||
target_label: kubernetes_namespace
|
target_label: kubernetes_namespace
|
||||||
- source_labels: [__meta_kubernetes_service_name]
|
- source_labels: [__meta_kubernetes_service_name]
|
||||||
action: replace
|
action: replace
|
||||||
target_label: kubernetes_name
|
target_label: job
|
||||||
|
|
||||||
# Example scrape config for probing services via the Blackbox Exporter.
|
# Example scrape config for probing services via the Blackbox Exporter.
|
||||||
#
|
#
|
||||||
@ -181,7 +184,7 @@ data:
|
|||||||
- source_labels: [__meta_kubernetes_namespace]
|
- source_labels: [__meta_kubernetes_namespace]
|
||||||
target_label: kubernetes_namespace
|
target_label: kubernetes_namespace
|
||||||
- source_labels: [__meta_kubernetes_service_name]
|
- source_labels: [__meta_kubernetes_service_name]
|
||||||
target_label: kubernetes_name
|
target_label: job
|
||||||
|
|
||||||
# Example scrape config for pods
|
# Example scrape config for pods
|
||||||
#
|
#
|
||||||
|
@ -1,22 +1,24 @@
|
|||||||
apiVersion: extensions/v1beta1
|
apiVersion: apps/v1
|
||||||
kind: Deployment
|
kind: Deployment
|
||||||
metadata:
|
metadata:
|
||||||
name: prometheus
|
name: prometheus
|
||||||
namespace: monitoring
|
namespace: monitoring
|
||||||
spec:
|
spec:
|
||||||
replicas: 1
|
replicas: 1
|
||||||
strategy:
|
selector:
|
||||||
rollingUpdate:
|
matchLabels:
|
||||||
maxUnavailable: 1
|
name: prometheus
|
||||||
|
phase: prod
|
||||||
template:
|
template:
|
||||||
metadata:
|
metadata:
|
||||||
labels:
|
labels:
|
||||||
name: prometheus
|
name: prometheus
|
||||||
phase: prod
|
phase: prod
|
||||||
spec:
|
spec:
|
||||||
|
serviceAccountName: prometheus
|
||||||
containers:
|
containers:
|
||||||
- name: prometheus
|
- name: prometheus
|
||||||
image: quay.io/prometheus/prometheus:v2.0.0
|
image: quay.io/prometheus/prometheus:v2.1.0
|
||||||
args:
|
args:
|
||||||
- '--config.file=/etc/prometheus/prometheus.yaml'
|
- '--config.file=/etc/prometheus/prometheus.yaml'
|
||||||
ports:
|
ports:
|
||||||
|
@ -12,7 +12,9 @@ rules:
|
|||||||
- replicationcontrollers
|
- replicationcontrollers
|
||||||
- limitranges
|
- limitranges
|
||||||
- persistentvolumeclaims
|
- persistentvolumeclaims
|
||||||
|
- persistentvolumes
|
||||||
- namespaces
|
- namespaces
|
||||||
|
- endpoints
|
||||||
verbs: ["list", "watch"]
|
verbs: ["list", "watch"]
|
||||||
- apiGroups: ["extensions"]
|
- apiGroups: ["extensions"]
|
||||||
resources:
|
resources:
|
||||||
@ -29,3 +31,7 @@ rules:
|
|||||||
- cronjobs
|
- cronjobs
|
||||||
- jobs
|
- jobs
|
||||||
verbs: ["list", "watch"]
|
verbs: ["list", "watch"]
|
||||||
|
- apiGroups: ["autoscaling"]
|
||||||
|
resources:
|
||||||
|
- horizontalpodautoscalers
|
||||||
|
verbs: ["list", "watch"]
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
apiVersion: apps/v1beta2
|
apiVersion: apps/v1
|
||||||
kind: Deployment
|
kind: Deployment
|
||||||
metadata:
|
metadata:
|
||||||
name: kube-state-metrics
|
name: kube-state-metrics
|
||||||
@ -22,7 +22,7 @@ spec:
|
|||||||
serviceAccountName: kube-state-metrics
|
serviceAccountName: kube-state-metrics
|
||||||
containers:
|
containers:
|
||||||
- name: kube-state-metrics
|
- name: kube-state-metrics
|
||||||
image: quay.io/coreos/kube-state-metrics:v1.1.0
|
image: quay.io/coreos/kube-state-metrics:v1.2.0
|
||||||
ports:
|
ports:
|
||||||
- name: metrics
|
- name: metrics
|
||||||
containerPort: 8080
|
containerPort: 8080
|
||||||
@ -54,8 +54,8 @@ spec:
|
|||||||
- /pod_nanny
|
- /pod_nanny
|
||||||
- --container=kube-state-metrics
|
- --container=kube-state-metrics
|
||||||
- --cpu=100m
|
- --cpu=100m
|
||||||
- --extra-cpu=1m
|
- --extra-cpu=2m
|
||||||
- --memory=100Mi
|
- --memory=150Mi
|
||||||
- --extra-memory=2Mi
|
- --extra-memory=30Mi
|
||||||
- --threshold=5
|
- --threshold=5
|
||||||
- --deployment=kube-state-metrics
|
- --deployment=kube-state-metrics
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
apiVersion: apps/v1beta2
|
apiVersion: apps/v1
|
||||||
kind: DaemonSet
|
kind: DaemonSet
|
||||||
metadata:
|
metadata:
|
||||||
name: node-exporter
|
name: node-exporter
|
||||||
@ -18,11 +18,15 @@ spec:
|
|||||||
name: node-exporter
|
name: node-exporter
|
||||||
phase: prod
|
phase: prod
|
||||||
spec:
|
spec:
|
||||||
|
serviceAccountName: node-exporter
|
||||||
|
securityContext:
|
||||||
|
runAsNonRoot: true
|
||||||
|
runAsUser: 65534
|
||||||
hostNetwork: true
|
hostNetwork: true
|
||||||
hostPID: true
|
hostPID: true
|
||||||
containers:
|
containers:
|
||||||
- name: node-exporter
|
- name: node-exporter
|
||||||
image: quay.io/prometheus/node-exporter:v0.15.0
|
image: quay.io/prometheus/node-exporter:v0.15.2
|
||||||
args:
|
args:
|
||||||
- "--path.procfs=/host/proc"
|
- "--path.procfs=/host/proc"
|
||||||
- "--path.sysfs=/host/sys"
|
- "--path.sysfs=/host/sys"
|
||||||
@ -45,9 +49,8 @@ spec:
|
|||||||
mountPath: /host/sys
|
mountPath: /host/sys
|
||||||
readOnly: true
|
readOnly: true
|
||||||
tolerations:
|
tolerations:
|
||||||
- key: node-role.kubernetes.io/master
|
- effect: NoSchedule
|
||||||
operator: Exists
|
operator: Exists
|
||||||
effect: NoSchedule
|
|
||||||
volumes:
|
volumes:
|
||||||
- name: proc
|
- name: proc
|
||||||
hostPath:
|
hostPath:
|
||||||
|
@ -0,0 +1,5 @@
|
|||||||
|
apiVersion: v1
|
||||||
|
kind: ServiceAccount
|
||||||
|
metadata:
|
||||||
|
name: node-exporter
|
||||||
|
namespace: monitoring
|
@ -1,4 +1,4 @@
|
|||||||
apiVersion: rbac.authorization.k8s.io/v1beta1
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
kind: ClusterRoleBinding
|
kind: ClusterRoleBinding
|
||||||
metadata:
|
metadata:
|
||||||
name: prometheus
|
name: prometheus
|
||||||
@ -8,5 +8,5 @@ roleRef:
|
|||||||
name: prometheus
|
name: prometheus
|
||||||
subjects:
|
subjects:
|
||||||
- kind: ServiceAccount
|
- kind: ServiceAccount
|
||||||
name: default
|
name: prometheus
|
||||||
namespace: monitoring
|
namespace: monitoring
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
apiVersion: rbac.authorization.k8s.io/v1beta1
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
kind: ClusterRole
|
kind: ClusterRole
|
||||||
metadata:
|
metadata:
|
||||||
name: prometheus
|
name: prometheus
|
||||||
|
@ -4,8 +4,7 @@ metadata:
|
|||||||
name: prometheus-rules
|
name: prometheus-rules
|
||||||
namespace: monitoring
|
namespace: monitoring
|
||||||
data:
|
data:
|
||||||
# Rules adapted from those provided by coreos/prometheus-operator and SoundCloud
|
alertmanager.rules.yaml: |
|
||||||
alertmanager.rules.yaml: |+
|
|
||||||
groups:
|
groups:
|
||||||
- name: alertmanager.rules
|
- name: alertmanager.rules
|
||||||
rules:
|
rules:
|
||||||
@ -36,7 +35,7 @@ data:
|
|||||||
annotations:
|
annotations:
|
||||||
description: Reloading Alertmanager's configuration has failed for {{ $labels.namespace
|
description: Reloading Alertmanager's configuration has failed for {{ $labels.namespace
|
||||||
}}/{{ $labels.pod}}.
|
}}/{{ $labels.pod}}.
|
||||||
etcd3.rules.yaml: |+
|
etcd3.rules.yaml: |
|
||||||
groups:
|
groups:
|
||||||
- name: ./etcd3.rules
|
- name: ./etcd3.rules
|
||||||
rules:
|
rules:
|
||||||
@ -65,8 +64,8 @@ data:
|
|||||||
changes within the last hour
|
changes within the last hour
|
||||||
summary: a high number of leader changes within the etcd cluster are happening
|
summary: a high number of leader changes within the etcd cluster are happening
|
||||||
- alert: HighNumberOfFailedGRPCRequests
|
- alert: HighNumberOfFailedGRPCRequests
|
||||||
expr: sum(rate(etcd_grpc_requests_failed_total{job="etcd"}[5m])) BY (grpc_method)
|
expr: sum(rate(grpc_server_handled_total{grpc_code!="OK",job="etcd"}[5m])) BY (grpc_service, grpc_method)
|
||||||
/ sum(rate(etcd_grpc_total{job="etcd"}[5m])) BY (grpc_method) > 0.01
|
/ sum(rate(grpc_server_handled_total{job="etcd"}[5m])) BY (grpc_service, grpc_method) > 0.01
|
||||||
for: 10m
|
for: 10m
|
||||||
labels:
|
labels:
|
||||||
severity: warning
|
severity: warning
|
||||||
@ -75,8 +74,8 @@ data:
|
|||||||
on etcd instance {{ $labels.instance }}'
|
on etcd instance {{ $labels.instance }}'
|
||||||
summary: a high number of gRPC requests are failing
|
summary: a high number of gRPC requests are failing
|
||||||
- alert: HighNumberOfFailedGRPCRequests
|
- alert: HighNumberOfFailedGRPCRequests
|
||||||
expr: sum(rate(etcd_grpc_requests_failed_total{job="etcd"}[5m])) BY (grpc_method)
|
expr: sum(rate(grpc_server_handled_total{grpc_code!="OK",job="etcd"}[5m])) BY (grpc_service, grpc_method)
|
||||||
/ sum(rate(etcd_grpc_total{job="etcd"}[5m])) BY (grpc_method) > 0.05
|
/ sum(rate(grpc_server_handled_total{job="etcd"}[5m])) BY (grpc_service, grpc_method) > 0.05
|
||||||
for: 5m
|
for: 5m
|
||||||
labels:
|
labels:
|
||||||
severity: critical
|
severity: critical
|
||||||
@ -85,7 +84,7 @@ data:
|
|||||||
on etcd instance {{ $labels.instance }}'
|
on etcd instance {{ $labels.instance }}'
|
||||||
summary: a high number of gRPC requests are failing
|
summary: a high number of gRPC requests are failing
|
||||||
- alert: GRPCRequestsSlow
|
- alert: GRPCRequestsSlow
|
||||||
expr: histogram_quantile(0.99, rate(etcd_grpc_unary_requests_duration_seconds_bucket[5m]))
|
expr: histogram_quantile(0.99, sum(rate(grpc_server_handling_seconds_bucket{job="etcd",grpc_type="unary"}[5m])) by (grpc_service, grpc_method, le))
|
||||||
> 0.15
|
> 0.15
|
||||||
for: 10m
|
for: 10m
|
||||||
labels:
|
labels:
|
||||||
@ -125,7 +124,7 @@ data:
|
|||||||
}} are slow
|
}} are slow
|
||||||
summary: slow HTTP requests
|
summary: slow HTTP requests
|
||||||
- alert: EtcdMemberCommunicationSlow
|
- alert: EtcdMemberCommunicationSlow
|
||||||
expr: histogram_quantile(0.99, rate(etcd_network_member_round_trip_time_seconds_bucket[5m]))
|
expr: histogram_quantile(0.99, rate(etcd_network_peer_round_trip_time_seconds_bucket[5m]))
|
||||||
> 0.15
|
> 0.15
|
||||||
for: 10m
|
for: 10m
|
||||||
labels:
|
labels:
|
||||||
@ -160,7 +159,7 @@ data:
|
|||||||
annotations:
|
annotations:
|
||||||
description: etcd instance {{ $labels.instance }} commit durations are high
|
description: etcd instance {{ $labels.instance }} commit durations are high
|
||||||
summary: high commit durations
|
summary: high commit durations
|
||||||
general.rules.yaml: |+
|
general.rules.yaml: |
|
||||||
groups:
|
groups:
|
||||||
- name: general.rules
|
- name: general.rules
|
||||||
rules:
|
rules:
|
||||||
@ -192,12 +191,12 @@ data:
|
|||||||
description: '{{ $labels.job }}: {{ $labels.namespace }}/{{ $labels.pod }} instance
|
description: '{{ $labels.job }}: {{ $labels.namespace }}/{{ $labels.pod }} instance
|
||||||
will exhaust in file/socket descriptors within the next hour'
|
will exhaust in file/socket descriptors within the next hour'
|
||||||
summary: file descriptors soon exhausted
|
summary: file descriptors soon exhausted
|
||||||
kube-controller-manager.rules.yaml: |+
|
kube-controller-manager.rules.yaml: |
|
||||||
groups:
|
groups:
|
||||||
- name: kube-controller-manager.rules
|
- name: kube-controller-manager.rules
|
||||||
rules:
|
rules:
|
||||||
- alert: K8SControllerManagerDown
|
- alert: K8SControllerManagerDown
|
||||||
expr: absent(up{kubernetes_name="kube-controller-manager"} == 1)
|
expr: absent(up{job="kube-controller-manager"} == 1)
|
||||||
for: 5m
|
for: 5m
|
||||||
labels:
|
labels:
|
||||||
severity: critical
|
severity: critical
|
||||||
@ -205,7 +204,7 @@ data:
|
|||||||
description: There is no running K8S controller manager. Deployments and replication
|
description: There is no running K8S controller manager. Deployments and replication
|
||||||
controllers are not making progress.
|
controllers are not making progress.
|
||||||
summary: Controller manager is down
|
summary: Controller manager is down
|
||||||
kube-scheduler.rules.yaml: |+
|
kube-scheduler.rules.yaml: |
|
||||||
groups:
|
groups:
|
||||||
- name: kube-scheduler.rules
|
- name: kube-scheduler.rules
|
||||||
rules:
|
rules:
|
||||||
@ -255,7 +254,7 @@ data:
|
|||||||
labels:
|
labels:
|
||||||
quantile: "0.5"
|
quantile: "0.5"
|
||||||
- alert: K8SSchedulerDown
|
- alert: K8SSchedulerDown
|
||||||
expr: absent(up{kubernetes_name="kube-scheduler"} == 1)
|
expr: absent(up{job="kube-scheduler"} == 1)
|
||||||
for: 5m
|
for: 5m
|
||||||
labels:
|
labels:
|
||||||
severity: critical
|
severity: critical
|
||||||
@ -263,7 +262,7 @@ data:
|
|||||||
description: There is no running K8S scheduler. New pods are not being assigned
|
description: There is no running K8S scheduler. New pods are not being assigned
|
||||||
to nodes.
|
to nodes.
|
||||||
summary: Scheduler is down
|
summary: Scheduler is down
|
||||||
kube-state-metrics.rules.yaml: |+
|
kube-state-metrics.rules.yaml: |
|
||||||
groups:
|
groups:
|
||||||
- name: kube-state-metrics.rules
|
- name: kube-state-metrics.rules
|
||||||
rules:
|
rules:
|
||||||
@ -274,7 +273,8 @@ data:
|
|||||||
severity: warning
|
severity: warning
|
||||||
annotations:
|
annotations:
|
||||||
description: Observed deployment generation does not match expected one for
|
description: Observed deployment generation does not match expected one for
|
||||||
deployment {{$labels.namespaces}}{{$labels.deployment}}
|
deployment {{$labels.namespaces}}/{{$labels.deployment}}
|
||||||
|
summary: Deployment is outdated
|
||||||
- alert: DeploymentReplicasNotUpdated
|
- alert: DeploymentReplicasNotUpdated
|
||||||
expr: ((kube_deployment_status_replicas_updated != kube_deployment_spec_replicas)
|
expr: ((kube_deployment_status_replicas_updated != kube_deployment_spec_replicas)
|
||||||
or (kube_deployment_status_replicas_available != kube_deployment_spec_replicas))
|
or (kube_deployment_status_replicas_available != kube_deployment_spec_replicas))
|
||||||
@ -284,8 +284,9 @@ data:
|
|||||||
severity: warning
|
severity: warning
|
||||||
annotations:
|
annotations:
|
||||||
description: Replicas are not updated and available for deployment {{$labels.namespaces}}/{{$labels.deployment}}
|
description: Replicas are not updated and available for deployment {{$labels.namespaces}}/{{$labels.deployment}}
|
||||||
|
summary: Deployment replicas are outdated
|
||||||
- alert: DaemonSetRolloutStuck
|
- alert: DaemonSetRolloutStuck
|
||||||
expr: kube_daemonset_status_current_number_ready / kube_daemonset_status_desired_number_scheduled
|
expr: kube_daemonset_status_number_ready / kube_daemonset_status_desired_number_scheduled
|
||||||
* 100 < 100
|
* 100 < 100
|
||||||
for: 15m
|
for: 15m
|
||||||
labels:
|
labels:
|
||||||
@ -293,6 +294,7 @@ data:
|
|||||||
annotations:
|
annotations:
|
||||||
description: Only {{$value}}% of desired pods scheduled and ready for daemon
|
description: Only {{$value}}% of desired pods scheduled and ready for daemon
|
||||||
set {{$labels.namespaces}}/{{$labels.daemonset}}
|
set {{$labels.namespaces}}/{{$labels.daemonset}}
|
||||||
|
summary: DaemonSet is missing pods
|
||||||
- alert: K8SDaemonSetsNotScheduled
|
- alert: K8SDaemonSetsNotScheduled
|
||||||
expr: kube_daemonset_status_desired_number_scheduled - kube_daemonset_status_current_number_scheduled
|
expr: kube_daemonset_status_desired_number_scheduled - kube_daemonset_status_current_number_scheduled
|
||||||
> 0
|
> 0
|
||||||
@ -312,14 +314,15 @@ data:
|
|||||||
to run.
|
to run.
|
||||||
summary: Daemonsets are not scheduled correctly
|
summary: Daemonsets are not scheduled correctly
|
||||||
- alert: PodFrequentlyRestarting
|
- alert: PodFrequentlyRestarting
|
||||||
expr: increase(kube_pod_container_status_restarts[1h]) > 5
|
expr: increase(kube_pod_container_status_restarts_total[1h]) > 5
|
||||||
for: 10m
|
for: 10m
|
||||||
labels:
|
labels:
|
||||||
severity: warning
|
severity: warning
|
||||||
annotations:
|
annotations:
|
||||||
description: Pod {{$labels.namespaces}}/{{$labels.pod}} is was restarted {{$value}}
|
description: Pod {{$labels.namespaces}}/{{$labels.pod}} is was restarted {{$value}}
|
||||||
times within the last hour
|
times within the last hour
|
||||||
kubelet.rules.yaml: |+
|
summary: Pod is restarting frequently
|
||||||
|
kubelet.rules.yaml: |
|
||||||
groups:
|
groups:
|
||||||
- name: kubelet.rules
|
- name: kubelet.rules
|
||||||
rules:
|
rules:
|
||||||
@ -342,14 +345,14 @@ data:
|
|||||||
annotations:
|
annotations:
|
||||||
description: '{{ $value }}% of Kubernetes nodes are not ready'
|
description: '{{ $value }}% of Kubernetes nodes are not ready'
|
||||||
- alert: K8SKubeletDown
|
- alert: K8SKubeletDown
|
||||||
expr: count(up{job="kubernetes-nodes"} == 0) / count(up{job="kubernetes-nodes"}) * 100 > 3
|
expr: count(up{job="kubelet"} == 0) / count(up{job="kubelet"}) * 100 > 3
|
||||||
for: 1h
|
for: 1h
|
||||||
labels:
|
labels:
|
||||||
severity: warning
|
severity: warning
|
||||||
annotations:
|
annotations:
|
||||||
description: Prometheus failed to scrape {{ $value }}% of kubelets.
|
description: Prometheus failed to scrape {{ $value }}% of kubelets.
|
||||||
- alert: K8SKubeletDown
|
- alert: K8SKubeletDown
|
||||||
expr: (absent(up{job="kubernetes-nodes"} == 1) or count(up{job="kubernetes-nodes"} == 0) / count(up{job="kubernetes-nodes"}))
|
expr: (absent(up{job="kubelet"} == 1) or count(up{job="kubelet"} == 0) / count(up{job="kubelet"}))
|
||||||
* 100 > 1
|
* 100 > 1
|
||||||
for: 1h
|
for: 1h
|
||||||
labels:
|
labels:
|
||||||
@ -367,7 +370,7 @@ data:
|
|||||||
description: Kubelet {{$labels.instance}} is running {{$value}} pods, close
|
description: Kubelet {{$labels.instance}} is running {{$value}} pods, close
|
||||||
to the limit of 110
|
to the limit of 110
|
||||||
summary: Kubelet is close to pod limit
|
summary: Kubelet is close to pod limit
|
||||||
kubernetes.rules.yaml: |+
|
kubernetes.rules.yaml: |
|
||||||
groups:
|
groups:
|
||||||
- name: kubernetes.rules
|
- name: kubernetes.rules
|
||||||
rules:
|
rules:
|
||||||
@ -447,14 +450,28 @@ data:
|
|||||||
annotations:
|
annotations:
|
||||||
description: API server returns errors for {{ $value }}% of requests
|
description: API server returns errors for {{ $value }}% of requests
|
||||||
- alert: K8SApiserverDown
|
- alert: K8SApiserverDown
|
||||||
expr: absent(up{job="kubernetes-apiservers"} == 1)
|
expr: absent(up{job="apiserver"} == 1)
|
||||||
for: 20m
|
for: 20m
|
||||||
labels:
|
labels:
|
||||||
severity: critical
|
severity: critical
|
||||||
annotations:
|
annotations:
|
||||||
description: No API servers are reachable or all have disappeared from service
|
description: No API servers are reachable or all have disappeared from service
|
||||||
discovery
|
discovery
|
||||||
node.rules.yaml: |+
|
|
||||||
|
- alert: K8sCertificateExpirationNotice
|
||||||
|
labels:
|
||||||
|
severity: warning
|
||||||
|
annotations:
|
||||||
|
description: Kubernetes API Certificate is expiring soon (less than 7 days)
|
||||||
|
expr: sum(apiserver_client_certificate_expiration_seconds_bucket{le="604800"}) > 0
|
||||||
|
|
||||||
|
- alert: K8sCertificateExpirationNotice
|
||||||
|
labels:
|
||||||
|
severity: critical
|
||||||
|
annotations:
|
||||||
|
description: Kubernetes API Certificate is expiring in less than 1 day
|
||||||
|
expr: sum(apiserver_client_certificate_expiration_seconds_bucket{le="86400"}) > 0
|
||||||
|
node.rules.yaml: |
|
||||||
groups:
|
groups:
|
||||||
- name: node.rules
|
- name: node.rules
|
||||||
rules:
|
rules:
|
||||||
@ -476,7 +493,7 @@ data:
|
|||||||
- record: cluster:node_cpu:ratio
|
- record: cluster:node_cpu:ratio
|
||||||
expr: cluster:node_cpu:rate5m / count(sum(node_cpu) BY (instance, cpu))
|
expr: cluster:node_cpu:rate5m / count(sum(node_cpu) BY (instance, cpu))
|
||||||
- alert: NodeExporterDown
|
- alert: NodeExporterDown
|
||||||
expr: absent(up{kubernetes_name="node-exporter"} == 1)
|
expr: absent(up{job="node-exporter"} == 1)
|
||||||
for: 10m
|
for: 10m
|
||||||
labels:
|
labels:
|
||||||
severity: warning
|
severity: warning
|
||||||
@ -499,7 +516,7 @@ data:
|
|||||||
annotations:
|
annotations:
|
||||||
description: device {{$labels.device}} on node {{$labels.instance}} is running
|
description: device {{$labels.device}} on node {{$labels.instance}} is running
|
||||||
full within the next 2 hours (mounted at {{$labels.mountpoint}})
|
full within the next 2 hours (mounted at {{$labels.mountpoint}})
|
||||||
prometheus.rules.yaml: |+
|
prometheus.rules.yaml: |
|
||||||
groups:
|
groups:
|
||||||
- name: prometheus.rules
|
- name: prometheus.rules
|
||||||
rules:
|
rules:
|
||||||
@ -544,3 +561,30 @@ data:
|
|||||||
annotations:
|
annotations:
|
||||||
description: Prometheus {{ $labels.namespace }}/{{ $labels.pod}} is not connected
|
description: Prometheus {{ $labels.namespace }}/{{ $labels.pod}} is not connected
|
||||||
to any Alertmanagers
|
to any Alertmanagers
|
||||||
|
- alert: PrometheusTSDBReloadsFailing
|
||||||
|
expr: increase(prometheus_tsdb_reloads_failures_total[2h]) > 0
|
||||||
|
for: 12h
|
||||||
|
labels:
|
||||||
|
severity: warning
|
||||||
|
annotations:
|
||||||
|
description: '{{$labels.job}} at {{$labels.instance}} had {{$value | humanize}}
|
||||||
|
reload failures over the last four hours.'
|
||||||
|
summary: Prometheus has issues reloading data blocks from disk
|
||||||
|
- alert: PrometheusTSDBCompactionsFailing
|
||||||
|
expr: increase(prometheus_tsdb_compactions_failed_total[2h]) > 0
|
||||||
|
for: 12h
|
||||||
|
labels:
|
||||||
|
severity: warning
|
||||||
|
annotations:
|
||||||
|
description: '{{$labels.job}} at {{$labels.instance}} had {{$value | humanize}}
|
||||||
|
compaction failures over the last four hours.'
|
||||||
|
summary: Prometheus has issues compacting sample blocks
|
||||||
|
- alert: PrometheusTSDBWALCorruptions
|
||||||
|
expr: tsdb_wal_corruptions_total > 0
|
||||||
|
for: 4h
|
||||||
|
labels:
|
||||||
|
severity: warning
|
||||||
|
annotations:
|
||||||
|
description: '{{$labels.job}} at {{$labels.instance}} has a corrupted write-ahead
|
||||||
|
log (WAL).'
|
||||||
|
summary: Prometheus write-ahead log is corrupted
|
||||||
|
5
addons/prometheus/service-account.yaml
Normal file
5
addons/prometheus/service-account.yaml
Normal file
@ -0,0 +1,5 @@
|
|||||||
|
apiVersion: v1
|
||||||
|
kind: ServiceAccount
|
||||||
|
metadata:
|
||||||
|
name: prometheus
|
||||||
|
namespace: monitoring
|
@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
|
|||||||
|
|
||||||
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
|
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
|
||||||
|
|
||||||
* Kubernetes v1.9.1 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
|
* Kubernetes v1.9.3 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
|
||||||
* Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
|
* Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
|
||||||
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
|
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
|
||||||
* Ready for Ingress, Dashboards, Metrics, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
|
* Ready for Ingress, Dashboards, Metrics, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
|
||||||
|
@ -1,6 +1,6 @@
|
|||||||
# Self-hosted Kubernetes assets (kubeconfig, manifests)
|
# Self-hosted Kubernetes assets (kubeconfig, manifests)
|
||||||
module "bootkube" {
|
module "bootkube" {
|
||||||
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=b83e321b350ac549c45ed6a05ffd8683336fb9f4"
|
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=203b90169ead2380f74cc64ea1f02c109806c9bc"
|
||||||
|
|
||||||
cluster_name = "${var.cluster_name}"
|
cluster_name = "${var.cluster_name}"
|
||||||
api_servers = ["${format("%s.%s", var.cluster_name, var.dns_zone)}"]
|
api_servers = ["${format("%s.%s", var.cluster_name, var.dns_zone)}"]
|
||||||
|
@ -7,7 +7,7 @@ systemd:
|
|||||||
- name: 40-etcd-cluster.conf
|
- name: 40-etcd-cluster.conf
|
||||||
contents: |
|
contents: |
|
||||||
[Service]
|
[Service]
|
||||||
Environment="ETCD_IMAGE_TAG=v3.2.13"
|
Environment="ETCD_IMAGE_TAG=v3.2.15"
|
||||||
Environment="ETCD_NAME=${etcd_name}"
|
Environment="ETCD_NAME=${etcd_name}"
|
||||||
Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${etcd_domain}:2379"
|
Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${etcd_domain}:2379"
|
||||||
Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${etcd_domain}:2380"
|
Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${etcd_domain}:2380"
|
||||||
@ -129,7 +129,7 @@ storage:
|
|||||||
contents:
|
contents:
|
||||||
inline: |
|
inline: |
|
||||||
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
|
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
|
||||||
KUBELET_IMAGE_TAG=v1.9.1
|
KUBELET_IMAGE_TAG=v1.9.3
|
||||||
- path: /etc/sysctl.d/max-user-watches.conf
|
- path: /etc/sysctl.d/max-user-watches.conf
|
||||||
filesystem: root
|
filesystem: root
|
||||||
contents:
|
contents:
|
||||||
@ -150,7 +150,7 @@ storage:
|
|||||||
# Move experimental manifests
|
# Move experimental manifests
|
||||||
[ -n "$(ls /opt/bootkube/assets/manifests-*/* 2>/dev/null)" ] && mv /opt/bootkube/assets/manifests-*/* /opt/bootkube/assets/manifests && rm -rf /opt/bootkube/assets/manifests-*
|
[ -n "$(ls /opt/bootkube/assets/manifests-*/* 2>/dev/null)" ] && mv /opt/bootkube/assets/manifests-*/* /opt/bootkube/assets/manifests && rm -rf /opt/bootkube/assets/manifests-*
|
||||||
BOOTKUBE_ACI="$${BOOTKUBE_ACI:-quay.io/coreos/bootkube}"
|
BOOTKUBE_ACI="$${BOOTKUBE_ACI:-quay.io/coreos/bootkube}"
|
||||||
BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.9.1}"
|
BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.10.0}"
|
||||||
BOOTKUBE_ASSETS="$${BOOTKUBE_ASSETS:-/opt/bootkube/assets}"
|
BOOTKUBE_ASSETS="$${BOOTKUBE_ASSETS:-/opt/bootkube/assets}"
|
||||||
exec /usr/bin/rkt run \
|
exec /usr/bin/rkt run \
|
||||||
--trust-keys-from-https \
|
--trust-keys-from-https \
|
||||||
|
@ -103,7 +103,7 @@ storage:
|
|||||||
contents:
|
contents:
|
||||||
inline: |
|
inline: |
|
||||||
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
|
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
|
||||||
KUBELET_IMAGE_TAG=v1.9.1
|
KUBELET_IMAGE_TAG=v1.9.3
|
||||||
- path: /etc/sysctl.d/max-user-watches.conf
|
- path: /etc/sysctl.d/max-user-watches.conf
|
||||||
filesystem: root
|
filesystem: root
|
||||||
contents:
|
contents:
|
||||||
@ -121,7 +121,7 @@ storage:
|
|||||||
--volume config,kind=host,source=/etc/kubernetes \
|
--volume config,kind=host,source=/etc/kubernetes \
|
||||||
--mount volume=config,target=/etc/kubernetes \
|
--mount volume=config,target=/etc/kubernetes \
|
||||||
--insecure-options=image \
|
--insecure-options=image \
|
||||||
docker://gcr.io/google_containers/hyperkube:v1.9.1 \
|
docker://gcr.io/google_containers/hyperkube:v1.9.3 \
|
||||||
--net=host \
|
--net=host \
|
||||||
--dns=host \
|
--dns=host \
|
||||||
--exec=/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname)
|
--exec=/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname)
|
||||||
|
@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
|
|||||||
|
|
||||||
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
|
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
|
||||||
|
|
||||||
* Kubernetes v1.9.1 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
|
* Kubernetes v1.9.3 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
|
||||||
* Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
|
* Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
|
||||||
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
|
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
|
||||||
* Ready for Ingress, Dashboards, Metrics, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
|
* Ready for Ingress, Dashboards, Metrics, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
|
||||||
|
@ -1,6 +1,6 @@
|
|||||||
# Self-hosted Kubernetes assets (kubeconfig, manifests)
|
# Self-hosted Kubernetes assets (kubeconfig, manifests)
|
||||||
module "bootkube" {
|
module "bootkube" {
|
||||||
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=b83e321b350ac549c45ed6a05ffd8683336fb9f4"
|
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=203b90169ead2380f74cc64ea1f02c109806c9bc"
|
||||||
|
|
||||||
cluster_name = "${var.cluster_name}"
|
cluster_name = "${var.cluster_name}"
|
||||||
api_servers = ["${var.k8s_domain_name}"]
|
api_servers = ["${var.k8s_domain_name}"]
|
||||||
|
@ -7,7 +7,7 @@ systemd:
|
|||||||
- name: 40-etcd-cluster.conf
|
- name: 40-etcd-cluster.conf
|
||||||
contents: |
|
contents: |
|
||||||
[Service]
|
[Service]
|
||||||
Environment="ETCD_IMAGE_TAG=v3.2.13"
|
Environment="ETCD_IMAGE_TAG=v3.2.15"
|
||||||
Environment="ETCD_NAME=${etcd_name}"
|
Environment="ETCD_NAME=${etcd_name}"
|
||||||
Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${domain_name}:2379"
|
Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${domain_name}:2379"
|
||||||
Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${domain_name}:2380"
|
Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${domain_name}:2380"
|
||||||
@ -117,7 +117,7 @@ storage:
|
|||||||
contents:
|
contents:
|
||||||
inline: |
|
inline: |
|
||||||
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
|
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
|
||||||
KUBELET_IMAGE_TAG=v1.9.1
|
KUBELET_IMAGE_TAG=v1.9.3
|
||||||
- path: /etc/hostname
|
- path: /etc/hostname
|
||||||
filesystem: root
|
filesystem: root
|
||||||
mode: 0644
|
mode: 0644
|
||||||
@ -144,7 +144,7 @@ storage:
|
|||||||
# Move experimental manifests
|
# Move experimental manifests
|
||||||
[ -n "$(ls /opt/bootkube/assets/manifests-*/* 2>/dev/null)" ] && mv /opt/bootkube/assets/manifests-*/* /opt/bootkube/assets/manifests && rm -rf /opt/bootkube/assets/manifests-*
|
[ -n "$(ls /opt/bootkube/assets/manifests-*/* 2>/dev/null)" ] && mv /opt/bootkube/assets/manifests-*/* /opt/bootkube/assets/manifests && rm -rf /opt/bootkube/assets/manifests-*
|
||||||
BOOTKUBE_ACI="$${BOOTKUBE_ACI:-quay.io/coreos/bootkube}"
|
BOOTKUBE_ACI="$${BOOTKUBE_ACI:-quay.io/coreos/bootkube}"
|
||||||
BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.9.1}"
|
BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.10.0}"
|
||||||
BOOTKUBE_ASSETS="$${BOOTKUBE_ASSETS:-/opt/bootkube/assets}"
|
BOOTKUBE_ASSETS="$${BOOTKUBE_ASSETS:-/opt/bootkube/assets}"
|
||||||
exec /usr/bin/rkt run \
|
exec /usr/bin/rkt run \
|
||||||
--trust-keys-from-https \
|
--trust-keys-from-https \
|
||||||
|
@ -82,7 +82,7 @@ storage:
|
|||||||
contents:
|
contents:
|
||||||
inline: |
|
inline: |
|
||||||
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
|
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
|
||||||
KUBELET_IMAGE_TAG=v1.9.1
|
KUBELET_IMAGE_TAG=v1.9.3
|
||||||
- path: /etc/hostname
|
- path: /etc/hostname
|
||||||
filesystem: root
|
filesystem: root
|
||||||
mode: 0644
|
mode: 0644
|
||||||
|
@ -3,7 +3,7 @@ resource "matchbox_group" "container-linux-install" {
|
|||||||
count = "${length(var.controller_names) + length(var.worker_names)}"
|
count = "${length(var.controller_names) + length(var.worker_names)}"
|
||||||
|
|
||||||
name = "${format("container-linux-install-%s", element(concat(var.controller_names, var.worker_names), count.index))}"
|
name = "${format("container-linux-install-%s", element(concat(var.controller_names, var.worker_names), count.index))}"
|
||||||
profile = "${var.cached_install == "true" ? matchbox_profile.cached-container-linux-install.name : matchbox_profile.container-linux-install.name}"
|
profile = "${var.cached_install == "true" ? element(matchbox_profile.cached-container-linux-install.*.name, count.index) : element(matchbox_profile.container-linux-install.*.name, count.index)}"
|
||||||
|
|
||||||
selector {
|
selector {
|
||||||
mac = "${element(concat(var.controller_macs, var.worker_macs), count.index)}"
|
mac = "${element(concat(var.controller_macs, var.worker_macs), count.index)}"
|
||||||
|
@ -1,6 +1,8 @@
|
|||||||
// Container Linux Install profile (from release.core-os.net)
|
// Container Linux Install profile (from release.core-os.net)
|
||||||
resource "matchbox_profile" "container-linux-install" {
|
resource "matchbox_profile" "container-linux-install" {
|
||||||
name = "container-linux-install"
|
count = "${length(var.controller_names) + length(var.worker_names)}"
|
||||||
|
name = "${format("%s-container-linux-install-%s", var.cluster_name, element(concat(var.controller_names, var.worker_names), count.index))}"
|
||||||
|
|
||||||
kernel = "http://${var.container_linux_channel}.release.core-os.net/amd64-usr/${var.container_linux_version}/coreos_production_pxe.vmlinuz"
|
kernel = "http://${var.container_linux_channel}.release.core-os.net/amd64-usr/${var.container_linux_version}/coreos_production_pxe.vmlinuz"
|
||||||
|
|
||||||
initrd = [
|
initrd = [
|
||||||
@ -16,10 +18,12 @@ resource "matchbox_profile" "container-linux-install" {
|
|||||||
"${var.kernel_args}",
|
"${var.kernel_args}",
|
||||||
]
|
]
|
||||||
|
|
||||||
container_linux_config = "${data.template_file.container-linux-install-config.rendered}"
|
container_linux_config = "${element(data.template_file.container-linux-install-configs.*.rendered, count.index)}"
|
||||||
}
|
}
|
||||||
|
|
||||||
data "template_file" "container-linux-install-config" {
|
data "template_file" "container-linux-install-configs" {
|
||||||
|
count = "${length(var.controller_names) + length(var.worker_names)}"
|
||||||
|
|
||||||
template = "${file("${path.module}/cl/container-linux-install.yaml.tmpl")}"
|
template = "${file("${path.module}/cl/container-linux-install.yaml.tmpl")}"
|
||||||
|
|
||||||
vars {
|
vars {
|
||||||
@ -37,7 +41,9 @@ data "template_file" "container-linux-install-config" {
|
|||||||
// Container Linux Install profile (from matchbox /assets cache)
|
// Container Linux Install profile (from matchbox /assets cache)
|
||||||
// Note: Admin must have downloaded container_linux_version into matchbox assets.
|
// Note: Admin must have downloaded container_linux_version into matchbox assets.
|
||||||
resource "matchbox_profile" "cached-container-linux-install" {
|
resource "matchbox_profile" "cached-container-linux-install" {
|
||||||
name = "cached-container-linux-install"
|
count = "${length(var.controller_names) + length(var.worker_names)}"
|
||||||
|
name = "${format("%s-cached-container-linux-install-%s", var.cluster_name, element(concat(var.controller_names, var.worker_names), count.index))}"
|
||||||
|
|
||||||
kernel = "/assets/coreos/${var.container_linux_version}/coreos_production_pxe.vmlinuz"
|
kernel = "/assets/coreos/${var.container_linux_version}/coreos_production_pxe.vmlinuz"
|
||||||
|
|
||||||
initrd = [
|
initrd = [
|
||||||
@ -53,10 +59,12 @@ resource "matchbox_profile" "cached-container-linux-install" {
|
|||||||
"${var.kernel_args}",
|
"${var.kernel_args}",
|
||||||
]
|
]
|
||||||
|
|
||||||
container_linux_config = "${data.template_file.cached-container-linux-install-config.rendered}"
|
container_linux_config = "${element(data.template_file.cached-container-linux-install-configs.*.rendered, count.index)}"
|
||||||
}
|
}
|
||||||
|
|
||||||
data "template_file" "cached-container-linux-install-config" {
|
data "template_file" "cached-container-linux-install-configs" {
|
||||||
|
count = "${length(var.controller_names) + length(var.worker_names)}"
|
||||||
|
|
||||||
template = "${file("${path.module}/cl/container-linux-install.yaml.tmpl")}"
|
template = "${file("${path.module}/cl/container-linux-install.yaml.tmpl")}"
|
||||||
|
|
||||||
vars {
|
vars {
|
||||||
|
@ -24,7 +24,7 @@ variable "ssh_authorized_key" {
|
|||||||
}
|
}
|
||||||
|
|
||||||
# Machines
|
# Machines
|
||||||
# Terraform's crude "type system" does properly support lists of maps so we do this.
|
# Terraform's crude "type system" does not properly support lists of maps so we do this.
|
||||||
|
|
||||||
variable "controller_names" {
|
variable "controller_names" {
|
||||||
type = "list"
|
type = "list"
|
||||||
|
@ -98,7 +98,7 @@ storage:
|
|||||||
contents:
|
contents:
|
||||||
inline: |
|
inline: |
|
||||||
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
|
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
|
||||||
KUBELET_IMAGE_TAG=v1.9.1
|
KUBELET_IMAGE_TAG=v1.9.3
|
||||||
- path: /etc/hostname
|
- path: /etc/hostname
|
||||||
filesystem: root
|
filesystem: root
|
||||||
mode: 0644
|
mode: 0644
|
||||||
|
@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
|
|||||||
|
|
||||||
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
|
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
|
||||||
|
|
||||||
* Kubernetes v1.9.1 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
|
* Kubernetes v1.9.3 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
|
||||||
* Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
|
* Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
|
||||||
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
|
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
|
||||||
* Ready for Ingress, Dashboards, Metrics, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
|
* Ready for Ingress, Dashboards, Metrics, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
|
||||||
|
@ -1,6 +1,6 @@
|
|||||||
# Self-hosted Kubernetes assets (kubeconfig, manifests)
|
# Self-hosted Kubernetes assets (kubeconfig, manifests)
|
||||||
module "bootkube" {
|
module "bootkube" {
|
||||||
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=b83e321b350ac549c45ed6a05ffd8683336fb9f4"
|
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=203b90169ead2380f74cc64ea1f02c109806c9bc"
|
||||||
|
|
||||||
cluster_name = "${var.cluster_name}"
|
cluster_name = "${var.cluster_name}"
|
||||||
api_servers = ["${format("%s.%s", var.cluster_name, var.dns_zone)}"]
|
api_servers = ["${format("%s.%s", var.cluster_name, var.dns_zone)}"]
|
||||||
|
@ -7,7 +7,7 @@ systemd:
|
|||||||
- name: 40-etcd-cluster.conf
|
- name: 40-etcd-cluster.conf
|
||||||
contents: |
|
contents: |
|
||||||
[Service]
|
[Service]
|
||||||
Environment="ETCD_IMAGE_TAG=v3.2.13"
|
Environment="ETCD_IMAGE_TAG=v3.2.15"
|
||||||
Environment="ETCD_NAME=${etcd_name}"
|
Environment="ETCD_NAME=${etcd_name}"
|
||||||
Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${etcd_domain}:2379"
|
Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${etcd_domain}:2379"
|
||||||
Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${etcd_domain}:2380"
|
Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${etcd_domain}:2380"
|
||||||
@ -120,7 +120,7 @@ storage:
|
|||||||
contents:
|
contents:
|
||||||
inline: |
|
inline: |
|
||||||
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
|
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
|
||||||
KUBELET_IMAGE_TAG=v1.9.1
|
KUBELET_IMAGE_TAG=v1.9.3
|
||||||
- path: /etc/sysctl.d/max-user-watches.conf
|
- path: /etc/sysctl.d/max-user-watches.conf
|
||||||
filesystem: root
|
filesystem: root
|
||||||
contents:
|
contents:
|
||||||
@ -141,7 +141,7 @@ storage:
|
|||||||
# Move experimental manifests
|
# Move experimental manifests
|
||||||
[ -n "$(ls /opt/bootkube/assets/manifests-*/* 2>/dev/null)" ] && mv /opt/bootkube/assets/manifests-*/* /opt/bootkube/assets/manifests && rm -rf /opt/bootkube/assets/manifests-*
|
[ -n "$(ls /opt/bootkube/assets/manifests-*/* 2>/dev/null)" ] && mv /opt/bootkube/assets/manifests-*/* /opt/bootkube/assets/manifests && rm -rf /opt/bootkube/assets/manifests-*
|
||||||
BOOTKUBE_ACI="$${BOOTKUBE_ACI:-quay.io/coreos/bootkube}"
|
BOOTKUBE_ACI="$${BOOTKUBE_ACI:-quay.io/coreos/bootkube}"
|
||||||
BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.9.1}"
|
BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.10.0}"
|
||||||
BOOTKUBE_ASSETS="$${BOOTKUBE_ASSETS:-/opt/bootkube/assets}"
|
BOOTKUBE_ASSETS="$${BOOTKUBE_ASSETS:-/opt/bootkube/assets}"
|
||||||
exec /usr/bin/rkt run \
|
exec /usr/bin/rkt run \
|
||||||
--trust-keys-from-https \
|
--trust-keys-from-https \
|
||||||
|
@ -94,7 +94,7 @@ storage:
|
|||||||
contents:
|
contents:
|
||||||
inline: |
|
inline: |
|
||||||
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
|
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
|
||||||
KUBELET_IMAGE_TAG=v1.9.1
|
KUBELET_IMAGE_TAG=v1.9.3
|
||||||
- path: /etc/sysctl.d/max-user-watches.conf
|
- path: /etc/sysctl.d/max-user-watches.conf
|
||||||
filesystem: root
|
filesystem: root
|
||||||
contents:
|
contents:
|
||||||
@ -112,7 +112,7 @@ storage:
|
|||||||
--volume config,kind=host,source=/etc/kubernetes \
|
--volume config,kind=host,source=/etc/kubernetes \
|
||||||
--mount volume=config,target=/etc/kubernetes \
|
--mount volume=config,target=/etc/kubernetes \
|
||||||
--insecure-options=image \
|
--insecure-options=image \
|
||||||
docker://gcr.io/google_containers/hyperkube:v1.9.1 \
|
docker://gcr.io/google_containers/hyperkube:v1.9.3 \
|
||||||
--net=host \
|
--net=host \
|
||||||
--dns=host \
|
--dns=host \
|
||||||
--exec=/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname)
|
--exec=/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname)
|
||||||
|
@ -22,12 +22,12 @@ resource "digitalocean_firewall" "rules" {
|
|||||||
},
|
},
|
||||||
{
|
{
|
||||||
protocol = "udp"
|
protocol = "udp"
|
||||||
port_range = "all"
|
port_range = "1-65535"
|
||||||
source_tags = ["${digitalocean_tag.controllers.name}", "${digitalocean_tag.workers.name}"]
|
source_tags = ["${digitalocean_tag.controllers.name}", "${digitalocean_tag.workers.name}"]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
protocol = "tcp"
|
protocol = "tcp"
|
||||||
port_range = "all"
|
port_range = "1-65535"
|
||||||
source_tags = ["${digitalocean_tag.controllers.name}", "${digitalocean_tag.workers.name}"]
|
source_tags = ["${digitalocean_tag.controllers.name}", "${digitalocean_tag.workers.name}"]
|
||||||
},
|
},
|
||||||
]
|
]
|
||||||
@ -35,17 +35,18 @@ resource "digitalocean_firewall" "rules" {
|
|||||||
# allow all outbound traffic
|
# allow all outbound traffic
|
||||||
outbound_rule = [
|
outbound_rule = [
|
||||||
{
|
{
|
||||||
protocol = "icmp"
|
protocol = "tcp"
|
||||||
|
port_range = "1-65535"
|
||||||
destination_addresses = ["0.0.0.0/0", "::/0"]
|
destination_addresses = ["0.0.0.0/0", "::/0"]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
protocol = "udp"
|
protocol = "udp"
|
||||||
port_range = "all"
|
port_range = "1-65535"
|
||||||
destination_addresses = ["0.0.0.0/0", "::/0"]
|
destination_addresses = ["0.0.0.0/0", "::/0"]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
protocol = "tcp"
|
protocol = "icmp"
|
||||||
port_range = "all"
|
port_range = "1-65535"
|
||||||
destination_addresses = ["0.0.0.0/0", "::/0"]
|
destination_addresses = ["0.0.0.0/0", "::/0"]
|
||||||
},
|
},
|
||||||
]
|
]
|
||||||
|
@ -5,7 +5,7 @@ terraform {
|
|||||||
}
|
}
|
||||||
|
|
||||||
provider "digitalocean" {
|
provider "digitalocean" {
|
||||||
version = "0.1.2"
|
version = "~> 0.1.2"
|
||||||
}
|
}
|
||||||
|
|
||||||
provider "local" {
|
provider "local" {
|
||||||
|
@ -27,8 +27,8 @@ variable "controller_count" {
|
|||||||
|
|
||||||
variable "controller_type" {
|
variable "controller_type" {
|
||||||
type = "string"
|
type = "string"
|
||||||
default = "2gb"
|
default = "s-2vcpu-2gb"
|
||||||
description = "Digital Ocean droplet size (e.g. 2gb (min), 4gb, 8gb)."
|
description = "Digital Ocean droplet size (e.g. s-2vcpu-2gb, s-2vcpu-4gb, s-4vcpu-8gb)."
|
||||||
}
|
}
|
||||||
|
|
||||||
variable "worker_count" {
|
variable "worker_count" {
|
||||||
@ -39,8 +39,8 @@ variable "worker_count" {
|
|||||||
|
|
||||||
variable "worker_type" {
|
variable "worker_type" {
|
||||||
type = "string"
|
type = "string"
|
||||||
default = "512mb"
|
default = "s-1vcpu-1gb"
|
||||||
description = "Digital Ocean droplet size (e.g. 512mb, 1gb, 2gb, 4gb)"
|
description = "Digital Ocean droplet size (e.g. s-1vcpu-1gb, s-1vcpu-2gb, s-2vcpu-2gb)"
|
||||||
}
|
}
|
||||||
|
|
||||||
variable "ssh_fingerprints" {
|
variable "ssh_fingerprints" {
|
||||||
@ -82,4 +82,3 @@ variable "cluster_domain_suffix" {
|
|||||||
type = "string"
|
type = "string"
|
||||||
default = "cluster.local"
|
default = "cluster.local"
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -1,27 +0,0 @@
|
|||||||
# Kubernetes Dashboard
|
|
||||||
|
|
||||||
!!! warning
|
|
||||||
The Kubernetes Dashboard takes [unusual approaches](https://github.com/kubernetes/dashboard/wiki/Access-control#authorization-header) to security and is often a point of security escalations. We recommend you do don't deploy it and get familiar with `kubectl`, if possible.
|
|
||||||
|
|
||||||
The Kubernetes [Dashboard](https://github.com/kubernetes/dashboard) provides a web UI to manage a Kubernetes cluster for those who prefer an alternative to `kubectl`.
|
|
||||||
|
|
||||||
## Create
|
|
||||||
|
|
||||||
Create the dashboard deployment and service.
|
|
||||||
|
|
||||||
```
|
|
||||||
kubectl apply -f addons/dashboard -R
|
|
||||||
```
|
|
||||||
|
|
||||||
## Access
|
|
||||||
|
|
||||||
Use `kubectl` to authenticate to the apiserver and create a local port forward to the remote port on the dashboard pod.
|
|
||||||
|
|
||||||
```sh
|
|
||||||
kubectl get pods -n kube-system
|
|
||||||
kubectl port-forward POD [LOCAL_PORT:]REMOTE_PORT
|
|
||||||
kubectl port-forward kubernetes-dashboard-id 9090 -n kube-system
|
|
||||||
```
|
|
||||||
|
|
||||||
!!! tip
|
|
||||||
If you'd like to expose the Dashboard via Ingress and add authentication, use a suitable OAuth2 proxy sidecar and pick your favorite OAuth2 provider.
|
|
20
docs/addons/grafana.md
Normal file
20
docs/addons/grafana.md
Normal file
@ -0,0 +1,20 @@
|
|||||||
|
## Grafana
|
||||||
|
|
||||||
|
Grafana can be used to build dashboards and visualizations that use Prometheus as the datasource. Create the grafana deployment and service.
|
||||||
|
|
||||||
|
```
|
||||||
|
kubectl apply -f addons/grafana -R
|
||||||
|
```
|
||||||
|
|
||||||
|
Use `kubectl` to authenticate to the apiserver and create a local port-forward to the Grafana pod.
|
||||||
|
|
||||||
|
```
|
||||||
|
kubectl port-forward grafana-POD-ID 8080 -n monitoring
|
||||||
|
```
|
||||||
|
|
||||||
|
Visit [127.0.0.1:8080](http://127.0.0.1:8080) to view the bundled dashboards.
|
||||||
|
|
||||||
|

|
||||||
|

|
||||||
|

|
||||||
|
|
@ -6,6 +6,5 @@ Every Typhoon cluster is verified to work well with several post-install addons.
|
|||||||
* Nginx [Ingress Controller](ingress.md)
|
* Nginx [Ingress Controller](ingress.md)
|
||||||
* [Heapster](heapster.md)
|
* [Heapster](heapster.md)
|
||||||
* [Prometheus](prometheus.md)
|
* [Prometheus](prometheus.md)
|
||||||
* [Grafana](prometheus.md#grafana)
|
* [Grafana](grafana.md)
|
||||||
* Kubernetes [Dashboard](dashboard.md)
|
|
||||||
|
|
||||||
|
@ -20,7 +20,7 @@ On Kubernetes clusters, Prometheus is run as a Deployment, configured with a Con
|
|||||||
kubectl apply -f addons/prometheus -R
|
kubectl apply -f addons/prometheus -R
|
||||||
```
|
```
|
||||||
|
|
||||||
The ConfigMap configures Prometheus to target apiserver endpoints, node metrics, cAdvisor metrics, and exporters. By default, data is kept in an `emptyDir` so it is persisted until the pod is rescheduled.
|
The ConfigMap configures Prometheus to discover apiservers, kubelets, cAdvisor, services, endpoints, and exporters. By default, data is kept in an `emptyDir` so it is persisted until the pod is rescheduled.
|
||||||
|
|
||||||
### Exporters
|
### Exporters
|
||||||
|
|
||||||
@ -32,7 +32,7 @@ Exporters expose metrics for 3rd-party systems that don't natively expose Promet
|
|||||||
|
|
||||||
### Queries and Alerts
|
### Queries and Alerts
|
||||||
|
|
||||||
Prometheus provides a simplistic UI for querying metrics and viewing alerts. Use `kubectl` to authenticate to the apiserver and create a local port-forward to the Prometheus pod.
|
Prometheus provides a basic UI for querying metrics and viewing alerts. Use `kubectl` to authenticate to the apiserver and create a local port-forward to the Prometheus pod.
|
||||||
|
|
||||||
```
|
```
|
||||||
kubectl get pods -n monitoring
|
kubectl get pods -n monitoring
|
||||||
@ -47,21 +47,4 @@ Visit [127.0.0.1:9090](http://127.0.0.1:9090) to query [expressions](http://127.
|
|||||||
<br/>
|
<br/>
|
||||||

|

|
||||||
|
|
||||||
## Grafana
|
Use [Grafana](/addons/grafana.md) to view or build dashboards that use Prometheus as the datasource.
|
||||||
|
|
||||||
Grafana can be used to build dashboards and rich visualizations that use Prometheus as the datasource. Create the grafana deployment and service.
|
|
||||||
|
|
||||||
```
|
|
||||||
kubectl apply -f addons/grafana -R
|
|
||||||
```
|
|
||||||
|
|
||||||
Use `kubectl` to authenticate to the apiserver and create a local port-forward to the Grafana pod.
|
|
||||||
|
|
||||||
```
|
|
||||||
kubectl port-forward grafana-POD-ID 8080 -n monitoring
|
|
||||||
```
|
|
||||||
|
|
||||||
Visit [127.0.0.1:8080](http://127.0.0.1:8080), add the prometheus data-source (http://prometheus.monitoring.svc.cluster.local), and import your desired dashboard (e.g. [Grafana Dashboard 315](https://grafana.com/dashboards/315)).
|
|
||||||
|
|
||||||

|
|
||||||
|
|
||||||
|
49
docs/aws.md
49
docs/aws.md
@ -1,6 +1,6 @@
|
|||||||
# AWS
|
# AWS
|
||||||
|
|
||||||
In this tutorial, we'll create a Kubernetes v1.9.1 cluster on AWS.
|
In this tutorial, we'll create a Kubernetes v1.9.3 cluster on AWS.
|
||||||
|
|
||||||
We'll declare a Kubernetes cluster in Terraform using the Typhoon Terraform module. On apply, a VPC, gateway, subnets, auto-scaling groups of controllers and workers, network load balancers for controllers and workers, and security groups will be created.
|
We'll declare a Kubernetes cluster in Terraform using the Typhoon Terraform module. On apply, a VPC, gateway, subnets, auto-scaling groups of controllers and workers, network load balancers for controllers and workers, and security groups will be created.
|
||||||
|
|
||||||
@ -10,15 +10,15 @@ Controllers and workers are provisioned to run a `kubelet`. A one-time [bootkube
|
|||||||
|
|
||||||
* AWS Account and IAM credentials
|
* AWS Account and IAM credentials
|
||||||
* AWS Route53 DNS Zone (registered Domain Name or delegated subdomain)
|
* AWS Route53 DNS Zone (registered Domain Name or delegated subdomain)
|
||||||
* Terraform v0.10.x and [terraform-provider-ct](https://github.com/coreos/terraform-provider-ct) installed locally
|
* Terraform v0.11.x and [terraform-provider-ct](https://github.com/coreos/terraform-provider-ct) installed locally
|
||||||
|
|
||||||
## Terraform Setup
|
## Terraform Setup
|
||||||
|
|
||||||
Install [Terraform](https://www.terraform.io/downloads.html) v0.10.x on your system.
|
Install [Terraform](https://www.terraform.io/downloads.html) v0.11.x on your system.
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
$ terraform version
|
$ terraform version
|
||||||
Terraform v0.10.7
|
Terraform v0.11.1
|
||||||
```
|
```
|
||||||
|
|
||||||
Add the [terraform-provider-ct](https://github.com/coreos/terraform-provider-ct) plugin binary for your system.
|
Add the [terraform-provider-ct](https://github.com/coreos/terraform-provider-ct) plugin binary for your system.
|
||||||
@ -57,9 +57,32 @@ Configure the AWS provider to use your access key credentials in a `providers.tf
|
|||||||
|
|
||||||
```tf
|
```tf
|
||||||
provider "aws" {
|
provider "aws" {
|
||||||
|
version = "~> 1.5.0"
|
||||||
|
alias = "default"
|
||||||
|
|
||||||
region = "eu-central-1"
|
region = "eu-central-1"
|
||||||
shared_credentials_file = "/home/user/.config/aws/credentials"
|
shared_credentials_file = "/home/user/.config/aws/credentials"
|
||||||
}
|
}
|
||||||
|
|
||||||
|
provider "local" {
|
||||||
|
version = "~> 1.0"
|
||||||
|
alias = "default"
|
||||||
|
}
|
||||||
|
|
||||||
|
provider "null" {
|
||||||
|
version = "~> 1.0"
|
||||||
|
alias = "default"
|
||||||
|
}
|
||||||
|
|
||||||
|
provider "template" {
|
||||||
|
version = "~> 1.0"
|
||||||
|
alias = "default"
|
||||||
|
}
|
||||||
|
|
||||||
|
provider "tls" {
|
||||||
|
version = "~> 1.0"
|
||||||
|
alias = "default"
|
||||||
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
Additional configuration options are described in the `aws` provider [docs](https://www.terraform.io/docs/providers/aws/).
|
Additional configuration options are described in the `aws` provider [docs](https://www.terraform.io/docs/providers/aws/).
|
||||||
@ -73,7 +96,15 @@ Define a Kubernetes cluster using the module `aws/container-linux/kubernetes`.
|
|||||||
|
|
||||||
```tf
|
```tf
|
||||||
module "aws-tempest" {
|
module "aws-tempest" {
|
||||||
source = "git::https://github.com/poseidon/typhoon//aws/container-linux/kubernetes"
|
source = "git::https://github.com/poseidon/typhoon//aws/container-linux/kubernetes?ref=v1.9.3"
|
||||||
|
|
||||||
|
providers = {
|
||||||
|
aws = "aws.default"
|
||||||
|
local = "local.default"
|
||||||
|
null = "null.default"
|
||||||
|
template = "template.default"
|
||||||
|
tls = "tls.default"
|
||||||
|
}
|
||||||
|
|
||||||
cluster_name = "tempest"
|
cluster_name = "tempest"
|
||||||
|
|
||||||
@ -119,7 +150,7 @@ Get or update Terraform modules.
|
|||||||
$ terraform get # downloads missing modules
|
$ terraform get # downloads missing modules
|
||||||
$ terraform get --update # updates all modules
|
$ terraform get --update # updates all modules
|
||||||
Get: git::https://github.com/poseidon/typhoon (update)
|
Get: git::https://github.com/poseidon/typhoon (update)
|
||||||
Get: git::https://github.com/poseidon/bootkube-terraform.git?ref=v0.9.1 (update)
|
Get: git::https://github.com/poseidon/bootkube-terraform.git?ref=v0.10.0 (update)
|
||||||
```
|
```
|
||||||
|
|
||||||
Plan the resources to be created.
|
Plan the resources to be created.
|
||||||
@ -151,9 +182,9 @@ In 4-8 minutes, the Kubernetes cluster will be ready.
|
|||||||
$ export KUBECONFIG=/home/user/.secrets/clusters/tempest/auth/kubeconfig
|
$ export KUBECONFIG=/home/user/.secrets/clusters/tempest/auth/kubeconfig
|
||||||
$ kubectl get nodes
|
$ kubectl get nodes
|
||||||
NAME STATUS AGE VERSION
|
NAME STATUS AGE VERSION
|
||||||
ip-10-0-12-221 Ready 34m v1.9.1
|
ip-10-0-12-221 Ready 34m v1.9.3
|
||||||
ip-10-0-19-112 Ready 34m v1.9.1
|
ip-10-0-19-112 Ready 34m v1.9.3
|
||||||
ip-10-0-4-22 Ready 34m v1.9.1
|
ip-10-0-4-22 Ready 34m v1.9.3
|
||||||
```
|
```
|
||||||
|
|
||||||
List the pods.
|
List the pods.
|
||||||
|
@ -1,6 +1,6 @@
|
|||||||
# Bare-Metal
|
# Bare-Metal
|
||||||
|
|
||||||
In this tutorial, we'll network boot and provision a Kubernetes v1.9.1 cluster on bare-metal.
|
In this tutorial, we'll network boot and provision a Kubernetes v1.9.3 cluster on bare-metal.
|
||||||
|
|
||||||
First, we'll deploy a [Matchbox](https://github.com/coreos/matchbox) service and setup a network boot environment. Then, we'll declare a Kubernetes cluster in Terraform using the Typhoon Terraform module and power on machines. On PXE boot, machines will install Container Linux to disk, reboot into the disk install, and provision themselves as Kubernetes controllers or workers.
|
First, we'll deploy a [Matchbox](https://github.com/coreos/matchbox) service and setup a network boot environment. Then, we'll declare a Kubernetes cluster in Terraform using the Typhoon Terraform module and power on machines. On PXE boot, machines will install Container Linux to disk, reboot into the disk install, and provision themselves as Kubernetes controllers or workers.
|
||||||
|
|
||||||
@ -12,7 +12,7 @@ Controllers are provisioned as etcd peers and run `etcd-member` (etcd3) and `kub
|
|||||||
* PXE-enabled [network boot](https://coreos.com/matchbox/docs/latest/network-setup.html) environment
|
* PXE-enabled [network boot](https://coreos.com/matchbox/docs/latest/network-setup.html) environment
|
||||||
* Matchbox v0.6+ deployment with API enabled
|
* Matchbox v0.6+ deployment with API enabled
|
||||||
* Matchbox credentials `client.crt`, `client.key`, `ca.crt`
|
* Matchbox credentials `client.crt`, `client.key`, `ca.crt`
|
||||||
* Terraform v0.10.x and [terraform-provider-matchbox](https://github.com/coreos/terraform-provider-matchbox) installed locally
|
* Terraform v0.11.x and [terraform-provider-matchbox](https://github.com/coreos/terraform-provider-matchbox) installed locally
|
||||||
|
|
||||||
## Machines
|
## Machines
|
||||||
|
|
||||||
@ -109,11 +109,11 @@ Read about the [many ways](https://coreos.com/matchbox/docs/latest/network-setup
|
|||||||
|
|
||||||
## Terraform Setup
|
## Terraform Setup
|
||||||
|
|
||||||
Install [Terraform](https://www.terraform.io/downloads.html) v0.10.x on your system.
|
Install [Terraform](https://www.terraform.io/downloads.html) v0.11.x on your system.
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
$ terraform version
|
$ terraform version
|
||||||
Terraform v0.10.7
|
Terraform v0.11.1
|
||||||
```
|
```
|
||||||
|
|
||||||
Add the [terraform-provider-matchbox](https://github.com/coreos/terraform-provider-matchbox) plugin binary for your system.
|
Add the [terraform-provider-matchbox](https://github.com/coreos/terraform-provider-matchbox) plugin binary for your system.
|
||||||
@ -149,6 +149,26 @@ provider "matchbox" {
|
|||||||
client_key = "${file("~/.config/matchbox/client.key")}"
|
client_key = "${file("~/.config/matchbox/client.key")}"
|
||||||
ca = "${file("~/.config/matchbox/ca.crt")}"
|
ca = "${file("~/.config/matchbox/ca.crt")}"
|
||||||
}
|
}
|
||||||
|
|
||||||
|
provider "local" {
|
||||||
|
version = "~> 1.0"
|
||||||
|
alias = "default"
|
||||||
|
}
|
||||||
|
|
||||||
|
provider "null" {
|
||||||
|
version = "~> 1.0"
|
||||||
|
alias = "default"
|
||||||
|
}
|
||||||
|
|
||||||
|
provider "template" {
|
||||||
|
version = "~> 1.0"
|
||||||
|
alias = "default"
|
||||||
|
}
|
||||||
|
|
||||||
|
provider "tls" {
|
||||||
|
version = "~> 1.0"
|
||||||
|
alias = "default"
|
||||||
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
## Cluster
|
## Cluster
|
||||||
@ -157,7 +177,14 @@ Define a Kubernetes cluster using the module `bare-metal/container-linux/kuberne
|
|||||||
|
|
||||||
```tf
|
```tf
|
||||||
module "bare-metal-mercury" {
|
module "bare-metal-mercury" {
|
||||||
source = "git::https://github.com/poseidon/typhoon//bare-metal/container-linux/kubernetes"
|
source = "git::https://github.com/poseidon/typhoon//bare-metal/container-linux/kubernetes?ref=v1.9.3"
|
||||||
|
|
||||||
|
providers = {
|
||||||
|
local = "local.default"
|
||||||
|
null = "null.default"
|
||||||
|
template = "template.default"
|
||||||
|
tls = "tls.default"
|
||||||
|
}
|
||||||
|
|
||||||
# install
|
# install
|
||||||
matchbox_http_endpoint = "http://matchbox.example.com"
|
matchbox_http_endpoint = "http://matchbox.example.com"
|
||||||
@ -219,7 +246,7 @@ Get or update Terraform modules.
|
|||||||
$ terraform get # downloads missing modules
|
$ terraform get # downloads missing modules
|
||||||
$ terraform get --update # updates all modules
|
$ terraform get --update # updates all modules
|
||||||
Get: git::https://github.com/poseidon/typhoon (update)
|
Get: git::https://github.com/poseidon/typhoon (update)
|
||||||
Get: git::https://github.com/poseidon/bootkube-terraform.git?ref=v0.9.1 (update)
|
Get: git::https://github.com/poseidon/bootkube-terraform.git?ref=v0.10.0 (update)
|
||||||
```
|
```
|
||||||
|
|
||||||
Plan the resources to be created.
|
Plan the resources to be created.
|
||||||
@ -290,9 +317,9 @@ bootkube[5]: Tearing down temporary bootstrap control plane...
|
|||||||
$ export KUBECONFIG=/home/user/.secrets/clusters/mercury/auth/kubeconfig
|
$ export KUBECONFIG=/home/user/.secrets/clusters/mercury/auth/kubeconfig
|
||||||
$ kubectl get nodes
|
$ kubectl get nodes
|
||||||
NAME STATUS AGE VERSION
|
NAME STATUS AGE VERSION
|
||||||
node1.example.com Ready 11m v1.9.1
|
node1.example.com Ready 11m v1.9.3
|
||||||
node2.example.com Ready 11m v1.9.1
|
node2.example.com Ready 11m v1.9.3
|
||||||
node3.example.com Ready 11m v1.9.1
|
node3.example.com Ready 11m v1.9.3
|
||||||
```
|
```
|
||||||
|
|
||||||
List the pods.
|
List the pods.
|
||||||
|
@ -1,6 +1,6 @@
|
|||||||
# Digital Ocean
|
# Digital Ocean
|
||||||
|
|
||||||
In this tutorial, we'll create a Kubernetes v1.9.1 cluster on Digital Ocean.
|
In this tutorial, we'll create a Kubernetes v1.9.3 cluster on Digital Ocean.
|
||||||
|
|
||||||
We'll declare a Kubernetes cluster in Terraform using the Typhoon Terraform module. On apply, firewall rules, DNS records, tags, and droplets for Kubernetes controllers and workers will be created.
|
We'll declare a Kubernetes cluster in Terraform using the Typhoon Terraform module. On apply, firewall rules, DNS records, tags, and droplets for Kubernetes controllers and workers will be created.
|
||||||
|
|
||||||
@ -10,15 +10,15 @@ Controllers and workers are provisioned to run a `kubelet`. A one-time [bootkube
|
|||||||
|
|
||||||
* Digital Ocean Account and Token
|
* Digital Ocean Account and Token
|
||||||
* Digital Ocean Domain (registered Domain Name or delegated subdomain)
|
* Digital Ocean Domain (registered Domain Name or delegated subdomain)
|
||||||
* Terraform v0.10.x and [terraform-provider-ct](https://github.com/coreos/terraform-provider-ct) installed locally
|
* Terraform v0.11.x and [terraform-provider-ct](https://github.com/coreos/terraform-provider-ct) installed locally
|
||||||
|
|
||||||
## Terraform Setup
|
## Terraform Setup
|
||||||
|
|
||||||
Install [Terraform](https://www.terraform.io/downloads.html) v0.10.x on your system.
|
Install [Terraform](https://www.terraform.io/downloads.html) v0.11.x on your system.
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
$ terraform version
|
$ terraform version
|
||||||
Terraform v0.10.7
|
Terraform v0.11.1
|
||||||
```
|
```
|
||||||
|
|
||||||
Add the [terraform-provider-ct](https://github.com/coreos/terraform-provider-ct) plugin binary for your system.
|
Add the [terraform-provider-ct](https://github.com/coreos/terraform-provider-ct) plugin binary for your system.
|
||||||
@ -58,7 +58,29 @@ Configure the DigitalOcean provider to use your token in a `providers.tf` file.
|
|||||||
|
|
||||||
```tf
|
```tf
|
||||||
provider "digitalocean" {
|
provider "digitalocean" {
|
||||||
|
version = "0.1.3"
|
||||||
token = "${chomp(file("~/.config/digital-ocean/token"))}"
|
token = "${chomp(file("~/.config/digital-ocean/token"))}"
|
||||||
|
alias = "default"
|
||||||
|
}
|
||||||
|
|
||||||
|
provider "local" {
|
||||||
|
version = "~> 1.0"
|
||||||
|
alias = "default"
|
||||||
|
}
|
||||||
|
|
||||||
|
provider "null" {
|
||||||
|
version = "~> 1.0"
|
||||||
|
alias = "default"
|
||||||
|
}
|
||||||
|
|
||||||
|
provider "template" {
|
||||||
|
version = "~> 1.0"
|
||||||
|
alias = "default"
|
||||||
|
}
|
||||||
|
|
||||||
|
provider "tls" {
|
||||||
|
version = "~> 1.0"
|
||||||
|
alias = "default"
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@ -68,7 +90,15 @@ Define a Kubernetes cluster using the module `digital-ocean/container-linux/kube
|
|||||||
|
|
||||||
```tf
|
```tf
|
||||||
module "digital-ocean-nemo" {
|
module "digital-ocean-nemo" {
|
||||||
source = "git::https://github.com/poseidon/typhoon//digital-ocean/container-linux/kubernetes"
|
source = "git::https://github.com/poseidon/typhoon//digital-ocean/container-linux/kubernetes?ref=v1.9.3"
|
||||||
|
|
||||||
|
providers = {
|
||||||
|
digitalocean = "digitalocean.default"
|
||||||
|
local = "local.default"
|
||||||
|
null = "null.default"
|
||||||
|
template = "template.default"
|
||||||
|
tls = "tls.default"
|
||||||
|
}
|
||||||
|
|
||||||
region = "nyc3"
|
region = "nyc3"
|
||||||
dns_zone = "digital-ocean.example.com"
|
dns_zone = "digital-ocean.example.com"
|
||||||
@ -76,9 +106,9 @@ module "digital-ocean-nemo" {
|
|||||||
cluster_name = "nemo"
|
cluster_name = "nemo"
|
||||||
image = "coreos-stable"
|
image = "coreos-stable"
|
||||||
controller_count = 1
|
controller_count = 1
|
||||||
controller_type = "2gb"
|
controller_type = "s-2vcpu-2gb"
|
||||||
worker_count = 2
|
worker_count = 2
|
||||||
worker_type = "512mb"
|
worker_type = "s-1vcpu-1gb"
|
||||||
ssh_fingerprints = ["d7:9d:79:ae:56:32:73:79:95:88:e3:a2:ab:5d:45:e7"]
|
ssh_fingerprints = ["d7:9d:79:ae:56:32:73:79:95:88:e3:a2:ab:5d:45:e7"]
|
||||||
|
|
||||||
# output assets dir
|
# output assets dir
|
||||||
@ -114,7 +144,7 @@ Get or update Terraform modules.
|
|||||||
$ terraform get # downloads missing modules
|
$ terraform get # downloads missing modules
|
||||||
$ terraform get --update # updates all modules
|
$ terraform get --update # updates all modules
|
||||||
Get: git::https://github.com/poseidon/typhoon (update)
|
Get: git::https://github.com/poseidon/typhoon (update)
|
||||||
Get: git::https://github.com/poseidon/bootkube-terraform.git?ref=v0.9.1 (update)
|
Get: git::https://github.com/poseidon/bootkube-terraform.git?ref=v0.10.0 (update)
|
||||||
```
|
```
|
||||||
|
|
||||||
Plan the resources to be created.
|
Plan the resources to be created.
|
||||||
@ -147,9 +177,9 @@ In 3-6 minutes, the Kubernetes cluster will be ready.
|
|||||||
$ export KUBECONFIG=/home/user/.secrets/clusters/nemo/auth/kubeconfig
|
$ export KUBECONFIG=/home/user/.secrets/clusters/nemo/auth/kubeconfig
|
||||||
$ kubectl get nodes
|
$ kubectl get nodes
|
||||||
NAME STATUS AGE VERSION
|
NAME STATUS AGE VERSION
|
||||||
10.132.110.130 Ready 10m v1.9.1
|
10.132.110.130 Ready 10m v1.9.3
|
||||||
10.132.115.81 Ready 10m v1.9.1
|
10.132.115.81 Ready 10m v1.9.3
|
||||||
10.132.124.107 Ready 10m v1.9.1
|
10.132.124.107 Ready 10m v1.9.3
|
||||||
```
|
```
|
||||||
|
|
||||||
List the pods.
|
List the pods.
|
||||||
@ -232,16 +262,18 @@ If you uploaded an SSH key to DigitalOcean (not required), find the fingerprint
|
|||||||
|:-----|:------------|:--------|:--------|
|
|:-----|:------------|:--------|:--------|
|
||||||
| image | OS image for droplets | "coreos-stable" | coreos-stable, coreos-beta, coreos-alpha |
|
| image | OS image for droplets | "coreos-stable" | coreos-stable, coreos-beta, coreos-alpha |
|
||||||
| controller_count | Number of controllers (i.e. masters) | 1 | 1 |
|
| controller_count | Number of controllers (i.e. masters) | 1 | 1 |
|
||||||
| controller_type | Digital Ocean droplet size | 2gb | 2gb (min), 4gb, 8gb |
|
| controller_type | Digital Ocean droplet size | s-2vcpu-2gb | s-2vcpu-2gb, s-2vcpu-4gb, s-4vcpu-8gb, ... |
|
||||||
| worker_count | Number of workers | 1 | 3 |
|
| worker_count | Number of workers | 1 | 3 |
|
||||||
| worker_type | Digital Ocean droplet size | 512mb | 512mb, 1gb, 2gb, 4gb |
|
| worker_type | Digital Ocean droplet size | s-1vcpu-1gb | s-1vcpu-1gb, s-1vcpu-2gb, s-2vcpu-2gb, ... |
|
||||||
| networking | Choice of networking provider | "flannel" | "flannel" |
|
| networking | Choice of networking provider | "flannel" | "flannel" |
|
||||||
| pod_cidr | CIDR range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
|
| pod_cidr | CIDR range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
|
||||||
| service_cidr | CIDR range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
|
| service_cidr | CIDR range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
|
||||||
| cluster_domain_suffix | FQDN suffix for Kubernetes services answered by kube-dns. | "cluster.local" | "k8s.example.com" |
|
| cluster_domain_suffix | FQDN suffix for Kubernetes services answered by kube-dns. | "cluster.local" | "k8s.example.com" |
|
||||||
|
|
||||||
|
You can see all valid droplet sizes [on DigitalOcean's website](https://developers.digitalocean.com/documentation/changelog/api-v2/new-size-slugs-for-droplet-plan-changes/) or by [using their `doctl` command-line tool](https://github.com/digitalocean/doctl) via `doctl compute size list`.
|
||||||
|
|
||||||
!!! warning
|
!!! warning
|
||||||
Do not choose a `controller_type` smaller than `2gb`. The `1gb` droplet is not sufficient for running a controller and bootstrapping will fail.
|
Do not choose a `controller_type` smaller than 2GB. Smaller droplets are not sufficient for running a controller and bootstrapping will fail.
|
||||||
|
|
||||||
!!! bug
|
!!! bug
|
||||||
Digital Ocean firewalls do not yet support the IP tunneling (IP in IP) protocol used by Calico. You can try using "calico" for `networking`, but it will only work if the cloud firewall is removed (unsafe).
|
Digital Ocean firewalls do not yet support the IP tunneling (IP in IP) protocol used by Calico. You can try using "calico" for `networking`, but it will only work if the cloud firewall is removed (unsafe).
|
||||||
|
@ -1,6 +1,6 @@
|
|||||||
# Google Cloud
|
# Google Cloud
|
||||||
|
|
||||||
In this tutorial, we'll create a Kubernetes v1.9.1 cluster on Google Compute Engine (not GKE).
|
In this tutorial, we'll create a Kubernetes v1.9.3 cluster on Google Compute Engine (not GKE).
|
||||||
|
|
||||||
We'll declare a Kubernetes cluster in Terraform using the Typhoon Terraform module. On apply, a network, firewall rules, managed instance groups of Kubernetes controllers and workers, network load balancers for controllers and workers, and health checks will be created.
|
We'll declare a Kubernetes cluster in Terraform using the Typhoon Terraform module. On apply, a network, firewall rules, managed instance groups of Kubernetes controllers and workers, network load balancers for controllers and workers, and health checks will be created.
|
||||||
|
|
||||||
@ -10,15 +10,15 @@ Controllers and workers are provisioned to run a `kubelet`. A one-time [bootkube
|
|||||||
|
|
||||||
* Google Cloud Account and Service Account
|
* Google Cloud Account and Service Account
|
||||||
* Google Cloud DNS Zone (registered Domain Name or delegated subdomain)
|
* Google Cloud DNS Zone (registered Domain Name or delegated subdomain)
|
||||||
* Terraform v0.10.x and [terraform-provider-ct](https://github.com/coreos/terraform-provider-ct) installed locally
|
* Terraform v0.11.x and [terraform-provider-ct](https://github.com/coreos/terraform-provider-ct) installed locally
|
||||||
|
|
||||||
## Terraform Setup
|
## Terraform Setup
|
||||||
|
|
||||||
Install [Terraform](https://www.terraform.io/downloads.html) v0.10.x on your system.
|
Install [Terraform](https://www.terraform.io/downloads.html) v0.11.x on your system.
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
$ terraform version
|
$ terraform version
|
||||||
Terraform v0.10.7
|
Terraform v0.11.1
|
||||||
```
|
```
|
||||||
|
|
||||||
Add the [terraform-provider-ct](https://github.com/coreos/terraform-provider-ct) plugin binary for your system.
|
Add the [terraform-provider-ct](https://github.com/coreos/terraform-provider-ct) plugin binary for your system.
|
||||||
@ -57,10 +57,33 @@ Configure the Google Cloud provider to use your service account key, project-id,
|
|||||||
|
|
||||||
```tf
|
```tf
|
||||||
provider "google" {
|
provider "google" {
|
||||||
|
version = "1.2"
|
||||||
|
alias = "default"
|
||||||
|
|
||||||
credentials = "${file("~/.config/google-cloud/terraform.json")}"
|
credentials = "${file("~/.config/google-cloud/terraform.json")}"
|
||||||
project = "project-id"
|
project = "project-id"
|
||||||
region = "us-central1"
|
region = "us-central1"
|
||||||
}
|
}
|
||||||
|
|
||||||
|
provider "local" {
|
||||||
|
version = "~> 1.0"
|
||||||
|
alias = "default"
|
||||||
|
}
|
||||||
|
|
||||||
|
provider "null" {
|
||||||
|
version = "~> 1.0"
|
||||||
|
alias = "default"
|
||||||
|
}
|
||||||
|
|
||||||
|
provider "template" {
|
||||||
|
version = "~> 1.0"
|
||||||
|
alias = "default"
|
||||||
|
}
|
||||||
|
|
||||||
|
provider "tls" {
|
||||||
|
version = "~> 1.0"
|
||||||
|
alias = "default"
|
||||||
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
Additional configuration options are described in the `google` provider [docs](https://www.terraform.io/docs/providers/google/index.html).
|
Additional configuration options are described in the `google` provider [docs](https://www.terraform.io/docs/providers/google/index.html).
|
||||||
@ -74,7 +97,15 @@ Define a Kubernetes cluster using the module `google-cloud/container-linux/kuber
|
|||||||
|
|
||||||
```tf
|
```tf
|
||||||
module "google-cloud-yavin" {
|
module "google-cloud-yavin" {
|
||||||
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes"
|
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes?ref=v1.9.3"
|
||||||
|
|
||||||
|
providers = {
|
||||||
|
google = "google.default"
|
||||||
|
local = "local.default"
|
||||||
|
null = "null.default"
|
||||||
|
template = "template.default"
|
||||||
|
tls = "tls.default"
|
||||||
|
}
|
||||||
|
|
||||||
# Google Cloud
|
# Google Cloud
|
||||||
region = "us-central1"
|
region = "us-central1"
|
||||||
@ -120,7 +151,7 @@ Get or update Terraform modules.
|
|||||||
$ terraform get # downloads missing modules
|
$ terraform get # downloads missing modules
|
||||||
$ terraform get --update # updates all modules
|
$ terraform get --update # updates all modules
|
||||||
Get: git::https://github.com/poseidon/typhoon (update)
|
Get: git::https://github.com/poseidon/typhoon (update)
|
||||||
Get: git::https://github.com/poseidon/bootkube-terraform.git?ref=v0.9.1 (update)
|
Get: git::https://github.com/poseidon/bootkube-terraform.git?ref=v0.10.0 (update)
|
||||||
```
|
```
|
||||||
|
|
||||||
Plan the resources to be created.
|
Plan the resources to be created.
|
||||||
@ -154,9 +185,9 @@ In 4-8 minutes, the Kubernetes cluster will be ready.
|
|||||||
$ export KUBECONFIG=/home/user/.secrets/clusters/yavin/auth/kubeconfig
|
$ export KUBECONFIG=/home/user/.secrets/clusters/yavin/auth/kubeconfig
|
||||||
$ kubectl get nodes
|
$ kubectl get nodes
|
||||||
NAME STATUS AGE VERSION
|
NAME STATUS AGE VERSION
|
||||||
yavin-controller-0.c.example-com.internal Ready 6m v1.9.1
|
yavin-controller-0.c.example-com.internal Ready 6m v1.9.3
|
||||||
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.9.1
|
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.9.3
|
||||||
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.9.1
|
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.9.3
|
||||||
```
|
```
|
||||||
|
|
||||||
List the pods.
|
List the pods.
|
||||||
|
BIN
docs/img/grafana-capacity.png
Normal file
BIN
docs/img/grafana-capacity.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 234 KiB |
BIN
docs/img/grafana-control-plane.png
Normal file
BIN
docs/img/grafana-control-plane.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 259 KiB |
Binary file not shown.
Before Width: | Height: | Size: 212 KiB |
BIN
docs/img/grafana-node.png
Normal file
BIN
docs/img/grafana-node.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 240 KiB |
Binary file not shown.
Before Width: | Height: | Size: 404 KiB After Width: | Height: | Size: 181 KiB |
@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
|
|||||||
|
|
||||||
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
|
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
|
||||||
|
|
||||||
* Kubernetes v1.9.1 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
|
* Kubernetes v1.9.3 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
|
||||||
* Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
|
* Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
|
||||||
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
|
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
|
||||||
* Ready for Ingress, Dashboards, Metrics and other optional [addons](addons/overview.md)
|
* Ready for Ingress, Dashboards, Metrics and other optional [addons](addons/overview.md)
|
||||||
@ -45,6 +45,14 @@ Define a Kubernetes cluster by using the Terraform module for your chosen platfo
|
|||||||
module "google-cloud-yavin" {
|
module "google-cloud-yavin" {
|
||||||
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes"
|
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes"
|
||||||
|
|
||||||
|
providers = {
|
||||||
|
google = "google.default"
|
||||||
|
local = "local.default"
|
||||||
|
null = "null.default"
|
||||||
|
template = "template.default"
|
||||||
|
tls = "tls.default"
|
||||||
|
}
|
||||||
|
|
||||||
# Google Cloud
|
# Google Cloud
|
||||||
region = "us-central1"
|
region = "us-central1"
|
||||||
dns_zone = "example.com"
|
dns_zone = "example.com"
|
||||||
@ -77,9 +85,9 @@ In 4-8 minutes (varies by platform), the cluster will be ready. This Google Clou
|
|||||||
$ export KUBECONFIG=/home/user/.secrets/clusters/yavin/auth/kubeconfig
|
$ export KUBECONFIG=/home/user/.secrets/clusters/yavin/auth/kubeconfig
|
||||||
$ kubectl get nodes
|
$ kubectl get nodes
|
||||||
NAME STATUS AGE VERSION
|
NAME STATUS AGE VERSION
|
||||||
yavin-controller-0.c.example-com.internal Ready 6m v1.9.1
|
yavin-controller-0.c.example-com.internal Ready 6m v1.9.3
|
||||||
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.9.1
|
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.9.3
|
||||||
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.9.1
|
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.9.3
|
||||||
```
|
```
|
||||||
|
|
||||||
List the pods.
|
List the pods.
|
||||||
@ -118,4 +126,4 @@ Typhoon is not a product, trial, or free-tier. It is not run by a company, does
|
|||||||
|
|
||||||
Typhoon clusters will contain only [free](https://www.debian.org/intro/free) components. Cluster components will not collect data on users without their permission.
|
Typhoon clusters will contain only [free](https://www.debian.org/intro/free) components. Cluster components will not collect data on users without their permission.
|
||||||
|
|
||||||
*Disclosure: The author works for CoreOS and previously wrote Matchbox and original Tectonic for bare-metal and AWS. This project is not associated with CoreOS.*
|
*Disclosure: The author works for Red Hat (prev CoreOS), but Typhoon is unassociated and maintained independently.*
|
||||||
|
@ -18,7 +18,7 @@ module "google-cloud-yavin" {
|
|||||||
}
|
}
|
||||||
|
|
||||||
module "bare-metal-mercury" {
|
module "bare-metal-mercury" {
|
||||||
source = "git::https://github.com/poseidon/typhoon//bare-metal/container-linux/kubernetes?ref=v1.9.1"
|
source = "git::https://github.com/poseidon/typhoon//bare-metal/container-linux/kubernetes?ref=v1.9.3"
|
||||||
...
|
...
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
@ -127,3 +127,78 @@ Typhoon supports multi-controller clusters, so it is possible to upgrade a clust
|
|||||||
!!! warning
|
!!! warning
|
||||||
Typhoon does not support or document node replacement as an upgrade strategy. It limits Typhoon's ability to make infrastructure and architectural changes between tagged releases.
|
Typhoon does not support or document node replacement as an upgrade strategy. It limits Typhoon's ability to make infrastructure and architectural changes between tagged releases.
|
||||||
|
|
||||||
|
## Terraform v0.11.x
|
||||||
|
|
||||||
|
Terraform v0.10.x to v0.11.x introduced breaking changes in the provider and module inheritance relationship that you MUST be aware of when upgrading to the v0.11.x `terraform` binary. Terraform now allows multiple named (i.e. aliased) copies of a provider to exist (e.g `aws.default`, `aws.somename`). Terraform now also requires providers be explicitly passed to modules in order to satisfy module version contraints (which Typhoon modules define). Full details can be found in [typhoon#77](https://github.com/poseidon/typhoon/issues/77) and [hashicorp#16824](https://github.com/hashicorp/terraform/issues/16824).
|
||||||
|
|
||||||
|
In particular, after upgrading to the v0.11.x `terraform` binary, you'll notice:
|
||||||
|
|
||||||
|
* `terraform plan` does not succeed and prompts for variables when it didn't before
|
||||||
|
* `terraform plan` does not succeed and mentions "provider configuration block is required for all operations"
|
||||||
|
* `terraform apply` fails when you comment or remove a module usage in order to delete a cluster
|
||||||
|
|
||||||
|
### New users
|
||||||
|
|
||||||
|
New users can start with Terraform v0.11.x and follow the Typhoon docs without issue.
|
||||||
|
|
||||||
|
### Existing
|
||||||
|
|
||||||
|
Users who used modules to create clusters with Terraform v0.10.x and still manage those clusters via Terraform must explicitly add each provider used in `provider.tf`:
|
||||||
|
|
||||||
|
```
|
||||||
|
provider "local" {
|
||||||
|
version = "~> 1.0"
|
||||||
|
alias = "default"
|
||||||
|
}
|
||||||
|
|
||||||
|
provider "null" {
|
||||||
|
version = "~> 1.0"
|
||||||
|
alias = "default"
|
||||||
|
}
|
||||||
|
|
||||||
|
provider "template" {
|
||||||
|
version = "~> 1.0"
|
||||||
|
alias = "default"
|
||||||
|
}
|
||||||
|
|
||||||
|
provider "tls" {
|
||||||
|
version = "~> 1.0"
|
||||||
|
alias = "default"
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Modify the `google`, `aws`, or `digitalocean` provider section to specify an explicit `alias` name.
|
||||||
|
|
||||||
|
```
|
||||||
|
provider "digitalocean" {
|
||||||
|
version = "0.1.2"
|
||||||
|
token = "${chomp(file("~/.config/digital-ocean/token"))}"
|
||||||
|
alias = "default"
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
!!! note
|
||||||
|
In these examples, we've chosen to name each provider "default", though the point of the Terraform changes is that other possibilities are possible.
|
||||||
|
|
||||||
|
Edit each instance (i.e. usage) of a module and explicitly pass the providers.
|
||||||
|
|
||||||
|
```
|
||||||
|
module "aws-cluster" {
|
||||||
|
source = "git::https://github.com/poseidon/typhoon//aws/container-linux/kubernetes"
|
||||||
|
|
||||||
|
providers = {
|
||||||
|
aws = "aws.default"
|
||||||
|
local = "local.default"
|
||||||
|
null = "null.default"
|
||||||
|
template = "template.default"
|
||||||
|
tls = "tls.default"
|
||||||
|
}
|
||||||
|
|
||||||
|
cluster_name = "somename"
|
||||||
|
```
|
||||||
|
|
||||||
|
Re-run `terraform plan`. Plan will claim there are no changes to apply. Run `terraform apply` anyway as this will update Terraform state to be aware of the explicit provider versions.
|
||||||
|
|
||||||
|
### Verify
|
||||||
|
|
||||||
|
You should now be able to run `terraform plan` without errors. When you choose, you may comment or delete a module from Terraform configs and `terraform apply` should destroy the cluster correctly.
|
||||||
|
@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
|
|||||||
|
|
||||||
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
|
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
|
||||||
|
|
||||||
* Kubernetes v1.9.1 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
|
* Kubernetes v1.9.3 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
|
||||||
* Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
|
* Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
|
||||||
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
|
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
|
||||||
* Ready for Ingress, Dashboards, Metrics, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
|
* Ready for Ingress, Dashboards, Metrics, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
|
||||||
|
@ -1,6 +1,6 @@
|
|||||||
# Self-hosted Kubernetes assets (kubeconfig, manifests)
|
# Self-hosted Kubernetes assets (kubeconfig, manifests)
|
||||||
module "bootkube" {
|
module "bootkube" {
|
||||||
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=b83e321b350ac549c45ed6a05ffd8683336fb9f4"
|
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=203b90169ead2380f74cc64ea1f02c109806c9bc"
|
||||||
|
|
||||||
cluster_name = "${var.cluster_name}"
|
cluster_name = "${var.cluster_name}"
|
||||||
api_servers = ["${format("%s.%s", var.cluster_name, var.dns_zone)}"]
|
api_servers = ["${format("%s.%s", var.cluster_name, var.dns_zone)}"]
|
||||||
|
@ -7,7 +7,7 @@ systemd:
|
|||||||
- name: 40-etcd-cluster.conf
|
- name: 40-etcd-cluster.conf
|
||||||
contents: |
|
contents: |
|
||||||
[Service]
|
[Service]
|
||||||
Environment="ETCD_IMAGE_TAG=v3.2.13"
|
Environment="ETCD_IMAGE_TAG=v3.2.15"
|
||||||
Environment="ETCD_NAME=${etcd_name}"
|
Environment="ETCD_NAME=${etcd_name}"
|
||||||
Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${etcd_domain}:2379"
|
Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${etcd_domain}:2379"
|
||||||
Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${etcd_domain}:2380"
|
Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${etcd_domain}:2380"
|
||||||
@ -130,7 +130,7 @@ storage:
|
|||||||
contents:
|
contents:
|
||||||
inline: |
|
inline: |
|
||||||
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
|
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
|
||||||
KUBELET_IMAGE_TAG=v1.9.1
|
KUBELET_IMAGE_TAG=v1.9.3
|
||||||
- path: /etc/sysctl.d/max-user-watches.conf
|
- path: /etc/sysctl.d/max-user-watches.conf
|
||||||
filesystem: root
|
filesystem: root
|
||||||
contents:
|
contents:
|
||||||
@ -151,7 +151,7 @@ storage:
|
|||||||
# Move experimental manifests
|
# Move experimental manifests
|
||||||
[ -n "$(ls /opt/bootkube/assets/manifests-*/* 2>/dev/null)" ] && mv /opt/bootkube/assets/manifests-*/* /opt/bootkube/assets/manifests && rm -rf /opt/bootkube/assets/manifests-*
|
[ -n "$(ls /opt/bootkube/assets/manifests-*/* 2>/dev/null)" ] && mv /opt/bootkube/assets/manifests-*/* /opt/bootkube/assets/manifests && rm -rf /opt/bootkube/assets/manifests-*
|
||||||
BOOTKUBE_ACI="$${BOOTKUBE_ACI:-quay.io/coreos/bootkube}"
|
BOOTKUBE_ACI="$${BOOTKUBE_ACI:-quay.io/coreos/bootkube}"
|
||||||
BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.9.1}"
|
BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.10.0}"
|
||||||
BOOTKUBE_ASSETS="$${BOOTKUBE_ASSETS:-/opt/bootkube/assets}"
|
BOOTKUBE_ASSETS="$${BOOTKUBE_ASSETS:-/opt/bootkube/assets}"
|
||||||
exec /usr/bin/rkt run \
|
exec /usr/bin/rkt run \
|
||||||
--trust-keys-from-https \
|
--trust-keys-from-https \
|
||||||
|
@ -30,7 +30,6 @@ resource "google_compute_firewall" "allow-apiserver" {
|
|||||||
target_tags = ["${var.cluster_name}-controller"]
|
target_tags = ["${var.cluster_name}-controller"]
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
resource "google_compute_firewall" "allow-ingress" {
|
resource "google_compute_firewall" "allow-ingress" {
|
||||||
name = "${var.cluster_name}-allow-ingress"
|
name = "${var.cluster_name}-allow-ingress"
|
||||||
network = "${google_compute_network.network.name}"
|
network = "${google_compute_network.network.name}"
|
||||||
|
@ -104,7 +104,7 @@ storage:
|
|||||||
contents:
|
contents:
|
||||||
inline: |
|
inline: |
|
||||||
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
|
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
|
||||||
KUBELET_IMAGE_TAG=v1.9.1
|
KUBELET_IMAGE_TAG=v1.9.3
|
||||||
- path: /etc/sysctl.d/max-user-watches.conf
|
- path: /etc/sysctl.d/max-user-watches.conf
|
||||||
filesystem: root
|
filesystem: root
|
||||||
contents:
|
contents:
|
||||||
@ -122,7 +122,7 @@ storage:
|
|||||||
--volume config,kind=host,source=/etc/kubernetes \
|
--volume config,kind=host,source=/etc/kubernetes \
|
||||||
--mount volume=config,target=/etc/kubernetes \
|
--mount volume=config,target=/etc/kubernetes \
|
||||||
--insecure-options=image \
|
--insecure-options=image \
|
||||||
docker://gcr.io/google_containers/hyperkube:v1.9.1 \
|
docker://gcr.io/google_containers/hyperkube:v1.9.3 \
|
||||||
--net=host \
|
--net=host \
|
||||||
--dns=host \
|
--dns=host \
|
||||||
--exec=/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname)
|
--exec=/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname)
|
||||||
|
@ -50,7 +50,7 @@ pages:
|
|||||||
- 'Heapster': 'addons/heapster.md'
|
- 'Heapster': 'addons/heapster.md'
|
||||||
- 'Nginx Ingress': 'addons/ingress.md'
|
- 'Nginx Ingress': 'addons/ingress.md'
|
||||||
- 'Prometheus': 'addons/prometheus.md'
|
- 'Prometheus': 'addons/prometheus.md'
|
||||||
- 'Dashboard': 'addons/dashboard.md'
|
- 'Grafana': 'addons/grafana.md'
|
||||||
- 'Topics':
|
- 'Topics':
|
||||||
- 'Maintenance': 'topics/maintenance.md'
|
- 'Maintenance': 'topics/maintenance.md'
|
||||||
- 'Hardware': 'topics/hardware.md'
|
- 'Hardware': 'topics/hardware.md'
|
||||||
|
Reference in New Issue
Block a user