mirror of
https://github.com/puppetmaster/typhoon.git
synced 2025-08-13 20:14:57 +02:00
Compare commits
79 Commits
Author | SHA1 | Date | |
---|---|---|---|
270d1ce357 | |||
ab87b6cea3 | |||
d621512dd6 | |||
c59a9c66b1 | |||
21f2cef12f | |||
931e311786 | |||
2592a0aad4 | |||
6c5e287c29 | |||
2a4595eeee | |||
8e7e6b9f7f | |||
35f3b1b28c | |||
9fb1e1a0e2 | |||
b61d6373c5 | |||
42708f9a70 | |||
d54709f89c | |||
0e688ef05a | |||
9dcc255f8e | |||
9307e97c46 | |||
c112ee3829 | |||
45b556c08f | |||
da6aafe816 | |||
cce4537487 | |||
73126eb7f8 | |||
160ae34e71 | |||
06d40c5b44 | |||
98985e5acd | |||
ea6bf9c9fb | |||
486fdb6968 | |||
a44cf0edbd | |||
983c7aa012 | |||
3d9683b6e8 | |||
0da7757ef4 | |||
04c6613ff3 | |||
92600efd11 | |||
66c64b4e45 | |||
13f3745093 | |||
c4914c326b | |||
461fd46986 | |||
ceb5555222 | |||
86420fd507 | |||
5c383f4184 | |||
22fa051002 | |||
c8313751d7 | |||
195d902ab6 | |||
c19a68b59b | |||
de88fa5457 | |||
d9a0183f3f | |||
7e24c67608 | |||
a37aff7f35 | |||
03d23bfde7 | |||
2c10d24113 | |||
82a616c70b | |||
2fa7dac247 | |||
a41691b222 | |||
9034203d7a | |||
d42f6d6b5d | |||
2fa1840c30 | |||
8e0b8d7e40 | |||
a0cf527ccf | |||
65321acad2 | |||
064ce83f25 | |||
c3b0cdddf3 | |||
211ec94c75 | |||
8aca5a089e | |||
3e6e4ea339 | |||
103f1e16d7 | |||
50dd3e3b82 | |||
3dc755994b | |||
ddbfb2eee1 | |||
868265988b | |||
6adffcb778 | |||
bc967ddcd0 | |||
ef18f19ec4 | |||
f5efcc1ff8 | |||
996651c605 | |||
38fa7dff1a | |||
bbe295a3f1 | |||
d8db296932 | |||
388ac08492 |
2
.github/ISSUE_TEMPLATE.md
vendored
2
.github/ISSUE_TEMPLATE.md
vendored
@ -4,7 +4,7 @@
|
|||||||
|
|
||||||
### Environment
|
### Environment
|
||||||
|
|
||||||
* Platform: bare-metal, google-cloud, digital-ocean
|
* Platform: aws, bare-metal, google-cloud, digital-ocean
|
||||||
* OS: container-linux, fedora-cloud
|
* OS: container-linux, fedora-cloud
|
||||||
* Terraform: `terraform version`
|
* Terraform: `terraform version`
|
||||||
* Plugins: Provider plugin versions
|
* Plugins: Provider plugin versions
|
||||||
|
115
CHANGES.md
115
CHANGES.md
@ -4,12 +4,125 @@ Notable changes between versions.
|
|||||||
|
|
||||||
## Latest
|
## Latest
|
||||||
|
|
||||||
|
## v1.9.4
|
||||||
|
|
||||||
|
* Kubernetes [v1.9.4](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.9.md#v194)
|
||||||
|
* Secret, configMap, downward API, and projected volumes now read-only (breaking, [kubernetes#58720](https://github.com/kubernetes/kubernetes/pull/58720))
|
||||||
|
* Regressed `subPath` volume mounts (regression, [kubernetes#61076](https://github.com/kubernetes/kubernetes/issues/61076))
|
||||||
|
* Mitigated `subPath` [CVE-2017-1002101](https://github.com/kubernetes/kubernetes/issues/60813)
|
||||||
|
* Introduce [worker pools](https://typhoon.psdn.io/advanced/worker-pools/) for AWS and Google Cloud for joining heterogeneous workers to existing clusters.
|
||||||
|
* Use new Network Load Balancers and cross zone load balancing on AWS
|
||||||
|
* Allow flexvolume plugins to be used on any Typhoon cluster (not just bare-metal)
|
||||||
|
* Upgrade etcd from v3.2.15 to v3.3.2
|
||||||
|
* Update Calico from v3.0.2 to v3.0.3
|
||||||
|
* Use kubernetes-incubator/bootkube v0.10.0
|
||||||
|
* [Recommend](https://typhoon.psdn.io/topics/maintenance/#terraform-provider-ct-v021) updating `terraform-provider-ct` plugin from v0.2.0 to [v0.2.1](https://github.com/coreos/terraform-provider-ct/releases/tag/v0.2.1) (action recommended)
|
||||||
|
|
||||||
|
#### AWS
|
||||||
|
|
||||||
|
* Promote AWS platform to stable
|
||||||
|
* Allow groups of workers to be defined and joined to a cluster (i.e. worker pools) ([#150](https://github.com/poseidon/typhoon/pull/150))
|
||||||
|
* Replace the apiserver elastic load balancer with a network load balancer ([#136](https://github.com/poseidon/typhoon/pull/136))
|
||||||
|
* Replace the Ingress elastic load balancer with a network load balancer ([#141](https://github.com/poseidon/typhoon/pull/141))
|
||||||
|
* AWS [NLBs](https://aws.amazon.com/blogs/aws/new-network-load-balancer-effortless-scaling-to-millions-of-requests-per-second/) can handle millions of RPS with high throughput and low latency.
|
||||||
|
* Require `terraform-provider-aws` 1.7.0 or higher
|
||||||
|
* Enable NLB [cross-zone](https://aws.amazon.com/about-aws/whats-new/2018/02/network-load-balancer-now-supports-cross-zone-load-balancing/) load balancing ([#159](https://github.com/poseidon/typhoon/pull/159))
|
||||||
|
* Requests are automatically evenly distributed to targets regardless of AZ
|
||||||
|
* Require `terraform-provider-aws` 1.11.0 or higher
|
||||||
|
* Add kubelet `--volume-plugin-dir` flag to allow flexvolume plugins ([#142](https://github.com/poseidon/typhoon/pull/142))
|
||||||
|
* Fix controller and worker launch configs to ignore AMI changes ([#126](https://github.com/poseidon/typhoon/pull/126), [#158](https://github.com/poseidon/typhoon/pull/158))
|
||||||
|
|
||||||
|
#### Digital Ocean
|
||||||
|
|
||||||
|
* Add kubelet `--volume-plugin-dir` flag to allow flexvolume plugins ([#142](https://github.com/poseidon/typhoon/pull/142))
|
||||||
|
* Fix to pass `ssh_fingerprints` as a list to droplets ([#143](https://github.com/poseidon/typhoon/pull/143))
|
||||||
|
|
||||||
|
#### Google Cloud
|
||||||
|
|
||||||
|
* Allow groups of workers to be defined and joined to a cluster (i.e. worker pools) ([#148](https://github.com/poseidon/typhoon/pull/148))
|
||||||
|
* Add kubelet `--volume-plugin-dir` flag to allow flexvolume plugins ([#142](https://github.com/poseidon/typhoon/pull/142))
|
||||||
|
* Add `kubeconfig` variable to `controllers` and `workers` submodules ([#147](https://github.com/poseidon/typhoon/pull/147))
|
||||||
|
* Remove `kubeconfig_*` variables from `controllers` and `workers` submodules ([#147](https://github.com/poseidon/typhoon/pull/147))
|
||||||
|
* Allow initial experimentation with accelerators (i.e. GPUs) on workers ([#161](https://github.com/poseidon/typhoon/pull/161)) (unofficial)
|
||||||
|
* Require `terraform-provider-google` v1.6.0
|
||||||
|
|
||||||
|
#### Addons
|
||||||
|
|
||||||
|
* Update Prometheus from 2.1.0 to 2.2.0 ([#153](https://github.com/poseidon/typhoon/pull/153))
|
||||||
|
* Scrape Prometheus itself to enable alerts about Prometheus itself
|
||||||
|
* Adjust KubeletDown rule to fire when 10% of kubelets are down
|
||||||
|
* Update heapster from v1.5.0 to v1.5.1 ([#131](https://github.com/poseidon/typhoon/pull/131))
|
||||||
|
* Use separate service account
|
||||||
|
* Update nginx-ingress from 0.10.2 to 0.11.0
|
||||||
|
|
||||||
|
## v1.9.3
|
||||||
|
|
||||||
|
* Kubernetes [v1.9.3](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.9.md#v193)
|
||||||
|
* Network improvements and fixes ([#104](https://github.com/poseidon/typhoon/pull/104))
|
||||||
|
* Switch from Calico v2.6.6 to v3.0.2
|
||||||
|
* Add Calico GlobalNetworkSet CRD
|
||||||
|
* Update flannel from v0.9.0 to v0.10.0
|
||||||
|
* Use separate service account for flannel
|
||||||
|
* Update etcd from v3.2.14 to v3.2.15
|
||||||
|
|
||||||
|
#### Digital Ocean
|
||||||
|
|
||||||
|
* Use new Droplet [types](https://developers.digitalocean.com/documentation/changelog/api-v2/new-size-slugs-for-droplet-plan-changes/) which offer more CPU/memory, at lower cost. ([#105](https://github.com/poseidon/typhoon/pull/105))
|
||||||
|
* A small Digital Ocean cluster costs less than $25 a month!
|
||||||
|
|
||||||
|
#### Addons
|
||||||
|
|
||||||
|
* Update Prometheus from v2.0.0 to v2.1.0 ([#113](https://github.com/poseidon/typhoon/pull/113))
|
||||||
|
* Improve alerting rules
|
||||||
|
* Relabel discovered kubelet, endpoint, service, and apiserver scrapes
|
||||||
|
* Use separate service accounts
|
||||||
|
* Update node-exporter and kube-state-metrics
|
||||||
|
* Include Grafana dashboards for Kubernetes admins ([#113](https://github.com/poseidon/typhoon/pull/113))
|
||||||
|
* Add grafana-watcher to load bundled upstream dashboards
|
||||||
|
* Update nginx-ingress from 0.9.0 to 0.10.2
|
||||||
|
* Update CLUO from v0.5.0 to v0.6.0
|
||||||
|
* Switch manifests to use `apps/v1` Deployments and Daemonsets ([#120](https://github.com/poseidon/typhoon/pull/120))
|
||||||
|
* Remove Kubernetes Dashboard manifests ([#121](https://github.com/poseidon/typhoon/pull/121))
|
||||||
|
|
||||||
|
## v1.9.2
|
||||||
|
|
||||||
|
* Kubernetes [v1.9.2](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.9.md#v192)
|
||||||
|
* Add Terraform v0.11.x support
|
||||||
|
* Add explicit "providers" section to modules for Terraform v0.11.x
|
||||||
|
* Retain support for Terraform v0.10.4+
|
||||||
|
* Add [migration guide](https://typhoon.psdn.io/topics/maintenance/#terraform-v011x) from Terraform v0.10.x to v0.11.x (**action required!**)
|
||||||
|
* Update etcd from 3.2.13 to 3.2.14
|
||||||
|
* Update calico from 2.6.5 to 2.6.6
|
||||||
|
* Update kube-dns from v1.14.7 to v1.14.8
|
||||||
|
* Use separate service account for kube-dns
|
||||||
|
* Use kubernetes-incubator/bootkube v0.10.0
|
||||||
|
|
||||||
|
#### Bare-Metal
|
||||||
|
|
||||||
|
* Use per-node Container Linux install profiles ([#97](https://github.com/poseidon/typhoon/pull/97))
|
||||||
|
* Allow Container Linux channel/version to be chosen per-cluster
|
||||||
|
* Fix issue where cluster deletion could require `terraform apply` multiple times
|
||||||
|
|
||||||
|
#### Digital Ocean
|
||||||
|
|
||||||
|
* Relax `digitalocean` provider version constraint
|
||||||
|
* Fix bug with `terraform plan` always showing a firewall diff to be applied ([#3](https://github.com/poseidon/typhoon/issues/3))
|
||||||
|
|
||||||
|
#### Addons
|
||||||
|
|
||||||
|
* Update CLUO to v0.5.0 to fix compatibility with Kubernetes 1.9 (**important**)
|
||||||
|
* Earlier versions can't roll out Container Linux updates on Kubernetes 1.9 nodes ([cluo#163](https://github.com/coreos/container-linux-update-operator/issues/163))
|
||||||
|
* Update kube-state-metrics from v1.1.0 to v1.2.0
|
||||||
|
* Fix RBAC cluster role for kube-state-metrics
|
||||||
|
|
||||||
|
## v1.9.1
|
||||||
|
|
||||||
* Kubernetes [v1.9.1](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.9.md#v191)
|
* Kubernetes [v1.9.1](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.9.md#v191)
|
||||||
* Update kube-dns from 1.14.5 to v1.14.7
|
* Update kube-dns from 1.14.5 to v1.14.7
|
||||||
* Update etcd from 3.2.0 to 3.2.13
|
* Update etcd from 3.2.0 to 3.2.13
|
||||||
* Update Calico from v2.6.4 to v2.6.5
|
* Update Calico from v2.6.4 to v2.6.5
|
||||||
* Enable portmap to fix hostPort with Calico
|
* Enable portmap to fix hostPort with Calico
|
||||||
* Service account for controller-manager
|
* Use separate service account for controller-manager
|
||||||
|
|
||||||
## v1.8.6
|
## v1.8.6
|
||||||
|
|
||||||
|
33
README.md
33
README.md
@ -11,10 +11,11 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
|
|||||||
|
|
||||||
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
|
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
|
||||||
|
|
||||||
* Kubernetes v1.9.1 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
|
* Kubernetes v1.9.4 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
|
||||||
* Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
|
* Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
|
||||||
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
|
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
|
||||||
* Ready for Ingress, Dashboards, Metrics, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
|
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/) and [preemption](https://typhoon.psdn.io/google-cloud/#preemption) (varies by platform)
|
||||||
|
* Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
|
||||||
|
|
||||||
## Modules
|
## Modules
|
||||||
|
|
||||||
@ -22,7 +23,7 @@ Typhoon provides a Terraform Module for each supported operating system and plat
|
|||||||
|
|
||||||
| Platform | Operating System | Terraform Module | Status |
|
| Platform | Operating System | Terraform Module | Status |
|
||||||
|---------------|------------------|------------------|--------|
|
|---------------|------------------|------------------|--------|
|
||||||
| AWS | Container Linux | [aws/container-linux/kubernetes](aws/container-linux/kubernetes) | beta |
|
| AWS | Container Linux | [aws/container-linux/kubernetes](aws/container-linux/kubernetes) | stable |
|
||||||
| Bare-Metal | Container Linux | [bare-metal/container-linux/kubernetes](bare-metal/container-linux/kubernetes) | stable |
|
| Bare-Metal | Container Linux | [bare-metal/container-linux/kubernetes](bare-metal/container-linux/kubernetes) | stable |
|
||||||
| Digital Ocean | Container Linux | [digital-ocean/container-linux/kubernetes](digital-ocean/container-linux/kubernetes) | beta |
|
| Digital Ocean | Container Linux | [digital-ocean/container-linux/kubernetes](digital-ocean/container-linux/kubernetes) | beta |
|
||||||
| Google Cloud | Container Linux | [google-cloud/container-linux/kubernetes](google-cloud/container-linux/kubernetes) | beta |
|
| Google Cloud | Container Linux | [google-cloud/container-linux/kubernetes](google-cloud/container-linux/kubernetes) | beta |
|
||||||
@ -43,13 +44,21 @@ Define a Kubernetes cluster by using the Terraform module for your chosen platfo
|
|||||||
|
|
||||||
```tf
|
```tf
|
||||||
module "google-cloud-yavin" {
|
module "google-cloud-yavin" {
|
||||||
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes"
|
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes?ref=v1.9.4"
|
||||||
|
|
||||||
|
providers = {
|
||||||
|
google = "google.default"
|
||||||
|
local = "local.default"
|
||||||
|
null = "null.default"
|
||||||
|
template = "template.default"
|
||||||
|
tls = "tls.default"
|
||||||
|
}
|
||||||
|
|
||||||
# Google Cloud
|
# Google Cloud
|
||||||
region = "us-central1"
|
region = "us-central1"
|
||||||
dns_zone = "example.com"
|
dns_zone = "example.com"
|
||||||
dns_zone_name = "example-zone"
|
dns_zone_name = "example-zone"
|
||||||
os_image = "coreos-stable-1576-5-0-v20180105"
|
os_image = "coreos-stable"
|
||||||
|
|
||||||
cluster_name = "yavin"
|
cluster_name = "yavin"
|
||||||
controller_count = 1
|
controller_count = 1
|
||||||
@ -78,9 +87,9 @@ In 4-8 minutes (varies by platform), the cluster will be ready. This Google Clou
|
|||||||
$ export KUBECONFIG=/home/user/.secrets/clusters/yavin/auth/kubeconfig
|
$ export KUBECONFIG=/home/user/.secrets/clusters/yavin/auth/kubeconfig
|
||||||
$ kubectl get nodes
|
$ kubectl get nodes
|
||||||
NAME STATUS AGE VERSION
|
NAME STATUS AGE VERSION
|
||||||
yavin-controller-0.c.example-com.internal Ready 6m v1.9.1
|
yavin-controller-0.c.example-com.internal Ready 6m v1.9.4
|
||||||
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.9.1
|
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.9.4
|
||||||
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.9.1
|
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.9.4
|
||||||
```
|
```
|
||||||
|
|
||||||
List the pods.
|
List the pods.
|
||||||
@ -115,11 +124,11 @@ Typhoon is strict about minimalism, maturity, and scope. These are not in scope:
|
|||||||
|
|
||||||
Ask questions on the IRC #typhoon channel on [freenode.net](http://freenode.net/).
|
Ask questions on the IRC #typhoon channel on [freenode.net](http://freenode.net/).
|
||||||
|
|
||||||
## Background
|
## Motivation
|
||||||
|
|
||||||
Typhoon powers the author's cloud and colocation clusters. The project has evolved through operational experience and Kubernetes changes. Typhoon is shared under a free license to allow others to use the work freely and contribute to its upkeep.
|
Typhoon powers the author's cloud and colocation clusters. The project has evolved through operational experience and Kubernetes changes. Typhoon is shared under a free license to allow others to use the work freely and contribute to its upkeep.
|
||||||
|
|
||||||
Typhoon addresses real world needs, which you may share. It is honest about limitations or areas that aren't mature yet. It avoids buzzword bingo and hype. It does not aim to be the one-solution-fits-all distro. An ecosystem of free (or enterprise) Kubernetes distros is healthy.
|
Typhoon addresses real world needs, which you may share. It is honest about limitations or areas that aren't mature yet. It avoids buzzword bingo and hype. It does not aim to be the one-solution-fits-all distro. An ecosystem of Kubernetes distributions is healthy.
|
||||||
|
|
||||||
## Social Contract
|
## Social Contract
|
||||||
|
|
||||||
@ -127,4 +136,6 @@ Typhoon is not a product, trial, or free-tier. It is not run by a company, does
|
|||||||
|
|
||||||
Typhoon clusters will contain only [free](https://www.debian.org/intro/free) components. Cluster components will not collect data on users without their permission.
|
Typhoon clusters will contain only [free](https://www.debian.org/intro/free) components. Cluster components will not collect data on users without their permission.
|
||||||
|
|
||||||
*Disclosure: The author works for CoreOS and previously wrote Matchbox and original Tectonic for bare-metal and AWS. This project is not associated with CoreOS.*
|
## Donations
|
||||||
|
|
||||||
|
Typhoon does not accept money donations. Instead, we encourage you to donate to one of [these organizations](https://github.com/poseidon/typhoon/wiki/Donations) to show your appreciation.
|
||||||
|
@ -1,5 +1,5 @@
|
|||||||
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
kind: ClusterRoleBinding
|
kind: ClusterRoleBinding
|
||||||
apiVersion: rbac.authorization.k8s.io/v1beta1
|
|
||||||
metadata:
|
metadata:
|
||||||
name: reboot-coordinator
|
name: reboot-coordinator
|
||||||
roleRef:
|
roleRef:
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
apiVersion: rbac.authorization.k8s.io/v1beta1
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
kind: ClusterRole
|
kind: ClusterRole
|
||||||
metadata:
|
metadata:
|
||||||
name: reboot-coordinator
|
name: reboot-coordinator
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
apiVersion: extensions/v1beta1
|
apiVersion: apps/v1
|
||||||
kind: DaemonSet
|
kind: DaemonSet
|
||||||
metadata:
|
metadata:
|
||||||
name: container-linux-update-agent
|
name: container-linux-update-agent
|
||||||
@ -8,6 +8,9 @@ spec:
|
|||||||
type: RollingUpdate
|
type: RollingUpdate
|
||||||
rollingUpdate:
|
rollingUpdate:
|
||||||
maxUnavailable: 1
|
maxUnavailable: 1
|
||||||
|
selector:
|
||||||
|
matchLabels:
|
||||||
|
app: container-linux-update-agent
|
||||||
template:
|
template:
|
||||||
metadata:
|
metadata:
|
||||||
labels:
|
labels:
|
||||||
@ -15,7 +18,7 @@ spec:
|
|||||||
spec:
|
spec:
|
||||||
containers:
|
containers:
|
||||||
- name: update-agent
|
- name: update-agent
|
||||||
image: quay.io/coreos/container-linux-update-operator:v0.4.1
|
image: quay.io/coreos/container-linux-update-operator:v0.6.0
|
||||||
command:
|
command:
|
||||||
- "/bin/update-agent"
|
- "/bin/update-agent"
|
||||||
volumeMounts:
|
volumeMounts:
|
||||||
|
@ -1,10 +1,13 @@
|
|||||||
apiVersion: extensions/v1beta1
|
apiVersion: apps/v1
|
||||||
kind: Deployment
|
kind: Deployment
|
||||||
metadata:
|
metadata:
|
||||||
name: container-linux-update-operator
|
name: container-linux-update-operator
|
||||||
namespace: reboot-coordinator
|
namespace: reboot-coordinator
|
||||||
spec:
|
spec:
|
||||||
replicas: 1
|
replicas: 1
|
||||||
|
selector:
|
||||||
|
matchLabels:
|
||||||
|
app: container-linux-update-operator
|
||||||
template:
|
template:
|
||||||
metadata:
|
metadata:
|
||||||
labels:
|
labels:
|
||||||
@ -12,7 +15,7 @@ spec:
|
|||||||
spec:
|
spec:
|
||||||
containers:
|
containers:
|
||||||
- name: update-operator
|
- name: update-operator
|
||||||
image: quay.io/coreos/container-linux-update-operator:v0.4.1
|
image: quay.io/coreos/container-linux-update-operator:v0.6.0
|
||||||
command:
|
command:
|
||||||
- "/bin/update-operator"
|
- "/bin/update-operator"
|
||||||
env:
|
env:
|
||||||
|
@ -1,32 +0,0 @@
|
|||||||
apiVersion: extensions/v1beta1
|
|
||||||
kind: Deployment
|
|
||||||
metadata:
|
|
||||||
name: kubernetes-dashboard
|
|
||||||
namespace: kube-system
|
|
||||||
spec:
|
|
||||||
replicas: 1
|
|
||||||
template:
|
|
||||||
metadata:
|
|
||||||
labels:
|
|
||||||
name: kubernetes-dashboard
|
|
||||||
phase: prod
|
|
||||||
spec:
|
|
||||||
containers:
|
|
||||||
- name: kubernetes-dashboard
|
|
||||||
image: gcr.io/google_containers/kubernetes-dashboard-amd64:v1.6.1
|
|
||||||
ports:
|
|
||||||
- name: http
|
|
||||||
containerPort: 9090
|
|
||||||
resources:
|
|
||||||
limits:
|
|
||||||
cpu: 100m
|
|
||||||
memory: 300Mi
|
|
||||||
requests:
|
|
||||||
cpu: 100m
|
|
||||||
memory: 100Mi
|
|
||||||
livenessProbe:
|
|
||||||
httpGet:
|
|
||||||
path: /
|
|
||||||
port: 9090
|
|
||||||
initialDelaySeconds: 30
|
|
||||||
timeoutSeconds: 30
|
|
@ -1,15 +0,0 @@
|
|||||||
apiVersion: v1
|
|
||||||
kind: Service
|
|
||||||
metadata:
|
|
||||||
name: kubernetes-dashboard
|
|
||||||
namespace: kube-system
|
|
||||||
spec:
|
|
||||||
type: ClusterIP
|
|
||||||
selector:
|
|
||||||
name: kubernetes-dashboard
|
|
||||||
phase: prod
|
|
||||||
ports:
|
|
||||||
- name: http
|
|
||||||
protocol: TCP
|
|
||||||
port: 80
|
|
||||||
targetPort: 9090
|
|
7499
addons/grafana/config.yaml
Normal file
7499
addons/grafana/config.yaml
Normal file
@ -0,0 +1,7499 @@
|
|||||||
|
apiVersion: v1
|
||||||
|
kind: ConfigMap
|
||||||
|
metadata:
|
||||||
|
name: grafana-dashboards
|
||||||
|
namespace: monitoring
|
||||||
|
data:
|
||||||
|
deployment-dashboard.json: |+
|
||||||
|
{
|
||||||
|
"dashboard":
|
||||||
|
{
|
||||||
|
"__inputs": [
|
||||||
|
{
|
||||||
|
"description": "",
|
||||||
|
"label": "prometheus",
|
||||||
|
"name": "DS_PROMETHEUS",
|
||||||
|
"pluginId": "prometheus",
|
||||||
|
"pluginName": "Prometheus",
|
||||||
|
"type": "datasource"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"annotations": {
|
||||||
|
"list": []
|
||||||
|
},
|
||||||
|
"editable": false,
|
||||||
|
"graphTooltip": 1,
|
||||||
|
"hideControls": false,
|
||||||
|
"links": [],
|
||||||
|
"rows": [
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "200px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(245, 54, 54, 0.9)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(50, 172, 45, 0.97)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "none",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"id": 8,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfix": "cores",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 4,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(rate(container_cpu_usage_seconds_total{namespace=\"$deployment_namespace\",pod_name=~\"$deployment_name.*\"}[3m]))",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "CPU",
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "110%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(245, 54, 54, 0.9)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(50, 172, 45, 0.97)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "none",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"id": 9,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfix": "GB",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "80%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 4,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(container_memory_usage_bytes{namespace=\"$deployment_namespace\",pod_name=~\"$deployment_name.*\"}) / 1024^3",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Memory",
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "110%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(245, 54, 54, 0.9)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(50, 172, 45, 0.97)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "Bps",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": false
|
||||||
|
},
|
||||||
|
"id": 7,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfix": "",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 4,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(rate(container_network_transmit_bytes_total{namespace=\"$deployment_namespace\",pod_name=~\"$deployment_name.*\"}[3m])) + sum(rate(container_network_receive_bytes_total{namespace=\"$deployment_namespace\",pod_name=~\"$deployment_name.*\"}[3m]))",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Network",
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "Dashboard Row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "100px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(245, 54, 54, 0.9)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(50, 172, 45, 0.97)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "none",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": false
|
||||||
|
},
|
||||||
|
"id": 5,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "max(kube_deployment_spec_replicas{deployment=\"$deployment_name\",namespace=\"$deployment_namespace\"}) without (instance, pod)",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"metric": "kube_deployment_spec_replicas",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Desired Replicas",
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(245, 54, 54, 0.9)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(50, 172, 45, 0.97)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "none",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"id": 6,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "min(kube_deployment_status_replicas_available{deployment=\"$deployment_name\",namespace=\"$deployment_namespace\"}) without (instance, pod)",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Available Replicas",
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(245, 54, 54, 0.9)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(50, 172, 45, 0.97)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "none",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"id": 3,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "max(kube_deployment_status_observed_generation{deployment=\"$deployment_name\",namespace=\"$deployment_namespace\"}) without (instance, pod)",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Observed Generation",
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(245, 54, 54, 0.9)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(50, 172, 45, 0.97)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "none",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"id": 2,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "max(kube_deployment_metadata_generation{deployment=\"$deployment_name\",namespace=\"$deployment_namespace\"}) without (instance, pod)",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Metadata Generation",
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "Dashboard Row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "350px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 1,
|
||||||
|
"grid": {
|
||||||
|
"threshold1Color": "rgba(216, 200, 27, 0.27)",
|
||||||
|
"threshold2Color": "rgba(234, 112, 112, 0.22)"
|
||||||
|
},
|
||||||
|
"id": 1,
|
||||||
|
"isNew": true,
|
||||||
|
"legend": {
|
||||||
|
"alignAsTable": false,
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"hideEmpty": false,
|
||||||
|
"hideZero": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"rightSide": false,
|
||||||
|
"show": true,
|
||||||
|
"total": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 12,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "max(kube_deployment_status_replicas{deployment=\"$deployment_name\",namespace=\"$deployment_namespace\"}) without (instance, pod)",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "current replicas",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 30
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "min(kube_deployment_status_replicas_available{deployment=\"$deployment_name\",namespace=\"$deployment_namespace\"}) without (instance, pod)",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "available",
|
||||||
|
"refId": "B",
|
||||||
|
"step": 30
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "max(kube_deployment_status_replicas_unavailable{deployment=\"$deployment_name\",namespace=\"$deployment_namespace\"}) without (instance, pod)",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "unavailable",
|
||||||
|
"refId": "C",
|
||||||
|
"step": 30
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "min(kube_deployment_status_replicas_updated{deployment=\"$deployment_name\",namespace=\"$deployment_namespace\"}) without (instance, pod)",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "updated",
|
||||||
|
"refId": "D",
|
||||||
|
"step": 30
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "max(kube_deployment_spec_replicas{deployment=\"$deployment_name\",namespace=\"$deployment_namespace\"}) without (instance, pod)",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "desired",
|
||||||
|
"refId": "E",
|
||||||
|
"step": 30
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Replicas",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": true,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "cumulative"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"mode": "time",
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "none",
|
||||||
|
"label": "",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"label": "",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": false
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "Dashboard Row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"schemaVersion": 14,
|
||||||
|
"sharedCrosshair": false,
|
||||||
|
"style": "dark",
|
||||||
|
"tags": [],
|
||||||
|
"templating": {
|
||||||
|
"list": [
|
||||||
|
{
|
||||||
|
"allValue": ".*",
|
||||||
|
"current": {},
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"hide": 0,
|
||||||
|
"includeAll": false,
|
||||||
|
"label": "Namespace",
|
||||||
|
"multi": false,
|
||||||
|
"name": "deployment_namespace",
|
||||||
|
"options": [],
|
||||||
|
"query": "label_values(kube_deployment_metadata_generation, namespace)",
|
||||||
|
"refresh": 1,
|
||||||
|
"regex": "",
|
||||||
|
"sort": 0,
|
||||||
|
"tagValuesQuery": null,
|
||||||
|
"tags": [],
|
||||||
|
"tagsQuery": "",
|
||||||
|
"type": "query",
|
||||||
|
"useTags": false
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"allValue": null,
|
||||||
|
"current": {},
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"hide": 0,
|
||||||
|
"includeAll": false,
|
||||||
|
"label": "Deployment",
|
||||||
|
"multi": false,
|
||||||
|
"name": "deployment_name",
|
||||||
|
"options": [],
|
||||||
|
"query": "label_values(kube_deployment_metadata_generation{namespace=\"$deployment_namespace\"}, deployment)",
|
||||||
|
"refresh": 1,
|
||||||
|
"regex": "",
|
||||||
|
"sort": 0,
|
||||||
|
"tagValuesQuery": "",
|
||||||
|
"tags": [],
|
||||||
|
"tagsQuery": "deployment",
|
||||||
|
"type": "query",
|
||||||
|
"useTags": false
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"time": {
|
||||||
|
"from": "now-6h",
|
||||||
|
"to": "now"
|
||||||
|
},
|
||||||
|
"timepicker": {
|
||||||
|
"refresh_intervals": [
|
||||||
|
"5s",
|
||||||
|
"10s",
|
||||||
|
"30s",
|
||||||
|
"1m",
|
||||||
|
"5m",
|
||||||
|
"15m",
|
||||||
|
"30m",
|
||||||
|
"1h",
|
||||||
|
"2h",
|
||||||
|
"1d"
|
||||||
|
],
|
||||||
|
"time_options": [
|
||||||
|
"5m",
|
||||||
|
"15m",
|
||||||
|
"1h",
|
||||||
|
"6h",
|
||||||
|
"12h",
|
||||||
|
"24h",
|
||||||
|
"2d",
|
||||||
|
"7d",
|
||||||
|
"30d"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"timezone": "browser",
|
||||||
|
"title": "Deployment",
|
||||||
|
"version": 1
|
||||||
|
}
|
||||||
|
,
|
||||||
|
"inputs": [
|
||||||
|
{
|
||||||
|
"name": "DS_PROMETHEUS",
|
||||||
|
"pluginId": "prometheus",
|
||||||
|
"type": "datasource",
|
||||||
|
"value": "prometheus"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"overwrite": true
|
||||||
|
}
|
||||||
|
etcd-dashboard.json: |+
|
||||||
|
{
|
||||||
|
"dashboard":
|
||||||
|
{
|
||||||
|
"__inputs": [
|
||||||
|
{
|
||||||
|
"name": "DS_PROMETHEUS",
|
||||||
|
"label": "prometheus",
|
||||||
|
"description": "",
|
||||||
|
"type": "datasource",
|
||||||
|
"pluginId": "prometheus",
|
||||||
|
"pluginName": "Prometheus"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"__requires": [
|
||||||
|
{
|
||||||
|
"type": "grafana",
|
||||||
|
"id": "grafana",
|
||||||
|
"name": "Grafana",
|
||||||
|
"version": "4.5.2"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"type": "panel",
|
||||||
|
"id": "graph",
|
||||||
|
"name": "Graph",
|
||||||
|
"version": ""
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"type": "datasource",
|
||||||
|
"id": "prometheus",
|
||||||
|
"name": "Prometheus",
|
||||||
|
"version": "1.0.0"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"type": "panel",
|
||||||
|
"id": "singlestat",
|
||||||
|
"name": "Singlestat",
|
||||||
|
"version": ""
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"annotations": {
|
||||||
|
"list": []
|
||||||
|
},
|
||||||
|
"description": "etcd sample Grafana dashboard with Prometheus",
|
||||||
|
"editable": false,
|
||||||
|
"gnetId": null,
|
||||||
|
"graphTooltip": 0,
|
||||||
|
"hideControls": false,
|
||||||
|
"id": null,
|
||||||
|
"links": [],
|
||||||
|
"refresh": false,
|
||||||
|
"rows": [
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"height": "250px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"cacheTimeout": null,
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(245, 54, 54, 0.9)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(50, 172, 45, 0.97)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"format": "none",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"id": 28,
|
||||||
|
"interval": null,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"nullText": null,
|
||||||
|
"postfix": "",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"tableColumn": "",
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(etcd_server_has_leader)",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "",
|
||||||
|
"metric": "etcd_server_has_leader",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 20
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "",
|
||||||
|
"title": "Up",
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "200%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 0,
|
||||||
|
"id": 23,
|
||||||
|
"legend": {
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"show": false,
|
||||||
|
"total": false,
|
||||||
|
"values": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 5,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(rate(grpc_server_started_total{grpc_type=\"unary\"}[5m]))",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "RPC Rate",
|
||||||
|
"metric": "grpc_server_started_total",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 4
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "sum(rate(grpc_server_handled_total{grpc_type=\"unary\",grpc_code!=\"OK\"}[5m]))",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "RPC Failed Rate",
|
||||||
|
"metric": "grpc_server_handled_total",
|
||||||
|
"refId": "B",
|
||||||
|
"step": 4
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": [],
|
||||||
|
"timeFrom": null,
|
||||||
|
"timeShift": null,
|
||||||
|
"title": "RPC Rate",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "individual"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"buckets": null,
|
||||||
|
"mode": "time",
|
||||||
|
"name": null,
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "ops",
|
||||||
|
"label": null,
|
||||||
|
"logBase": 1,
|
||||||
|
"max": null,
|
||||||
|
"min": null,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"label": null,
|
||||||
|
"logBase": 1,
|
||||||
|
"max": null,
|
||||||
|
"min": null,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 0,
|
||||||
|
"id": 41,
|
||||||
|
"legend": {
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"show": false,
|
||||||
|
"total": false,
|
||||||
|
"values": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 4,
|
||||||
|
"stack": true,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(grpc_server_started_total{grpc_service=\"etcdserverpb.Watch\",grpc_type=\"bidi_stream\"}) - sum(grpc_server_handled_total{grpc_service=\"etcdserverpb.Watch\",grpc_type=\"bidi_stream\"})",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "Watch Streams",
|
||||||
|
"metric": "grpc_server_handled_total",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 4
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "sum(grpc_server_started_total{grpc_service=\"etcdserverpb.Lease\",grpc_type=\"bidi_stream\"}) - sum(grpc_server_handled_total{grpc_service=\"etcdserverpb.Lease\",grpc_type=\"bidi_stream\"})",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "Lease Streams",
|
||||||
|
"metric": "grpc_server_handled_total",
|
||||||
|
"refId": "B",
|
||||||
|
"step": 4
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": [],
|
||||||
|
"timeFrom": null,
|
||||||
|
"timeShift": null,
|
||||||
|
"title": "Active Streams",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "individual"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"buckets": null,
|
||||||
|
"mode": "time",
|
||||||
|
"name": null,
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"label": "",
|
||||||
|
"logBase": 1,
|
||||||
|
"max": null,
|
||||||
|
"min": null,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"label": null,
|
||||||
|
"logBase": 1,
|
||||||
|
"max": null,
|
||||||
|
"min": null,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"repeat": null,
|
||||||
|
"repeatIteration": null,
|
||||||
|
"repeatRowId": null,
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "Row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"height": "250px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"decimals": null,
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 0,
|
||||||
|
"grid": {},
|
||||||
|
"id": 1,
|
||||||
|
"legend": {
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"show": false,
|
||||||
|
"total": false,
|
||||||
|
"values": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 4,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "etcd_debugging_mvcc_db_total_size_in_bytes",
|
||||||
|
"format": "time_series",
|
||||||
|
"hide": false,
|
||||||
|
"interval": "",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "{{instance}} DB Size",
|
||||||
|
"metric": "",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 4
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": [],
|
||||||
|
"timeFrom": null,
|
||||||
|
"timeShift": null,
|
||||||
|
"title": "DB Size",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "cumulative"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"buckets": null,
|
||||||
|
"mode": "time",
|
||||||
|
"name": null,
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "bytes",
|
||||||
|
"logBase": 1,
|
||||||
|
"max": null,
|
||||||
|
"min": null,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"logBase": 1,
|
||||||
|
"max": null,
|
||||||
|
"min": null,
|
||||||
|
"show": false
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 0,
|
||||||
|
"grid": {},
|
||||||
|
"id": 3,
|
||||||
|
"legend": {
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"show": false,
|
||||||
|
"total": false,
|
||||||
|
"values": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 1,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 4,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": true,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "histogram_quantile(0.99, sum(rate(etcd_disk_wal_fsync_duration_seconds_bucket[5m])) by (instance, le))",
|
||||||
|
"format": "time_series",
|
||||||
|
"hide": false,
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "{{instance}} WAL fsync",
|
||||||
|
"metric": "etcd_disk_wal_fsync_duration_seconds_bucket",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 4
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "histogram_quantile(0.99, sum(rate(etcd_disk_backend_commit_duration_seconds_bucket[5m])) by (instance, le))",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "{{instance}} DB fsync",
|
||||||
|
"metric": "etcd_disk_backend_commit_duration_seconds_bucket",
|
||||||
|
"refId": "B",
|
||||||
|
"step": 4
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": [],
|
||||||
|
"timeFrom": null,
|
||||||
|
"timeShift": null,
|
||||||
|
"title": "Disk Sync Duration",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "cumulative"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"buckets": null,
|
||||||
|
"mode": "time",
|
||||||
|
"name": null,
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "s",
|
||||||
|
"logBase": 1,
|
||||||
|
"max": null,
|
||||||
|
"min": null,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"logBase": 1,
|
||||||
|
"max": null,
|
||||||
|
"min": null,
|
||||||
|
"show": false
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 0,
|
||||||
|
"id": 29,
|
||||||
|
"legend": {
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"show": false,
|
||||||
|
"total": false,
|
||||||
|
"values": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 4,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "process_resident_memory_bytes",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "{{instance}} Resident Memory",
|
||||||
|
"metric": "process_resident_memory_bytes",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 4
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": [],
|
||||||
|
"timeFrom": null,
|
||||||
|
"timeShift": null,
|
||||||
|
"title": "Memory",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "individual"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"buckets": null,
|
||||||
|
"mode": "time",
|
||||||
|
"name": null,
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "bytes",
|
||||||
|
"label": null,
|
||||||
|
"logBase": 1,
|
||||||
|
"max": null,
|
||||||
|
"min": null,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"label": null,
|
||||||
|
"logBase": 1,
|
||||||
|
"max": null,
|
||||||
|
"min": null,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"repeat": null,
|
||||||
|
"repeatIteration": null,
|
||||||
|
"repeatRowId": null,
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "New row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"height": "250px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 5,
|
||||||
|
"id": 22,
|
||||||
|
"legend": {
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"show": false,
|
||||||
|
"total": false,
|
||||||
|
"values": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 3,
|
||||||
|
"stack": true,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "rate(etcd_network_client_grpc_received_bytes_total[5m])",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "{{instance}} Client Traffic In",
|
||||||
|
"metric": "etcd_network_client_grpc_received_bytes_total",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 4
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": [],
|
||||||
|
"timeFrom": null,
|
||||||
|
"timeShift": null,
|
||||||
|
"title": "Client Traffic In",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "individual"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"buckets": null,
|
||||||
|
"mode": "time",
|
||||||
|
"name": null,
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "Bps",
|
||||||
|
"label": null,
|
||||||
|
"logBase": 1,
|
||||||
|
"max": null,
|
||||||
|
"min": null,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"label": null,
|
||||||
|
"logBase": 1,
|
||||||
|
"max": null,
|
||||||
|
"min": null,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 5,
|
||||||
|
"id": 21,
|
||||||
|
"legend": {
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"show": false,
|
||||||
|
"total": false,
|
||||||
|
"values": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 3,
|
||||||
|
"stack": true,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "rate(etcd_network_client_grpc_sent_bytes_total[5m])",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "{{instance}} Client Traffic Out",
|
||||||
|
"metric": "etcd_network_client_grpc_sent_bytes_total",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 4
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": [],
|
||||||
|
"timeFrom": null,
|
||||||
|
"timeShift": null,
|
||||||
|
"title": "Client Traffic Out",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "individual"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"buckets": null,
|
||||||
|
"mode": "time",
|
||||||
|
"name": null,
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "Bps",
|
||||||
|
"label": null,
|
||||||
|
"logBase": 1,
|
||||||
|
"max": null,
|
||||||
|
"min": null,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"label": null,
|
||||||
|
"logBase": 1,
|
||||||
|
"max": null,
|
||||||
|
"min": null,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 0,
|
||||||
|
"id": 20,
|
||||||
|
"legend": {
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"show": false,
|
||||||
|
"total": false,
|
||||||
|
"values": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 3,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(rate(etcd_network_peer_received_bytes_total[5m])) by (instance)",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "{{instance}} Peer Traffic In",
|
||||||
|
"metric": "etcd_network_peer_received_bytes_total",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 4
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": [],
|
||||||
|
"timeFrom": null,
|
||||||
|
"timeShift": null,
|
||||||
|
"title": "Peer Traffic In",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "individual"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"buckets": null,
|
||||||
|
"mode": "time",
|
||||||
|
"name": null,
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "Bps",
|
||||||
|
"label": null,
|
||||||
|
"logBase": 1,
|
||||||
|
"max": null,
|
||||||
|
"min": null,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"label": null,
|
||||||
|
"logBase": 1,
|
||||||
|
"max": null,
|
||||||
|
"min": null,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"decimals": null,
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 0,
|
||||||
|
"grid": {},
|
||||||
|
"id": 16,
|
||||||
|
"legend": {
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"show": false,
|
||||||
|
"total": false,
|
||||||
|
"values": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 3,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(rate(etcd_network_peer_sent_bytes_total[5m])) by (instance)",
|
||||||
|
"format": "time_series",
|
||||||
|
"hide": false,
|
||||||
|
"interval": "",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "{{instance}} Peer Traffic Out",
|
||||||
|
"metric": "etcd_network_peer_sent_bytes_total",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 4
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": [],
|
||||||
|
"timeFrom": null,
|
||||||
|
"timeShift": null,
|
||||||
|
"title": "Peer Traffic Out",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "cumulative"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"buckets": null,
|
||||||
|
"mode": "time",
|
||||||
|
"name": null,
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "Bps",
|
||||||
|
"logBase": 1,
|
||||||
|
"max": null,
|
||||||
|
"min": null,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"logBase": 1,
|
||||||
|
"max": null,
|
||||||
|
"min": null,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"repeat": null,
|
||||||
|
"repeatIteration": null,
|
||||||
|
"repeatRowId": null,
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "New row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"height": "250px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 0,
|
||||||
|
"id": 40,
|
||||||
|
"legend": {
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"show": false,
|
||||||
|
"total": false,
|
||||||
|
"values": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 6,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(rate(etcd_server_proposals_failed_total[5m]))",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "Proposal Failure Rate",
|
||||||
|
"metric": "etcd_server_proposals_failed_total",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 2
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "sum(etcd_server_proposals_pending)",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "Proposal Pending Total",
|
||||||
|
"metric": "etcd_server_proposals_pending",
|
||||||
|
"refId": "B",
|
||||||
|
"step": 2
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "sum(rate(etcd_server_proposals_committed_total[5m]))",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "Proposal Commit Rate",
|
||||||
|
"metric": "etcd_server_proposals_committed_total",
|
||||||
|
"refId": "C",
|
||||||
|
"step": 2
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "sum(rate(etcd_server_proposals_applied_total[5m]))",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "Proposal Apply Rate",
|
||||||
|
"refId": "D",
|
||||||
|
"step": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": [],
|
||||||
|
"timeFrom": null,
|
||||||
|
"timeShift": null,
|
||||||
|
"title": "Raft Proposals",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "individual"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"buckets": null,
|
||||||
|
"mode": "time",
|
||||||
|
"name": null,
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"label": "",
|
||||||
|
"logBase": 1,
|
||||||
|
"max": null,
|
||||||
|
"min": null,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"label": null,
|
||||||
|
"logBase": 1,
|
||||||
|
"max": null,
|
||||||
|
"min": null,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"decimals": 0,
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 0,
|
||||||
|
"id": 19,
|
||||||
|
"legend": {
|
||||||
|
"alignAsTable": false,
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"rightSide": false,
|
||||||
|
"show": false,
|
||||||
|
"total": false,
|
||||||
|
"values": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 6,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "changes(etcd_server_leader_changes_seen_total[1d])",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "{{instance}} Total Leader Elections Per Day",
|
||||||
|
"metric": "etcd_server_leader_changes_seen_total",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": [],
|
||||||
|
"timeFrom": null,
|
||||||
|
"timeShift": null,
|
||||||
|
"title": "Total Leader Elections Per Day",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "individual"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"buckets": null,
|
||||||
|
"mode": "time",
|
||||||
|
"name": null,
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"label": null,
|
||||||
|
"logBase": 1,
|
||||||
|
"max": null,
|
||||||
|
"min": null,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"label": null,
|
||||||
|
"logBase": 1,
|
||||||
|
"max": null,
|
||||||
|
"min": null,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"repeat": null,
|
||||||
|
"repeatIteration": null,
|
||||||
|
"repeatRowId": null,
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "New row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"schemaVersion": 14,
|
||||||
|
"style": "dark",
|
||||||
|
"tags": [],
|
||||||
|
"templating": {
|
||||||
|
"list": []
|
||||||
|
},
|
||||||
|
"time": {
|
||||||
|
"from": "now-15m",
|
||||||
|
"to": "now"
|
||||||
|
},
|
||||||
|
"timepicker": {
|
||||||
|
"now": true,
|
||||||
|
"refresh_intervals": [
|
||||||
|
"5s",
|
||||||
|
"10s",
|
||||||
|
"30s",
|
||||||
|
"1m",
|
||||||
|
"5m",
|
||||||
|
"15m",
|
||||||
|
"30m",
|
||||||
|
"1h",
|
||||||
|
"2h",
|
||||||
|
"1d"
|
||||||
|
],
|
||||||
|
"time_options": [
|
||||||
|
"5m",
|
||||||
|
"15m",
|
||||||
|
"1h",
|
||||||
|
"6h",
|
||||||
|
"12h",
|
||||||
|
"24h",
|
||||||
|
"2d",
|
||||||
|
"7d",
|
||||||
|
"30d"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"timezone": "browser",
|
||||||
|
"title": "etcd",
|
||||||
|
"version": 4
|
||||||
|
}
|
||||||
|
,
|
||||||
|
"inputs": [
|
||||||
|
{
|
||||||
|
"name": "DS_PROMETHEUS",
|
||||||
|
"pluginId": "prometheus",
|
||||||
|
"type": "datasource",
|
||||||
|
"value": "prometheus"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"overwrite": true
|
||||||
|
}
|
||||||
|
kubernetes-capacity-planning-dashboard.json: |+
|
||||||
|
{
|
||||||
|
"dashboard":
|
||||||
|
{
|
||||||
|
"__inputs": [
|
||||||
|
{
|
||||||
|
"description": "",
|
||||||
|
"label": "prometheus",
|
||||||
|
"name": "DS_PROMETHEUS",
|
||||||
|
"pluginId": "prometheus",
|
||||||
|
"pluginName": "Prometheus",
|
||||||
|
"type": "datasource"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"annotations": {
|
||||||
|
"list": []
|
||||||
|
},
|
||||||
|
"editable": false,
|
||||||
|
"gnetId": 22,
|
||||||
|
"graphTooltip": 0,
|
||||||
|
"hideControls": false,
|
||||||
|
"links": [],
|
||||||
|
"refresh": false,
|
||||||
|
"rows": [
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "250px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 1,
|
||||||
|
"grid": {
|
||||||
|
"threshold1Color": "rgba(216, 200, 27, 0.27)",
|
||||||
|
"threshold2Color": "rgba(234, 112, 112, 0.22)"
|
||||||
|
},
|
||||||
|
"id": 3,
|
||||||
|
"isNew": false,
|
||||||
|
"legend": {
|
||||||
|
"alignAsTable": false,
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"hideEmpty": false,
|
||||||
|
"hideZero": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"rightSide": false,
|
||||||
|
"show": true,
|
||||||
|
"total": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 6,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(rate(node_cpu{mode=\"idle\"}[2m])) * 100",
|
||||||
|
"hide": false,
|
||||||
|
"intervalFactor": 10,
|
||||||
|
"legendFormat": "",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 50
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Idle CPU",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "cumulative"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"mode": "time",
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "percent",
|
||||||
|
"label": "cpu usage",
|
||||||
|
"logBase": 1,
|
||||||
|
"min": 0,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 1,
|
||||||
|
"grid": {
|
||||||
|
"threshold1Color": "rgba(216, 200, 27, 0.27)",
|
||||||
|
"threshold2Color": "rgba(234, 112, 112, 0.22)"
|
||||||
|
},
|
||||||
|
"id": 9,
|
||||||
|
"isNew": false,
|
||||||
|
"legend": {
|
||||||
|
"alignAsTable": false,
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"hideEmpty": false,
|
||||||
|
"hideZero": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"rightSide": false,
|
||||||
|
"show": true,
|
||||||
|
"total": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 6,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(node_load1)",
|
||||||
|
"intervalFactor": 4,
|
||||||
|
"legendFormat": "load 1m",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 20,
|
||||||
|
"target": ""
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "sum(node_load5)",
|
||||||
|
"intervalFactor": 4,
|
||||||
|
"legendFormat": "load 5m",
|
||||||
|
"refId": "B",
|
||||||
|
"step": 20,
|
||||||
|
"target": ""
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "sum(node_load15)",
|
||||||
|
"intervalFactor": 4,
|
||||||
|
"legendFormat": "load 15m",
|
||||||
|
"refId": "C",
|
||||||
|
"step": 20,
|
||||||
|
"target": ""
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "System Load",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "cumulative"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"mode": "time",
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "percentunit",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "New Row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "250px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 1,
|
||||||
|
"grid": {
|
||||||
|
"threshold1Color": "rgba(216, 200, 27, 0.27)",
|
||||||
|
"threshold2Color": "rgba(234, 112, 112, 0.22)"
|
||||||
|
},
|
||||||
|
"id": 4,
|
||||||
|
"isNew": false,
|
||||||
|
"legend": {
|
||||||
|
"alignAsTable": false,
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"hideEmpty": false,
|
||||||
|
"hideZero": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"rightSide": false,
|
||||||
|
"show": true,
|
||||||
|
"total": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [
|
||||||
|
{
|
||||||
|
"alias": "node_memory_SwapFree{instance=\"172.17.0.1:9100\",job=\"prometheus\"}",
|
||||||
|
"yaxis": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 9,
|
||||||
|
"stack": true,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(node_memory_MemTotal) - sum(node_memory_MemFree) - sum(node_memory_Buffers) - sum(node_memory_Cached)",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "memory usage",
|
||||||
|
"metric": "memo",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 10,
|
||||||
|
"target": ""
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "sum(node_memory_Buffers)",
|
||||||
|
"interval": "",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "memory buffers",
|
||||||
|
"metric": "memo",
|
||||||
|
"refId": "B",
|
||||||
|
"step": 10,
|
||||||
|
"target": ""
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "sum(node_memory_Cached)",
|
||||||
|
"interval": "",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "memory cached",
|
||||||
|
"metric": "memo",
|
||||||
|
"refId": "C",
|
||||||
|
"step": 10,
|
||||||
|
"target": ""
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "sum(node_memory_MemFree)",
|
||||||
|
"interval": "",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "memory free",
|
||||||
|
"metric": "memo",
|
||||||
|
"refId": "D",
|
||||||
|
"step": 10,
|
||||||
|
"target": ""
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Memory Usage",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "individual"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"mode": "time",
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "bytes",
|
||||||
|
"logBase": 1,
|
||||||
|
"min": "0",
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(50, 172, 45, 0.97)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(245, 54, 54, 0.9)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "percent",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": true,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"hideTimeOverride": false,
|
||||||
|
"id": 5,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfix": "",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "((sum(node_memory_MemTotal) - sum(node_memory_MemFree) - sum(node_memory_Buffers) - sum(node_memory_Cached)) / sum(node_memory_MemTotal)) * 100",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"metric": "",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 60,
|
||||||
|
"target": ""
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "80, 90",
|
||||||
|
"title": "Memory Usage",
|
||||||
|
"transparent": false,
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "New Row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "246px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 1,
|
||||||
|
"grid": {
|
||||||
|
"threshold1Color": "rgba(216, 200, 27, 0.27)",
|
||||||
|
"threshold2Color": "rgba(234, 112, 112, 0.22)"
|
||||||
|
},
|
||||||
|
"id": 6,
|
||||||
|
"isNew": false,
|
||||||
|
"legend": {
|
||||||
|
"alignAsTable": false,
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"hideEmpty": false,
|
||||||
|
"hideZero": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"rightSide": false,
|
||||||
|
"show": true,
|
||||||
|
"total": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [
|
||||||
|
{
|
||||||
|
"alias": "read",
|
||||||
|
"yaxis": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"alias": "{instance=\"172.17.0.1:9100\"}",
|
||||||
|
"yaxis": 2
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"alias": "io time",
|
||||||
|
"yaxis": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 9,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(rate(node_disk_bytes_read[5m]))",
|
||||||
|
"hide": false,
|
||||||
|
"intervalFactor": 4,
|
||||||
|
"legendFormat": "read",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 20,
|
||||||
|
"target": ""
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "sum(rate(node_disk_bytes_written[5m]))",
|
||||||
|
"intervalFactor": 4,
|
||||||
|
"legendFormat": "written",
|
||||||
|
"refId": "B",
|
||||||
|
"step": 20
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "sum(rate(node_disk_io_time_ms[5m]))",
|
||||||
|
"intervalFactor": 4,
|
||||||
|
"legendFormat": "io time",
|
||||||
|
"refId": "C",
|
||||||
|
"step": 20
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Disk I/O",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "cumulative"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"mode": "time",
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "bytes",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "ms",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(50, 172, 45, 0.97)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(245, 54, 54, 0.9)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "percentunit",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 1,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": true,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"hideTimeOverride": false,
|
||||||
|
"id": 12,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfix": "",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "(sum(node_filesystem_size{device!=\"rootfs\"}) - sum(node_filesystem_free{device!=\"rootfs\"})) / sum(node_filesystem_size{device!=\"rootfs\"})",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 60,
|
||||||
|
"target": ""
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "0.75, 0.9",
|
||||||
|
"title": "Disk Space Usage",
|
||||||
|
"transparent": false,
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "current"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "New Row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "250px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 1,
|
||||||
|
"grid": {
|
||||||
|
"threshold1Color": "rgba(216, 200, 27, 0.27)",
|
||||||
|
"threshold2Color": "rgba(234, 112, 112, 0.22)"
|
||||||
|
},
|
||||||
|
"id": 8,
|
||||||
|
"isNew": false,
|
||||||
|
"legend": {
|
||||||
|
"alignAsTable": false,
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"hideEmpty": false,
|
||||||
|
"hideZero": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"rightSide": false,
|
||||||
|
"show": true,
|
||||||
|
"total": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [
|
||||||
|
{
|
||||||
|
"alias": "transmitted",
|
||||||
|
"yaxis": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 6,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(rate(node_network_receive_bytes{device!~\"lo\"}[5m]))",
|
||||||
|
"hide": false,
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 10,
|
||||||
|
"target": ""
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Network Received",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "cumulative"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"mode": "time",
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "bytes",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "bytes",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 1,
|
||||||
|
"grid": {
|
||||||
|
"threshold1Color": "rgba(216, 200, 27, 0.27)",
|
||||||
|
"threshold2Color": "rgba(234, 112, 112, 0.22)"
|
||||||
|
},
|
||||||
|
"id": 10,
|
||||||
|
"isNew": false,
|
||||||
|
"legend": {
|
||||||
|
"alignAsTable": false,
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"hideEmpty": false,
|
||||||
|
"hideZero": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"rightSide": false,
|
||||||
|
"show": true,
|
||||||
|
"total": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [
|
||||||
|
{
|
||||||
|
"alias": "transmitted",
|
||||||
|
"yaxis": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 6,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(rate(node_network_transmit_bytes{device!~\"lo\"}[5m]))",
|
||||||
|
"hide": false,
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "",
|
||||||
|
"refId": "B",
|
||||||
|
"step": 10,
|
||||||
|
"target": ""
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Network Transmitted",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "cumulative"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"mode": "time",
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "bytes",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "bytes",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "New Row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "276px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 1,
|
||||||
|
"grid": {
|
||||||
|
"threshold1Color": "rgba(216, 200, 27, 0.27)",
|
||||||
|
"threshold2Color": "rgba(234, 112, 112, 0.22)"
|
||||||
|
},
|
||||||
|
"id": 11,
|
||||||
|
"isNew": true,
|
||||||
|
"legend": {
|
||||||
|
"alignAsTable": false,
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"hideEmpty": false,
|
||||||
|
"hideZero": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"rightSide": false,
|
||||||
|
"show": true,
|
||||||
|
"total": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 11,
|
||||||
|
"span": 9,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(kube_pod_info)",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "Current number of Pods",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 10
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "sum(kube_node_status_capacity_pods)",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "Maximum capacity of pods",
|
||||||
|
"refId": "B",
|
||||||
|
"step": 10
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Cluster Pod Utilization",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "individual"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"mode": "time",
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(50, 172, 45, 0.97)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(245, 54, 54, 0.9)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "percent",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": true,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"hideTimeOverride": false,
|
||||||
|
"id": 7,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfix": "",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "100 - (sum(kube_node_status_capacity_pods) - sum(kube_pod_info)) / sum(kube_node_status_capacity_pods) * 100",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 60,
|
||||||
|
"target": ""
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "80, 90",
|
||||||
|
"title": "Pod Utilization",
|
||||||
|
"transparent": false,
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "current"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "New Row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"schemaVersion": 14,
|
||||||
|
"sharedCrosshair": false,
|
||||||
|
"style": "dark",
|
||||||
|
"tags": [],
|
||||||
|
"templating": {
|
||||||
|
"list": []
|
||||||
|
},
|
||||||
|
"time": {
|
||||||
|
"from": "now-1h",
|
||||||
|
"to": "now"
|
||||||
|
},
|
||||||
|
"timepicker": {
|
||||||
|
"refresh_intervals": [
|
||||||
|
"5s",
|
||||||
|
"10s",
|
||||||
|
"30s",
|
||||||
|
"1m",
|
||||||
|
"5m",
|
||||||
|
"15m",
|
||||||
|
"30m",
|
||||||
|
"1h",
|
||||||
|
"2h",
|
||||||
|
"1d"
|
||||||
|
],
|
||||||
|
"time_options": [
|
||||||
|
"5m",
|
||||||
|
"15m",
|
||||||
|
"1h",
|
||||||
|
"6h",
|
||||||
|
"12h",
|
||||||
|
"24h",
|
||||||
|
"2d",
|
||||||
|
"7d",
|
||||||
|
"30d"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"timezone": "browser",
|
||||||
|
"title": "Kubernetes Capacity Planning",
|
||||||
|
"version": 4
|
||||||
|
}
|
||||||
|
,
|
||||||
|
"inputs": [
|
||||||
|
{
|
||||||
|
"name": "DS_PROMETHEUS",
|
||||||
|
"pluginId": "prometheus",
|
||||||
|
"type": "datasource",
|
||||||
|
"value": "prometheus"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"overwrite": true
|
||||||
|
}
|
||||||
|
kubernetes-cluster-health-dashboard.json: |+
|
||||||
|
{
|
||||||
|
"dashboard":
|
||||||
|
{
|
||||||
|
"__inputs": [
|
||||||
|
{
|
||||||
|
"description": "",
|
||||||
|
"label": "prometheus",
|
||||||
|
"name": "DS_PROMETHEUS",
|
||||||
|
"pluginId": "prometheus",
|
||||||
|
"pluginName": "Prometheus",
|
||||||
|
"type": "datasource"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"annotations": {
|
||||||
|
"list": []
|
||||||
|
},
|
||||||
|
"editable": false,
|
||||||
|
"graphTooltip": 0,
|
||||||
|
"hideControls": false,
|
||||||
|
"links": [],
|
||||||
|
"refresh": "10s",
|
||||||
|
"rows": [
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "254px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": true,
|
||||||
|
"colors": [
|
||||||
|
"rgba(50, 172, 45, 0.97)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(245, 54, 54, 0.9)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "none",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"hideTimeOverride": false,
|
||||||
|
"id": 1,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfix": "",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(up{job=~\"apiserver|kube-scheduler|kube-controller-manager\"} == 0)",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "1, 3",
|
||||||
|
"title": "Control Plane Components Down",
|
||||||
|
"transparent": false,
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "Everything UP and healthy",
|
||||||
|
"value": "null"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "",
|
||||||
|
"value": ""
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": true,
|
||||||
|
"colors": [
|
||||||
|
"rgba(50, 172, 45, 0.97)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(245, 54, 54, 0.9)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "none",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"hideTimeOverride": false,
|
||||||
|
"id": 2,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfix": "",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(ALERTS{alertstate=\"firing\",alertname!=\"DeadMansSwitch\"})",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "1, 3",
|
||||||
|
"title": "Alerts Firing",
|
||||||
|
"transparent": false,
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "0",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "current"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": true,
|
||||||
|
"colors": [
|
||||||
|
"rgba(50, 172, 45, 0.97)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(245, 54, 54, 0.9)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "none",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"hideTimeOverride": false,
|
||||||
|
"id": 3,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfix": "",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(ALERTS{alertstate=\"pending\",alertname!=\"DeadMansSwitch\"})",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "3, 5",
|
||||||
|
"title": "Alerts Pending",
|
||||||
|
"transparent": false,
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "0",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "current"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": true,
|
||||||
|
"colors": [
|
||||||
|
"rgba(50, 172, 45, 0.97)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(245, 54, 54, 0.9)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "none",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"hideTimeOverride": false,
|
||||||
|
"id": 4,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfix": "",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "count(increase(kube_pod_container_status_restarts[1h]) > 5)",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "1, 3",
|
||||||
|
"title": "Crashlooping Pods",
|
||||||
|
"transparent": false,
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "0",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "current"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "Row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "250px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": true,
|
||||||
|
"colors": [
|
||||||
|
"rgba(50, 172, 45, 0.97)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(245, 54, 54, 0.9)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "none",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"hideTimeOverride": false,
|
||||||
|
"id": 5,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfix": "",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(kube_node_status_condition{condition=\"Ready\",status!=\"true\"})",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "1, 3",
|
||||||
|
"title": "Node Not Ready",
|
||||||
|
"transparent": false,
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "current"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": true,
|
||||||
|
"colors": [
|
||||||
|
"rgba(50, 172, 45, 0.97)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(245, 54, 54, 0.9)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "none",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"hideTimeOverride": false,
|
||||||
|
"id": 6,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfix": "",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(kube_node_status_condition{condition=\"DiskPressure\",status=\"true\"})",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "1, 3",
|
||||||
|
"title": "Node Disk Pressure",
|
||||||
|
"transparent": false,
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "current"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": true,
|
||||||
|
"colors": [
|
||||||
|
"rgba(50, 172, 45, 0.97)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(245, 54, 54, 0.9)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "none",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"hideTimeOverride": false,
|
||||||
|
"id": 7,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfix": "",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(kube_node_status_condition{condition=\"MemoryPressure\",status=\"true\"})",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "1, 3",
|
||||||
|
"title": "Node Memory Pressure",
|
||||||
|
"transparent": false,
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "current"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": true,
|
||||||
|
"colors": [
|
||||||
|
"rgba(50, 172, 45, 0.97)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(245, 54, 54, 0.9)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "none",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"hideTimeOverride": false,
|
||||||
|
"id": 8,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfix": "",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(kube_node_spec_unschedulable)",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "1, 3",
|
||||||
|
"title": "Nodes Unschedulable",
|
||||||
|
"transparent": false,
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "current"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "Row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"schemaVersion": 14,
|
||||||
|
"sharedCrosshair": false,
|
||||||
|
"style": "dark",
|
||||||
|
"tags": [],
|
||||||
|
"templating": {
|
||||||
|
"list": []
|
||||||
|
},
|
||||||
|
"time": {
|
||||||
|
"from": "now-6h",
|
||||||
|
"to": "now"
|
||||||
|
},
|
||||||
|
"timepicker": {
|
||||||
|
"refresh_intervals": [
|
||||||
|
"5s",
|
||||||
|
"10s",
|
||||||
|
"30s",
|
||||||
|
"1m",
|
||||||
|
"5m",
|
||||||
|
"15m",
|
||||||
|
"30m",
|
||||||
|
"1h",
|
||||||
|
"2h",
|
||||||
|
"1d"
|
||||||
|
],
|
||||||
|
"time_options": [
|
||||||
|
"5m",
|
||||||
|
"15m",
|
||||||
|
"1h",
|
||||||
|
"6h",
|
||||||
|
"12h",
|
||||||
|
"24h",
|
||||||
|
"2d",
|
||||||
|
"7d",
|
||||||
|
"30d"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"timezone": "browser",
|
||||||
|
"title": "Kubernetes Cluster Health",
|
||||||
|
"version": 9
|
||||||
|
}
|
||||||
|
,
|
||||||
|
"inputs": [
|
||||||
|
{
|
||||||
|
"name": "DS_PROMETHEUS",
|
||||||
|
"pluginId": "prometheus",
|
||||||
|
"type": "datasource",
|
||||||
|
"value": "prometheus"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"overwrite": true
|
||||||
|
}
|
||||||
|
kubernetes-cluster-status-dashboard.json: |+
|
||||||
|
{
|
||||||
|
"dashboard":
|
||||||
|
{
|
||||||
|
"__inputs": [
|
||||||
|
{
|
||||||
|
"description": "",
|
||||||
|
"label": "prometheus",
|
||||||
|
"name": "DS_PROMETHEUS",
|
||||||
|
"pluginId": "prometheus",
|
||||||
|
"pluginName": "Prometheus",
|
||||||
|
"type": "datasource"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"annotations": {
|
||||||
|
"list": []
|
||||||
|
},
|
||||||
|
"editable": false,
|
||||||
|
"graphTooltip": 0,
|
||||||
|
"hideControls": false,
|
||||||
|
"links": [],
|
||||||
|
"rows": [
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "129px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": true,
|
||||||
|
"colors": [
|
||||||
|
"rgba(50, 172, 45, 0.97)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(245, 54, 54, 0.9)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "none",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"id": 5,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 6,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(up{job=~\"apiserver|kube-scheduler|kube-controller-manager\"} == 0)",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "1, 3",
|
||||||
|
"title": "Control Plane UP",
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "UP",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "total"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": true,
|
||||||
|
"colors": [
|
||||||
|
"rgba(50, 172, 45, 0.97)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(245, 54, 54, 0.9)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "none",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"id": 6,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 6,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(ALERTS{alertstate=\"firing\",alertname!=\"DeadMansSwitch\"})",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "3, 5",
|
||||||
|
"title": "Alerts Firing",
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "0",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "current"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": true,
|
||||||
|
"title": "Cluster Health",
|
||||||
|
"titleSize": "h6"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "168px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(245, 54, 54, 0.9)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(50, 172, 45, 0.97)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "percent",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": true,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"id": 1,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "(sum(up{job=\"apiserver\"} == 1) / count(up{job=\"apiserver\"})) * 100",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "50, 80",
|
||||||
|
"title": "API Servers UP",
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "current"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(245, 54, 54, 0.9)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(50, 172, 45, 0.97)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "percent",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": true,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"id": 2,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "(sum(up{job=\"kube-controller-manager\"} == 1) / count(up{job=\"kube-controller-manager\"})) * 100",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "50, 80",
|
||||||
|
"title": "Controller Managers UP",
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "current"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(245, 54, 54, 0.9)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(50, 172, 45, 0.97)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "percent",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": true,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"id": 3,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "(sum(up{job=\"kube-scheduler\"} == 1) / count(up{job=\"kube-scheduler\"})) * 100",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "50, 80",
|
||||||
|
"title": "Schedulers UP",
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "current"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": true,
|
||||||
|
"colors": [
|
||||||
|
"rgba(50, 172, 45, 0.97)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(245, 54, 54, 0.9)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "none",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"id": 4,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "count(increase(kube_pod_container_status_restarts{namespace=~\"kube-system|tectonic-system\"}[1h]) > 5)",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "1, 3",
|
||||||
|
"title": "Crashlooping Control Plane Pods",
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "0",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "current"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": true,
|
||||||
|
"title": "Control Plane Status",
|
||||||
|
"titleSize": "h6"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "158px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(50, 172, 45, 0.97)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(245, 54, 54, 0.9)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "percent",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": true,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"id": 8,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(100 - (avg by (instance) (rate(node_cpu{job=\"node-exporter\",mode=\"idle\"}[5m])) * 100)) / count(node_cpu{job=\"node-exporter\",mode=\"idle\"})",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "80, 90",
|
||||||
|
"title": "CPU Utilization",
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(50, 172, 45, 0.97)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(245, 54, 54, 0.9)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "percent",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": true,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"id": 7,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "((sum(node_memory_MemTotal) - sum(node_memory_MemFree) - sum(node_memory_Buffers) - sum(node_memory_Cached)) / sum(node_memory_MemTotal)) * 100",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "80, 90",
|
||||||
|
"title": "Memory Utilization",
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(50, 172, 45, 0.97)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(245, 54, 54, 0.9)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "percent",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": true,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"id": 9,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "(sum(node_filesystem_size{device!=\"rootfs\"}) - sum(node_filesystem_free{device!=\"rootfs\"})) / sum(node_filesystem_size{device!=\"rootfs\"})",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "80, 90",
|
||||||
|
"title": "Filesystem Utilization",
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(50, 172, 45, 0.97)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(245, 54, 54, 0.9)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "percent",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": true,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"id": 10,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "100 - (sum(kube_node_status_capacity_pods) - sum(kube_pod_info)) / sum(kube_node_status_capacity_pods) * 100",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "80, 90",
|
||||||
|
"title": "Pod Utilization",
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": true,
|
||||||
|
"title": "Capacity Planning",
|
||||||
|
"titleSize": "h6"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"schemaVersion": 14,
|
||||||
|
"sharedCrosshair": false,
|
||||||
|
"style": "dark",
|
||||||
|
"tags": [],
|
||||||
|
"templating": {
|
||||||
|
"list": []
|
||||||
|
},
|
||||||
|
"time": {
|
||||||
|
"from": "now-6h",
|
||||||
|
"to": "now"
|
||||||
|
},
|
||||||
|
"timepicker": {
|
||||||
|
"refresh_intervals": [
|
||||||
|
"5s",
|
||||||
|
"10s",
|
||||||
|
"30s",
|
||||||
|
"1m",
|
||||||
|
"5m",
|
||||||
|
"15m",
|
||||||
|
"30m",
|
||||||
|
"1h",
|
||||||
|
"2h",
|
||||||
|
"1d"
|
||||||
|
],
|
||||||
|
"time_options": [
|
||||||
|
"5m",
|
||||||
|
"15m",
|
||||||
|
"1h",
|
||||||
|
"6h",
|
||||||
|
"12h",
|
||||||
|
"24h",
|
||||||
|
"2d",
|
||||||
|
"7d",
|
||||||
|
"30d"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"timezone": "browser",
|
||||||
|
"title": "Kubernetes Cluster Status",
|
||||||
|
"version": 3
|
||||||
|
}
|
||||||
|
,
|
||||||
|
"inputs": [
|
||||||
|
{
|
||||||
|
"name": "DS_PROMETHEUS",
|
||||||
|
"pluginId": "prometheus",
|
||||||
|
"type": "datasource",
|
||||||
|
"value": "prometheus"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"overwrite": true
|
||||||
|
}
|
||||||
|
kubernetes-control-plane-status-dashboard.json: |+
|
||||||
|
{
|
||||||
|
"dashboard":
|
||||||
|
{
|
||||||
|
"__inputs": [
|
||||||
|
{
|
||||||
|
"description": "",
|
||||||
|
"label": "prometheus",
|
||||||
|
"name": "DS_PROMETHEUS",
|
||||||
|
"pluginId": "prometheus",
|
||||||
|
"pluginName": "Prometheus",
|
||||||
|
"type": "datasource"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"annotations": {
|
||||||
|
"list": []
|
||||||
|
},
|
||||||
|
"editable": false,
|
||||||
|
"graphTooltip": 0,
|
||||||
|
"hideControls": false,
|
||||||
|
"links": [],
|
||||||
|
"rows": [
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "250px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(245, 54, 54, 0.9)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(50, 172, 45, 0.97)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "percent",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": true,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"hideTimeOverride": false,
|
||||||
|
"id": 1,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfix": "",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "(sum(up{job=\"apiserver\"} == 1) / sum(up{job=\"apiserver\"})) * 100",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "50, 80",
|
||||||
|
"title": "API Servers UP",
|
||||||
|
"transparent": false,
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(245, 54, 54, 0.9)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(50, 172, 45, 0.97)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "percent",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": true,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"hideTimeOverride": false,
|
||||||
|
"id": 2,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfix": "",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "(sum(up{job=\"kube-controller-manager\"} == 1) / sum(up{job=\"kube-controller-manager\"})) * 100",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "50, 80",
|
||||||
|
"title": "Controller Managers UP",
|
||||||
|
"transparent": false,
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(245, 54, 54, 0.9)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(50, 172, 45, 0.97)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "percent",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": true,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"hideTimeOverride": false,
|
||||||
|
"id": 3,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfix": "",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "(sum(up{job=\"kube-scheduler\"} == 1) / sum(up{job=\"kube-scheduler\"})) * 100",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "50, 80",
|
||||||
|
"title": "Schedulers UP",
|
||||||
|
"transparent": false,
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(50, 172, 45, 0.97)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(245, 54, 54, 0.9)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "percent",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": true,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"hideTimeOverride": false,
|
||||||
|
"id": 4,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfix": "",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "max(sum by(instance) (rate(apiserver_request_count{code=~\"5..\"}[5m])) / sum by(instance) (rate(apiserver_request_count[5m]))) * 100",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "5, 10",
|
||||||
|
"title": "API Server Request Error Rate",
|
||||||
|
"transparent": false,
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "0",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "Dashboard Row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "250px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 1,
|
||||||
|
"grid": {
|
||||||
|
"threshold1Color": "rgba(216, 200, 27, 0.27)",
|
||||||
|
"threshold2Color": "rgba(234, 112, 112, 0.22)"
|
||||||
|
},
|
||||||
|
"id": 7,
|
||||||
|
"isNew": false,
|
||||||
|
"legend": {
|
||||||
|
"alignAsTable": false,
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"hideEmpty": false,
|
||||||
|
"hideZero": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"rightSide": false,
|
||||||
|
"show": true,
|
||||||
|
"total": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 1,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "null",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 12,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum by(verb) (rate(apiserver_latency_seconds:quantile[5m]) >= 0)",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 30
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "API Server Request Latency",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "individual"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"mode": "time",
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "Dashboard Row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "250px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 1,
|
||||||
|
"grid": {
|
||||||
|
"threshold1Color": "rgba(216, 200, 27, 0.27)",
|
||||||
|
"threshold2Color": "rgba(234, 112, 112, 0.22)"
|
||||||
|
},
|
||||||
|
"id": 5,
|
||||||
|
"isNew": false,
|
||||||
|
"legend": {
|
||||||
|
"alignAsTable": false,
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"hideEmpty": false,
|
||||||
|
"hideZero": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"rightSide": false,
|
||||||
|
"show": true,
|
||||||
|
"total": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 1,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "null",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 6,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "cluster:scheduler_e2e_scheduling_latency_seconds:quantile",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 60
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "End to End Scheduling Latency",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "individual"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"mode": "time",
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "dtdurations",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 1,
|
||||||
|
"grid": {
|
||||||
|
"threshold1Color": "rgba(216, 200, 27, 0.27)",
|
||||||
|
"threshold2Color": "rgba(234, 112, 112, 0.22)"
|
||||||
|
},
|
||||||
|
"id": 6,
|
||||||
|
"isNew": false,
|
||||||
|
"legend": {
|
||||||
|
"alignAsTable": false,
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"hideEmpty": false,
|
||||||
|
"hideZero": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"rightSide": false,
|
||||||
|
"show": true,
|
||||||
|
"total": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 1,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "null",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 6,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum by(instance) (rate(apiserver_request_count{code!~\"2..\"}[5m]))",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "Error Rate",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 60
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "sum by(instance) (rate(apiserver_request_count[5m]))",
|
||||||
|
"format": "time_series",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "Request Rate",
|
||||||
|
"refId": "B",
|
||||||
|
"step": 60
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "API Server Request Rates",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "individual"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"mode": "time",
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "Dashboard Row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"schemaVersion": 14,
|
||||||
|
"sharedCrosshair": false,
|
||||||
|
"style": "dark",
|
||||||
|
"tags": [],
|
||||||
|
"templating": {
|
||||||
|
"list": []
|
||||||
|
},
|
||||||
|
"time": {
|
||||||
|
"from": "now-6h",
|
||||||
|
"to": "now"
|
||||||
|
},
|
||||||
|
"timepicker": {
|
||||||
|
"refresh_intervals": [
|
||||||
|
"5s",
|
||||||
|
"10s",
|
||||||
|
"30s",
|
||||||
|
"1m",
|
||||||
|
"5m",
|
||||||
|
"15m",
|
||||||
|
"30m",
|
||||||
|
"1h",
|
||||||
|
"2h",
|
||||||
|
"1d"
|
||||||
|
],
|
||||||
|
"time_options": [
|
||||||
|
"5m",
|
||||||
|
"15m",
|
||||||
|
"1h",
|
||||||
|
"6h",
|
||||||
|
"12h",
|
||||||
|
"24h",
|
||||||
|
"2d",
|
||||||
|
"7d",
|
||||||
|
"30d"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"timezone": "browser",
|
||||||
|
"title": "Kubernetes Control Plane Status",
|
||||||
|
"version": 3
|
||||||
|
}
|
||||||
|
,
|
||||||
|
"inputs": [
|
||||||
|
{
|
||||||
|
"name": "DS_PROMETHEUS",
|
||||||
|
"pluginId": "prometheus",
|
||||||
|
"type": "datasource",
|
||||||
|
"value": "prometheus"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"overwrite": true
|
||||||
|
}
|
||||||
|
kubernetes-resource-requests-dashboard.json: |+
|
||||||
|
{
|
||||||
|
"dashboard":
|
||||||
|
{
|
||||||
|
"__inputs": [
|
||||||
|
{
|
||||||
|
"description": "",
|
||||||
|
"label": "prometheus",
|
||||||
|
"name": "DS_PROMETHEUS",
|
||||||
|
"pluginId": "prometheus",
|
||||||
|
"pluginName": "Prometheus",
|
||||||
|
"type": "datasource"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"annotations": {
|
||||||
|
"list": []
|
||||||
|
},
|
||||||
|
"editable": false,
|
||||||
|
"graphTooltip": 0,
|
||||||
|
"hideControls": false,
|
||||||
|
"links": [],
|
||||||
|
"refresh": false,
|
||||||
|
"rows": [
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "300px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"description": "This represents the total [CPU resource requests](https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/#meaning-of-cpu) in the cluster.\nFor comparison the total [allocatable CPU cores](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/node-allocatable.md) is also shown.",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 1,
|
||||||
|
"grid": {
|
||||||
|
"threshold1Color": "rgba(216, 200, 27, 0.27)",
|
||||||
|
"threshold2Color": "rgba(234, 112, 112, 0.22)"
|
||||||
|
},
|
||||||
|
"id": 1,
|
||||||
|
"isNew": false,
|
||||||
|
"legend": {
|
||||||
|
"alignAsTable": false,
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"hideEmpty": false,
|
||||||
|
"hideZero": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"rightSide": false,
|
||||||
|
"show": true,
|
||||||
|
"total": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 1,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "null",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 9,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "min(sum(kube_node_status_allocatable_cpu_cores) by (instance))",
|
||||||
|
"hide": false,
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "Allocatable CPU Cores",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 20
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "max(sum(kube_pod_container_resource_requests_cpu_cores) by (instance))",
|
||||||
|
"hide": false,
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "Requested CPU Cores",
|
||||||
|
"refId": "B",
|
||||||
|
"step": 20
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "CPU Cores",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "individual"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"mode": "time",
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"label": "CPU Cores",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(50, 172, 45, 0.97)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(245, 54, 54, 0.9)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "percent",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": true,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"hideTimeOverride": false,
|
||||||
|
"id": 2,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfix": "",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "max(sum(kube_pod_container_resource_requests_cpu_cores) by (instance)) / min(sum(kube_node_status_allocatable_cpu_cores) by (instance)) * 100",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 240
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "80, 90",
|
||||||
|
"title": "CPU Cores",
|
||||||
|
"transparent": false,
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "110%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "CPU Cores",
|
||||||
|
"titleSize": "h6"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "300px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"description": "This represents the total [memory resource requests](https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/#meaning-of-memory) in the cluster.\nFor comparison the total [allocatable memory](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/node-allocatable.md) is also shown.",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 1,
|
||||||
|
"grid": {
|
||||||
|
"threshold1Color": "rgba(216, 200, 27, 0.27)",
|
||||||
|
"threshold2Color": "rgba(234, 112, 112, 0.22)"
|
||||||
|
},
|
||||||
|
"id": 3,
|
||||||
|
"isNew": false,
|
||||||
|
"legend": {
|
||||||
|
"alignAsTable": false,
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"hideEmpty": false,
|
||||||
|
"hideZero": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"rightSide": false,
|
||||||
|
"show": true,
|
||||||
|
"total": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 1,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "null",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 9,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "min(sum(kube_node_status_allocatable_memory_bytes) by (instance))",
|
||||||
|
"hide": false,
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "Allocatable Memory",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 20
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "max(sum(kube_pod_container_resource_requests_memory_bytes) by (instance))",
|
||||||
|
"hide": false,
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "Requested Memory",
|
||||||
|
"refId": "B",
|
||||||
|
"step": 20
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Memory",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "individual"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"mode": "time",
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "bytes",
|
||||||
|
"label": "Memory",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(50, 172, 45, 0.97)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(245, 54, 54, 0.9)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "percent",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": true,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"hideTimeOverride": false,
|
||||||
|
"id": 4,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfix": "",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "max(sum(kube_pod_container_resource_requests_memory_bytes) by (instance)) / min(sum(kube_node_status_allocatable_memory_bytes) by (instance)) * 100",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 240
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "80, 90",
|
||||||
|
"title": "Memory",
|
||||||
|
"transparent": false,
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "110%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "Memory",
|
||||||
|
"titleSize": "h6"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"schemaVersion": 14,
|
||||||
|
"sharedCrosshair": false,
|
||||||
|
"style": "dark",
|
||||||
|
"tags": [],
|
||||||
|
"templating": {
|
||||||
|
"list": []
|
||||||
|
},
|
||||||
|
"time": {
|
||||||
|
"from": "now-3h",
|
||||||
|
"to": "now"
|
||||||
|
},
|
||||||
|
"timepicker": {
|
||||||
|
"refresh_intervals": [
|
||||||
|
"5s",
|
||||||
|
"10s",
|
||||||
|
"30s",
|
||||||
|
"1m",
|
||||||
|
"5m",
|
||||||
|
"15m",
|
||||||
|
"30m",
|
||||||
|
"1h",
|
||||||
|
"2h",
|
||||||
|
"1d"
|
||||||
|
],
|
||||||
|
"time_options": [
|
||||||
|
"5m",
|
||||||
|
"15m",
|
||||||
|
"1h",
|
||||||
|
"6h",
|
||||||
|
"12h",
|
||||||
|
"24h",
|
||||||
|
"2d",
|
||||||
|
"7d",
|
||||||
|
"30d"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"timezone": "browser",
|
||||||
|
"title": "Kubernetes Resource Requests",
|
||||||
|
"version": 2
|
||||||
|
}
|
||||||
|
,
|
||||||
|
"inputs": [
|
||||||
|
{
|
||||||
|
"name": "DS_PROMETHEUS",
|
||||||
|
"pluginId": "prometheus",
|
||||||
|
"type": "datasource",
|
||||||
|
"value": "prometheus"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"overwrite": true
|
||||||
|
}
|
||||||
|
nodes-dashboard.json: |+
|
||||||
|
{
|
||||||
|
"dashboard":
|
||||||
|
{
|
||||||
|
"__inputs": [
|
||||||
|
{
|
||||||
|
"description": "",
|
||||||
|
"label": "prometheus",
|
||||||
|
"name": "DS_PROMETHEUS",
|
||||||
|
"pluginId": "prometheus",
|
||||||
|
"pluginName": "Prometheus",
|
||||||
|
"type": "datasource"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"annotations": {
|
||||||
|
"list": []
|
||||||
|
},
|
||||||
|
"description": "Dashboard to get an overview of one server",
|
||||||
|
"editable": false,
|
||||||
|
"gnetId": 22,
|
||||||
|
"graphTooltip": 0,
|
||||||
|
"hideControls": false,
|
||||||
|
"links": [],
|
||||||
|
"refresh": false,
|
||||||
|
"rows": [
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "250px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 1,
|
||||||
|
"grid": {
|
||||||
|
"threshold1Color": "rgba(216, 200, 27, 0.27)",
|
||||||
|
"threshold2Color": "rgba(234, 112, 112, 0.22)"
|
||||||
|
},
|
||||||
|
"id": 3,
|
||||||
|
"isNew": false,
|
||||||
|
"legend": {
|
||||||
|
"alignAsTable": false,
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"hideEmpty": false,
|
||||||
|
"hideZero": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"rightSide": false,
|
||||||
|
"show": true,
|
||||||
|
"total": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 6,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "100 - (avg by (cpu) (irate(node_cpu{mode=\"idle\", instance=\"$server\"}[5m])) * 100)",
|
||||||
|
"hide": false,
|
||||||
|
"intervalFactor": 10,
|
||||||
|
"legendFormat": "{{cpu}}",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 50
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Idle CPU",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "cumulative"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"mode": "time",
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "percent",
|
||||||
|
"label": "cpu usage",
|
||||||
|
"logBase": 1,
|
||||||
|
"max": 100,
|
||||||
|
"min": 0,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 1,
|
||||||
|
"grid": {
|
||||||
|
"threshold1Color": "rgba(216, 200, 27, 0.27)",
|
||||||
|
"threshold2Color": "rgba(234, 112, 112, 0.22)"
|
||||||
|
},
|
||||||
|
"id": 9,
|
||||||
|
"isNew": false,
|
||||||
|
"legend": {
|
||||||
|
"alignAsTable": false,
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"hideEmpty": false,
|
||||||
|
"hideZero": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"rightSide": false,
|
||||||
|
"show": true,
|
||||||
|
"total": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 6,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "node_load1{instance=\"$server\"}",
|
||||||
|
"intervalFactor": 4,
|
||||||
|
"legendFormat": "load 1m",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 20,
|
||||||
|
"target": ""
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "node_load5{instance=\"$server\"}",
|
||||||
|
"intervalFactor": 4,
|
||||||
|
"legendFormat": "load 5m",
|
||||||
|
"refId": "B",
|
||||||
|
"step": 20,
|
||||||
|
"target": ""
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "node_load15{instance=\"$server\"}",
|
||||||
|
"intervalFactor": 4,
|
||||||
|
"legendFormat": "load 15m",
|
||||||
|
"refId": "C",
|
||||||
|
"step": 20,
|
||||||
|
"target": ""
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "System Load",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "cumulative"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"mode": "time",
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "percentunit",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "New Row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "250px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 1,
|
||||||
|
"grid": {
|
||||||
|
"threshold1Color": "rgba(216, 200, 27, 0.27)",
|
||||||
|
"threshold2Color": "rgba(234, 112, 112, 0.22)"
|
||||||
|
},
|
||||||
|
"id": 4,
|
||||||
|
"isNew": false,
|
||||||
|
"legend": {
|
||||||
|
"alignAsTable": false,
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"hideEmpty": false,
|
||||||
|
"hideZero": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"rightSide": false,
|
||||||
|
"show": true,
|
||||||
|
"total": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [
|
||||||
|
{
|
||||||
|
"alias": "node_memory_SwapFree{instance=\"172.17.0.1:9100\",job=\"prometheus\"}",
|
||||||
|
"yaxis": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 9,
|
||||||
|
"stack": true,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "node_memory_MemTotal{instance=\"$server\"} - node_memory_MemFree{instance=\"$server\"} - node_memory_Buffers{instance=\"$server\"} - node_memory_Cached{instance=\"$server\"}",
|
||||||
|
"hide": false,
|
||||||
|
"interval": "",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "memory used",
|
||||||
|
"metric": "",
|
||||||
|
"refId": "C",
|
||||||
|
"step": 10
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "node_memory_Buffers{instance=\"$server\"}",
|
||||||
|
"interval": "",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "memory buffers",
|
||||||
|
"metric": "",
|
||||||
|
"refId": "E",
|
||||||
|
"step": 10
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "node_memory_Cached{instance=\"$server\"}",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "memory cached",
|
||||||
|
"metric": "",
|
||||||
|
"refId": "F",
|
||||||
|
"step": 10
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "node_memory_MemFree{instance=\"$server\"}",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "memory free",
|
||||||
|
"metric": "",
|
||||||
|
"refId": "D",
|
||||||
|
"step": 10
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Memory Usage",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "individual"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"mode": "time",
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "bytes",
|
||||||
|
"logBase": 1,
|
||||||
|
"min": "0",
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(50, 172, 45, 0.97)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(245, 54, 54, 0.9)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "percent",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": true,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"hideTimeOverride": false,
|
||||||
|
"id": 5,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfix": "",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "((node_memory_MemTotal{instance=\"$server\"} - node_memory_MemFree{instance=\"$server\"} - node_memory_Buffers{instance=\"$server\"} - node_memory_Cached{instance=\"$server\"}) / node_memory_MemTotal{instance=\"$server\"}) * 100",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 60,
|
||||||
|
"target": ""
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "80, 90",
|
||||||
|
"title": "Memory Usage",
|
||||||
|
"transparent": false,
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "New Row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "250px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 1,
|
||||||
|
"grid": {
|
||||||
|
"threshold1Color": "rgba(216, 200, 27, 0.27)",
|
||||||
|
"threshold2Color": "rgba(234, 112, 112, 0.22)"
|
||||||
|
},
|
||||||
|
"id": 6,
|
||||||
|
"isNew": true,
|
||||||
|
"legend": {
|
||||||
|
"alignAsTable": false,
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"hideEmpty": false,
|
||||||
|
"hideZero": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"rightSide": false,
|
||||||
|
"show": true,
|
||||||
|
"total": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [
|
||||||
|
{
|
||||||
|
"alias": "read",
|
||||||
|
"yaxis": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"alias": "{instance=\"172.17.0.1:9100\"}",
|
||||||
|
"yaxis": 2
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"alias": "io time",
|
||||||
|
"yaxis": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 9,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum by (instance) (rate(node_disk_bytes_read{instance=\"$server\"}[2m]))",
|
||||||
|
"hide": false,
|
||||||
|
"intervalFactor": 4,
|
||||||
|
"legendFormat": "read",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 20,
|
||||||
|
"target": ""
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "sum by (instance) (rate(node_disk_bytes_written{instance=\"$server\"}[2m]))",
|
||||||
|
"intervalFactor": 4,
|
||||||
|
"legendFormat": "written",
|
||||||
|
"refId": "B",
|
||||||
|
"step": 20
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "sum by (instance) (rate(node_disk_io_time_ms{instance=\"$server\"}[2m]))",
|
||||||
|
"intervalFactor": 4,
|
||||||
|
"legendFormat": "io time",
|
||||||
|
"refId": "C",
|
||||||
|
"step": 20
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Disk I/O",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "cumulative"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"mode": "time",
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "bytes",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "ms",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(50, 172, 45, 0.97)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(245, 54, 54, 0.9)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "percentunit",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 1,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": true,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"hideTimeOverride": false,
|
||||||
|
"id": 7,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfix": "",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "(sum(node_filesystem_size{device!=\"rootfs\",instance=\"$server\"}) - sum(node_filesystem_free{device!=\"rootfs\",instance=\"$server\"})) / sum(node_filesystem_size{device!=\"rootfs\",instance=\"$server\"})",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 60,
|
||||||
|
"target": ""
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"thresholds": "0.75, 0.9",
|
||||||
|
"title": "Disk Space Usage",
|
||||||
|
"transparent": false,
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "current"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "New Row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "250px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 1,
|
||||||
|
"grid": {
|
||||||
|
"threshold1Color": "rgba(216, 200, 27, 0.27)",
|
||||||
|
"threshold2Color": "rgba(234, 112, 112, 0.22)"
|
||||||
|
},
|
||||||
|
"id": 8,
|
||||||
|
"isNew": false,
|
||||||
|
"legend": {
|
||||||
|
"alignAsTable": false,
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"hideEmpty": false,
|
||||||
|
"hideZero": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"rightSide": false,
|
||||||
|
"show": true,
|
||||||
|
"total": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [
|
||||||
|
{
|
||||||
|
"alias": "transmitted",
|
||||||
|
"yaxis": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 6,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "rate(node_network_receive_bytes{instance=\"$server\",device!~\"lo\"}[5m])",
|
||||||
|
"hide": false,
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "{{device}}",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 10,
|
||||||
|
"target": ""
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Network Received",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "cumulative"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"mode": "time",
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "bytes",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "bytes",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 1,
|
||||||
|
"grid": {
|
||||||
|
"threshold1Color": "rgba(216, 200, 27, 0.27)",
|
||||||
|
"threshold2Color": "rgba(234, 112, 112, 0.22)"
|
||||||
|
},
|
||||||
|
"id": 10,
|
||||||
|
"isNew": false,
|
||||||
|
"legend": {
|
||||||
|
"alignAsTable": false,
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"hideEmpty": false,
|
||||||
|
"hideZero": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"rightSide": false,
|
||||||
|
"show": true,
|
||||||
|
"total": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [
|
||||||
|
{
|
||||||
|
"alias": "transmitted",
|
||||||
|
"yaxis": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 6,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "rate(node_network_transmit_bytes{instance=\"$server\",device!~\"lo\"}[5m])",
|
||||||
|
"hide": false,
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "{{device}}",
|
||||||
|
"refId": "B",
|
||||||
|
"step": 10,
|
||||||
|
"target": ""
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Network Transmitted",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": false,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "cumulative"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"mode": "time",
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "bytes",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "bytes",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "New Row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"schemaVersion": 14,
|
||||||
|
"sharedCrosshair": false,
|
||||||
|
"style": "dark",
|
||||||
|
"tags": [],
|
||||||
|
"templating": {
|
||||||
|
"list": [
|
||||||
|
{
|
||||||
|
"allValue": null,
|
||||||
|
"current": {},
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"hide": 0,
|
||||||
|
"includeAll": false,
|
||||||
|
"label": null,
|
||||||
|
"multi": false,
|
||||||
|
"name": "server",
|
||||||
|
"options": [],
|
||||||
|
"query": "label_values(node_boot_time, instance)",
|
||||||
|
"refresh": 1,
|
||||||
|
"regex": "",
|
||||||
|
"sort": 0,
|
||||||
|
"tagValuesQuery": "",
|
||||||
|
"tags": [],
|
||||||
|
"tagsQuery": "",
|
||||||
|
"type": "query",
|
||||||
|
"useTags": false
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"time": {
|
||||||
|
"from": "now-1h",
|
||||||
|
"to": "now"
|
||||||
|
},
|
||||||
|
"timepicker": {
|
||||||
|
"refresh_intervals": [
|
||||||
|
"5s",
|
||||||
|
"10s",
|
||||||
|
"30s",
|
||||||
|
"1m",
|
||||||
|
"5m",
|
||||||
|
"15m",
|
||||||
|
"30m",
|
||||||
|
"1h",
|
||||||
|
"2h",
|
||||||
|
"1d"
|
||||||
|
],
|
||||||
|
"time_options": [
|
||||||
|
"5m",
|
||||||
|
"15m",
|
||||||
|
"1h",
|
||||||
|
"6h",
|
||||||
|
"12h",
|
||||||
|
"24h",
|
||||||
|
"2d",
|
||||||
|
"7d",
|
||||||
|
"30d"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"timezone": "browser",
|
||||||
|
"title": "Nodes",
|
||||||
|
"version": 2
|
||||||
|
}
|
||||||
|
,
|
||||||
|
"inputs": [
|
||||||
|
{
|
||||||
|
"name": "DS_PROMETHEUS",
|
||||||
|
"pluginId": "prometheus",
|
||||||
|
"type": "datasource",
|
||||||
|
"value": "prometheus"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"overwrite": true
|
||||||
|
}
|
||||||
|
pods-dashboard.json: |+
|
||||||
|
{
|
||||||
|
"dashboard":
|
||||||
|
{
|
||||||
|
"__inputs": [
|
||||||
|
{
|
||||||
|
"description": "",
|
||||||
|
"label": "prometheus",
|
||||||
|
"name": "DS_PROMETHEUS",
|
||||||
|
"pluginId": "prometheus",
|
||||||
|
"pluginName": "Prometheus",
|
||||||
|
"type": "datasource"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"annotations": {
|
||||||
|
"list": []
|
||||||
|
},
|
||||||
|
"editable": false,
|
||||||
|
"graphTooltip": 1,
|
||||||
|
"hideControls": false,
|
||||||
|
"links": [],
|
||||||
|
"refresh": false,
|
||||||
|
"rows": [
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "250px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 1,
|
||||||
|
"grid": {
|
||||||
|
"threshold1Color": "rgba(216, 200, 27, 0.27)",
|
||||||
|
"threshold2Color": "rgba(234, 112, 112, 0.22)"
|
||||||
|
},
|
||||||
|
"id": 1,
|
||||||
|
"isNew": false,
|
||||||
|
"legend": {
|
||||||
|
"alignAsTable": true,
|
||||||
|
"avg": true,
|
||||||
|
"current": true,
|
||||||
|
"hideEmpty": false,
|
||||||
|
"hideZero": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"rightSide": true,
|
||||||
|
"show": true,
|
||||||
|
"total": false,
|
||||||
|
"values": true
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 12,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum by(container_name) (container_memory_usage_bytes{pod_name=\"$pod\", container_name=~\"$container\", container_name!=\"POD\"})",
|
||||||
|
"interval": "10s",
|
||||||
|
"intervalFactor": 1,
|
||||||
|
"legendFormat": "Current: {{ container_name }}",
|
||||||
|
"metric": "container_memory_usage_bytes",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 15
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "kube_pod_container_resource_requests_memory_bytes{pod=\"$pod\", container=~\"$container\"}",
|
||||||
|
"interval": "10s",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "Requested: {{ container }}",
|
||||||
|
"metric": "kube_pod_container_resource_requests_memory_bytes",
|
||||||
|
"refId": "B",
|
||||||
|
"step": 20
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "kube_pod_container_resource_limits_memory_bytes{pod=\"$pod\", container=~\"$container\"}",
|
||||||
|
"interval": "10s",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "Limit: {{ container }}",
|
||||||
|
"metric": "kube_pod_container_resource_limits_memory_bytes",
|
||||||
|
"refId": "C",
|
||||||
|
"step": 20
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Memory Usage",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": true,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "cumulative"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"mode": "time",
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "bytes",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "Row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "250px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 1,
|
||||||
|
"grid": {
|
||||||
|
"threshold1Color": "rgba(216, 200, 27, 0.27)",
|
||||||
|
"threshold2Color": "rgba(234, 112, 112, 0.22)"
|
||||||
|
},
|
||||||
|
"id": 2,
|
||||||
|
"isNew": false,
|
||||||
|
"legend": {
|
||||||
|
"alignAsTable": true,
|
||||||
|
"avg": true,
|
||||||
|
"current": true,
|
||||||
|
"hideEmpty": false,
|
||||||
|
"hideZero": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"rightSide": true,
|
||||||
|
"show": true,
|
||||||
|
"total": false,
|
||||||
|
"values": true
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 12,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum by (container_name)(rate(container_cpu_usage_seconds_total{image!=\"\",container_name!=\"POD\",pod_name=\"$pod\"}[1m]))",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "{{ container_name }}",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 30
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "kube_pod_container_resource_requests_cpu_cores{pod=\"$pod\", container=~\"$container\"}",
|
||||||
|
"interval": "10s",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "Requested: {{ container }}",
|
||||||
|
"metric": "kube_pod_container_resource_requests_cpu_cores",
|
||||||
|
"refId": "B",
|
||||||
|
"step": 20
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "kube_pod_container_resource_limits_cpu_cores{pod=\"$pod\", container=~\"$container\"}",
|
||||||
|
"interval": "10s",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "Limit: {{ container }}",
|
||||||
|
"metric": "kube_pod_container_resource_limits_memory_bytes",
|
||||||
|
"refId": "C",
|
||||||
|
"step": 20
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "CPU Usage",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": true,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "cumulative"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"mode": "time",
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "Row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "250px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 1,
|
||||||
|
"grid": {
|
||||||
|
"threshold1Color": "rgba(216, 200, 27, 0.27)",
|
||||||
|
"threshold2Color": "rgba(234, 112, 112, 0.22)"
|
||||||
|
},
|
||||||
|
"id": 3,
|
||||||
|
"isNew": false,
|
||||||
|
"legend": {
|
||||||
|
"alignAsTable": true,
|
||||||
|
"avg": true,
|
||||||
|
"current": true,
|
||||||
|
"hideEmpty": false,
|
||||||
|
"hideZero": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"rightSide": true,
|
||||||
|
"show": true,
|
||||||
|
"total": false,
|
||||||
|
"values": true
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 12,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sort_desc(sum by (pod_name) (rate(container_network_receive_bytes_total{pod_name=\"$pod\"}[1m])))",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "{{ pod_name }}",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 30
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Network I/O",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": true,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "cumulative"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"mode": "time",
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "bytes",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "New Row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"schemaVersion": 14,
|
||||||
|
"sharedCrosshair": false,
|
||||||
|
"style": "dark",
|
||||||
|
"tags": [],
|
||||||
|
"templating": {
|
||||||
|
"list": [
|
||||||
|
{
|
||||||
|
"allValue": ".*",
|
||||||
|
"current": {},
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"hide": 0,
|
||||||
|
"includeAll": true,
|
||||||
|
"label": "Namespace",
|
||||||
|
"multi": false,
|
||||||
|
"name": "namespace",
|
||||||
|
"options": [],
|
||||||
|
"query": "label_values(kube_pod_info, namespace)",
|
||||||
|
"refresh": 1,
|
||||||
|
"regex": "",
|
||||||
|
"sort": 0,
|
||||||
|
"tagValuesQuery": "",
|
||||||
|
"tags": [],
|
||||||
|
"tagsQuery": "",
|
||||||
|
"type": "query",
|
||||||
|
"useTags": false
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"allValue": null,
|
||||||
|
"current": {},
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"hide": 0,
|
||||||
|
"includeAll": false,
|
||||||
|
"label": "Pod",
|
||||||
|
"multi": false,
|
||||||
|
"name": "pod",
|
||||||
|
"options": [],
|
||||||
|
"query": "label_values(kube_pod_info{namespace=~\"$namespace\"}, pod)",
|
||||||
|
"refresh": 1,
|
||||||
|
"regex": "",
|
||||||
|
"sort": 0,
|
||||||
|
"tagValuesQuery": "",
|
||||||
|
"tags": [],
|
||||||
|
"tagsQuery": "",
|
||||||
|
"type": "query",
|
||||||
|
"useTags": false
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"allValue": ".*",
|
||||||
|
"current": {},
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"hide": 0,
|
||||||
|
"includeAll": true,
|
||||||
|
"label": "Container",
|
||||||
|
"multi": false,
|
||||||
|
"name": "container",
|
||||||
|
"options": [],
|
||||||
|
"query": "label_values(kube_pod_container_info{namespace=\"$namespace\", pod=\"$pod\"}, container)",
|
||||||
|
"refresh": 1,
|
||||||
|
"regex": "",
|
||||||
|
"sort": 0,
|
||||||
|
"tagValuesQuery": "",
|
||||||
|
"tags": [],
|
||||||
|
"tagsQuery": "",
|
||||||
|
"type": "query",
|
||||||
|
"useTags": false
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"time": {
|
||||||
|
"from": "now-6h",
|
||||||
|
"to": "now"
|
||||||
|
},
|
||||||
|
"timepicker": {
|
||||||
|
"refresh_intervals": [
|
||||||
|
"5s",
|
||||||
|
"10s",
|
||||||
|
"30s",
|
||||||
|
"1m",
|
||||||
|
"5m",
|
||||||
|
"15m",
|
||||||
|
"30m",
|
||||||
|
"1h",
|
||||||
|
"2h",
|
||||||
|
"1d"
|
||||||
|
],
|
||||||
|
"time_options": [
|
||||||
|
"5m",
|
||||||
|
"15m",
|
||||||
|
"1h",
|
||||||
|
"6h",
|
||||||
|
"12h",
|
||||||
|
"24h",
|
||||||
|
"2d",
|
||||||
|
"7d",
|
||||||
|
"30d"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"timezone": "browser",
|
||||||
|
"title": "Pods",
|
||||||
|
"version": 1
|
||||||
|
}
|
||||||
|
,
|
||||||
|
"inputs": [
|
||||||
|
{
|
||||||
|
"name": "DS_PROMETHEUS",
|
||||||
|
"pluginId": "prometheus",
|
||||||
|
"type": "datasource",
|
||||||
|
"value": "prometheus"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"overwrite": true
|
||||||
|
}
|
||||||
|
statefulset-dashboard.json: |+
|
||||||
|
{
|
||||||
|
"dashboard":
|
||||||
|
{
|
||||||
|
"__inputs": [
|
||||||
|
{
|
||||||
|
"description": "",
|
||||||
|
"label": "prometheus",
|
||||||
|
"name": "DS_PROMETHEUS",
|
||||||
|
"pluginId": "prometheus",
|
||||||
|
"pluginName": "Prometheus",
|
||||||
|
"type": "datasource"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"annotations": {
|
||||||
|
"list": []
|
||||||
|
},
|
||||||
|
"editable": false,
|
||||||
|
"graphTooltip": 1,
|
||||||
|
"hideControls": false,
|
||||||
|
"links": [],
|
||||||
|
"rows": [
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "200px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(245, 54, 54, 0.9)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(50, 172, 45, 0.97)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "none",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"id": 8,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfix": "cores",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 4,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(rate(container_cpu_usage_seconds_total{namespace=\"$statefulset_namespace\",pod_name=~\"$statefulset_name.*\"}[3m]))",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "CPU",
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "110%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(245, 54, 54, 0.9)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(50, 172, 45, 0.97)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "none",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"id": 9,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfix": "GB",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "80%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 4,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(container_memory_usage_bytes{namespace=\"$statefulset_namespace\",pod_name=~\"$statefulset_name.*\"}) / 1024^3",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Memory",
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "110%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(245, 54, 54, 0.9)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(50, 172, 45, 0.97)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "Bps",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": false
|
||||||
|
},
|
||||||
|
"id": 7,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfix": "",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 4,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "sum(rate(container_network_transmit_bytes_total{namespace=\"$statefulset_namespace\",pod_name=~\"$statefulset_name.*\"}[3m])) + sum(rate(container_network_receive_bytes_total{namespace=\"$statefulset_namespace\",pod_name=~\"$statefulset_name.*\"}[3m]))",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Network",
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "Dashboard Row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "100px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(245, 54, 54, 0.9)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(50, 172, 45, 0.97)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "none",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": false
|
||||||
|
},
|
||||||
|
"id": 5,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "max(kube_statefulset_replicas{statefulset=\"$statefulset_name\",namespace=\"$statefulset_namespace\"}) without (instance, pod)",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"metric": "kube_statefulset_replicas",
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Desired Replicas",
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(245, 54, 54, 0.9)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(50, 172, 45, 0.97)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "none",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"id": 6,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "min(kube_statefulset_status_replicas{statefulset=\"$statefulset_name\",namespace=\"$statefulset_namespace\"}) without (instance, pod)",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Available Replicas",
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(245, 54, 54, 0.9)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(50, 172, 45, 0.97)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "none",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"id": 3,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "max(kube_statefulset_status_observed_generation{statefulset=\"$statefulset_name\",namespace=\"$statefulset_namespace\"}) without (instance, pod)",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Observed Generation",
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"colorBackground": false,
|
||||||
|
"colorValue": false,
|
||||||
|
"colors": [
|
||||||
|
"rgba(245, 54, 54, 0.9)",
|
||||||
|
"rgba(237, 129, 40, 0.89)",
|
||||||
|
"rgba(50, 172, 45, 0.97)"
|
||||||
|
],
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"format": "none",
|
||||||
|
"gauge": {
|
||||||
|
"maxValue": 100,
|
||||||
|
"minValue": 0,
|
||||||
|
"show": false,
|
||||||
|
"thresholdLabels": false,
|
||||||
|
"thresholdMarkers": true
|
||||||
|
},
|
||||||
|
"id": 2,
|
||||||
|
"links": [],
|
||||||
|
"mappingType": 1,
|
||||||
|
"mappingTypes": [
|
||||||
|
{
|
||||||
|
"name": "value to text",
|
||||||
|
"value": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "range to text",
|
||||||
|
"value": 2
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maxDataPoints": 100,
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"postfixFontSize": "50%",
|
||||||
|
"prefix": "",
|
||||||
|
"prefixFontSize": "50%",
|
||||||
|
"rangeMaps": [
|
||||||
|
{
|
||||||
|
"from": "null",
|
||||||
|
"text": "N/A",
|
||||||
|
"to": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"span": 3,
|
||||||
|
"sparkline": {
|
||||||
|
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||||
|
"full": false,
|
||||||
|
"lineColor": "rgb(31, 120, 193)",
|
||||||
|
"show": false
|
||||||
|
},
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "max(kube_statefulset_metadata_generation{statefulset=\"$statefulset_name\",namespace=\"$statefulset_namespace\"}) without (instance, pod)",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"refId": "A",
|
||||||
|
"step": 600
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Metadata Generation",
|
||||||
|
"type": "singlestat",
|
||||||
|
"valueFontSize": "80%",
|
||||||
|
"valueMaps": [
|
||||||
|
{
|
||||||
|
"op": "=",
|
||||||
|
"text": "N/A",
|
||||||
|
"value": "null"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"valueName": "avg"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "Dashboard Row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"collapse": false,
|
||||||
|
"editable": false,
|
||||||
|
"height": "350px",
|
||||||
|
"panels": [
|
||||||
|
{
|
||||||
|
"aliasColors": {},
|
||||||
|
"bars": false,
|
||||||
|
"dashLength": 10,
|
||||||
|
"dashes": false,
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"editable": false,
|
||||||
|
"error": false,
|
||||||
|
"fill": 1,
|
||||||
|
"grid": {
|
||||||
|
"threshold1Color": "rgba(216, 200, 27, 0.27)",
|
||||||
|
"threshold2Color": "rgba(234, 112, 112, 0.22)"
|
||||||
|
},
|
||||||
|
"id": 1,
|
||||||
|
"isNew": true,
|
||||||
|
"legend": {
|
||||||
|
"alignAsTable": false,
|
||||||
|
"avg": false,
|
||||||
|
"current": false,
|
||||||
|
"hideEmpty": false,
|
||||||
|
"hideZero": false,
|
||||||
|
"max": false,
|
||||||
|
"min": false,
|
||||||
|
"rightSide": false,
|
||||||
|
"show": true,
|
||||||
|
"total": false
|
||||||
|
},
|
||||||
|
"lines": true,
|
||||||
|
"linewidth": 2,
|
||||||
|
"links": [],
|
||||||
|
"nullPointMode": "connected",
|
||||||
|
"percentage": false,
|
||||||
|
"pointradius": 5,
|
||||||
|
"points": false,
|
||||||
|
"renderer": "flot",
|
||||||
|
"seriesOverrides": [],
|
||||||
|
"spaceLength": 10,
|
||||||
|
"span": 12,
|
||||||
|
"stack": false,
|
||||||
|
"steppedLine": false,
|
||||||
|
"targets": [
|
||||||
|
{
|
||||||
|
"expr": "min(kube_statefulset_status_replicas{statefulset=\"$statefulset_name\",namespace=\"$statefulset_namespace\"}) without (instance, pod)",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "available",
|
||||||
|
"refId": "B",
|
||||||
|
"step": 30
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"expr": "max(kube_statefulset_replicas{statefulset=\"$statefulset_name\",namespace=\"$statefulset_namespace\"}) without (instance, pod)",
|
||||||
|
"intervalFactor": 2,
|
||||||
|
"legendFormat": "desired",
|
||||||
|
"refId": "E",
|
||||||
|
"step": 30
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"title": "Replicas",
|
||||||
|
"tooltip": {
|
||||||
|
"msResolution": true,
|
||||||
|
"shared": true,
|
||||||
|
"sort": 0,
|
||||||
|
"value_type": "cumulative"
|
||||||
|
},
|
||||||
|
"type": "graph",
|
||||||
|
"xaxis": {
|
||||||
|
"mode": "time",
|
||||||
|
"show": true,
|
||||||
|
"values": []
|
||||||
|
},
|
||||||
|
"yaxes": [
|
||||||
|
{
|
||||||
|
"format": "none",
|
||||||
|
"label": "",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": true
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"format": "short",
|
||||||
|
"label": "",
|
||||||
|
"logBase": 1,
|
||||||
|
"show": false
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"showTitle": false,
|
||||||
|
"title": "Dashboard Row",
|
||||||
|
"titleSize": "h6"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"schemaVersion": 14,
|
||||||
|
"sharedCrosshair": false,
|
||||||
|
"style": "dark",
|
||||||
|
"tags": [],
|
||||||
|
"templating": {
|
||||||
|
"list": [
|
||||||
|
{
|
||||||
|
"allValue": ".*",
|
||||||
|
"current": {},
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"hide": 0,
|
||||||
|
"includeAll": false,
|
||||||
|
"label": "Namespace",
|
||||||
|
"multi": false,
|
||||||
|
"name": "statefulset_namespace",
|
||||||
|
"options": [],
|
||||||
|
"query": "label_values(kube_statefulset_metadata_generation, namespace)",
|
||||||
|
"refresh": 1,
|
||||||
|
"regex": "",
|
||||||
|
"sort": 0,
|
||||||
|
"tagValuesQuery": null,
|
||||||
|
"tags": [],
|
||||||
|
"tagsQuery": "",
|
||||||
|
"type": "query",
|
||||||
|
"useTags": false
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"allValue": null,
|
||||||
|
"current": {},
|
||||||
|
"datasource": "${DS_PROMETHEUS}",
|
||||||
|
"hide": 0,
|
||||||
|
"includeAll": false,
|
||||||
|
"label": "StatefulSet",
|
||||||
|
"multi": false,
|
||||||
|
"name": "statefulset_name",
|
||||||
|
"options": [],
|
||||||
|
"query": "label_values(kube_statefulset_metadata_generation{namespace=\"$statefulset_namespace\"}, statefulset)",
|
||||||
|
"refresh": 1,
|
||||||
|
"regex": "",
|
||||||
|
"sort": 0,
|
||||||
|
"tagValuesQuery": "",
|
||||||
|
"tags": [],
|
||||||
|
"tagsQuery": "statefulset",
|
||||||
|
"type": "query",
|
||||||
|
"useTags": false
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"time": {
|
||||||
|
"from": "now-6h",
|
||||||
|
"to": "now"
|
||||||
|
},
|
||||||
|
"timepicker": {
|
||||||
|
"refresh_intervals": [
|
||||||
|
"5s",
|
||||||
|
"10s",
|
||||||
|
"30s",
|
||||||
|
"1m",
|
||||||
|
"5m",
|
||||||
|
"15m",
|
||||||
|
"30m",
|
||||||
|
"1h",
|
||||||
|
"2h",
|
||||||
|
"1d"
|
||||||
|
],
|
||||||
|
"time_options": [
|
||||||
|
"5m",
|
||||||
|
"15m",
|
||||||
|
"1h",
|
||||||
|
"6h",
|
||||||
|
"12h",
|
||||||
|
"24h",
|
||||||
|
"2d",
|
||||||
|
"7d",
|
||||||
|
"30d"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"timezone": "browser",
|
||||||
|
"title": "StatefulSet",
|
||||||
|
"version": 1
|
||||||
|
}
|
||||||
|
,
|
||||||
|
"inputs": [
|
||||||
|
{
|
||||||
|
"name": "DS_PROMETHEUS",
|
||||||
|
"pluginId": "prometheus",
|
||||||
|
"type": "datasource",
|
||||||
|
"value": "prometheus"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"overwrite": true
|
||||||
|
}
|
||||||
|
prometheus-datasource.json: |+
|
||||||
|
{
|
||||||
|
"access": "proxy",
|
||||||
|
"basicAuth": false,
|
||||||
|
"name": "prometheus",
|
||||||
|
"type": "prometheus",
|
||||||
|
"url": "http://prometheus.monitoring.svc"
|
||||||
|
}
|
||||||
|
---
|
@ -1,4 +1,4 @@
|
|||||||
apiVersion: apps/v1beta2
|
apiVersion: apps/v1
|
||||||
kind: Deployment
|
kind: Deployment
|
||||||
metadata:
|
metadata:
|
||||||
name: grafana
|
name: grafana
|
||||||
@ -41,6 +41,22 @@ spec:
|
|||||||
limits:
|
limits:
|
||||||
memory: 200Mi
|
memory: 200Mi
|
||||||
cpu: 200m
|
cpu: 200m
|
||||||
|
- name: grafana-watcher
|
||||||
|
image: quay.io/coreos/grafana-watcher:v0.0.8
|
||||||
|
args:
|
||||||
|
- '--watch-dir=/etc/grafana/dashboards'
|
||||||
|
- '--grafana-url=http://localhost:8080'
|
||||||
|
resources:
|
||||||
|
requests:
|
||||||
|
memory: "16Mi"
|
||||||
|
cpu: "50m"
|
||||||
|
limits:
|
||||||
|
memory: "32Mi"
|
||||||
|
cpu: "100m"
|
||||||
|
volumeMounts:
|
||||||
|
- name: dashboards
|
||||||
|
mountPath: /etc/grafana/dashboards
|
||||||
volumes:
|
volumes:
|
||||||
- name: grafana-storage
|
- name: dashboards
|
||||||
emptyDir: {}
|
configMap:
|
||||||
|
name: grafana-dashboards
|
||||||
|
12
addons/heapster/cluster-role-binding.yaml
Normal file
12
addons/heapster/cluster-role-binding.yaml
Normal file
@ -0,0 +1,12 @@
|
|||||||
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
|
kind: ClusterRoleBinding
|
||||||
|
metadata:
|
||||||
|
name: heapster
|
||||||
|
roleRef:
|
||||||
|
apiGroup: rbac.authorization.k8s.io
|
||||||
|
kind: ClusterRole
|
||||||
|
name: system:heapster
|
||||||
|
subjects:
|
||||||
|
- kind: ServiceAccount
|
||||||
|
name: heapster
|
||||||
|
namespace: kube-system
|
@ -1,4 +1,4 @@
|
|||||||
apiVersion: apps/v1beta2
|
apiVersion: apps/v1
|
||||||
kind: Deployment
|
kind: Deployment
|
||||||
metadata:
|
metadata:
|
||||||
name: heapster
|
name: heapster
|
||||||
@ -14,12 +14,11 @@ spec:
|
|||||||
labels:
|
labels:
|
||||||
name: heapster
|
name: heapster
|
||||||
phase: prod
|
phase: prod
|
||||||
annotations:
|
|
||||||
scheduler.alpha.kubernetes.io/critical-pod: ''
|
|
||||||
spec:
|
spec:
|
||||||
|
serviceAccountName: heapster
|
||||||
containers:
|
containers:
|
||||||
- name: heapster
|
- name: heapster
|
||||||
image: gcr.io/google_containers/heapster-amd64:v1.5.0
|
image: k8s.gcr.io/heapster-amd64:v1.5.1
|
||||||
command:
|
command:
|
||||||
- /heapster
|
- /heapster
|
||||||
- --source=kubernetes.summary_api:''
|
- --source=kubernetes.summary_api:''
|
||||||
@ -31,7 +30,7 @@ spec:
|
|||||||
initialDelaySeconds: 180
|
initialDelaySeconds: 180
|
||||||
timeoutSeconds: 5
|
timeoutSeconds: 5
|
||||||
- name: heapster-nanny
|
- name: heapster-nanny
|
||||||
image: gcr.io/google_containers/addon-resizer:1.7
|
image: k8s.gcr.io/addon-resizer:1.7
|
||||||
command:
|
command:
|
||||||
- /pod_nanny
|
- /pod_nanny
|
||||||
- --cpu=80m
|
- --cpu=80m
|
||||||
|
13
addons/heapster/role-binding.yaml
Normal file
13
addons/heapster/role-binding.yaml
Normal file
@ -0,0 +1,13 @@
|
|||||||
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
|
kind: RoleBinding
|
||||||
|
metadata:
|
||||||
|
name: heapster
|
||||||
|
namespace: kube-system
|
||||||
|
roleRef:
|
||||||
|
apiGroup: rbac.authorization.k8s.io
|
||||||
|
kind: Role
|
||||||
|
name: system:pod-nanny
|
||||||
|
subjects:
|
||||||
|
- kind: ServiceAccount
|
||||||
|
name: heapster
|
||||||
|
namespace: kube-system
|
19
addons/heapster/role.yaml
Normal file
19
addons/heapster/role.yaml
Normal file
@ -0,0 +1,19 @@
|
|||||||
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
|
kind: Role
|
||||||
|
metadata:
|
||||||
|
name: system:pod-nanny
|
||||||
|
namespace: kube-system
|
||||||
|
rules:
|
||||||
|
- apiGroups:
|
||||||
|
- ""
|
||||||
|
resources:
|
||||||
|
- pods
|
||||||
|
verbs:
|
||||||
|
- get
|
||||||
|
- apiGroups:
|
||||||
|
- "extensions"
|
||||||
|
resources:
|
||||||
|
- deployments
|
||||||
|
verbs:
|
||||||
|
- get
|
||||||
|
- update
|
5
addons/heapster/service-account.yaml
Normal file
5
addons/heapster/service-account.yaml
Normal file
@ -0,0 +1,5 @@
|
|||||||
|
apiVersion: v1
|
||||||
|
kind: ServiceAccount
|
||||||
|
metadata:
|
||||||
|
name: heapster
|
||||||
|
namespace: kube-system
|
@ -1,10 +1,14 @@
|
|||||||
apiVersion: extensions/v1beta1
|
apiVersion: apps/v1
|
||||||
kind: Deployment
|
kind: Deployment
|
||||||
metadata:
|
metadata:
|
||||||
name: default-backend
|
name: default-backend
|
||||||
namespace: ingress
|
namespace: ingress
|
||||||
spec:
|
spec:
|
||||||
replicas: 1
|
replicas: 1
|
||||||
|
selector:
|
||||||
|
matchLabels:
|
||||||
|
name: default-backend
|
||||||
|
phase: prod
|
||||||
template:
|
template:
|
||||||
metadata:
|
metadata:
|
||||||
labels:
|
labels:
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
apiVersion: extensions/v1beta1
|
apiVersion: apps/v1
|
||||||
kind: Deployment
|
kind: Deployment
|
||||||
metadata:
|
metadata:
|
||||||
name: nginx-ingress-controller
|
name: nginx-ingress-controller
|
||||||
@ -8,6 +8,10 @@ spec:
|
|||||||
strategy:
|
strategy:
|
||||||
rollingUpdate:
|
rollingUpdate:
|
||||||
maxUnavailable: 1
|
maxUnavailable: 1
|
||||||
|
selector:
|
||||||
|
matchLabels:
|
||||||
|
name: nginx-ingress-controller
|
||||||
|
phase: prod
|
||||||
template:
|
template:
|
||||||
metadata:
|
metadata:
|
||||||
labels:
|
labels:
|
||||||
@ -19,7 +23,7 @@ spec:
|
|||||||
hostNetwork: true
|
hostNetwork: true
|
||||||
containers:
|
containers:
|
||||||
- name: nginx-ingress-controller
|
- name: nginx-ingress-controller
|
||||||
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.9.0
|
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.11.0
|
||||||
args:
|
args:
|
||||||
- /nginx-ingress-controller
|
- /nginx-ingress-controller
|
||||||
- --default-backend-service=$(POD_NAMESPACE)/default-backend
|
- --default-backend-service=$(POD_NAMESPACE)/default-backend
|
||||||
|
@ -1,5 +1,5 @@
|
|||||||
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
kind: ClusterRoleBinding
|
kind: ClusterRoleBinding
|
||||||
apiVersion: rbac.authorization.k8s.io/v1beta1
|
|
||||||
metadata:
|
metadata:
|
||||||
name: ingress
|
name: ingress
|
||||||
roleRef:
|
roleRef:
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
apiVersion: rbac.authorization.k8s.io/v1beta1
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
kind: ClusterRole
|
kind: ClusterRole
|
||||||
metadata:
|
metadata:
|
||||||
name: ingress
|
name: ingress
|
||||||
|
@ -1,5 +1,5 @@
|
|||||||
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
kind: RoleBinding
|
kind: RoleBinding
|
||||||
apiVersion: rbac.authorization.k8s.io/v1beta1
|
|
||||||
metadata:
|
metadata:
|
||||||
name: ingress
|
name: ingress
|
||||||
namespace: ingress
|
namespace: ingress
|
||||||
|
@ -1,5 +1,5 @@
|
|||||||
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
kind: Role
|
kind: Role
|
||||||
apiVersion: rbac.authorization.k8s.io/v1beta1
|
|
||||||
metadata:
|
metadata:
|
||||||
name: ingress
|
name: ingress
|
||||||
namespace: ingress
|
namespace: ingress
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
apiVersion: extensions/v1beta1
|
apiVersion: apps/v1
|
||||||
kind: DaemonSet
|
kind: DaemonSet
|
||||||
metadata:
|
metadata:
|
||||||
name: nginx-ingress-controller
|
name: nginx-ingress-controller
|
||||||
@ -8,6 +8,10 @@ spec:
|
|||||||
type: RollingUpdate
|
type: RollingUpdate
|
||||||
rollingUpdate:
|
rollingUpdate:
|
||||||
maxUnavailable: 1
|
maxUnavailable: 1
|
||||||
|
selector:
|
||||||
|
matchLabels:
|
||||||
|
name: nginx-ingress-controller
|
||||||
|
phase: prod
|
||||||
template:
|
template:
|
||||||
metadata:
|
metadata:
|
||||||
labels:
|
labels:
|
||||||
@ -19,7 +23,7 @@ spec:
|
|||||||
hostNetwork: true
|
hostNetwork: true
|
||||||
containers:
|
containers:
|
||||||
- name: nginx-ingress-controller
|
- name: nginx-ingress-controller
|
||||||
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.9.0
|
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.11.0
|
||||||
args:
|
args:
|
||||||
- /nginx-ingress-controller
|
- /nginx-ingress-controller
|
||||||
- --default-backend-service=$(POD_NAMESPACE)/default-backend
|
- --default-backend-service=$(POD_NAMESPACE)/default-backend
|
||||||
|
@ -1,10 +1,14 @@
|
|||||||
apiVersion: extensions/v1beta1
|
apiVersion: apps/v1
|
||||||
kind: Deployment
|
kind: Deployment
|
||||||
metadata:
|
metadata:
|
||||||
name: default-backend
|
name: default-backend
|
||||||
namespace: ingress
|
namespace: ingress
|
||||||
spec:
|
spec:
|
||||||
replicas: 1
|
replicas: 1
|
||||||
|
selector:
|
||||||
|
matchLabels:
|
||||||
|
name: default-backend
|
||||||
|
phase: prod
|
||||||
template:
|
template:
|
||||||
metadata:
|
metadata:
|
||||||
labels:
|
labels:
|
||||||
|
@ -1,5 +1,5 @@
|
|||||||
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
kind: ClusterRoleBinding
|
kind: ClusterRoleBinding
|
||||||
apiVersion: rbac.authorization.k8s.io/v1beta1
|
|
||||||
metadata:
|
metadata:
|
||||||
name: ingress
|
name: ingress
|
||||||
roleRef:
|
roleRef:
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
apiVersion: rbac.authorization.k8s.io/v1beta1
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
kind: ClusterRole
|
kind: ClusterRole
|
||||||
metadata:
|
metadata:
|
||||||
name: ingress
|
name: ingress
|
||||||
|
@ -1,5 +1,5 @@
|
|||||||
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
kind: RoleBinding
|
kind: RoleBinding
|
||||||
apiVersion: rbac.authorization.k8s.io/v1beta1
|
|
||||||
metadata:
|
metadata:
|
||||||
name: ingress
|
name: ingress
|
||||||
namespace: ingress
|
namespace: ingress
|
||||||
|
@ -1,5 +1,5 @@
|
|||||||
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
kind: Role
|
kind: Role
|
||||||
apiVersion: rbac.authorization.k8s.io/v1beta1
|
|
||||||
metadata:
|
metadata:
|
||||||
name: ingress
|
name: ingress
|
||||||
namespace: ingress
|
namespace: ingress
|
||||||
|
@ -1,10 +1,14 @@
|
|||||||
apiVersion: extensions/v1beta1
|
apiVersion: apps/v1
|
||||||
kind: Deployment
|
kind: Deployment
|
||||||
metadata:
|
metadata:
|
||||||
name: default-backend
|
name: default-backend
|
||||||
namespace: ingress
|
namespace: ingress
|
||||||
spec:
|
spec:
|
||||||
replicas: 1
|
replicas: 1
|
||||||
|
selector:
|
||||||
|
matchLabels:
|
||||||
|
name: default-backend
|
||||||
|
phase: prod
|
||||||
template:
|
template:
|
||||||
metadata:
|
metadata:
|
||||||
labels:
|
labels:
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
apiVersion: extensions/v1beta1
|
apiVersion: apps/v1
|
||||||
kind: Deployment
|
kind: Deployment
|
||||||
metadata:
|
metadata:
|
||||||
name: nginx-ingress-controller
|
name: nginx-ingress-controller
|
||||||
@ -8,6 +8,10 @@ spec:
|
|||||||
strategy:
|
strategy:
|
||||||
rollingUpdate:
|
rollingUpdate:
|
||||||
maxUnavailable: 1
|
maxUnavailable: 1
|
||||||
|
selector:
|
||||||
|
matchLabels:
|
||||||
|
name: nginx-ingress-controller
|
||||||
|
phase: prod
|
||||||
template:
|
template:
|
||||||
metadata:
|
metadata:
|
||||||
labels:
|
labels:
|
||||||
@ -19,7 +23,7 @@ spec:
|
|||||||
hostNetwork: true
|
hostNetwork: true
|
||||||
containers:
|
containers:
|
||||||
- name: nginx-ingress-controller
|
- name: nginx-ingress-controller
|
||||||
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.9.0
|
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.11.0
|
||||||
args:
|
args:
|
||||||
- /nginx-ingress-controller
|
- /nginx-ingress-controller
|
||||||
- --default-backend-service=$(POD_NAMESPACE)/default-backend
|
- --default-backend-service=$(POD_NAMESPACE)/default-backend
|
||||||
|
@ -1,5 +1,5 @@
|
|||||||
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
kind: ClusterRoleBinding
|
kind: ClusterRoleBinding
|
||||||
apiVersion: rbac.authorization.k8s.io/v1beta1
|
|
||||||
metadata:
|
metadata:
|
||||||
name: ingress
|
name: ingress
|
||||||
roleRef:
|
roleRef:
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
apiVersion: rbac.authorization.k8s.io/v1beta1
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
kind: ClusterRole
|
kind: ClusterRole
|
||||||
metadata:
|
metadata:
|
||||||
name: ingress
|
name: ingress
|
||||||
|
@ -1,5 +1,5 @@
|
|||||||
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
kind: RoleBinding
|
kind: RoleBinding
|
||||||
apiVersion: rbac.authorization.k8s.io/v1beta1
|
|
||||||
metadata:
|
metadata:
|
||||||
name: ingress
|
name: ingress
|
||||||
namespace: ingress
|
namespace: ingress
|
||||||
|
@ -1,5 +1,5 @@
|
|||||||
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
kind: Role
|
kind: Role
|
||||||
apiVersion: rbac.authorization.k8s.io/v1beta1
|
|
||||||
metadata:
|
metadata:
|
||||||
name: ingress
|
name: ingress
|
||||||
namespace: ingress
|
namespace: ingress
|
||||||
|
@ -39,7 +39,7 @@ data:
|
|||||||
tls_config:
|
tls_config:
|
||||||
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
|
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
|
||||||
# Using endpoints to discover kube-apiserver targets finds the pod IP
|
# Using endpoints to discover kube-apiserver targets finds the pod IP
|
||||||
# (host IP since apiserver is uses host network) which is not used in
|
# (host IP since apiserver uses host network) which is not used in
|
||||||
# the server certificate.
|
# the server certificate.
|
||||||
insecure_skip_verify: true
|
insecure_skip_verify: true
|
||||||
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
|
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
|
||||||
@ -51,6 +51,9 @@ data:
|
|||||||
- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
|
- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
|
||||||
action: keep
|
action: keep
|
||||||
regex: default;kubernetes;https
|
regex: default;kubernetes;https
|
||||||
|
- replacement: apiserver
|
||||||
|
action: replace
|
||||||
|
target_label: job
|
||||||
|
|
||||||
# Scrape config for node (i.e. kubelet) /metrics (e.g. 'kubelet_'). Explore
|
# Scrape config for node (i.e. kubelet) /metrics (e.g. 'kubelet_'). Explore
|
||||||
# metrics from a node by scraping kubelet (127.0.0.1:10255/metrics).
|
# metrics from a node by scraping kubelet (127.0.0.1:10255/metrics).
|
||||||
@ -59,7 +62,7 @@ data:
|
|||||||
# Kubernetes apiserver. This means it will work if Prometheus is running out of
|
# Kubernetes apiserver. This means it will work if Prometheus is running out of
|
||||||
# cluster, or can't connect to nodes for some other reason (e.g. because of
|
# cluster, or can't connect to nodes for some other reason (e.g. because of
|
||||||
# firewalling).
|
# firewalling).
|
||||||
- job_name: 'kubernetes-nodes'
|
- job_name: 'kubelet'
|
||||||
kubernetes_sd_configs:
|
kubernetes_sd_configs:
|
||||||
- role: node
|
- role: node
|
||||||
|
|
||||||
@ -149,7 +152,7 @@ data:
|
|||||||
target_label: kubernetes_namespace
|
target_label: kubernetes_namespace
|
||||||
- source_labels: [__meta_kubernetes_service_name]
|
- source_labels: [__meta_kubernetes_service_name]
|
||||||
action: replace
|
action: replace
|
||||||
target_label: kubernetes_name
|
target_label: job
|
||||||
|
|
||||||
# Example scrape config for probing services via the Blackbox Exporter.
|
# Example scrape config for probing services via the Blackbox Exporter.
|
||||||
#
|
#
|
||||||
@ -181,7 +184,7 @@ data:
|
|||||||
- source_labels: [__meta_kubernetes_namespace]
|
- source_labels: [__meta_kubernetes_namespace]
|
||||||
target_label: kubernetes_namespace
|
target_label: kubernetes_namespace
|
||||||
- source_labels: [__meta_kubernetes_service_name]
|
- source_labels: [__meta_kubernetes_service_name]
|
||||||
target_label: kubernetes_name
|
target_label: job
|
||||||
|
|
||||||
# Example scrape config for pods
|
# Example scrape config for pods
|
||||||
#
|
#
|
||||||
|
@ -1,22 +1,24 @@
|
|||||||
apiVersion: extensions/v1beta1
|
apiVersion: apps/v1
|
||||||
kind: Deployment
|
kind: Deployment
|
||||||
metadata:
|
metadata:
|
||||||
name: prometheus
|
name: prometheus
|
||||||
namespace: monitoring
|
namespace: monitoring
|
||||||
spec:
|
spec:
|
||||||
replicas: 1
|
replicas: 1
|
||||||
strategy:
|
selector:
|
||||||
rollingUpdate:
|
matchLabels:
|
||||||
maxUnavailable: 1
|
name: prometheus
|
||||||
|
phase: prod
|
||||||
template:
|
template:
|
||||||
metadata:
|
metadata:
|
||||||
labels:
|
labels:
|
||||||
name: prometheus
|
name: prometheus
|
||||||
phase: prod
|
phase: prod
|
||||||
spec:
|
spec:
|
||||||
|
serviceAccountName: prometheus
|
||||||
containers:
|
containers:
|
||||||
- name: prometheus
|
- name: prometheus
|
||||||
image: quay.io/prometheus/prometheus:v2.0.0
|
image: quay.io/prometheus/prometheus:v2.2.0
|
||||||
args:
|
args:
|
||||||
- '--config.file=/etc/prometheus/prometheus.yaml'
|
- '--config.file=/etc/prometheus/prometheus.yaml'
|
||||||
ports:
|
ports:
|
||||||
|
@ -12,7 +12,9 @@ rules:
|
|||||||
- replicationcontrollers
|
- replicationcontrollers
|
||||||
- limitranges
|
- limitranges
|
||||||
- persistentvolumeclaims
|
- persistentvolumeclaims
|
||||||
|
- persistentvolumes
|
||||||
- namespaces
|
- namespaces
|
||||||
|
- endpoints
|
||||||
verbs: ["list", "watch"]
|
verbs: ["list", "watch"]
|
||||||
- apiGroups: ["extensions"]
|
- apiGroups: ["extensions"]
|
||||||
resources:
|
resources:
|
||||||
@ -29,3 +31,7 @@ rules:
|
|||||||
- cronjobs
|
- cronjobs
|
||||||
- jobs
|
- jobs
|
||||||
verbs: ["list", "watch"]
|
verbs: ["list", "watch"]
|
||||||
|
- apiGroups: ["autoscaling"]
|
||||||
|
resources:
|
||||||
|
- horizontalpodautoscalers
|
||||||
|
verbs: ["list", "watch"]
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
apiVersion: apps/v1beta2
|
apiVersion: apps/v1
|
||||||
kind: Deployment
|
kind: Deployment
|
||||||
metadata:
|
metadata:
|
||||||
name: kube-state-metrics
|
name: kube-state-metrics
|
||||||
@ -22,7 +22,7 @@ spec:
|
|||||||
serviceAccountName: kube-state-metrics
|
serviceAccountName: kube-state-metrics
|
||||||
containers:
|
containers:
|
||||||
- name: kube-state-metrics
|
- name: kube-state-metrics
|
||||||
image: quay.io/coreos/kube-state-metrics:v1.1.0
|
image: quay.io/coreos/kube-state-metrics:v1.2.0
|
||||||
ports:
|
ports:
|
||||||
- name: metrics
|
- name: metrics
|
||||||
containerPort: 8080
|
containerPort: 8080
|
||||||
@ -33,7 +33,7 @@ spec:
|
|||||||
initialDelaySeconds: 5
|
initialDelaySeconds: 5
|
||||||
timeoutSeconds: 5
|
timeoutSeconds: 5
|
||||||
- name: addon-resizer
|
- name: addon-resizer
|
||||||
image: gcr.io/google_containers/addon-resizer:1.0
|
image: gcr.io/google_containers/addon-resizer:1.7
|
||||||
resources:
|
resources:
|
||||||
limits:
|
limits:
|
||||||
cpu: 100m
|
cpu: 100m
|
||||||
|
@ -15,5 +15,5 @@ spec:
|
|||||||
ports:
|
ports:
|
||||||
- name: metrics
|
- name: metrics
|
||||||
protocol: TCP
|
protocol: TCP
|
||||||
port: 80
|
port: 8080
|
||||||
targetPort: 8080
|
targetPort: 8080
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
apiVersion: apps/v1beta2
|
apiVersion: apps/v1
|
||||||
kind: DaemonSet
|
kind: DaemonSet
|
||||||
metadata:
|
metadata:
|
||||||
name: node-exporter
|
name: node-exporter
|
||||||
@ -18,11 +18,15 @@ spec:
|
|||||||
name: node-exporter
|
name: node-exporter
|
||||||
phase: prod
|
phase: prod
|
||||||
spec:
|
spec:
|
||||||
|
serviceAccountName: node-exporter
|
||||||
|
securityContext:
|
||||||
|
runAsNonRoot: true
|
||||||
|
runAsUser: 65534
|
||||||
hostNetwork: true
|
hostNetwork: true
|
||||||
hostPID: true
|
hostPID: true
|
||||||
containers:
|
containers:
|
||||||
- name: node-exporter
|
- name: node-exporter
|
||||||
image: quay.io/prometheus/node-exporter:v0.15.0
|
image: quay.io/prometheus/node-exporter:v0.15.2
|
||||||
args:
|
args:
|
||||||
- "--path.procfs=/host/proc"
|
- "--path.procfs=/host/proc"
|
||||||
- "--path.sysfs=/host/sys"
|
- "--path.sysfs=/host/sys"
|
||||||
@ -45,9 +49,8 @@ spec:
|
|||||||
mountPath: /host/sys
|
mountPath: /host/sys
|
||||||
readOnly: true
|
readOnly: true
|
||||||
tolerations:
|
tolerations:
|
||||||
- key: node-role.kubernetes.io/master
|
- effect: NoSchedule
|
||||||
operator: Exists
|
operator: Exists
|
||||||
effect: NoSchedule
|
|
||||||
volumes:
|
volumes:
|
||||||
- name: proc
|
- name: proc
|
||||||
hostPath:
|
hostPath:
|
||||||
|
@ -0,0 +1,5 @@
|
|||||||
|
apiVersion: v1
|
||||||
|
kind: ServiceAccount
|
||||||
|
metadata:
|
||||||
|
name: node-exporter
|
||||||
|
namespace: monitoring
|
@ -1,4 +1,4 @@
|
|||||||
apiVersion: rbac.authorization.k8s.io/v1beta1
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
kind: ClusterRoleBinding
|
kind: ClusterRoleBinding
|
||||||
metadata:
|
metadata:
|
||||||
name: prometheus
|
name: prometheus
|
||||||
@ -8,5 +8,5 @@ roleRef:
|
|||||||
name: prometheus
|
name: prometheus
|
||||||
subjects:
|
subjects:
|
||||||
- kind: ServiceAccount
|
- kind: ServiceAccount
|
||||||
name: default
|
name: prometheus
|
||||||
namespace: monitoring
|
namespace: monitoring
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
apiVersion: rbac.authorization.k8s.io/v1beta1
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
kind: ClusterRole
|
kind: ClusterRole
|
||||||
metadata:
|
metadata:
|
||||||
name: prometheus
|
name: prometheus
|
||||||
|
@ -4,8 +4,7 @@ metadata:
|
|||||||
name: prometheus-rules
|
name: prometheus-rules
|
||||||
namespace: monitoring
|
namespace: monitoring
|
||||||
data:
|
data:
|
||||||
# Rules adapted from those provided by coreos/prometheus-operator and SoundCloud
|
alertmanager.rules.yaml: |
|
||||||
alertmanager.rules.yaml: |+
|
|
||||||
groups:
|
groups:
|
||||||
- name: alertmanager.rules
|
- name: alertmanager.rules
|
||||||
rules:
|
rules:
|
||||||
@ -36,7 +35,7 @@ data:
|
|||||||
annotations:
|
annotations:
|
||||||
description: Reloading Alertmanager's configuration has failed for {{ $labels.namespace
|
description: Reloading Alertmanager's configuration has failed for {{ $labels.namespace
|
||||||
}}/{{ $labels.pod}}.
|
}}/{{ $labels.pod}}.
|
||||||
etcd3.rules.yaml: |+
|
etcd3.rules.yaml: |
|
||||||
groups:
|
groups:
|
||||||
- name: ./etcd3.rules
|
- name: ./etcd3.rules
|
||||||
rules:
|
rules:
|
||||||
@ -65,8 +64,8 @@ data:
|
|||||||
changes within the last hour
|
changes within the last hour
|
||||||
summary: a high number of leader changes within the etcd cluster are happening
|
summary: a high number of leader changes within the etcd cluster are happening
|
||||||
- alert: HighNumberOfFailedGRPCRequests
|
- alert: HighNumberOfFailedGRPCRequests
|
||||||
expr: sum(rate(etcd_grpc_requests_failed_total{job="etcd"}[5m])) BY (grpc_method)
|
expr: sum(rate(grpc_server_handled_total{grpc_code!="OK",job="etcd"}[5m])) BY (grpc_service, grpc_method)
|
||||||
/ sum(rate(etcd_grpc_total{job="etcd"}[5m])) BY (grpc_method) > 0.01
|
/ sum(rate(grpc_server_handled_total{job="etcd"}[5m])) BY (grpc_service, grpc_method) > 0.01
|
||||||
for: 10m
|
for: 10m
|
||||||
labels:
|
labels:
|
||||||
severity: warning
|
severity: warning
|
||||||
@ -75,8 +74,8 @@ data:
|
|||||||
on etcd instance {{ $labels.instance }}'
|
on etcd instance {{ $labels.instance }}'
|
||||||
summary: a high number of gRPC requests are failing
|
summary: a high number of gRPC requests are failing
|
||||||
- alert: HighNumberOfFailedGRPCRequests
|
- alert: HighNumberOfFailedGRPCRequests
|
||||||
expr: sum(rate(etcd_grpc_requests_failed_total{job="etcd"}[5m])) BY (grpc_method)
|
expr: sum(rate(grpc_server_handled_total{grpc_code!="OK",job="etcd"}[5m])) BY (grpc_service, grpc_method)
|
||||||
/ sum(rate(etcd_grpc_total{job="etcd"}[5m])) BY (grpc_method) > 0.05
|
/ sum(rate(grpc_server_handled_total{job="etcd"}[5m])) BY (grpc_service, grpc_method) > 0.05
|
||||||
for: 5m
|
for: 5m
|
||||||
labels:
|
labels:
|
||||||
severity: critical
|
severity: critical
|
||||||
@ -85,7 +84,7 @@ data:
|
|||||||
on etcd instance {{ $labels.instance }}'
|
on etcd instance {{ $labels.instance }}'
|
||||||
summary: a high number of gRPC requests are failing
|
summary: a high number of gRPC requests are failing
|
||||||
- alert: GRPCRequestsSlow
|
- alert: GRPCRequestsSlow
|
||||||
expr: histogram_quantile(0.99, rate(etcd_grpc_unary_requests_duration_seconds_bucket[5m]))
|
expr: histogram_quantile(0.99, sum(rate(grpc_server_handling_seconds_bucket{job="etcd",grpc_type="unary"}[5m])) by (grpc_service, grpc_method, le))
|
||||||
> 0.15
|
> 0.15
|
||||||
for: 10m
|
for: 10m
|
||||||
labels:
|
labels:
|
||||||
@ -125,7 +124,7 @@ data:
|
|||||||
}} are slow
|
}} are slow
|
||||||
summary: slow HTTP requests
|
summary: slow HTTP requests
|
||||||
- alert: EtcdMemberCommunicationSlow
|
- alert: EtcdMemberCommunicationSlow
|
||||||
expr: histogram_quantile(0.99, rate(etcd_network_member_round_trip_time_seconds_bucket[5m]))
|
expr: histogram_quantile(0.99, rate(etcd_network_peer_round_trip_time_seconds_bucket[5m]))
|
||||||
> 0.15
|
> 0.15
|
||||||
for: 10m
|
for: 10m
|
||||||
labels:
|
labels:
|
||||||
@ -160,7 +159,7 @@ data:
|
|||||||
annotations:
|
annotations:
|
||||||
description: etcd instance {{ $labels.instance }} commit durations are high
|
description: etcd instance {{ $labels.instance }} commit durations are high
|
||||||
summary: high commit durations
|
summary: high commit durations
|
||||||
general.rules.yaml: |+
|
general.rules.yaml: |
|
||||||
groups:
|
groups:
|
||||||
- name: general.rules
|
- name: general.rules
|
||||||
rules:
|
rules:
|
||||||
@ -192,12 +191,12 @@ data:
|
|||||||
description: '{{ $labels.job }}: {{ $labels.namespace }}/{{ $labels.pod }} instance
|
description: '{{ $labels.job }}: {{ $labels.namespace }}/{{ $labels.pod }} instance
|
||||||
will exhaust in file/socket descriptors within the next hour'
|
will exhaust in file/socket descriptors within the next hour'
|
||||||
summary: file descriptors soon exhausted
|
summary: file descriptors soon exhausted
|
||||||
kube-controller-manager.rules.yaml: |+
|
kube-controller-manager.rules.yaml: |
|
||||||
groups:
|
groups:
|
||||||
- name: kube-controller-manager.rules
|
- name: kube-controller-manager.rules
|
||||||
rules:
|
rules:
|
||||||
- alert: K8SControllerManagerDown
|
- alert: K8SControllerManagerDown
|
||||||
expr: absent(up{kubernetes_name="kube-controller-manager"} == 1)
|
expr: absent(up{job="kube-controller-manager"} == 1)
|
||||||
for: 5m
|
for: 5m
|
||||||
labels:
|
labels:
|
||||||
severity: critical
|
severity: critical
|
||||||
@ -205,7 +204,7 @@ data:
|
|||||||
description: There is no running K8S controller manager. Deployments and replication
|
description: There is no running K8S controller manager. Deployments and replication
|
||||||
controllers are not making progress.
|
controllers are not making progress.
|
||||||
summary: Controller manager is down
|
summary: Controller manager is down
|
||||||
kube-scheduler.rules.yaml: |+
|
kube-scheduler.rules.yaml: |
|
||||||
groups:
|
groups:
|
||||||
- name: kube-scheduler.rules
|
- name: kube-scheduler.rules
|
||||||
rules:
|
rules:
|
||||||
@ -255,7 +254,7 @@ data:
|
|||||||
labels:
|
labels:
|
||||||
quantile: "0.5"
|
quantile: "0.5"
|
||||||
- alert: K8SSchedulerDown
|
- alert: K8SSchedulerDown
|
||||||
expr: absent(up{kubernetes_name="kube-scheduler"} == 1)
|
expr: absent(up{job="kube-scheduler"} == 1)
|
||||||
for: 5m
|
for: 5m
|
||||||
labels:
|
labels:
|
||||||
severity: critical
|
severity: critical
|
||||||
@ -263,7 +262,7 @@ data:
|
|||||||
description: There is no running K8S scheduler. New pods are not being assigned
|
description: There is no running K8S scheduler. New pods are not being assigned
|
||||||
to nodes.
|
to nodes.
|
||||||
summary: Scheduler is down
|
summary: Scheduler is down
|
||||||
kube-state-metrics.rules.yaml: |+
|
kube-state-metrics.rules.yaml: |
|
||||||
groups:
|
groups:
|
||||||
- name: kube-state-metrics.rules
|
- name: kube-state-metrics.rules
|
||||||
rules:
|
rules:
|
||||||
@ -274,7 +273,8 @@ data:
|
|||||||
severity: warning
|
severity: warning
|
||||||
annotations:
|
annotations:
|
||||||
description: Observed deployment generation does not match expected one for
|
description: Observed deployment generation does not match expected one for
|
||||||
deployment {{$labels.namespaces}}{{$labels.deployment}}
|
deployment {{$labels.namespaces}}/{{$labels.deployment}}
|
||||||
|
summary: Deployment is outdated
|
||||||
- alert: DeploymentReplicasNotUpdated
|
- alert: DeploymentReplicasNotUpdated
|
||||||
expr: ((kube_deployment_status_replicas_updated != kube_deployment_spec_replicas)
|
expr: ((kube_deployment_status_replicas_updated != kube_deployment_spec_replicas)
|
||||||
or (kube_deployment_status_replicas_available != kube_deployment_spec_replicas))
|
or (kube_deployment_status_replicas_available != kube_deployment_spec_replicas))
|
||||||
@ -284,8 +284,9 @@ data:
|
|||||||
severity: warning
|
severity: warning
|
||||||
annotations:
|
annotations:
|
||||||
description: Replicas are not updated and available for deployment {{$labels.namespaces}}/{{$labels.deployment}}
|
description: Replicas are not updated and available for deployment {{$labels.namespaces}}/{{$labels.deployment}}
|
||||||
|
summary: Deployment replicas are outdated
|
||||||
- alert: DaemonSetRolloutStuck
|
- alert: DaemonSetRolloutStuck
|
||||||
expr: kube_daemonset_status_current_number_ready / kube_daemonset_status_desired_number_scheduled
|
expr: kube_daemonset_status_number_ready / kube_daemonset_status_desired_number_scheduled
|
||||||
* 100 < 100
|
* 100 < 100
|
||||||
for: 15m
|
for: 15m
|
||||||
labels:
|
labels:
|
||||||
@ -293,6 +294,7 @@ data:
|
|||||||
annotations:
|
annotations:
|
||||||
description: Only {{$value}}% of desired pods scheduled and ready for daemon
|
description: Only {{$value}}% of desired pods scheduled and ready for daemon
|
||||||
set {{$labels.namespaces}}/{{$labels.daemonset}}
|
set {{$labels.namespaces}}/{{$labels.daemonset}}
|
||||||
|
summary: DaemonSet is missing pods
|
||||||
- alert: K8SDaemonSetsNotScheduled
|
- alert: K8SDaemonSetsNotScheduled
|
||||||
expr: kube_daemonset_status_desired_number_scheduled - kube_daemonset_status_current_number_scheduled
|
expr: kube_daemonset_status_desired_number_scheduled - kube_daemonset_status_current_number_scheduled
|
||||||
> 0
|
> 0
|
||||||
@ -312,14 +314,15 @@ data:
|
|||||||
to run.
|
to run.
|
||||||
summary: Daemonsets are not scheduled correctly
|
summary: Daemonsets are not scheduled correctly
|
||||||
- alert: PodFrequentlyRestarting
|
- alert: PodFrequentlyRestarting
|
||||||
expr: increase(kube_pod_container_status_restarts[1h]) > 5
|
expr: increase(kube_pod_container_status_restarts_total[1h]) > 5
|
||||||
for: 10m
|
for: 10m
|
||||||
labels:
|
labels:
|
||||||
severity: warning
|
severity: warning
|
||||||
annotations:
|
annotations:
|
||||||
description: Pod {{$labels.namespaces}}/{{$labels.pod}} is was restarted {{$value}}
|
description: Pod {{$labels.namespaces}}/{{$labels.pod}} is was restarted {{$value}}
|
||||||
times within the last hour
|
times within the last hour
|
||||||
kubelet.rules.yaml: |+
|
summary: Pod is restarting frequently
|
||||||
|
kubelet.rules.yaml: |
|
||||||
groups:
|
groups:
|
||||||
- name: kubelet.rules
|
- name: kubelet.rules
|
||||||
rules:
|
rules:
|
||||||
@ -342,15 +345,15 @@ data:
|
|||||||
annotations:
|
annotations:
|
||||||
description: '{{ $value }}% of Kubernetes nodes are not ready'
|
description: '{{ $value }}% of Kubernetes nodes are not ready'
|
||||||
- alert: K8SKubeletDown
|
- alert: K8SKubeletDown
|
||||||
expr: count(up{job="kubernetes-nodes"} == 0) / count(up{job="kubernetes-nodes"}) * 100 > 3
|
expr: count(up{job="kubelet"} == 0) / count(up{job="kubelet"}) * 100 > 3
|
||||||
for: 1h
|
for: 1h
|
||||||
labels:
|
labels:
|
||||||
severity: warning
|
severity: warning
|
||||||
annotations:
|
annotations:
|
||||||
description: Prometheus failed to scrape {{ $value }}% of kubelets.
|
description: Prometheus failed to scrape {{ $value }}% of kubelets.
|
||||||
- alert: K8SKubeletDown
|
- alert: K8SKubeletDown
|
||||||
expr: (absent(up{job="kubernetes-nodes"} == 1) or count(up{job="kubernetes-nodes"} == 0) / count(up{job="kubernetes-nodes"}))
|
expr: (absent(up{job="kubelet"} == 1) or count(up{job="kubelet"} == 0) / count(up{job="kubelet"}))
|
||||||
* 100 > 1
|
* 100 > 10
|
||||||
for: 1h
|
for: 1h
|
||||||
labels:
|
labels:
|
||||||
severity: critical
|
severity: critical
|
||||||
@ -367,7 +370,7 @@ data:
|
|||||||
description: Kubelet {{$labels.instance}} is running {{$value}} pods, close
|
description: Kubelet {{$labels.instance}} is running {{$value}} pods, close
|
||||||
to the limit of 110
|
to the limit of 110
|
||||||
summary: Kubelet is close to pod limit
|
summary: Kubelet is close to pod limit
|
||||||
kubernetes.rules.yaml: |+
|
kubernetes.rules.yaml: |
|
||||||
groups:
|
groups:
|
||||||
- name: kubernetes.rules
|
- name: kubernetes.rules
|
||||||
rules:
|
rules:
|
||||||
@ -447,14 +450,28 @@ data:
|
|||||||
annotations:
|
annotations:
|
||||||
description: API server returns errors for {{ $value }}% of requests
|
description: API server returns errors for {{ $value }}% of requests
|
||||||
- alert: K8SApiserverDown
|
- alert: K8SApiserverDown
|
||||||
expr: absent(up{job="kubernetes-apiservers"} == 1)
|
expr: absent(up{job="apiserver"} == 1)
|
||||||
for: 20m
|
for: 20m
|
||||||
labels:
|
labels:
|
||||||
severity: critical
|
severity: critical
|
||||||
annotations:
|
annotations:
|
||||||
description: No API servers are reachable or all have disappeared from service
|
description: No API servers are reachable or all have disappeared from service
|
||||||
discovery
|
discovery
|
||||||
node.rules.yaml: |+
|
|
||||||
|
- alert: K8sCertificateExpirationNotice
|
||||||
|
labels:
|
||||||
|
severity: warning
|
||||||
|
annotations:
|
||||||
|
description: Kubernetes API Certificate is expiring soon (less than 7 days)
|
||||||
|
expr: sum(apiserver_client_certificate_expiration_seconds_bucket{le="604800"}) > 0
|
||||||
|
|
||||||
|
- alert: K8sCertificateExpirationNotice
|
||||||
|
labels:
|
||||||
|
severity: critical
|
||||||
|
annotations:
|
||||||
|
description: Kubernetes API Certificate is expiring in less than 1 day
|
||||||
|
expr: sum(apiserver_client_certificate_expiration_seconds_bucket{le="86400"}) > 0
|
||||||
|
node.rules.yaml: |
|
||||||
groups:
|
groups:
|
||||||
- name: node.rules
|
- name: node.rules
|
||||||
rules:
|
rules:
|
||||||
@ -476,7 +493,7 @@ data:
|
|||||||
- record: cluster:node_cpu:ratio
|
- record: cluster:node_cpu:ratio
|
||||||
expr: cluster:node_cpu:rate5m / count(sum(node_cpu) BY (instance, cpu))
|
expr: cluster:node_cpu:rate5m / count(sum(node_cpu) BY (instance, cpu))
|
||||||
- alert: NodeExporterDown
|
- alert: NodeExporterDown
|
||||||
expr: absent(up{kubernetes_name="node-exporter"} == 1)
|
expr: absent(up{job="node-exporter"} == 1)
|
||||||
for: 10m
|
for: 10m
|
||||||
labels:
|
labels:
|
||||||
severity: warning
|
severity: warning
|
||||||
@ -499,7 +516,7 @@ data:
|
|||||||
annotations:
|
annotations:
|
||||||
description: device {{$labels.device}} on node {{$labels.instance}} is running
|
description: device {{$labels.device}} on node {{$labels.instance}} is running
|
||||||
full within the next 2 hours (mounted at {{$labels.mountpoint}})
|
full within the next 2 hours (mounted at {{$labels.mountpoint}})
|
||||||
prometheus.rules.yaml: |+
|
prometheus.rules.yaml: |
|
||||||
groups:
|
groups:
|
||||||
- name: prometheus.rules
|
- name: prometheus.rules
|
||||||
rules:
|
rules:
|
||||||
@ -544,3 +561,38 @@ data:
|
|||||||
annotations:
|
annotations:
|
||||||
description: Prometheus {{ $labels.namespace }}/{{ $labels.pod}} is not connected
|
description: Prometheus {{ $labels.namespace }}/{{ $labels.pod}} is not connected
|
||||||
to any Alertmanagers
|
to any Alertmanagers
|
||||||
|
- alert: PrometheusTSDBReloadsFailing
|
||||||
|
expr: increase(prometheus_tsdb_reloads_failures_total[2h]) > 0
|
||||||
|
for: 12h
|
||||||
|
labels:
|
||||||
|
severity: warning
|
||||||
|
annotations:
|
||||||
|
description: '{{$labels.job}} at {{$labels.instance}} had {{$value | humanize}}
|
||||||
|
reload failures over the last four hours.'
|
||||||
|
summary: Prometheus has issues reloading data blocks from disk
|
||||||
|
- alert: PrometheusTSDBCompactionsFailing
|
||||||
|
expr: increase(prometheus_tsdb_compactions_failed_total[2h]) > 0
|
||||||
|
for: 12h
|
||||||
|
labels:
|
||||||
|
severity: warning
|
||||||
|
annotations:
|
||||||
|
description: '{{$labels.job}} at {{$labels.instance}} had {{$value | humanize}}
|
||||||
|
compaction failures over the last four hours.'
|
||||||
|
summary: Prometheus has issues compacting sample blocks
|
||||||
|
- alert: PrometheusTSDBWALCorruptions
|
||||||
|
expr: tsdb_wal_corruptions_total > 0
|
||||||
|
for: 4h
|
||||||
|
labels:
|
||||||
|
severity: warning
|
||||||
|
annotations:
|
||||||
|
description: '{{$labels.job}} at {{$labels.instance}} has a corrupted write-ahead
|
||||||
|
log (WAL).'
|
||||||
|
summary: Prometheus write-ahead log is corrupted
|
||||||
|
- alert: PrometheusNotIngestingSamples
|
||||||
|
expr: rate(prometheus_tsdb_head_samples_appended_total[5m]) <= 0
|
||||||
|
for: 10m
|
||||||
|
labels:
|
||||||
|
severity: warning
|
||||||
|
annotations:
|
||||||
|
description: "Prometheus {{ $labels.namespace }}/{{ $labels.pod}} isn't ingesting samples."
|
||||||
|
summary: "Prometheus isn't ingesting samples"
|
||||||
|
5
addons/prometheus/service-account.yaml
Normal file
5
addons/prometheus/service-account.yaml
Normal file
@ -0,0 +1,5 @@
|
|||||||
|
apiVersion: v1
|
||||||
|
kind: ServiceAccount
|
||||||
|
metadata:
|
||||||
|
name: prometheus
|
||||||
|
namespace: monitoring
|
@ -3,6 +3,8 @@ kind: Service
|
|||||||
metadata:
|
metadata:
|
||||||
name: prometheus
|
name: prometheus
|
||||||
namespace: monitoring
|
namespace: monitoring
|
||||||
|
annotations:
|
||||||
|
prometheus.io/scrape: 'true'
|
||||||
spec:
|
spec:
|
||||||
type: ClusterIP
|
type: ClusterIP
|
||||||
selector:
|
selector:
|
||||||
|
@ -11,10 +11,11 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
|
|||||||
|
|
||||||
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
|
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
|
||||||
|
|
||||||
* Kubernetes v1.9.1 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
|
* Kubernetes v1.9.4 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
|
||||||
* Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
|
* Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
|
||||||
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
|
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
|
||||||
* Ready for Ingress, Dashboards, Metrics, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
|
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/)
|
||||||
|
* Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
|
||||||
|
|
||||||
## Docs
|
## Docs
|
||||||
|
|
||||||
|
69
aws/container-linux/kubernetes/apiserver.tf
Normal file
69
aws/container-linux/kubernetes/apiserver.tf
Normal file
@ -0,0 +1,69 @@
|
|||||||
|
# kube-apiserver Network Load Balancer DNS Record
|
||||||
|
resource "aws_route53_record" "apiserver" {
|
||||||
|
zone_id = "${var.dns_zone_id}"
|
||||||
|
|
||||||
|
name = "${format("%s.%s.", var.cluster_name, var.dns_zone)}"
|
||||||
|
type = "A"
|
||||||
|
|
||||||
|
# AWS recommends their special "alias" records for ELBs
|
||||||
|
alias {
|
||||||
|
name = "${aws_lb.apiserver.dns_name}"
|
||||||
|
zone_id = "${aws_lb.apiserver.zone_id}"
|
||||||
|
evaluate_target_health = true
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
# Network Load Balancer for apiservers
|
||||||
|
resource "aws_lb" "apiserver" {
|
||||||
|
name = "${var.cluster_name}-apiserver"
|
||||||
|
load_balancer_type = "network"
|
||||||
|
internal = false
|
||||||
|
|
||||||
|
subnets = ["${aws_subnet.public.*.id}"]
|
||||||
|
|
||||||
|
enable_cross_zone_load_balancing = true
|
||||||
|
}
|
||||||
|
|
||||||
|
# Forward HTTP traffic to controllers
|
||||||
|
resource "aws_lb_listener" "apiserver-https" {
|
||||||
|
load_balancer_arn = "${aws_lb.apiserver.arn}"
|
||||||
|
protocol = "TCP"
|
||||||
|
port = "443"
|
||||||
|
|
||||||
|
default_action {
|
||||||
|
type = "forward"
|
||||||
|
target_group_arn = "${aws_lb_target_group.controllers.arn}"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
# Target group of controllers
|
||||||
|
resource "aws_lb_target_group" "controllers" {
|
||||||
|
name = "${var.cluster_name}-controllers"
|
||||||
|
vpc_id = "${aws_vpc.network.id}"
|
||||||
|
target_type = "instance"
|
||||||
|
|
||||||
|
protocol = "TCP"
|
||||||
|
port = 443
|
||||||
|
|
||||||
|
# Kubelet HTTP health check
|
||||||
|
health_check {
|
||||||
|
protocol = "TCP"
|
||||||
|
port = 443
|
||||||
|
|
||||||
|
# NLBs required to use same healthy and unhealthy thresholds
|
||||||
|
healthy_threshold = 3
|
||||||
|
unhealthy_threshold = 3
|
||||||
|
|
||||||
|
# Interval between health checks required to be 10 or 30
|
||||||
|
interval = 10
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
# Attach controller instances to apiserver NLB
|
||||||
|
resource "aws_lb_target_group_attachment" "controllers" {
|
||||||
|
count = "${var.controller_count}"
|
||||||
|
|
||||||
|
target_group_arn = "${aws_lb_target_group.controllers.arn}"
|
||||||
|
target_id = "${element(aws_instance.controllers.*.id, count.index)}"
|
||||||
|
port = 443
|
||||||
|
}
|
@ -1,6 +1,6 @@
|
|||||||
# Self-hosted Kubernetes assets (kubeconfig, manifests)
|
# Self-hosted Kubernetes assets (kubeconfig, manifests)
|
||||||
module "bootkube" {
|
module "bootkube" {
|
||||||
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=b83e321b350ac549c45ed6a05ffd8683336fb9f4"
|
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=c5fc93d95fe4993511656cdd6372afbd1307f08f"
|
||||||
|
|
||||||
cluster_name = "${var.cluster_name}"
|
cluster_name = "${var.cluster_name}"
|
||||||
api_servers = ["${format("%s.%s", var.cluster_name, var.dns_zone)}"]
|
api_servers = ["${format("%s.%s", var.cluster_name, var.dns_zone)}"]
|
||||||
|
@ -7,7 +7,7 @@ systemd:
|
|||||||
- name: 40-etcd-cluster.conf
|
- name: 40-etcd-cluster.conf
|
||||||
contents: |
|
contents: |
|
||||||
[Service]
|
[Service]
|
||||||
Environment="ETCD_IMAGE_TAG=v3.2.13"
|
Environment="ETCD_IMAGE_TAG=v3.3.2"
|
||||||
Environment="ETCD_NAME=${etcd_name}"
|
Environment="ETCD_NAME=${etcd_name}"
|
||||||
Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${etcd_domain}:2379"
|
Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${etcd_domain}:2379"
|
||||||
Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${etcd_domain}:2380"
|
Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${etcd_domain}:2380"
|
||||||
@ -66,6 +66,7 @@ systemd:
|
|||||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/checkpoint-secrets
|
ExecStartPre=/bin/mkdir -p /etc/kubernetes/checkpoint-secrets
|
||||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/inactive-manifests
|
ExecStartPre=/bin/mkdir -p /etc/kubernetes/inactive-manifests
|
||||||
ExecStartPre=/bin/mkdir -p /var/lib/cni
|
ExecStartPre=/bin/mkdir -p /var/lib/cni
|
||||||
|
ExecStartPre=/bin/mkdir -p /var/lib/kubelet/volumeplugins
|
||||||
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
|
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
|
||||||
ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/cache/kubelet-pod.uuid
|
ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/cache/kubelet-pod.uuid
|
||||||
ExecStart=/usr/lib/coreos/kubelet-wrapper \
|
ExecStart=/usr/lib/coreos/kubelet-wrapper \
|
||||||
@ -81,7 +82,8 @@ systemd:
|
|||||||
--network-plugin=cni \
|
--network-plugin=cni \
|
||||||
--node-labels=node-role.kubernetes.io/master \
|
--node-labels=node-role.kubernetes.io/master \
|
||||||
--pod-manifest-path=/etc/kubernetes/manifests \
|
--pod-manifest-path=/etc/kubernetes/manifests \
|
||||||
--register-with-taints=node-role.kubernetes.io/master=:NoSchedule
|
--register-with-taints=node-role.kubernetes.io/master=:NoSchedule \
|
||||||
|
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
|
||||||
ExecStop=-/usr/bin/rkt stop --uuid-file=/var/cache/kubelet-pod.uuid
|
ExecStop=-/usr/bin/rkt stop --uuid-file=/var/cache/kubelet-pod.uuid
|
||||||
Restart=always
|
Restart=always
|
||||||
RestartSec=10
|
RestartSec=10
|
||||||
@ -107,29 +109,14 @@ storage:
|
|||||||
mode: 0644
|
mode: 0644
|
||||||
contents:
|
contents:
|
||||||
inline: |
|
inline: |
|
||||||
apiVersion: v1
|
${kubeconfig}
|
||||||
kind: Config
|
|
||||||
clusters:
|
|
||||||
- name: local
|
|
||||||
cluster:
|
|
||||||
server: ${kubeconfig_server}
|
|
||||||
certificate-authority-data: ${kubeconfig_ca_cert}
|
|
||||||
users:
|
|
||||||
- name: kubelet
|
|
||||||
user:
|
|
||||||
client-certificate-data: ${kubeconfig_kubelet_cert}
|
|
||||||
client-key-data: ${kubeconfig_kubelet_key}
|
|
||||||
contexts:
|
|
||||||
- context:
|
|
||||||
cluster: local
|
|
||||||
user: kubelet
|
|
||||||
- path: /etc/kubernetes/kubelet.env
|
- path: /etc/kubernetes/kubelet.env
|
||||||
filesystem: root
|
filesystem: root
|
||||||
mode: 0644
|
mode: 0644
|
||||||
contents:
|
contents:
|
||||||
inline: |
|
inline: |
|
||||||
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
|
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
|
||||||
KUBELET_IMAGE_TAG=v1.9.1
|
KUBELET_IMAGE_TAG=v1.9.4
|
||||||
- path: /etc/sysctl.d/max-user-watches.conf
|
- path: /etc/sysctl.d/max-user-watches.conf
|
||||||
filesystem: root
|
filesystem: root
|
||||||
contents:
|
contents:
|
||||||
@ -150,7 +137,7 @@ storage:
|
|||||||
# Move experimental manifests
|
# Move experimental manifests
|
||||||
[ -n "$(ls /opt/bootkube/assets/manifests-*/* 2>/dev/null)" ] && mv /opt/bootkube/assets/manifests-*/* /opt/bootkube/assets/manifests && rm -rf /opt/bootkube/assets/manifests-*
|
[ -n "$(ls /opt/bootkube/assets/manifests-*/* 2>/dev/null)" ] && mv /opt/bootkube/assets/manifests-*/* /opt/bootkube/assets/manifests && rm -rf /opt/bootkube/assets/manifests-*
|
||||||
BOOTKUBE_ACI="$${BOOTKUBE_ACI:-quay.io/coreos/bootkube}"
|
BOOTKUBE_ACI="$${BOOTKUBE_ACI:-quay.io/coreos/bootkube}"
|
||||||
BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.9.1}"
|
BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.11.0}"
|
||||||
BOOTKUBE_ASSETS="$${BOOTKUBE_ASSETS:-/opt/bootkube/assets}"
|
BOOTKUBE_ASSETS="$${BOOTKUBE_ASSETS:-/opt/bootkube/assets}"
|
||||||
exec /usr/bin/rkt run \
|
exec /usr/bin/rkt run \
|
||||||
--trust-keys-from-https \
|
--trust-keys-from-https \
|
||||||
|
@ -36,6 +36,10 @@ resource "aws_instance" "controllers" {
|
|||||||
associate_public_ip_address = true
|
associate_public_ip_address = true
|
||||||
subnet_id = "${element(aws_subnet.public.*.id, count.index)}"
|
subnet_id = "${element(aws_subnet.public.*.id, count.index)}"
|
||||||
vpc_security_group_ids = ["${aws_security_group.controller.id}"]
|
vpc_security_group_ids = ["${aws_security_group.controller.id}"]
|
||||||
|
|
||||||
|
lifecycle {
|
||||||
|
ignore_changes = ["ami"]
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
# Controller Container Linux Config
|
# Controller Container Linux Config
|
||||||
@ -52,13 +56,10 @@ data "template_file" "controller_config" {
|
|||||||
# etcd0=https://cluster-etcd0.example.com,etcd1=https://cluster-etcd1.example.com,...
|
# etcd0=https://cluster-etcd0.example.com,etcd1=https://cluster-etcd1.example.com,...
|
||||||
etcd_initial_cluster = "${join(",", formatlist("%s=https://%s:2380", null_resource.repeat.*.triggers.name, null_resource.repeat.*.triggers.domain))}"
|
etcd_initial_cluster = "${join(",", formatlist("%s=https://%s:2380", null_resource.repeat.*.triggers.name, null_resource.repeat.*.triggers.domain))}"
|
||||||
|
|
||||||
k8s_dns_service_ip = "${cidrhost(var.service_cidr, 10)}"
|
k8s_dns_service_ip = "${cidrhost(var.service_cidr, 10)}"
|
||||||
ssh_authorized_key = "${var.ssh_authorized_key}"
|
ssh_authorized_key = "${var.ssh_authorized_key}"
|
||||||
cluster_domain_suffix = "${var.cluster_domain_suffix}"
|
cluster_domain_suffix = "${var.cluster_domain_suffix}"
|
||||||
kubeconfig_ca_cert = "${module.bootkube.ca_cert}"
|
kubeconfig = "${indent(10, module.bootkube.kubeconfig)}"
|
||||||
kubeconfig_kubelet_cert = "${module.bootkube.kubelet_cert}"
|
|
||||||
kubeconfig_kubelet_key = "${module.bootkube.kubelet_key}"
|
|
||||||
kubeconfig_server = "${module.bootkube.server}"
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -78,185 +79,3 @@ data "ct_config" "controller_ign" {
|
|||||||
content = "${element(data.template_file.controller_config.*.rendered, count.index)}"
|
content = "${element(data.template_file.controller_config.*.rendered, count.index)}"
|
||||||
pretty_print = false
|
pretty_print = false
|
||||||
}
|
}
|
||||||
|
|
||||||
# Security Group (instance firewall)
|
|
||||||
|
|
||||||
resource "aws_security_group" "controller" {
|
|
||||||
name = "${var.cluster_name}-controller"
|
|
||||||
description = "${var.cluster_name} controller security group"
|
|
||||||
|
|
||||||
vpc_id = "${aws_vpc.network.id}"
|
|
||||||
|
|
||||||
tags = "${map("Name", "${var.cluster_name}-controller")}"
|
|
||||||
}
|
|
||||||
|
|
||||||
resource "aws_security_group_rule" "controller-icmp" {
|
|
||||||
security_group_id = "${aws_security_group.controller.id}"
|
|
||||||
|
|
||||||
type = "ingress"
|
|
||||||
protocol = "icmp"
|
|
||||||
from_port = 0
|
|
||||||
to_port = 0
|
|
||||||
cidr_blocks = ["0.0.0.0/0"]
|
|
||||||
}
|
|
||||||
|
|
||||||
resource "aws_security_group_rule" "controller-ssh" {
|
|
||||||
security_group_id = "${aws_security_group.controller.id}"
|
|
||||||
|
|
||||||
type = "ingress"
|
|
||||||
protocol = "tcp"
|
|
||||||
from_port = 22
|
|
||||||
to_port = 22
|
|
||||||
cidr_blocks = ["0.0.0.0/0"]
|
|
||||||
}
|
|
||||||
|
|
||||||
resource "aws_security_group_rule" "controller-apiserver" {
|
|
||||||
security_group_id = "${aws_security_group.controller.id}"
|
|
||||||
|
|
||||||
type = "ingress"
|
|
||||||
protocol = "tcp"
|
|
||||||
from_port = 443
|
|
||||||
to_port = 443
|
|
||||||
cidr_blocks = ["0.0.0.0/0"]
|
|
||||||
}
|
|
||||||
|
|
||||||
resource "aws_security_group_rule" "controller-etcd" {
|
|
||||||
security_group_id = "${aws_security_group.controller.id}"
|
|
||||||
|
|
||||||
type = "ingress"
|
|
||||||
protocol = "tcp"
|
|
||||||
from_port = 2379
|
|
||||||
to_port = 2380
|
|
||||||
self = true
|
|
||||||
}
|
|
||||||
|
|
||||||
resource "aws_security_group_rule" "controller-flannel" {
|
|
||||||
security_group_id = "${aws_security_group.controller.id}"
|
|
||||||
|
|
||||||
type = "ingress"
|
|
||||||
protocol = "udp"
|
|
||||||
from_port = 8472
|
|
||||||
to_port = 8472
|
|
||||||
source_security_group_id = "${aws_security_group.worker.id}"
|
|
||||||
}
|
|
||||||
|
|
||||||
resource "aws_security_group_rule" "controller-flannel-self" {
|
|
||||||
security_group_id = "${aws_security_group.controller.id}"
|
|
||||||
|
|
||||||
type = "ingress"
|
|
||||||
protocol = "udp"
|
|
||||||
from_port = 8472
|
|
||||||
to_port = 8472
|
|
||||||
self = true
|
|
||||||
}
|
|
||||||
|
|
||||||
resource "aws_security_group_rule" "controller-node-exporter" {
|
|
||||||
security_group_id = "${aws_security_group.controller.id}"
|
|
||||||
|
|
||||||
type = "ingress"
|
|
||||||
protocol = "tcp"
|
|
||||||
from_port = 9100
|
|
||||||
to_port = 9100
|
|
||||||
source_security_group_id = "${aws_security_group.worker.id}"
|
|
||||||
}
|
|
||||||
|
|
||||||
resource "aws_security_group_rule" "controller-kubelet-self" {
|
|
||||||
security_group_id = "${aws_security_group.controller.id}"
|
|
||||||
|
|
||||||
type = "ingress"
|
|
||||||
protocol = "tcp"
|
|
||||||
from_port = 10250
|
|
||||||
to_port = 10250
|
|
||||||
self = true
|
|
||||||
}
|
|
||||||
|
|
||||||
resource "aws_security_group_rule" "controller-kubelet-read" {
|
|
||||||
security_group_id = "${aws_security_group.controller.id}"
|
|
||||||
|
|
||||||
type = "ingress"
|
|
||||||
protocol = "tcp"
|
|
||||||
from_port = 10255
|
|
||||||
to_port = 10255
|
|
||||||
source_security_group_id = "${aws_security_group.worker.id}"
|
|
||||||
}
|
|
||||||
|
|
||||||
resource "aws_security_group_rule" "controller-kubelet-read-self" {
|
|
||||||
security_group_id = "${aws_security_group.controller.id}"
|
|
||||||
|
|
||||||
type = "ingress"
|
|
||||||
protocol = "tcp"
|
|
||||||
from_port = 10255
|
|
||||||
to_port = 10255
|
|
||||||
self = true
|
|
||||||
}
|
|
||||||
|
|
||||||
resource "aws_security_group_rule" "controller-bgp" {
|
|
||||||
security_group_id = "${aws_security_group.controller.id}"
|
|
||||||
|
|
||||||
type = "ingress"
|
|
||||||
protocol = "tcp"
|
|
||||||
from_port = 179
|
|
||||||
to_port = 179
|
|
||||||
source_security_group_id = "${aws_security_group.worker.id}"
|
|
||||||
}
|
|
||||||
|
|
||||||
resource "aws_security_group_rule" "controller-bgp-self" {
|
|
||||||
security_group_id = "${aws_security_group.controller.id}"
|
|
||||||
|
|
||||||
type = "ingress"
|
|
||||||
protocol = "tcp"
|
|
||||||
from_port = 179
|
|
||||||
to_port = 179
|
|
||||||
self = true
|
|
||||||
}
|
|
||||||
|
|
||||||
resource "aws_security_group_rule" "controller-ipip" {
|
|
||||||
security_group_id = "${aws_security_group.controller.id}"
|
|
||||||
|
|
||||||
type = "ingress"
|
|
||||||
protocol = 4
|
|
||||||
from_port = 0
|
|
||||||
to_port = 0
|
|
||||||
source_security_group_id = "${aws_security_group.worker.id}"
|
|
||||||
}
|
|
||||||
|
|
||||||
resource "aws_security_group_rule" "controller-ipip-self" {
|
|
||||||
security_group_id = "${aws_security_group.controller.id}"
|
|
||||||
|
|
||||||
type = "ingress"
|
|
||||||
protocol = 4
|
|
||||||
from_port = 0
|
|
||||||
to_port = 0
|
|
||||||
self = true
|
|
||||||
}
|
|
||||||
|
|
||||||
resource "aws_security_group_rule" "controller-ipip-legacy" {
|
|
||||||
security_group_id = "${aws_security_group.controller.id}"
|
|
||||||
|
|
||||||
type = "ingress"
|
|
||||||
protocol = 94
|
|
||||||
from_port = 0
|
|
||||||
to_port = 0
|
|
||||||
source_security_group_id = "${aws_security_group.worker.id}"
|
|
||||||
}
|
|
||||||
|
|
||||||
resource "aws_security_group_rule" "controller-ipip-legacy-self" {
|
|
||||||
security_group_id = "${aws_security_group.controller.id}"
|
|
||||||
|
|
||||||
type = "ingress"
|
|
||||||
protocol = 94
|
|
||||||
from_port = 0
|
|
||||||
to_port = 0
|
|
||||||
self = true
|
|
||||||
}
|
|
||||||
|
|
||||||
resource "aws_security_group_rule" "controller-egress" {
|
|
||||||
security_group_id = "${aws_security_group.controller.id}"
|
|
||||||
|
|
||||||
type = "egress"
|
|
||||||
protocol = "-1"
|
|
||||||
from_port = 0
|
|
||||||
to_port = 0
|
|
||||||
cidr_blocks = ["0.0.0.0/0"]
|
|
||||||
ipv6_cidr_blocks = ["::/0"]
|
|
||||||
}
|
|
||||||
|
@ -1,43 +0,0 @@
|
|||||||
# kube-apiserver Network Load Balancer DNS Record
|
|
||||||
resource "aws_route53_record" "apiserver" {
|
|
||||||
zone_id = "${var.dns_zone_id}"
|
|
||||||
|
|
||||||
name = "${format("%s.%s.", var.cluster_name, var.dns_zone)}"
|
|
||||||
type = "A"
|
|
||||||
|
|
||||||
# AWS recommends their special "alias" records for ELBs
|
|
||||||
alias {
|
|
||||||
name = "${aws_elb.apiserver.dns_name}"
|
|
||||||
zone_id = "${aws_elb.apiserver.zone_id}"
|
|
||||||
evaluate_target_health = true
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
# Controller Network Load Balancer
|
|
||||||
resource "aws_elb" "apiserver" {
|
|
||||||
name = "${var.cluster_name}-apiserver"
|
|
||||||
subnets = ["${aws_subnet.public.*.id}"]
|
|
||||||
security_groups = ["${aws_security_group.controller.id}"]
|
|
||||||
|
|
||||||
listener {
|
|
||||||
lb_port = 443
|
|
||||||
lb_protocol = "tcp"
|
|
||||||
instance_port = 443
|
|
||||||
instance_protocol = "tcp"
|
|
||||||
}
|
|
||||||
|
|
||||||
instances = ["${aws_instance.controllers.*.id}"]
|
|
||||||
|
|
||||||
# Kubelet HTTP health check
|
|
||||||
health_check {
|
|
||||||
target = "SSL:443"
|
|
||||||
healthy_threshold = 2
|
|
||||||
unhealthy_threshold = 4
|
|
||||||
timeout = 5
|
|
||||||
interval = 6
|
|
||||||
}
|
|
||||||
|
|
||||||
idle_timeout = 3600
|
|
||||||
connection_draining = true
|
|
||||||
connection_draining_timeout = 300
|
|
||||||
}
|
|
@ -1,32 +0,0 @@
|
|||||||
# Ingress Network Load Balancer
|
|
||||||
resource "aws_elb" "ingress" {
|
|
||||||
name = "${var.cluster_name}-ingress"
|
|
||||||
subnets = ["${aws_subnet.public.*.id}"]
|
|
||||||
security_groups = ["${aws_security_group.worker.id}"]
|
|
||||||
|
|
||||||
listener {
|
|
||||||
lb_port = 80
|
|
||||||
lb_protocol = "tcp"
|
|
||||||
instance_port = 80
|
|
||||||
instance_protocol = "tcp"
|
|
||||||
}
|
|
||||||
|
|
||||||
listener {
|
|
||||||
lb_port = 443
|
|
||||||
lb_protocol = "tcp"
|
|
||||||
instance_port = 443
|
|
||||||
instance_protocol = "tcp"
|
|
||||||
}
|
|
||||||
|
|
||||||
# Ingress Controller HTTP health check
|
|
||||||
health_check {
|
|
||||||
target = "HTTP:10254/healthz"
|
|
||||||
healthy_threshold = 2
|
|
||||||
unhealthy_threshold = 4
|
|
||||||
timeout = 5
|
|
||||||
interval = 6
|
|
||||||
}
|
|
||||||
|
|
||||||
connection_draining = true
|
|
||||||
connection_draining_timeout = 300
|
|
||||||
}
|
|
@ -1,4 +1,25 @@
|
|||||||
output "ingress_dns_name" {
|
output "ingress_dns_name" {
|
||||||
value = "${aws_elb.ingress.dns_name}"
|
value = "${module.workers.ingress_dns_name}"
|
||||||
description = "DNS name of the ELB for distributing traffic to Ingress controllers"
|
description = "DNS name of the network load balancer for distributing traffic to Ingress controllers"
|
||||||
|
}
|
||||||
|
|
||||||
|
# Outputs for worker pools
|
||||||
|
|
||||||
|
output "vpc_id" {
|
||||||
|
value = "${aws_vpc.network.id}"
|
||||||
|
description = "ID of the VPC for creating worker instances"
|
||||||
|
}
|
||||||
|
|
||||||
|
output "subnet_ids" {
|
||||||
|
value = ["${aws_subnet.public.*.id}"]
|
||||||
|
description = "List of subnet IDs for creating worker instances"
|
||||||
|
}
|
||||||
|
|
||||||
|
output "worker_security_groups" {
|
||||||
|
value = ["${aws_security_group.worker.id}"]
|
||||||
|
description = "List of worker security group IDs"
|
||||||
|
}
|
||||||
|
|
||||||
|
output "kubeconfig" {
|
||||||
|
value = "${module.bootkube.kubeconfig}"
|
||||||
}
|
}
|
||||||
|
@ -5,7 +5,7 @@ terraform {
|
|||||||
}
|
}
|
||||||
|
|
||||||
provider "aws" {
|
provider "aws" {
|
||||||
version = "~> 1.0"
|
version = "~> 1.11"
|
||||||
}
|
}
|
||||||
|
|
||||||
provider "local" {
|
provider "local" {
|
||||||
|
385
aws/container-linux/kubernetes/security.tf
Normal file
385
aws/container-linux/kubernetes/security.tf
Normal file
@ -0,0 +1,385 @@
|
|||||||
|
# Security Groups (instance firewalls)
|
||||||
|
|
||||||
|
# Controller security group
|
||||||
|
|
||||||
|
resource "aws_security_group" "controller" {
|
||||||
|
name = "${var.cluster_name}-controller"
|
||||||
|
description = "${var.cluster_name} controller security group"
|
||||||
|
|
||||||
|
vpc_id = "${aws_vpc.network.id}"
|
||||||
|
|
||||||
|
tags = "${map("Name", "${var.cluster_name}-controller")}"
|
||||||
|
}
|
||||||
|
|
||||||
|
resource "aws_security_group_rule" "controller-icmp" {
|
||||||
|
security_group_id = "${aws_security_group.controller.id}"
|
||||||
|
|
||||||
|
type = "ingress"
|
||||||
|
protocol = "icmp"
|
||||||
|
from_port = 0
|
||||||
|
to_port = 0
|
||||||
|
cidr_blocks = ["0.0.0.0/0"]
|
||||||
|
}
|
||||||
|
|
||||||
|
resource "aws_security_group_rule" "controller-ssh" {
|
||||||
|
security_group_id = "${aws_security_group.controller.id}"
|
||||||
|
|
||||||
|
type = "ingress"
|
||||||
|
protocol = "tcp"
|
||||||
|
from_port = 22
|
||||||
|
to_port = 22
|
||||||
|
cidr_blocks = ["0.0.0.0/0"]
|
||||||
|
}
|
||||||
|
|
||||||
|
resource "aws_security_group_rule" "controller-apiserver" {
|
||||||
|
security_group_id = "${aws_security_group.controller.id}"
|
||||||
|
|
||||||
|
type = "ingress"
|
||||||
|
protocol = "tcp"
|
||||||
|
from_port = 443
|
||||||
|
to_port = 443
|
||||||
|
cidr_blocks = ["0.0.0.0/0"]
|
||||||
|
}
|
||||||
|
|
||||||
|
resource "aws_security_group_rule" "controller-etcd" {
|
||||||
|
security_group_id = "${aws_security_group.controller.id}"
|
||||||
|
|
||||||
|
type = "ingress"
|
||||||
|
protocol = "tcp"
|
||||||
|
from_port = 2379
|
||||||
|
to_port = 2380
|
||||||
|
self = true
|
||||||
|
}
|
||||||
|
|
||||||
|
resource "aws_security_group_rule" "controller-flannel" {
|
||||||
|
security_group_id = "${aws_security_group.controller.id}"
|
||||||
|
|
||||||
|
type = "ingress"
|
||||||
|
protocol = "udp"
|
||||||
|
from_port = 8472
|
||||||
|
to_port = 8472
|
||||||
|
source_security_group_id = "${aws_security_group.worker.id}"
|
||||||
|
}
|
||||||
|
|
||||||
|
resource "aws_security_group_rule" "controller-flannel-self" {
|
||||||
|
security_group_id = "${aws_security_group.controller.id}"
|
||||||
|
|
||||||
|
type = "ingress"
|
||||||
|
protocol = "udp"
|
||||||
|
from_port = 8472
|
||||||
|
to_port = 8472
|
||||||
|
self = true
|
||||||
|
}
|
||||||
|
|
||||||
|
resource "aws_security_group_rule" "controller-node-exporter" {
|
||||||
|
security_group_id = "${aws_security_group.controller.id}"
|
||||||
|
|
||||||
|
type = "ingress"
|
||||||
|
protocol = "tcp"
|
||||||
|
from_port = 9100
|
||||||
|
to_port = 9100
|
||||||
|
source_security_group_id = "${aws_security_group.worker.id}"
|
||||||
|
}
|
||||||
|
|
||||||
|
resource "aws_security_group_rule" "controller-kubelet-self" {
|
||||||
|
security_group_id = "${aws_security_group.controller.id}"
|
||||||
|
|
||||||
|
type = "ingress"
|
||||||
|
protocol = "tcp"
|
||||||
|
from_port = 10250
|
||||||
|
to_port = 10250
|
||||||
|
self = true
|
||||||
|
}
|
||||||
|
|
||||||
|
resource "aws_security_group_rule" "controller-kubelet-read" {
|
||||||
|
security_group_id = "${aws_security_group.controller.id}"
|
||||||
|
|
||||||
|
type = "ingress"
|
||||||
|
protocol = "tcp"
|
||||||
|
from_port = 10255
|
||||||
|
to_port = 10255
|
||||||
|
source_security_group_id = "${aws_security_group.worker.id}"
|
||||||
|
}
|
||||||
|
|
||||||
|
resource "aws_security_group_rule" "controller-kubelet-read-self" {
|
||||||
|
security_group_id = "${aws_security_group.controller.id}"
|
||||||
|
|
||||||
|
type = "ingress"
|
||||||
|
protocol = "tcp"
|
||||||
|
from_port = 10255
|
||||||
|
to_port = 10255
|
||||||
|
self = true
|
||||||
|
}
|
||||||
|
|
||||||
|
resource "aws_security_group_rule" "controller-bgp" {
|
||||||
|
security_group_id = "${aws_security_group.controller.id}"
|
||||||
|
|
||||||
|
type = "ingress"
|
||||||
|
protocol = "tcp"
|
||||||
|
from_port = 179
|
||||||
|
to_port = 179
|
||||||
|
source_security_group_id = "${aws_security_group.worker.id}"
|
||||||
|
}
|
||||||
|
|
||||||
|
resource "aws_security_group_rule" "controller-bgp-self" {
|
||||||
|
security_group_id = "${aws_security_group.controller.id}"
|
||||||
|
|
||||||
|
type = "ingress"
|
||||||
|
protocol = "tcp"
|
||||||
|
from_port = 179
|
||||||
|
to_port = 179
|
||||||
|
self = true
|
||||||
|
}
|
||||||
|
|
||||||
|
resource "aws_security_group_rule" "controller-ipip" {
|
||||||
|
security_group_id = "${aws_security_group.controller.id}"
|
||||||
|
|
||||||
|
type = "ingress"
|
||||||
|
protocol = 4
|
||||||
|
from_port = 0
|
||||||
|
to_port = 0
|
||||||
|
source_security_group_id = "${aws_security_group.worker.id}"
|
||||||
|
}
|
||||||
|
|
||||||
|
resource "aws_security_group_rule" "controller-ipip-self" {
|
||||||
|
security_group_id = "${aws_security_group.controller.id}"
|
||||||
|
|
||||||
|
type = "ingress"
|
||||||
|
protocol = 4
|
||||||
|
from_port = 0
|
||||||
|
to_port = 0
|
||||||
|
self = true
|
||||||
|
}
|
||||||
|
|
||||||
|
resource "aws_security_group_rule" "controller-ipip-legacy" {
|
||||||
|
security_group_id = "${aws_security_group.controller.id}"
|
||||||
|
|
||||||
|
type = "ingress"
|
||||||
|
protocol = 94
|
||||||
|
from_port = 0
|
||||||
|
to_port = 0
|
||||||
|
source_security_group_id = "${aws_security_group.worker.id}"
|
||||||
|
}
|
||||||
|
|
||||||
|
resource "aws_security_group_rule" "controller-ipip-legacy-self" {
|
||||||
|
security_group_id = "${aws_security_group.controller.id}"
|
||||||
|
|
||||||
|
type = "ingress"
|
||||||
|
protocol = 94
|
||||||
|
from_port = 0
|
||||||
|
to_port = 0
|
||||||
|
self = true
|
||||||
|
}
|
||||||
|
|
||||||
|
resource "aws_security_group_rule" "controller-egress" {
|
||||||
|
security_group_id = "${aws_security_group.controller.id}"
|
||||||
|
|
||||||
|
type = "egress"
|
||||||
|
protocol = "-1"
|
||||||
|
from_port = 0
|
||||||
|
to_port = 0
|
||||||
|
cidr_blocks = ["0.0.0.0/0"]
|
||||||
|
ipv6_cidr_blocks = ["::/0"]
|
||||||
|
}
|
||||||
|
|
||||||
|
# Worker security group
|
||||||
|
|
||||||
|
resource "aws_security_group" "worker" {
|
||||||
|
name = "${var.cluster_name}-worker"
|
||||||
|
description = "${var.cluster_name} worker security group"
|
||||||
|
|
||||||
|
vpc_id = "${aws_vpc.network.id}"
|
||||||
|
|
||||||
|
tags = "${map("Name", "${var.cluster_name}-worker")}"
|
||||||
|
}
|
||||||
|
|
||||||
|
resource "aws_security_group_rule" "worker-icmp" {
|
||||||
|
security_group_id = "${aws_security_group.worker.id}"
|
||||||
|
|
||||||
|
type = "ingress"
|
||||||
|
protocol = "icmp"
|
||||||
|
from_port = 0
|
||||||
|
to_port = 0
|
||||||
|
cidr_blocks = ["0.0.0.0/0"]
|
||||||
|
}
|
||||||
|
|
||||||
|
resource "aws_security_group_rule" "worker-ssh" {
|
||||||
|
security_group_id = "${aws_security_group.worker.id}"
|
||||||
|
|
||||||
|
type = "ingress"
|
||||||
|
protocol = "tcp"
|
||||||
|
from_port = 22
|
||||||
|
to_port = 22
|
||||||
|
cidr_blocks = ["0.0.0.0/0"]
|
||||||
|
}
|
||||||
|
|
||||||
|
resource "aws_security_group_rule" "worker-http" {
|
||||||
|
security_group_id = "${aws_security_group.worker.id}"
|
||||||
|
|
||||||
|
type = "ingress"
|
||||||
|
protocol = "tcp"
|
||||||
|
from_port = 80
|
||||||
|
to_port = 80
|
||||||
|
cidr_blocks = ["0.0.0.0/0"]
|
||||||
|
}
|
||||||
|
|
||||||
|
resource "aws_security_group_rule" "worker-https" {
|
||||||
|
security_group_id = "${aws_security_group.worker.id}"
|
||||||
|
|
||||||
|
type = "ingress"
|
||||||
|
protocol = "tcp"
|
||||||
|
from_port = 443
|
||||||
|
to_port = 443
|
||||||
|
cidr_blocks = ["0.0.0.0/0"]
|
||||||
|
}
|
||||||
|
|
||||||
|
resource "aws_security_group_rule" "worker-flannel" {
|
||||||
|
security_group_id = "${aws_security_group.worker.id}"
|
||||||
|
|
||||||
|
type = "ingress"
|
||||||
|
protocol = "udp"
|
||||||
|
from_port = 8472
|
||||||
|
to_port = 8472
|
||||||
|
source_security_group_id = "${aws_security_group.controller.id}"
|
||||||
|
}
|
||||||
|
|
||||||
|
resource "aws_security_group_rule" "worker-flannel-self" {
|
||||||
|
security_group_id = "${aws_security_group.worker.id}"
|
||||||
|
|
||||||
|
type = "ingress"
|
||||||
|
protocol = "udp"
|
||||||
|
from_port = 8472
|
||||||
|
to_port = 8472
|
||||||
|
self = true
|
||||||
|
}
|
||||||
|
|
||||||
|
resource "aws_security_group_rule" "worker-node-exporter" {
|
||||||
|
security_group_id = "${aws_security_group.worker.id}"
|
||||||
|
|
||||||
|
type = "ingress"
|
||||||
|
protocol = "tcp"
|
||||||
|
from_port = 9100
|
||||||
|
to_port = 9100
|
||||||
|
self = true
|
||||||
|
}
|
||||||
|
|
||||||
|
resource "aws_security_group_rule" "ingress-health" {
|
||||||
|
security_group_id = "${aws_security_group.worker.id}"
|
||||||
|
|
||||||
|
type = "ingress"
|
||||||
|
protocol = "tcp"
|
||||||
|
from_port = 10254
|
||||||
|
to_port = 10254
|
||||||
|
cidr_blocks = ["0.0.0.0/0"]
|
||||||
|
}
|
||||||
|
|
||||||
|
resource "aws_security_group_rule" "worker-kubelet" {
|
||||||
|
security_group_id = "${aws_security_group.worker.id}"
|
||||||
|
|
||||||
|
type = "ingress"
|
||||||
|
protocol = "tcp"
|
||||||
|
from_port = 10250
|
||||||
|
to_port = 10250
|
||||||
|
source_security_group_id = "${aws_security_group.controller.id}"
|
||||||
|
}
|
||||||
|
|
||||||
|
resource "aws_security_group_rule" "worker-kubelet-self" {
|
||||||
|
security_group_id = "${aws_security_group.worker.id}"
|
||||||
|
|
||||||
|
type = "ingress"
|
||||||
|
protocol = "tcp"
|
||||||
|
from_port = 10250
|
||||||
|
to_port = 10250
|
||||||
|
self = true
|
||||||
|
}
|
||||||
|
|
||||||
|
resource "aws_security_group_rule" "worker-kubelet-read" {
|
||||||
|
security_group_id = "${aws_security_group.worker.id}"
|
||||||
|
|
||||||
|
type = "ingress"
|
||||||
|
protocol = "tcp"
|
||||||
|
from_port = 10255
|
||||||
|
to_port = 10255
|
||||||
|
source_security_group_id = "${aws_security_group.controller.id}"
|
||||||
|
}
|
||||||
|
|
||||||
|
resource "aws_security_group_rule" "worker-kubelet-read-self" {
|
||||||
|
security_group_id = "${aws_security_group.worker.id}"
|
||||||
|
|
||||||
|
type = "ingress"
|
||||||
|
protocol = "tcp"
|
||||||
|
from_port = 10255
|
||||||
|
to_port = 10255
|
||||||
|
self = true
|
||||||
|
}
|
||||||
|
|
||||||
|
resource "aws_security_group_rule" "worker-bgp" {
|
||||||
|
security_group_id = "${aws_security_group.worker.id}"
|
||||||
|
|
||||||
|
type = "ingress"
|
||||||
|
protocol = "tcp"
|
||||||
|
from_port = 179
|
||||||
|
to_port = 179
|
||||||
|
source_security_group_id = "${aws_security_group.controller.id}"
|
||||||
|
}
|
||||||
|
|
||||||
|
resource "aws_security_group_rule" "worker-bgp-self" {
|
||||||
|
security_group_id = "${aws_security_group.worker.id}"
|
||||||
|
|
||||||
|
type = "ingress"
|
||||||
|
protocol = "tcp"
|
||||||
|
from_port = 179
|
||||||
|
to_port = 179
|
||||||
|
self = true
|
||||||
|
}
|
||||||
|
|
||||||
|
resource "aws_security_group_rule" "worker-ipip" {
|
||||||
|
security_group_id = "${aws_security_group.worker.id}"
|
||||||
|
|
||||||
|
type = "ingress"
|
||||||
|
protocol = 4
|
||||||
|
from_port = 0
|
||||||
|
to_port = 0
|
||||||
|
source_security_group_id = "${aws_security_group.controller.id}"
|
||||||
|
}
|
||||||
|
|
||||||
|
resource "aws_security_group_rule" "worker-ipip-self" {
|
||||||
|
security_group_id = "${aws_security_group.worker.id}"
|
||||||
|
|
||||||
|
type = "ingress"
|
||||||
|
protocol = 4
|
||||||
|
from_port = 0
|
||||||
|
to_port = 0
|
||||||
|
self = true
|
||||||
|
}
|
||||||
|
|
||||||
|
resource "aws_security_group_rule" "worker-ipip-legacy" {
|
||||||
|
security_group_id = "${aws_security_group.worker.id}"
|
||||||
|
|
||||||
|
type = "ingress"
|
||||||
|
protocol = 94
|
||||||
|
from_port = 0
|
||||||
|
to_port = 0
|
||||||
|
source_security_group_id = "${aws_security_group.controller.id}"
|
||||||
|
}
|
||||||
|
|
||||||
|
resource "aws_security_group_rule" "worker-ipip-legacy-self" {
|
||||||
|
security_group_id = "${aws_security_group.worker.id}"
|
||||||
|
|
||||||
|
type = "ingress"
|
||||||
|
protocol = 94
|
||||||
|
from_port = 0
|
||||||
|
to_port = 0
|
||||||
|
self = true
|
||||||
|
}
|
||||||
|
|
||||||
|
resource "aws_security_group_rule" "worker-egress" {
|
||||||
|
security_group_id = "${aws_security_group.worker.id}"
|
||||||
|
|
||||||
|
type = "egress"
|
||||||
|
protocol = "-1"
|
||||||
|
from_port = 0
|
||||||
|
to_port = 0
|
||||||
|
cidr_blocks = ["0.0.0.0/0"]
|
||||||
|
ipv6_cidr_blocks = ["::/0"]
|
||||||
|
}
|
@ -1,275 +1,19 @@
|
|||||||
# Workers AutoScaling Group
|
module "workers" {
|
||||||
resource "aws_autoscaling_group" "workers" {
|
source = "workers"
|
||||||
name = "${var.cluster_name}-worker ${aws_launch_configuration.worker.name}"
|
name = "${var.cluster_name}"
|
||||||
load_balancers = ["${aws_elb.ingress.id}"]
|
|
||||||
|
|
||||||
# count
|
# AWS
|
||||||
desired_capacity = "${var.worker_count}"
|
vpc_id = "${aws_vpc.network.id}"
|
||||||
min_size = "${var.worker_count}"
|
subnet_ids = ["${aws_subnet.public.*.id}"]
|
||||||
max_size = "${var.worker_count + 2}"
|
|
||||||
default_cooldown = 30
|
|
||||||
health_check_grace_period = 30
|
|
||||||
|
|
||||||
# network
|
|
||||||
vpc_zone_identifier = ["${aws_subnet.public.*.id}"]
|
|
||||||
|
|
||||||
# template
|
|
||||||
launch_configuration = "${aws_launch_configuration.worker.name}"
|
|
||||||
|
|
||||||
lifecycle {
|
|
||||||
# override the default destroy and replace update behavior
|
|
||||||
create_before_destroy = true
|
|
||||||
ignore_changes = ["image_id"]
|
|
||||||
}
|
|
||||||
|
|
||||||
tags = [{
|
|
||||||
key = "Name"
|
|
||||||
value = "${var.cluster_name}-worker"
|
|
||||||
propagate_at_launch = true
|
|
||||||
}]
|
|
||||||
}
|
|
||||||
|
|
||||||
# Worker template
|
|
||||||
resource "aws_launch_configuration" "worker" {
|
|
||||||
image_id = "${data.aws_ami.coreos.image_id}"
|
|
||||||
instance_type = "${var.worker_type}"
|
|
||||||
|
|
||||||
user_data = "${data.ct_config.worker_ign.rendered}"
|
|
||||||
|
|
||||||
# storage
|
|
||||||
root_block_device {
|
|
||||||
volume_type = "standard"
|
|
||||||
volume_size = "${var.disk_size}"
|
|
||||||
}
|
|
||||||
|
|
||||||
# network
|
|
||||||
security_groups = ["${aws_security_group.worker.id}"]
|
security_groups = ["${aws_security_group.worker.id}"]
|
||||||
|
count = "${var.worker_count}"
|
||||||
|
instance_type = "${var.worker_type}"
|
||||||
|
os_channel = "${var.os_channel}"
|
||||||
|
disk_size = "${var.disk_size}"
|
||||||
|
|
||||||
lifecycle {
|
# configuration
|
||||||
// Override the default destroy and replace update behavior
|
kubeconfig = "${module.bootkube.kubeconfig}"
|
||||||
create_before_destroy = true
|
ssh_authorized_key = "${var.ssh_authorized_key}"
|
||||||
}
|
service_cidr = "${var.service_cidr}"
|
||||||
}
|
cluster_domain_suffix = "${var.cluster_domain_suffix}"
|
||||||
|
|
||||||
# Worker Container Linux Config
|
|
||||||
data "template_file" "worker_config" {
|
|
||||||
template = "${file("${path.module}/cl/worker.yaml.tmpl")}"
|
|
||||||
|
|
||||||
vars = {
|
|
||||||
k8s_dns_service_ip = "${cidrhost(var.service_cidr, 10)}"
|
|
||||||
k8s_etcd_service_ip = "${cidrhost(var.service_cidr, 15)}"
|
|
||||||
ssh_authorized_key = "${var.ssh_authorized_key}"
|
|
||||||
cluster_domain_suffix = "${var.cluster_domain_suffix}"
|
|
||||||
kubeconfig_ca_cert = "${module.bootkube.ca_cert}"
|
|
||||||
kubeconfig_kubelet_cert = "${module.bootkube.kubelet_cert}"
|
|
||||||
kubeconfig_kubelet_key = "${module.bootkube.kubelet_key}"
|
|
||||||
kubeconfig_server = "${module.bootkube.server}"
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
data "ct_config" "worker_ign" {
|
|
||||||
content = "${data.template_file.worker_config.rendered}"
|
|
||||||
pretty_print = false
|
|
||||||
}
|
|
||||||
|
|
||||||
# Security Group (instance firewall)
|
|
||||||
|
|
||||||
resource "aws_security_group" "worker" {
|
|
||||||
name = "${var.cluster_name}-worker"
|
|
||||||
description = "${var.cluster_name} worker security group"
|
|
||||||
|
|
||||||
vpc_id = "${aws_vpc.network.id}"
|
|
||||||
|
|
||||||
tags = "${map("Name", "${var.cluster_name}-worker")}"
|
|
||||||
}
|
|
||||||
|
|
||||||
resource "aws_security_group_rule" "worker-icmp" {
|
|
||||||
security_group_id = "${aws_security_group.worker.id}"
|
|
||||||
|
|
||||||
type = "ingress"
|
|
||||||
protocol = "icmp"
|
|
||||||
from_port = 0
|
|
||||||
to_port = 0
|
|
||||||
cidr_blocks = ["0.0.0.0/0"]
|
|
||||||
}
|
|
||||||
|
|
||||||
resource "aws_security_group_rule" "worker-ssh" {
|
|
||||||
security_group_id = "${aws_security_group.worker.id}"
|
|
||||||
|
|
||||||
type = "ingress"
|
|
||||||
protocol = "tcp"
|
|
||||||
from_port = 22
|
|
||||||
to_port = 22
|
|
||||||
cidr_blocks = ["0.0.0.0/0"]
|
|
||||||
}
|
|
||||||
|
|
||||||
resource "aws_security_group_rule" "worker-http" {
|
|
||||||
security_group_id = "${aws_security_group.worker.id}"
|
|
||||||
|
|
||||||
type = "ingress"
|
|
||||||
protocol = "tcp"
|
|
||||||
from_port = 80
|
|
||||||
to_port = 80
|
|
||||||
cidr_blocks = ["0.0.0.0/0"]
|
|
||||||
}
|
|
||||||
|
|
||||||
resource "aws_security_group_rule" "worker-https" {
|
|
||||||
security_group_id = "${aws_security_group.worker.id}"
|
|
||||||
|
|
||||||
type = "ingress"
|
|
||||||
protocol = "tcp"
|
|
||||||
from_port = 443
|
|
||||||
to_port = 443
|
|
||||||
cidr_blocks = ["0.0.0.0/0"]
|
|
||||||
}
|
|
||||||
|
|
||||||
resource "aws_security_group_rule" "worker-flannel" {
|
|
||||||
security_group_id = "${aws_security_group.worker.id}"
|
|
||||||
|
|
||||||
type = "ingress"
|
|
||||||
protocol = "udp"
|
|
||||||
from_port = 8472
|
|
||||||
to_port = 8472
|
|
||||||
source_security_group_id = "${aws_security_group.controller.id}"
|
|
||||||
}
|
|
||||||
|
|
||||||
resource "aws_security_group_rule" "worker-flannel-self" {
|
|
||||||
security_group_id = "${aws_security_group.worker.id}"
|
|
||||||
|
|
||||||
type = "ingress"
|
|
||||||
protocol = "udp"
|
|
||||||
from_port = 8472
|
|
||||||
to_port = 8472
|
|
||||||
self = true
|
|
||||||
}
|
|
||||||
|
|
||||||
resource "aws_security_group_rule" "worker-node-exporter" {
|
|
||||||
security_group_id = "${aws_security_group.worker.id}"
|
|
||||||
|
|
||||||
type = "ingress"
|
|
||||||
protocol = "tcp"
|
|
||||||
from_port = 9100
|
|
||||||
to_port = 9100
|
|
||||||
self = true
|
|
||||||
}
|
|
||||||
|
|
||||||
resource "aws_security_group_rule" "worker-kubelet" {
|
|
||||||
security_group_id = "${aws_security_group.worker.id}"
|
|
||||||
|
|
||||||
type = "ingress"
|
|
||||||
protocol = "tcp"
|
|
||||||
from_port = 10250
|
|
||||||
to_port = 10250
|
|
||||||
source_security_group_id = "${aws_security_group.controller.id}"
|
|
||||||
}
|
|
||||||
|
|
||||||
resource "aws_security_group_rule" "worker-kubelet-self" {
|
|
||||||
security_group_id = "${aws_security_group.worker.id}"
|
|
||||||
|
|
||||||
type = "ingress"
|
|
||||||
protocol = "tcp"
|
|
||||||
from_port = 10250
|
|
||||||
to_port = 10250
|
|
||||||
self = true
|
|
||||||
}
|
|
||||||
|
|
||||||
resource "aws_security_group_rule" "worker-kubelet-read" {
|
|
||||||
security_group_id = "${aws_security_group.worker.id}"
|
|
||||||
|
|
||||||
type = "ingress"
|
|
||||||
protocol = "tcp"
|
|
||||||
from_port = 10255
|
|
||||||
to_port = 10255
|
|
||||||
source_security_group_id = "${aws_security_group.controller.id}"
|
|
||||||
}
|
|
||||||
|
|
||||||
resource "aws_security_group_rule" "worker-kubelet-read-self" {
|
|
||||||
security_group_id = "${aws_security_group.worker.id}"
|
|
||||||
|
|
||||||
type = "ingress"
|
|
||||||
protocol = "tcp"
|
|
||||||
from_port = 10255
|
|
||||||
to_port = 10255
|
|
||||||
self = true
|
|
||||||
}
|
|
||||||
|
|
||||||
resource "aws_security_group_rule" "ingress-health-self" {
|
|
||||||
security_group_id = "${aws_security_group.worker.id}"
|
|
||||||
|
|
||||||
type = "ingress"
|
|
||||||
protocol = "tcp"
|
|
||||||
from_port = 10254
|
|
||||||
to_port = 10254
|
|
||||||
self = true
|
|
||||||
}
|
|
||||||
|
|
||||||
resource "aws_security_group_rule" "worker-bgp" {
|
|
||||||
security_group_id = "${aws_security_group.worker.id}"
|
|
||||||
|
|
||||||
type = "ingress"
|
|
||||||
protocol = "tcp"
|
|
||||||
from_port = 179
|
|
||||||
to_port = 179
|
|
||||||
source_security_group_id = "${aws_security_group.controller.id}"
|
|
||||||
}
|
|
||||||
|
|
||||||
resource "aws_security_group_rule" "worker-bgp-self" {
|
|
||||||
security_group_id = "${aws_security_group.worker.id}"
|
|
||||||
|
|
||||||
type = "ingress"
|
|
||||||
protocol = "tcp"
|
|
||||||
from_port = 179
|
|
||||||
to_port = 179
|
|
||||||
self = true
|
|
||||||
}
|
|
||||||
|
|
||||||
resource "aws_security_group_rule" "worker-ipip" {
|
|
||||||
security_group_id = "${aws_security_group.worker.id}"
|
|
||||||
|
|
||||||
type = "ingress"
|
|
||||||
protocol = 4
|
|
||||||
from_port = 0
|
|
||||||
to_port = 0
|
|
||||||
source_security_group_id = "${aws_security_group.controller.id}"
|
|
||||||
}
|
|
||||||
|
|
||||||
resource "aws_security_group_rule" "worker-ipip-self" {
|
|
||||||
security_group_id = "${aws_security_group.worker.id}"
|
|
||||||
|
|
||||||
type = "ingress"
|
|
||||||
protocol = 4
|
|
||||||
from_port = 0
|
|
||||||
to_port = 0
|
|
||||||
self = true
|
|
||||||
}
|
|
||||||
|
|
||||||
resource "aws_security_group_rule" "worker-ipip-legacy" {
|
|
||||||
security_group_id = "${aws_security_group.worker.id}"
|
|
||||||
|
|
||||||
type = "ingress"
|
|
||||||
protocol = 94
|
|
||||||
from_port = 0
|
|
||||||
to_port = 0
|
|
||||||
source_security_group_id = "${aws_security_group.controller.id}"
|
|
||||||
}
|
|
||||||
|
|
||||||
resource "aws_security_group_rule" "worker-ipip-legacy-self" {
|
|
||||||
security_group_id = "${aws_security_group.worker.id}"
|
|
||||||
|
|
||||||
type = "ingress"
|
|
||||||
protocol = 94
|
|
||||||
from_port = 0
|
|
||||||
to_port = 0
|
|
||||||
self = true
|
|
||||||
}
|
|
||||||
|
|
||||||
resource "aws_security_group_rule" "worker-egress" {
|
|
||||||
security_group_id = "${aws_security_group.worker.id}"
|
|
||||||
|
|
||||||
type = "egress"
|
|
||||||
protocol = "-1"
|
|
||||||
from_port = 0
|
|
||||||
to_port = 0
|
|
||||||
cidr_blocks = ["0.0.0.0/0"]
|
|
||||||
ipv6_cidr_blocks = ["::/0"]
|
|
||||||
}
|
}
|
||||||
|
19
aws/container-linux/kubernetes/workers/ami.tf
Normal file
19
aws/container-linux/kubernetes/workers/ami.tf
Normal file
@ -0,0 +1,19 @@
|
|||||||
|
data "aws_ami" "coreos" {
|
||||||
|
most_recent = true
|
||||||
|
owners = ["595879546273"]
|
||||||
|
|
||||||
|
filter {
|
||||||
|
name = "architecture"
|
||||||
|
values = ["x86_64"]
|
||||||
|
}
|
||||||
|
|
||||||
|
filter {
|
||||||
|
name = "virtualization-type"
|
||||||
|
values = ["hvm"]
|
||||||
|
}
|
||||||
|
|
||||||
|
filter {
|
||||||
|
name = "name"
|
||||||
|
values = ["CoreOS-${var.os_channel}-*"]
|
||||||
|
}
|
||||||
|
}
|
@ -42,6 +42,7 @@ systemd:
|
|||||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/checkpoint-secrets
|
ExecStartPre=/bin/mkdir -p /etc/kubernetes/checkpoint-secrets
|
||||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/inactive-manifests
|
ExecStartPre=/bin/mkdir -p /etc/kubernetes/inactive-manifests
|
||||||
ExecStartPre=/bin/mkdir -p /var/lib/cni
|
ExecStartPre=/bin/mkdir -p /var/lib/cni
|
||||||
|
ExecStartPre=/bin/mkdir -p /var/lib/kubelet/volumeplugins
|
||||||
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
|
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
|
||||||
ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/cache/kubelet-pod.uuid
|
ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/cache/kubelet-pod.uuid
|
||||||
ExecStart=/usr/lib/coreos/kubelet-wrapper \
|
ExecStart=/usr/lib/coreos/kubelet-wrapper \
|
||||||
@ -56,7 +57,8 @@ systemd:
|
|||||||
--lock-file=/var/run/lock/kubelet.lock \
|
--lock-file=/var/run/lock/kubelet.lock \
|
||||||
--network-plugin=cni \
|
--network-plugin=cni \
|
||||||
--node-labels=node-role.kubernetes.io/node \
|
--node-labels=node-role.kubernetes.io/node \
|
||||||
--pod-manifest-path=/etc/kubernetes/manifests
|
--pod-manifest-path=/etc/kubernetes/manifests \
|
||||||
|
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
|
||||||
ExecStop=-/usr/bin/rkt stop --uuid-file=/var/cache/kubelet-pod.uuid
|
ExecStop=-/usr/bin/rkt stop --uuid-file=/var/cache/kubelet-pod.uuid
|
||||||
Restart=always
|
Restart=always
|
||||||
RestartSec=5
|
RestartSec=5
|
||||||
@ -81,29 +83,14 @@ storage:
|
|||||||
mode: 0644
|
mode: 0644
|
||||||
contents:
|
contents:
|
||||||
inline: |
|
inline: |
|
||||||
apiVersion: v1
|
${kubeconfig}
|
||||||
kind: Config
|
|
||||||
clusters:
|
|
||||||
- name: local
|
|
||||||
cluster:
|
|
||||||
server: ${kubeconfig_server}
|
|
||||||
certificate-authority-data: ${kubeconfig_ca_cert}
|
|
||||||
users:
|
|
||||||
- name: kubelet
|
|
||||||
user:
|
|
||||||
client-certificate-data: ${kubeconfig_kubelet_cert}
|
|
||||||
client-key-data: ${kubeconfig_kubelet_key}
|
|
||||||
contexts:
|
|
||||||
- context:
|
|
||||||
cluster: local
|
|
||||||
user: kubelet
|
|
||||||
- path: /etc/kubernetes/kubelet.env
|
- path: /etc/kubernetes/kubelet.env
|
||||||
filesystem: root
|
filesystem: root
|
||||||
mode: 0644
|
mode: 0644
|
||||||
contents:
|
contents:
|
||||||
inline: |
|
inline: |
|
||||||
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
|
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
|
||||||
KUBELET_IMAGE_TAG=v1.9.1
|
KUBELET_IMAGE_TAG=v1.9.4
|
||||||
- path: /etc/sysctl.d/max-user-watches.conf
|
- path: /etc/sysctl.d/max-user-watches.conf
|
||||||
filesystem: root
|
filesystem: root
|
||||||
contents:
|
contents:
|
||||||
@ -121,7 +108,7 @@ storage:
|
|||||||
--volume config,kind=host,source=/etc/kubernetes \
|
--volume config,kind=host,source=/etc/kubernetes \
|
||||||
--mount volume=config,target=/etc/kubernetes \
|
--mount volume=config,target=/etc/kubernetes \
|
||||||
--insecure-options=image \
|
--insecure-options=image \
|
||||||
docker://gcr.io/google_containers/hyperkube:v1.9.1 \
|
docker://gcr.io/google_containers/hyperkube:v1.9.4 \
|
||||||
--net=host \
|
--net=host \
|
||||||
--dns=host \
|
--dns=host \
|
||||||
--exec=/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname)
|
--exec=/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname)
|
82
aws/container-linux/kubernetes/workers/ingress.tf
Normal file
82
aws/container-linux/kubernetes/workers/ingress.tf
Normal file
@ -0,0 +1,82 @@
|
|||||||
|
# Network Load Balancer for Ingress
|
||||||
|
resource "aws_lb" "ingress" {
|
||||||
|
name = "${var.name}-ingress"
|
||||||
|
load_balancer_type = "network"
|
||||||
|
internal = false
|
||||||
|
|
||||||
|
subnets = ["${var.subnet_ids}"]
|
||||||
|
|
||||||
|
enable_cross_zone_load_balancing = true
|
||||||
|
}
|
||||||
|
|
||||||
|
# Forward HTTP traffic to workers
|
||||||
|
resource "aws_lb_listener" "ingress-http" {
|
||||||
|
load_balancer_arn = "${aws_lb.ingress.arn}"
|
||||||
|
protocol = "TCP"
|
||||||
|
port = 80
|
||||||
|
|
||||||
|
default_action {
|
||||||
|
type = "forward"
|
||||||
|
target_group_arn = "${aws_lb_target_group.workers-http.arn}"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
# Forward HTTPS traffic to workers
|
||||||
|
resource "aws_lb_listener" "ingress-https" {
|
||||||
|
load_balancer_arn = "${aws_lb.ingress.arn}"
|
||||||
|
protocol = "TCP"
|
||||||
|
port = 443
|
||||||
|
|
||||||
|
default_action {
|
||||||
|
type = "forward"
|
||||||
|
target_group_arn = "${aws_lb_target_group.workers-https.arn}"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
# Network Load Balancer target groups of instances
|
||||||
|
|
||||||
|
resource "aws_lb_target_group" "workers-http" {
|
||||||
|
name = "${var.name}-workers-http"
|
||||||
|
vpc_id = "${var.vpc_id}"
|
||||||
|
target_type = "instance"
|
||||||
|
|
||||||
|
protocol = "TCP"
|
||||||
|
port = 80
|
||||||
|
|
||||||
|
# Ingress Controller HTTP health check
|
||||||
|
health_check {
|
||||||
|
protocol = "HTTP"
|
||||||
|
port = 10254
|
||||||
|
path = "/healthz"
|
||||||
|
|
||||||
|
# NLBs required to use same healthy and unhealthy thresholds
|
||||||
|
healthy_threshold = 3
|
||||||
|
unhealthy_threshold = 3
|
||||||
|
|
||||||
|
# Interval between health checks required to be 10 or 30
|
||||||
|
interval = 10
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
resource "aws_lb_target_group" "workers-https" {
|
||||||
|
name = "${var.name}-workers-https"
|
||||||
|
vpc_id = "${var.vpc_id}"
|
||||||
|
target_type = "instance"
|
||||||
|
|
||||||
|
protocol = "TCP"
|
||||||
|
port = 443
|
||||||
|
|
||||||
|
# Ingress Controller HTTP health check
|
||||||
|
health_check {
|
||||||
|
protocol = "HTTP"
|
||||||
|
port = 10254
|
||||||
|
path = "/healthz"
|
||||||
|
|
||||||
|
# NLBs required to use same healthy and unhealthy thresholds
|
||||||
|
healthy_threshold = 3
|
||||||
|
unhealthy_threshold = 3
|
||||||
|
|
||||||
|
# Interval between health checks required to be 10 or 30
|
||||||
|
interval = 10
|
||||||
|
}
|
||||||
|
}
|
4
aws/container-linux/kubernetes/workers/outputs.tf
Normal file
4
aws/container-linux/kubernetes/workers/outputs.tf
Normal file
@ -0,0 +1,4 @@
|
|||||||
|
output "ingress_dns_name" {
|
||||||
|
value = "${aws_lb.ingress.dns_name}"
|
||||||
|
description = "DNS name of the network load balancer for distributing traffic to Ingress controllers"
|
||||||
|
}
|
73
aws/container-linux/kubernetes/workers/variables.tf
Normal file
73
aws/container-linux/kubernetes/workers/variables.tf
Normal file
@ -0,0 +1,73 @@
|
|||||||
|
variable "name" {
|
||||||
|
type = "string"
|
||||||
|
description = "Unique name instance group"
|
||||||
|
}
|
||||||
|
|
||||||
|
variable "vpc_id" {
|
||||||
|
type = "string"
|
||||||
|
description = "ID of the VPC for creating instances"
|
||||||
|
}
|
||||||
|
|
||||||
|
variable "subnet_ids" {
|
||||||
|
type = "list"
|
||||||
|
description = "List of subnet IDs for creating instances"
|
||||||
|
}
|
||||||
|
|
||||||
|
variable "security_groups" {
|
||||||
|
type = "list"
|
||||||
|
description = "List of security group IDs"
|
||||||
|
}
|
||||||
|
|
||||||
|
# instances
|
||||||
|
|
||||||
|
variable "count" {
|
||||||
|
type = "string"
|
||||||
|
default = "1"
|
||||||
|
description = "Number of instances"
|
||||||
|
}
|
||||||
|
|
||||||
|
variable "instance_type" {
|
||||||
|
type = "string"
|
||||||
|
default = "t2.small"
|
||||||
|
description = "EC2 instance type"
|
||||||
|
}
|
||||||
|
|
||||||
|
variable "os_channel" {
|
||||||
|
type = "string"
|
||||||
|
default = "stable"
|
||||||
|
description = "Container Linux AMI channel (stable, beta, alpha)"
|
||||||
|
}
|
||||||
|
|
||||||
|
variable "disk_size" {
|
||||||
|
type = "string"
|
||||||
|
default = "40"
|
||||||
|
description = "Size of the disk in GB"
|
||||||
|
}
|
||||||
|
|
||||||
|
# configuration
|
||||||
|
|
||||||
|
variable "kubeconfig" {
|
||||||
|
type = "string"
|
||||||
|
description = "Generated Kubelet kubeconfig"
|
||||||
|
}
|
||||||
|
|
||||||
|
variable "ssh_authorized_key" {
|
||||||
|
type = "string"
|
||||||
|
description = "SSH public key for user 'core'"
|
||||||
|
}
|
||||||
|
|
||||||
|
variable "service_cidr" {
|
||||||
|
description = <<EOD
|
||||||
|
CIDR IPv4 range to assign Kubernetes services.
|
||||||
|
The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for kube-dns.
|
||||||
|
EOD
|
||||||
|
|
||||||
|
type = "string"
|
||||||
|
default = "10.3.0.0/16"
|
||||||
|
}
|
||||||
|
|
||||||
|
variable "cluster_domain_suffix" {
|
||||||
|
description = "Queries for domains with the suffix will be answered by kube-dns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
|
||||||
|
type = "string"
|
||||||
|
default = "cluster.local"
|
||||||
|
}
|
74
aws/container-linux/kubernetes/workers/workers.tf
Normal file
74
aws/container-linux/kubernetes/workers/workers.tf
Normal file
@ -0,0 +1,74 @@
|
|||||||
|
# Workers AutoScaling Group
|
||||||
|
resource "aws_autoscaling_group" "workers" {
|
||||||
|
name = "${var.name}-worker ${aws_launch_configuration.worker.name}"
|
||||||
|
|
||||||
|
# count
|
||||||
|
desired_capacity = "${var.count}"
|
||||||
|
min_size = "${var.count}"
|
||||||
|
max_size = "${var.count + 2}"
|
||||||
|
default_cooldown = 30
|
||||||
|
health_check_grace_period = 30
|
||||||
|
|
||||||
|
# network
|
||||||
|
vpc_zone_identifier = ["${var.subnet_ids}"]
|
||||||
|
|
||||||
|
# template
|
||||||
|
launch_configuration = "${aws_launch_configuration.worker.name}"
|
||||||
|
|
||||||
|
# target groups to which instances should be added
|
||||||
|
target_group_arns = [
|
||||||
|
"${aws_lb_target_group.workers-http.id}",
|
||||||
|
"${aws_lb_target_group.workers-https.id}",
|
||||||
|
]
|
||||||
|
|
||||||
|
lifecycle {
|
||||||
|
# override the default destroy and replace update behavior
|
||||||
|
create_before_destroy = true
|
||||||
|
}
|
||||||
|
|
||||||
|
tags = [{
|
||||||
|
key = "Name"
|
||||||
|
value = "${var.name}-worker"
|
||||||
|
propagate_at_launch = true
|
||||||
|
}]
|
||||||
|
}
|
||||||
|
|
||||||
|
# Worker template
|
||||||
|
resource "aws_launch_configuration" "worker" {
|
||||||
|
image_id = "${data.aws_ami.coreos.image_id}"
|
||||||
|
instance_type = "${var.instance_type}"
|
||||||
|
|
||||||
|
user_data = "${data.ct_config.worker_ign.rendered}"
|
||||||
|
|
||||||
|
# storage
|
||||||
|
root_block_device {
|
||||||
|
volume_type = "standard"
|
||||||
|
volume_size = "${var.disk_size}"
|
||||||
|
}
|
||||||
|
|
||||||
|
# network
|
||||||
|
security_groups = ["${var.security_groups}"]
|
||||||
|
|
||||||
|
lifecycle {
|
||||||
|
// Override the default destroy and replace update behavior
|
||||||
|
create_before_destroy = true
|
||||||
|
ignore_changes = ["image_id"]
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
# Worker Container Linux Config
|
||||||
|
data "template_file" "worker_config" {
|
||||||
|
template = "${file("${path.module}/cl/worker.yaml.tmpl")}"
|
||||||
|
|
||||||
|
vars = {
|
||||||
|
kubeconfig = "${indent(10, var.kubeconfig)}"
|
||||||
|
ssh_authorized_key = "${var.ssh_authorized_key}"
|
||||||
|
k8s_dns_service_ip = "${cidrhost(var.service_cidr, 10)}"
|
||||||
|
cluster_domain_suffix = "${var.cluster_domain_suffix}"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
data "ct_config" "worker_ign" {
|
||||||
|
content = "${data.template_file.worker_config.rendered}"
|
||||||
|
pretty_print = false
|
||||||
|
}
|
@ -11,10 +11,10 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
|
|||||||
|
|
||||||
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
|
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
|
||||||
|
|
||||||
* Kubernetes v1.9.1 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
|
* Kubernetes v1.9.4 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
|
||||||
* Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
|
* Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
|
||||||
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
|
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
|
||||||
* Ready for Ingress, Dashboards, Metrics, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
|
* Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
|
||||||
|
|
||||||
## Docs
|
## Docs
|
||||||
|
|
||||||
|
@ -1,6 +1,6 @@
|
|||||||
# Self-hosted Kubernetes assets (kubeconfig, manifests)
|
# Self-hosted Kubernetes assets (kubeconfig, manifests)
|
||||||
module "bootkube" {
|
module "bootkube" {
|
||||||
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=b83e321b350ac549c45ed6a05ffd8683336fb9f4"
|
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=c5fc93d95fe4993511656cdd6372afbd1307f08f"
|
||||||
|
|
||||||
cluster_name = "${var.cluster_name}"
|
cluster_name = "${var.cluster_name}"
|
||||||
api_servers = ["${var.k8s_domain_name}"]
|
api_servers = ["${var.k8s_domain_name}"]
|
||||||
|
@ -7,7 +7,7 @@ systemd:
|
|||||||
- name: 40-etcd-cluster.conf
|
- name: 40-etcd-cluster.conf
|
||||||
contents: |
|
contents: |
|
||||||
[Service]
|
[Service]
|
||||||
Environment="ETCD_IMAGE_TAG=v3.2.13"
|
Environment="ETCD_IMAGE_TAG=v3.3.2"
|
||||||
Environment="ETCD_NAME=${etcd_name}"
|
Environment="ETCD_NAME=${etcd_name}"
|
||||||
Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${domain_name}:2379"
|
Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${domain_name}:2379"
|
||||||
Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${domain_name}:2380"
|
Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${domain_name}:2380"
|
||||||
@ -117,7 +117,7 @@ storage:
|
|||||||
contents:
|
contents:
|
||||||
inline: |
|
inline: |
|
||||||
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
|
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
|
||||||
KUBELET_IMAGE_TAG=v1.9.1
|
KUBELET_IMAGE_TAG=v1.9.4
|
||||||
- path: /etc/hostname
|
- path: /etc/hostname
|
||||||
filesystem: root
|
filesystem: root
|
||||||
mode: 0644
|
mode: 0644
|
||||||
@ -144,7 +144,7 @@ storage:
|
|||||||
# Move experimental manifests
|
# Move experimental manifests
|
||||||
[ -n "$(ls /opt/bootkube/assets/manifests-*/* 2>/dev/null)" ] && mv /opt/bootkube/assets/manifests-*/* /opt/bootkube/assets/manifests && rm -rf /opt/bootkube/assets/manifests-*
|
[ -n "$(ls /opt/bootkube/assets/manifests-*/* 2>/dev/null)" ] && mv /opt/bootkube/assets/manifests-*/* /opt/bootkube/assets/manifests && rm -rf /opt/bootkube/assets/manifests-*
|
||||||
BOOTKUBE_ACI="$${BOOTKUBE_ACI:-quay.io/coreos/bootkube}"
|
BOOTKUBE_ACI="$${BOOTKUBE_ACI:-quay.io/coreos/bootkube}"
|
||||||
BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.9.1}"
|
BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.11.0}"
|
||||||
BOOTKUBE_ASSETS="$${BOOTKUBE_ASSETS:-/opt/bootkube/assets}"
|
BOOTKUBE_ASSETS="$${BOOTKUBE_ASSETS:-/opt/bootkube/assets}"
|
||||||
exec /usr/bin/rkt run \
|
exec /usr/bin/rkt run \
|
||||||
--trust-keys-from-https \
|
--trust-keys-from-https \
|
||||||
|
@ -82,7 +82,7 @@ storage:
|
|||||||
contents:
|
contents:
|
||||||
inline: |
|
inline: |
|
||||||
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
|
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
|
||||||
KUBELET_IMAGE_TAG=v1.9.1
|
KUBELET_IMAGE_TAG=v1.9.4
|
||||||
- path: /etc/hostname
|
- path: /etc/hostname
|
||||||
filesystem: root
|
filesystem: root
|
||||||
mode: 0644
|
mode: 0644
|
||||||
|
@ -3,7 +3,7 @@ resource "matchbox_group" "container-linux-install" {
|
|||||||
count = "${length(var.controller_names) + length(var.worker_names)}"
|
count = "${length(var.controller_names) + length(var.worker_names)}"
|
||||||
|
|
||||||
name = "${format("container-linux-install-%s", element(concat(var.controller_names, var.worker_names), count.index))}"
|
name = "${format("container-linux-install-%s", element(concat(var.controller_names, var.worker_names), count.index))}"
|
||||||
profile = "${var.cached_install == "true" ? matchbox_profile.cached-container-linux-install.name : matchbox_profile.container-linux-install.name}"
|
profile = "${var.cached_install == "true" ? element(matchbox_profile.cached-container-linux-install.*.name, count.index) : element(matchbox_profile.container-linux-install.*.name, count.index)}"
|
||||||
|
|
||||||
selector {
|
selector {
|
||||||
mac = "${element(concat(var.controller_macs, var.worker_macs), count.index)}"
|
mac = "${element(concat(var.controller_macs, var.worker_macs), count.index)}"
|
||||||
|
@ -1,6 +1,8 @@
|
|||||||
// Container Linux Install profile (from release.core-os.net)
|
// Container Linux Install profile (from release.core-os.net)
|
||||||
resource "matchbox_profile" "container-linux-install" {
|
resource "matchbox_profile" "container-linux-install" {
|
||||||
name = "container-linux-install"
|
count = "${length(var.controller_names) + length(var.worker_names)}"
|
||||||
|
name = "${format("%s-container-linux-install-%s", var.cluster_name, element(concat(var.controller_names, var.worker_names), count.index))}"
|
||||||
|
|
||||||
kernel = "http://${var.container_linux_channel}.release.core-os.net/amd64-usr/${var.container_linux_version}/coreos_production_pxe.vmlinuz"
|
kernel = "http://${var.container_linux_channel}.release.core-os.net/amd64-usr/${var.container_linux_version}/coreos_production_pxe.vmlinuz"
|
||||||
|
|
||||||
initrd = [
|
initrd = [
|
||||||
@ -16,10 +18,12 @@ resource "matchbox_profile" "container-linux-install" {
|
|||||||
"${var.kernel_args}",
|
"${var.kernel_args}",
|
||||||
]
|
]
|
||||||
|
|
||||||
container_linux_config = "${data.template_file.container-linux-install-config.rendered}"
|
container_linux_config = "${element(data.template_file.container-linux-install-configs.*.rendered, count.index)}"
|
||||||
}
|
}
|
||||||
|
|
||||||
data "template_file" "container-linux-install-config" {
|
data "template_file" "container-linux-install-configs" {
|
||||||
|
count = "${length(var.controller_names) + length(var.worker_names)}"
|
||||||
|
|
||||||
template = "${file("${path.module}/cl/container-linux-install.yaml.tmpl")}"
|
template = "${file("${path.module}/cl/container-linux-install.yaml.tmpl")}"
|
||||||
|
|
||||||
vars {
|
vars {
|
||||||
@ -37,7 +41,9 @@ data "template_file" "container-linux-install-config" {
|
|||||||
// Container Linux Install profile (from matchbox /assets cache)
|
// Container Linux Install profile (from matchbox /assets cache)
|
||||||
// Note: Admin must have downloaded container_linux_version into matchbox assets.
|
// Note: Admin must have downloaded container_linux_version into matchbox assets.
|
||||||
resource "matchbox_profile" "cached-container-linux-install" {
|
resource "matchbox_profile" "cached-container-linux-install" {
|
||||||
name = "cached-container-linux-install"
|
count = "${length(var.controller_names) + length(var.worker_names)}"
|
||||||
|
name = "${format("%s-cached-container-linux-install-%s", var.cluster_name, element(concat(var.controller_names, var.worker_names), count.index))}"
|
||||||
|
|
||||||
kernel = "/assets/coreos/${var.container_linux_version}/coreos_production_pxe.vmlinuz"
|
kernel = "/assets/coreos/${var.container_linux_version}/coreos_production_pxe.vmlinuz"
|
||||||
|
|
||||||
initrd = [
|
initrd = [
|
||||||
@ -53,10 +59,12 @@ resource "matchbox_profile" "cached-container-linux-install" {
|
|||||||
"${var.kernel_args}",
|
"${var.kernel_args}",
|
||||||
]
|
]
|
||||||
|
|
||||||
container_linux_config = "${data.template_file.cached-container-linux-install-config.rendered}"
|
container_linux_config = "${element(data.template_file.cached-container-linux-install-configs.*.rendered, count.index)}"
|
||||||
}
|
}
|
||||||
|
|
||||||
data "template_file" "cached-container-linux-install-config" {
|
data "template_file" "cached-container-linux-install-configs" {
|
||||||
|
count = "${length(var.controller_names) + length(var.worker_names)}"
|
||||||
|
|
||||||
template = "${file("${path.module}/cl/container-linux-install.yaml.tmpl")}"
|
template = "${file("${path.module}/cl/container-linux-install.yaml.tmpl")}"
|
||||||
|
|
||||||
vars {
|
vars {
|
||||||
|
@ -24,7 +24,7 @@ variable "ssh_authorized_key" {
|
|||||||
}
|
}
|
||||||
|
|
||||||
# Machines
|
# Machines
|
||||||
# Terraform's crude "type system" does properly support lists of maps so we do this.
|
# Terraform's crude "type system" does not properly support lists of maps so we do this.
|
||||||
|
|
||||||
variable "controller_names" {
|
variable "controller_names" {
|
||||||
type = "list"
|
type = "list"
|
||||||
|
@ -98,7 +98,7 @@ storage:
|
|||||||
contents:
|
contents:
|
||||||
inline: |
|
inline: |
|
||||||
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
|
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
|
||||||
KUBELET_IMAGE_TAG=v1.9.1
|
KUBELET_IMAGE_TAG=v1.9.4
|
||||||
- path: /etc/hostname
|
- path: /etc/hostname
|
||||||
filesystem: root
|
filesystem: root
|
||||||
mode: 0644
|
mode: 0644
|
||||||
|
@ -12,9 +12,6 @@ resource "matchbox_group" "workers" {
|
|||||||
domain_name = "${element(var.worker_domains, count.index)}"
|
domain_name = "${element(var.worker_domains, count.index)}"
|
||||||
etcd_endpoints = "${join(",", formatlist("%s:2379", var.controller_domains))}"
|
etcd_endpoints = "${join(",", formatlist("%s:2379", var.controller_domains))}"
|
||||||
|
|
||||||
# TODO
|
|
||||||
etcd_on_host = "true"
|
|
||||||
k8s_etcd_service_ip = "10.3.0.15"
|
|
||||||
k8s_dns_service_ip = "${var.kube_dns_service_ip}"
|
k8s_dns_service_ip = "${var.kube_dns_service_ip}"
|
||||||
cluster_domain_suffix = "${var.cluster_domain_suffix}"
|
cluster_domain_suffix = "${var.cluster_domain_suffix}"
|
||||||
ssh_authorized_key = "${var.ssh_authorized_key}"
|
ssh_authorized_key = "${var.ssh_authorized_key}"
|
||||||
|
@ -11,10 +11,10 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
|
|||||||
|
|
||||||
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
|
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
|
||||||
|
|
||||||
* Kubernetes v1.9.1 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
|
* Kubernetes v1.9.4 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
|
||||||
* Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
|
* Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
|
||||||
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
|
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
|
||||||
* Ready for Ingress, Dashboards, Metrics, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
|
* Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
|
||||||
|
|
||||||
## Docs
|
## Docs
|
||||||
|
|
||||||
|
@ -1,6 +1,6 @@
|
|||||||
# Self-hosted Kubernetes assets (kubeconfig, manifests)
|
# Self-hosted Kubernetes assets (kubeconfig, manifests)
|
||||||
module "bootkube" {
|
module "bootkube" {
|
||||||
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=b83e321b350ac549c45ed6a05ffd8683336fb9f4"
|
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=c5fc93d95fe4993511656cdd6372afbd1307f08f"
|
||||||
|
|
||||||
cluster_name = "${var.cluster_name}"
|
cluster_name = "${var.cluster_name}"
|
||||||
api_servers = ["${format("%s.%s", var.cluster_name, var.dns_zone)}"]
|
api_servers = ["${format("%s.%s", var.cluster_name, var.dns_zone)}"]
|
||||||
|
@ -7,7 +7,7 @@ systemd:
|
|||||||
- name: 40-etcd-cluster.conf
|
- name: 40-etcd-cluster.conf
|
||||||
contents: |
|
contents: |
|
||||||
[Service]
|
[Service]
|
||||||
Environment="ETCD_IMAGE_TAG=v3.2.13"
|
Environment="ETCD_IMAGE_TAG=v3.3.2"
|
||||||
Environment="ETCD_NAME=${etcd_name}"
|
Environment="ETCD_NAME=${etcd_name}"
|
||||||
Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${etcd_domain}:2379"
|
Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${etcd_domain}:2379"
|
||||||
Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${etcd_domain}:2380"
|
Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${etcd_domain}:2380"
|
||||||
@ -77,6 +77,7 @@ systemd:
|
|||||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/checkpoint-secrets
|
ExecStartPre=/bin/mkdir -p /etc/kubernetes/checkpoint-secrets
|
||||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/inactive-manifests
|
ExecStartPre=/bin/mkdir -p /etc/kubernetes/inactive-manifests
|
||||||
ExecStartPre=/bin/mkdir -p /var/lib/cni
|
ExecStartPre=/bin/mkdir -p /var/lib/cni
|
||||||
|
ExecStartPre=/bin/mkdir -p /var/lib/kubelet/volumeplugins
|
||||||
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
|
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
|
||||||
ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/cache/kubelet-pod.uuid
|
ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/cache/kubelet-pod.uuid
|
||||||
ExecStart=/usr/lib/coreos/kubelet-wrapper \
|
ExecStart=/usr/lib/coreos/kubelet-wrapper \
|
||||||
@ -93,7 +94,8 @@ systemd:
|
|||||||
--network-plugin=cni \
|
--network-plugin=cni \
|
||||||
--node-labels=node-role.kubernetes.io/master \
|
--node-labels=node-role.kubernetes.io/master \
|
||||||
--pod-manifest-path=/etc/kubernetes/manifests \
|
--pod-manifest-path=/etc/kubernetes/manifests \
|
||||||
--register-with-taints=node-role.kubernetes.io/master=:NoSchedule
|
--register-with-taints=node-role.kubernetes.io/master=:NoSchedule \
|
||||||
|
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
|
||||||
ExecStop=-/usr/bin/rkt stop --uuid-file=/var/cache/kubelet-pod.uuid
|
ExecStop=-/usr/bin/rkt stop --uuid-file=/var/cache/kubelet-pod.uuid
|
||||||
Restart=always
|
Restart=always
|
||||||
RestartSec=10
|
RestartSec=10
|
||||||
@ -120,7 +122,7 @@ storage:
|
|||||||
contents:
|
contents:
|
||||||
inline: |
|
inline: |
|
||||||
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
|
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
|
||||||
KUBELET_IMAGE_TAG=v1.9.1
|
KUBELET_IMAGE_TAG=v1.9.4
|
||||||
- path: /etc/sysctl.d/max-user-watches.conf
|
- path: /etc/sysctl.d/max-user-watches.conf
|
||||||
filesystem: root
|
filesystem: root
|
||||||
contents:
|
contents:
|
||||||
@ -141,7 +143,7 @@ storage:
|
|||||||
# Move experimental manifests
|
# Move experimental manifests
|
||||||
[ -n "$(ls /opt/bootkube/assets/manifests-*/* 2>/dev/null)" ] && mv /opt/bootkube/assets/manifests-*/* /opt/bootkube/assets/manifests && rm -rf /opt/bootkube/assets/manifests-*
|
[ -n "$(ls /opt/bootkube/assets/manifests-*/* 2>/dev/null)" ] && mv /opt/bootkube/assets/manifests-*/* /opt/bootkube/assets/manifests && rm -rf /opt/bootkube/assets/manifests-*
|
||||||
BOOTKUBE_ACI="$${BOOTKUBE_ACI:-quay.io/coreos/bootkube}"
|
BOOTKUBE_ACI="$${BOOTKUBE_ACI:-quay.io/coreos/bootkube}"
|
||||||
BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.9.1}"
|
BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.11.0}"
|
||||||
BOOTKUBE_ASSETS="$${BOOTKUBE_ASSETS:-/opt/bootkube/assets}"
|
BOOTKUBE_ASSETS="$${BOOTKUBE_ASSETS:-/opt/bootkube/assets}"
|
||||||
exec /usr/bin/rkt run \
|
exec /usr/bin/rkt run \
|
||||||
--trust-keys-from-https \
|
--trust-keys-from-https \
|
||||||
|
@ -53,6 +53,7 @@ systemd:
|
|||||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/checkpoint-secrets
|
ExecStartPre=/bin/mkdir -p /etc/kubernetes/checkpoint-secrets
|
||||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/inactive-manifests
|
ExecStartPre=/bin/mkdir -p /etc/kubernetes/inactive-manifests
|
||||||
ExecStartPre=/bin/mkdir -p /var/lib/cni
|
ExecStartPre=/bin/mkdir -p /var/lib/cni
|
||||||
|
ExecStartPre=/bin/mkdir -p /var/lib/kubelet/volumeplugins
|
||||||
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
|
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
|
||||||
ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/cache/kubelet-pod.uuid
|
ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/cache/kubelet-pod.uuid
|
||||||
ExecStart=/usr/lib/coreos/kubelet-wrapper \
|
ExecStart=/usr/lib/coreos/kubelet-wrapper \
|
||||||
@ -68,7 +69,8 @@ systemd:
|
|||||||
--lock-file=/var/run/lock/kubelet.lock \
|
--lock-file=/var/run/lock/kubelet.lock \
|
||||||
--network-plugin=cni \
|
--network-plugin=cni \
|
||||||
--node-labels=node-role.kubernetes.io/node \
|
--node-labels=node-role.kubernetes.io/node \
|
||||||
--pod-manifest-path=/etc/kubernetes/manifests
|
--pod-manifest-path=/etc/kubernetes/manifests \
|
||||||
|
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
|
||||||
ExecStop=-/usr/bin/rkt stop --uuid-file=/var/cache/kubelet-pod.uuid
|
ExecStop=-/usr/bin/rkt stop --uuid-file=/var/cache/kubelet-pod.uuid
|
||||||
Restart=always
|
Restart=always
|
||||||
RestartSec=5
|
RestartSec=5
|
||||||
@ -94,7 +96,7 @@ storage:
|
|||||||
contents:
|
contents:
|
||||||
inline: |
|
inline: |
|
||||||
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
|
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
|
||||||
KUBELET_IMAGE_TAG=v1.9.1
|
KUBELET_IMAGE_TAG=v1.9.4
|
||||||
- path: /etc/sysctl.d/max-user-watches.conf
|
- path: /etc/sysctl.d/max-user-watches.conf
|
||||||
filesystem: root
|
filesystem: root
|
||||||
contents:
|
contents:
|
||||||
@ -112,7 +114,7 @@ storage:
|
|||||||
--volume config,kind=host,source=/etc/kubernetes \
|
--volume config,kind=host,source=/etc/kubernetes \
|
||||||
--mount volume=config,target=/etc/kubernetes \
|
--mount volume=config,target=/etc/kubernetes \
|
||||||
--insecure-options=image \
|
--insecure-options=image \
|
||||||
docker://gcr.io/google_containers/hyperkube:v1.9.1 \
|
docker://gcr.io/google_containers/hyperkube:v1.9.4 \
|
||||||
--net=host \
|
--net=host \
|
||||||
--dns=host \
|
--dns=host \
|
||||||
--exec=/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname)
|
--exec=/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname)
|
||||||
|
@ -45,7 +45,7 @@ resource "digitalocean_droplet" "controllers" {
|
|||||||
private_networking = true
|
private_networking = true
|
||||||
|
|
||||||
user_data = "${element(data.ct_config.controller_ign.*.rendered, count.index)}"
|
user_data = "${element(data.ct_config.controller_ign.*.rendered, count.index)}"
|
||||||
ssh_keys = "${var.ssh_fingerprints}"
|
ssh_keys = ["${var.ssh_fingerprints}"]
|
||||||
|
|
||||||
tags = [
|
tags = [
|
||||||
"${digitalocean_tag.controllers.id}",
|
"${digitalocean_tag.controllers.id}",
|
||||||
|
@ -22,12 +22,12 @@ resource "digitalocean_firewall" "rules" {
|
|||||||
},
|
},
|
||||||
{
|
{
|
||||||
protocol = "udp"
|
protocol = "udp"
|
||||||
port_range = "all"
|
port_range = "1-65535"
|
||||||
source_tags = ["${digitalocean_tag.controllers.name}", "${digitalocean_tag.workers.name}"]
|
source_tags = ["${digitalocean_tag.controllers.name}", "${digitalocean_tag.workers.name}"]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
protocol = "tcp"
|
protocol = "tcp"
|
||||||
port_range = "all"
|
port_range = "1-65535"
|
||||||
source_tags = ["${digitalocean_tag.controllers.name}", "${digitalocean_tag.workers.name}"]
|
source_tags = ["${digitalocean_tag.controllers.name}", "${digitalocean_tag.workers.name}"]
|
||||||
},
|
},
|
||||||
]
|
]
|
||||||
@ -35,17 +35,18 @@ resource "digitalocean_firewall" "rules" {
|
|||||||
# allow all outbound traffic
|
# allow all outbound traffic
|
||||||
outbound_rule = [
|
outbound_rule = [
|
||||||
{
|
{
|
||||||
protocol = "icmp"
|
protocol = "tcp"
|
||||||
|
port_range = "1-65535"
|
||||||
destination_addresses = ["0.0.0.0/0", "::/0"]
|
destination_addresses = ["0.0.0.0/0", "::/0"]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
protocol = "udp"
|
protocol = "udp"
|
||||||
port_range = "all"
|
port_range = "1-65535"
|
||||||
destination_addresses = ["0.0.0.0/0", "::/0"]
|
destination_addresses = ["0.0.0.0/0", "::/0"]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
protocol = "tcp"
|
protocol = "icmp"
|
||||||
port_range = "all"
|
port_range = "1-65535"
|
||||||
destination_addresses = ["0.0.0.0/0", "::/0"]
|
destination_addresses = ["0.0.0.0/0", "::/0"]
|
||||||
},
|
},
|
||||||
]
|
]
|
||||||
|
@ -5,7 +5,7 @@ terraform {
|
|||||||
}
|
}
|
||||||
|
|
||||||
provider "digitalocean" {
|
provider "digitalocean" {
|
||||||
version = "0.1.2"
|
version = "~> 0.1.2"
|
||||||
}
|
}
|
||||||
|
|
||||||
provider "local" {
|
provider "local" {
|
||||||
|
@ -27,8 +27,8 @@ variable "controller_count" {
|
|||||||
|
|
||||||
variable "controller_type" {
|
variable "controller_type" {
|
||||||
type = "string"
|
type = "string"
|
||||||
default = "2gb"
|
default = "s-2vcpu-2gb"
|
||||||
description = "Digital Ocean droplet size (e.g. 2gb (min), 4gb, 8gb)."
|
description = "Digital Ocean droplet size (e.g. s-2vcpu-2gb, s-2vcpu-4gb, s-4vcpu-8gb)."
|
||||||
}
|
}
|
||||||
|
|
||||||
variable "worker_count" {
|
variable "worker_count" {
|
||||||
@ -39,8 +39,8 @@ variable "worker_count" {
|
|||||||
|
|
||||||
variable "worker_type" {
|
variable "worker_type" {
|
||||||
type = "string"
|
type = "string"
|
||||||
default = "512mb"
|
default = "s-1vcpu-1gb"
|
||||||
description = "Digital Ocean droplet size (e.g. 512mb, 1gb, 2gb, 4gb)"
|
description = "Digital Ocean droplet size (e.g. s-1vcpu-1gb, s-1vcpu-2gb, s-2vcpu-2gb)"
|
||||||
}
|
}
|
||||||
|
|
||||||
variable "ssh_fingerprints" {
|
variable "ssh_fingerprints" {
|
||||||
@ -82,4 +82,3 @@ variable "cluster_domain_suffix" {
|
|||||||
type = "string"
|
type = "string"
|
||||||
default = "cluster.local"
|
default = "cluster.local"
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -26,7 +26,7 @@ resource "digitalocean_droplet" "workers" {
|
|||||||
private_networking = true
|
private_networking = true
|
||||||
|
|
||||||
user_data = "${data.ct_config.worker_ign.rendered}"
|
user_data = "${data.ct_config.worker_ign.rendered}"
|
||||||
ssh_keys = "${var.ssh_fingerprints}"
|
ssh_keys = ["${var.ssh_fingerprints}"]
|
||||||
|
|
||||||
tags = [
|
tags = [
|
||||||
"${digitalocean_tag.workers.id}",
|
"${digitalocean_tag.workers.id}",
|
||||||
@ -44,7 +44,6 @@ data "template_file" "worker_config" {
|
|||||||
|
|
||||||
vars = {
|
vars = {
|
||||||
k8s_dns_service_ip = "${cidrhost(var.service_cidr, 10)}"
|
k8s_dns_service_ip = "${cidrhost(var.service_cidr, 10)}"
|
||||||
k8s_etcd_service_ip = "${cidrhost(var.service_cidr, 15)}"
|
|
||||||
cluster_domain_suffix = "${var.cluster_domain_suffix}"
|
cluster_domain_suffix = "${var.cluster_domain_suffix}"
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
@ -18,7 +18,7 @@ kubectl apply -f addons/cluo -R
|
|||||||
$ kubectl get nodes --show-labels
|
$ kubectl get nodes --show-labels
|
||||||
...
|
...
|
||||||
container-linux-update.v1.coreos.com/group=stable
|
container-linux-update.v1.coreos.com/group=stable
|
||||||
container-linux-update.v1.coreos.com/version=1576.5.0
|
container-linux-update.v1.coreos.com/version=1632.3.0
|
||||||
```
|
```
|
||||||
|
|
||||||
`update-operator` ensures one node reboots at a time and that pods are drained prior to reboot.
|
`update-operator` ensures one node reboots at a time and that pods are drained prior to reboot.
|
||||||
|
@ -1,27 +0,0 @@
|
|||||||
# Kubernetes Dashboard
|
|
||||||
|
|
||||||
!!! warning
|
|
||||||
The Kubernetes Dashboard takes [unusual approaches](https://github.com/kubernetes/dashboard/wiki/Access-control#authorization-header) to security and is often a point of security escalations. We recommend you do don't deploy it and get familiar with `kubectl`, if possible.
|
|
||||||
|
|
||||||
The Kubernetes [Dashboard](https://github.com/kubernetes/dashboard) provides a web UI to manage a Kubernetes cluster for those who prefer an alternative to `kubectl`.
|
|
||||||
|
|
||||||
## Create
|
|
||||||
|
|
||||||
Create the dashboard deployment and service.
|
|
||||||
|
|
||||||
```
|
|
||||||
kubectl apply -f addons/dashboard -R
|
|
||||||
```
|
|
||||||
|
|
||||||
## Access
|
|
||||||
|
|
||||||
Use `kubectl` to authenticate to the apiserver and create a local port forward to the remote port on the dashboard pod.
|
|
||||||
|
|
||||||
```sh
|
|
||||||
kubectl get pods -n kube-system
|
|
||||||
kubectl port-forward POD [LOCAL_PORT:]REMOTE_PORT
|
|
||||||
kubectl port-forward kubernetes-dashboard-id 9090 -n kube-system
|
|
||||||
```
|
|
||||||
|
|
||||||
!!! tip
|
|
||||||
If you'd like to expose the Dashboard via Ingress and add authentication, use a suitable OAuth2 proxy sidecar and pick your favorite OAuth2 provider.
|
|
20
docs/addons/grafana.md
Normal file
20
docs/addons/grafana.md
Normal file
@ -0,0 +1,20 @@
|
|||||||
|
## Grafana
|
||||||
|
|
||||||
|
Grafana can be used to build dashboards and visualizations that use Prometheus as the datasource. Create the grafana deployment and service.
|
||||||
|
|
||||||
|
```
|
||||||
|
kubectl apply -f addons/grafana -R
|
||||||
|
```
|
||||||
|
|
||||||
|
Use `kubectl` to authenticate to the apiserver and create a local port-forward to the Grafana pod.
|
||||||
|
|
||||||
|
```
|
||||||
|
kubectl port-forward grafana-POD-ID 8080 -n monitoring
|
||||||
|
```
|
||||||
|
|
||||||
|
Visit [127.0.0.1:8080](http://127.0.0.1:8080) to view the bundled dashboards.
|
||||||
|
|
||||||
|

|
||||||
|

|
||||||
|

|
||||||
|
|
@ -6,6 +6,5 @@ Every Typhoon cluster is verified to work well with several post-install addons.
|
|||||||
* Nginx [Ingress Controller](ingress.md)
|
* Nginx [Ingress Controller](ingress.md)
|
||||||
* [Heapster](heapster.md)
|
* [Heapster](heapster.md)
|
||||||
* [Prometheus](prometheus.md)
|
* [Prometheus](prometheus.md)
|
||||||
* [Grafana](prometheus.md#grafana)
|
* [Grafana](grafana.md)
|
||||||
* Kubernetes [Dashboard](dashboard.md)
|
|
||||||
|
|
||||||
|
@ -20,7 +20,7 @@ On Kubernetes clusters, Prometheus is run as a Deployment, configured with a Con
|
|||||||
kubectl apply -f addons/prometheus -R
|
kubectl apply -f addons/prometheus -R
|
||||||
```
|
```
|
||||||
|
|
||||||
The ConfigMap configures Prometheus to target apiserver endpoints, node metrics, cAdvisor metrics, and exporters. By default, data is kept in an `emptyDir` so it is persisted until the pod is rescheduled.
|
The ConfigMap configures Prometheus to discover apiservers, kubelets, cAdvisor, services, endpoints, and exporters. By default, data is kept in an `emptyDir` so it is persisted until the pod is rescheduled.
|
||||||
|
|
||||||
### Exporters
|
### Exporters
|
||||||
|
|
||||||
@ -32,7 +32,7 @@ Exporters expose metrics for 3rd-party systems that don't natively expose Promet
|
|||||||
|
|
||||||
### Queries and Alerts
|
### Queries and Alerts
|
||||||
|
|
||||||
Prometheus provides a simplistic UI for querying metrics and viewing alerts. Use `kubectl` to authenticate to the apiserver and create a local port-forward to the Prometheus pod.
|
Prometheus provides a basic UI for querying metrics and viewing alerts. Use `kubectl` to authenticate to the apiserver and create a local port-forward to the Prometheus pod.
|
||||||
|
|
||||||
```
|
```
|
||||||
kubectl get pods -n monitoring
|
kubectl get pods -n monitoring
|
||||||
@ -47,21 +47,4 @@ Visit [127.0.0.1:9090](http://127.0.0.1:9090) to query [expressions](http://127.
|
|||||||
<br/>
|
<br/>
|
||||||

|

|
||||||
|
|
||||||
## Grafana
|
Use [Grafana](/addons/grafana.md) to view or build dashboards that use Prometheus as the datasource.
|
||||||
|
|
||||||
Grafana can be used to build dashboards and rich visualizations that use Prometheus as the datasource. Create the grafana deployment and service.
|
|
||||||
|
|
||||||
```
|
|
||||||
kubectl apply -f addons/grafana -R
|
|
||||||
```
|
|
||||||
|
|
||||||
Use `kubectl` to authenticate to the apiserver and create a local port-forward to the Grafana pod.
|
|
||||||
|
|
||||||
```
|
|
||||||
kubectl port-forward grafana-POD-ID 8080 -n monitoring
|
|
||||||
```
|
|
||||||
|
|
||||||
Visit [127.0.0.1:8080](http://127.0.0.1:8080), add the prometheus data-source (http://prometheus.monitoring.svc.cluster.local), and import your desired dashboard (e.g. [Grafana Dashboard 315](https://grafana.com/dashboards/315)).
|
|
||||||
|
|
||||||

|
|
||||||
|
|
||||||
|
6
docs/advanced/overview.md
Normal file
6
docs/advanced/overview.md
Normal file
@ -0,0 +1,6 @@
|
|||||||
|
# Advanced
|
||||||
|
|
||||||
|
Typhoon clusters offer several advanced features for skilled users.
|
||||||
|
|
||||||
|
* [Customization](customization.md)
|
||||||
|
* [Worker Pools](worker-pools.md)
|
149
docs/advanced/worker-pools.md
Normal file
149
docs/advanced/worker-pools.md
Normal file
@ -0,0 +1,149 @@
|
|||||||
|
# Worker Pools
|
||||||
|
|
||||||
|
Typhoon AWS and Google Cloud allow additional groups of workers to be defined and joined to a cluster. For example, add worker pools of instances with different types, disk sizes, Container Linux channels, or preemptibility modes.
|
||||||
|
|
||||||
|
Internal Terraform Modules:
|
||||||
|
|
||||||
|
* `aws/container-linux/kubernetes/workers`
|
||||||
|
* `google-cloud/container-linux/kubernetes/workers`
|
||||||
|
|
||||||
|
## AWS
|
||||||
|
|
||||||
|
Create a cluster following the AWS [tutorial](../aws.md#cluster). Define a worker pool using the AWS internal `workers` module.
|
||||||
|
|
||||||
|
```tf
|
||||||
|
module "tempest-worker-pool" {
|
||||||
|
source = "git::https://github.com/poseidon/typhoon//aws/container-linux/kubernetes/workers?ref=v1.9.4"
|
||||||
|
|
||||||
|
providers = {
|
||||||
|
aws = "aws.default"
|
||||||
|
}
|
||||||
|
|
||||||
|
# AWS
|
||||||
|
vpc_id = "${module.aws-tempest.vpc_id}"
|
||||||
|
subnet_ids = "${module.aws-tempest.subnet_ids}"
|
||||||
|
security_groups = "${module.aws-tempest.worker_security_groups}"
|
||||||
|
|
||||||
|
# configuration
|
||||||
|
name = "tempest-worker-pool"
|
||||||
|
kubeconfig = "${module.aws-tempest.kubeconfig}"
|
||||||
|
ssh_authorized_key = "${var.ssh_authorized_key}"
|
||||||
|
|
||||||
|
count = 2
|
||||||
|
instance_type = "m5.large"
|
||||||
|
os_channel = "beta"
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Apply the change.
|
||||||
|
|
||||||
|
```
|
||||||
|
terraform apply
|
||||||
|
```
|
||||||
|
|
||||||
|
Verify an auto-scaling group of workers join the cluster within a few minutes.
|
||||||
|
|
||||||
|
### Variables
|
||||||
|
|
||||||
|
The AWS internal `workers` module supports a number of [variables](https://github.com/poseidon/typhoon/blob/master/aws/container-linux/kubernetes/workers/variables.tf).
|
||||||
|
|
||||||
|
#### Required
|
||||||
|
|
||||||
|
| Name | Description | Example |
|
||||||
|
|:-----|:------------|:--------|
|
||||||
|
| vpc_id | Must be set to `vpc_id` output by cluster | "${module.cluster.vpc_id}" |
|
||||||
|
| subnet_ids | Must be set to `subnet_ids` output by cluster | "${module.cluster.subnet_ids}" |
|
||||||
|
| security_groups | Must be set to `worker_security_groups` output by cluster | "${module.cluster.worker_security_groups}" |
|
||||||
|
| name | Unique name (distinct from cluster name) | "tempest-m5s" |
|
||||||
|
| kubeconfig | Must be set to `kubeconfig` output by cluster | "${module.cluster.kubeconfig}" |
|
||||||
|
| ssh_authorized_key | SSH public key for ~/.ssh_authorized_keys | "ssh-rsa AAAAB3NZ..." |
|
||||||
|
|
||||||
|
#### Optional
|
||||||
|
|
||||||
|
| Name | Description | Default | Example |
|
||||||
|
|:-----|:------------|:--------|:--------|
|
||||||
|
| count | Number of instances | 1 | 3 |
|
||||||
|
| instance_type | EC2 instance type | "t2.small" | "t2.medium" |
|
||||||
|
| os_channel | Container Linux AMI channel | stable| "beta", "alpha" |
|
||||||
|
| disk_size | Size of the disk in GB | 40 | 100 |
|
||||||
|
| service_cidr | Must match `service_cidr` of cluster | "10.3.0.0/16" | "10.3.0.0/24" |
|
||||||
|
| cluster_domain_suffix | Must match `cluster_domain_suffix` of cluster | "cluster.local" | "k8s.example.com" |
|
||||||
|
|
||||||
|
Check the list of valid [instance types](https://aws.amazon.com/ec2/instance-types/).
|
||||||
|
|
||||||
|
## Google Cloud
|
||||||
|
|
||||||
|
Create a cluster following the Google Cloud [tutorial](../google-cloud.md#cluster). Define a worker pool using the Google Cloud internal `workers` module.
|
||||||
|
|
||||||
|
```tf
|
||||||
|
module "yavin-worker-pool" {
|
||||||
|
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes/workers?ref=v1.9.4"
|
||||||
|
|
||||||
|
providers = {
|
||||||
|
google = "google.default"
|
||||||
|
}
|
||||||
|
|
||||||
|
# Google Cloud
|
||||||
|
region = "us-central1"
|
||||||
|
network = "${module.google-cloud-yavin.network_name}"
|
||||||
|
cluster_name = "yavin"
|
||||||
|
|
||||||
|
# configuration
|
||||||
|
name = "yavin-16x"
|
||||||
|
kubeconfig = "${module.google-cloud-yavin.kubeconfig}"
|
||||||
|
ssh_authorized_key = "${var.ssh_authorized_key}"
|
||||||
|
|
||||||
|
count = 2
|
||||||
|
machine_type = "n1-standard-16"
|
||||||
|
os_image = "coreos-beta"
|
||||||
|
preemptible = true
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Apply the change.
|
||||||
|
|
||||||
|
```
|
||||||
|
terraform apply
|
||||||
|
```
|
||||||
|
|
||||||
|
Verify a managed instance group of workers joins the cluster within a few minutes.
|
||||||
|
|
||||||
|
```
|
||||||
|
$ kubectl get nodes
|
||||||
|
NAME STATUS AGE VERSION
|
||||||
|
yavin-controller-0.c.example-com.internal Ready 6m v1.9.4
|
||||||
|
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.9.4
|
||||||
|
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.9.4
|
||||||
|
yavin-16x-worker-jrbf.c.example-com.internal Ready 3m v1.9.4
|
||||||
|
yavin-16x-worker-mzdm.c.example-com.internal Ready 3m v1.9.4
|
||||||
|
```
|
||||||
|
|
||||||
|
### Variables
|
||||||
|
|
||||||
|
The Google Cloud internal `workers` module supports a number of [variables](https://github.com/poseidon/typhoon/blob/master/google-cloud/container-linux/kubernetes/workers/variables.tf).
|
||||||
|
|
||||||
|
#### Required
|
||||||
|
|
||||||
|
| Name | Description | Example |
|
||||||
|
|:-----|:------------|:--------|
|
||||||
|
| region | Must be set to `region` of cluster | "us-central1" |
|
||||||
|
| network | Must be set to `network_name` output by cluster | "${module.cluster.network_name}" |
|
||||||
|
| name | Unique name (distinct from cluster name) | "yavin-16x" |
|
||||||
|
| cluster_name | Must be set to `cluster_name` of cluster | "yavin" |
|
||||||
|
| kubeconfig | Must be set to `kubeconfig` output by cluster | "${module.cluster.kubeconfig}" |
|
||||||
|
| ssh_authorized_key | SSH public key for ~/.ssh_authorized_keys | "ssh-rsa AAAAB3NZ..." |
|
||||||
|
|
||||||
|
#### Optional
|
||||||
|
|
||||||
|
| Name | Description | Default | Example |
|
||||||
|
|:-----|:------------|:--------|:--------|
|
||||||
|
| count | Number of instances | 1 | 3 |
|
||||||
|
| machine_type | Compute instance machine type | "n1-standard-1" | See below |
|
||||||
|
| os_image | OS image for compute instances | "coreos-stable" | "coreos-alpha", "coreos-beta" |
|
||||||
|
| disk_size | Size of the disk in GB | 40 | 100 |
|
||||||
|
| preemptible | If true, Compute Engine will terminate instances randomly within 24 hours | false | true |
|
||||||
|
| service_cidr | Must match `service_cidr` of cluster | "10.3.0.0/16" | "10.3.0.0/24" |
|
||||||
|
| cluster_domain_suffix | Must match `cluster_domain_suffix` of cluster | "cluster.local" | "k8s.example.com" |
|
||||||
|
|
||||||
|
Check the list of valid [machine types](https://cloud.google.com/compute/docs/machine-types).
|
||||||
|
|
59
docs/aws.md
59
docs/aws.md
@ -1,6 +1,6 @@
|
|||||||
# AWS
|
# AWS
|
||||||
|
|
||||||
In this tutorial, we'll create a Kubernetes v1.9.1 cluster on AWS.
|
In this tutorial, we'll create a Kubernetes v1.9.4 cluster on AWS.
|
||||||
|
|
||||||
We'll declare a Kubernetes cluster in Terraform using the Typhoon Terraform module. On apply, a VPC, gateway, subnets, auto-scaling groups of controllers and workers, network load balancers for controllers and workers, and security groups will be created.
|
We'll declare a Kubernetes cluster in Terraform using the Typhoon Terraform module. On apply, a VPC, gateway, subnets, auto-scaling groups of controllers and workers, network load balancers for controllers and workers, and security groups will be created.
|
||||||
|
|
||||||
@ -10,23 +10,23 @@ Controllers and workers are provisioned to run a `kubelet`. A one-time [bootkube
|
|||||||
|
|
||||||
* AWS Account and IAM credentials
|
* AWS Account and IAM credentials
|
||||||
* AWS Route53 DNS Zone (registered Domain Name or delegated subdomain)
|
* AWS Route53 DNS Zone (registered Domain Name or delegated subdomain)
|
||||||
* Terraform v0.10.x and [terraform-provider-ct](https://github.com/coreos/terraform-provider-ct) installed locally
|
* Terraform v0.11.x and [terraform-provider-ct](https://github.com/coreos/terraform-provider-ct) installed locally
|
||||||
|
|
||||||
## Terraform Setup
|
## Terraform Setup
|
||||||
|
|
||||||
Install [Terraform](https://www.terraform.io/downloads.html) v0.10.x on your system.
|
Install [Terraform](https://www.terraform.io/downloads.html) v0.11.x on your system.
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
$ terraform version
|
$ terraform version
|
||||||
Terraform v0.10.7
|
Terraform v0.11.1
|
||||||
```
|
```
|
||||||
|
|
||||||
Add the [terraform-provider-ct](https://github.com/coreos/terraform-provider-ct) plugin binary for your system.
|
Add the [terraform-provider-ct](https://github.com/coreos/terraform-provider-ct) plugin binary for your system.
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
wget https://github.com/coreos/terraform-provider-ct/releases/download/v0.2.0/terraform-provider-ct-v0.2.0-linux-amd64.tar.gz
|
wget https://github.com/coreos/terraform-provider-ct/releases/download/v0.2.1/terraform-provider-ct-v0.2.1-linux-amd64.tar.gz
|
||||||
tar xzf terraform-provider-ct-v0.2.0-linux-amd64.tar.gz
|
tar xzf terraform-provider-ct-v0.2.1-linux-amd64.tar.gz
|
||||||
sudo mv terraform-provider-ct-v0.2.0-linux-amd64/terraform-provider-ct /usr/local/bin/
|
sudo mv terraform-provider-ct-v0.2.1-linux-amd64/terraform-provider-ct /usr/local/bin/
|
||||||
```
|
```
|
||||||
|
|
||||||
Add the plugin to your `~/.terraformrc`.
|
Add the plugin to your `~/.terraformrc`.
|
||||||
@ -57,9 +57,32 @@ Configure the AWS provider to use your access key credentials in a `providers.tf
|
|||||||
|
|
||||||
```tf
|
```tf
|
||||||
provider "aws" {
|
provider "aws" {
|
||||||
|
version = "~> 1.5.0"
|
||||||
|
alias = "default"
|
||||||
|
|
||||||
region = "eu-central-1"
|
region = "eu-central-1"
|
||||||
shared_credentials_file = "/home/user/.config/aws/credentials"
|
shared_credentials_file = "/home/user/.config/aws/credentials"
|
||||||
}
|
}
|
||||||
|
|
||||||
|
provider "local" {
|
||||||
|
version = "~> 1.0"
|
||||||
|
alias = "default"
|
||||||
|
}
|
||||||
|
|
||||||
|
provider "null" {
|
||||||
|
version = "~> 1.0"
|
||||||
|
alias = "default"
|
||||||
|
}
|
||||||
|
|
||||||
|
provider "template" {
|
||||||
|
version = "~> 1.0"
|
||||||
|
alias = "default"
|
||||||
|
}
|
||||||
|
|
||||||
|
provider "tls" {
|
||||||
|
version = "~> 1.0"
|
||||||
|
alias = "default"
|
||||||
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
Additional configuration options are described in the `aws` provider [docs](https://www.terraform.io/docs/providers/aws/).
|
Additional configuration options are described in the `aws` provider [docs](https://www.terraform.io/docs/providers/aws/).
|
||||||
@ -73,8 +96,16 @@ Define a Kubernetes cluster using the module `aws/container-linux/kubernetes`.
|
|||||||
|
|
||||||
```tf
|
```tf
|
||||||
module "aws-tempest" {
|
module "aws-tempest" {
|
||||||
source = "git::https://github.com/poseidon/typhoon//aws/container-linux/kubernetes"
|
source = "git::https://github.com/poseidon/typhoon//aws/container-linux/kubernetes?ref=v1.9.4"
|
||||||
|
|
||||||
|
providers = {
|
||||||
|
aws = "aws.default"
|
||||||
|
local = "local.default"
|
||||||
|
null = "null.default"
|
||||||
|
template = "template.default"
|
||||||
|
tls = "tls.default"
|
||||||
|
}
|
||||||
|
|
||||||
cluster_name = "tempest"
|
cluster_name = "tempest"
|
||||||
|
|
||||||
# AWS
|
# AWS
|
||||||
@ -119,7 +150,7 @@ Get or update Terraform modules.
|
|||||||
$ terraform get # downloads missing modules
|
$ terraform get # downloads missing modules
|
||||||
$ terraform get --update # updates all modules
|
$ terraform get --update # updates all modules
|
||||||
Get: git::https://github.com/poseidon/typhoon (update)
|
Get: git::https://github.com/poseidon/typhoon (update)
|
||||||
Get: git::https://github.com/poseidon/bootkube-terraform.git?ref=v0.9.1 (update)
|
Get: git::https://github.com/poseidon/bootkube-terraform.git?ref=v0.11.0 (update)
|
||||||
```
|
```
|
||||||
|
|
||||||
Plan the resources to be created.
|
Plan the resources to be created.
|
||||||
@ -151,9 +182,9 @@ In 4-8 minutes, the Kubernetes cluster will be ready.
|
|||||||
$ export KUBECONFIG=/home/user/.secrets/clusters/tempest/auth/kubeconfig
|
$ export KUBECONFIG=/home/user/.secrets/clusters/tempest/auth/kubeconfig
|
||||||
$ kubectl get nodes
|
$ kubectl get nodes
|
||||||
NAME STATUS AGE VERSION
|
NAME STATUS AGE VERSION
|
||||||
ip-10-0-12-221 Ready 34m v1.9.1
|
ip-10-0-12-221 Ready 34m v1.9.4
|
||||||
ip-10-0-19-112 Ready 34m v1.9.1
|
ip-10-0-19-112 Ready 34m v1.9.4
|
||||||
ip-10-0-4-22 Ready 34m v1.9.1
|
ip-10-0-4-22 Ready 34m v1.9.4
|
||||||
```
|
```
|
||||||
|
|
||||||
List the pods.
|
List the pods.
|
||||||
@ -179,10 +210,10 @@ kube-system pod-checkpointer-4kxtl-ip-10-0-12-221 1/1 Running 0
|
|||||||
|
|
||||||
## Going Further
|
## Going Further
|
||||||
|
|
||||||
Learn about [version pinning](concepts.md#versioning), [maintenance](topics/maintenance.md), and [addons](addons/overview.md).
|
Learn about [maintenance](topics/maintenance.md) and [addons](addons/overview.md).
|
||||||
|
|
||||||
!!! note
|
!!! note
|
||||||
On Container Linux clusters, install the `container-linux-update-operator` addon to coordinate reboots and drains when nodes auto-update. Otherwise, updates may not be applied until the next reboot.
|
On Container Linux clusters, install the `CLUO` addon to coordinate reboots and drains when nodes auto-update. Otherwise, updates may not be applied until the next reboot.
|
||||||
|
|
||||||
## Variables
|
## Variables
|
||||||
|
|
||||||
|
@ -1,6 +1,6 @@
|
|||||||
# Bare-Metal
|
# Bare-Metal
|
||||||
|
|
||||||
In this tutorial, we'll network boot and provision a Kubernetes v1.9.1 cluster on bare-metal.
|
In this tutorial, we'll network boot and provision a Kubernetes v1.9.4 cluster on bare-metal.
|
||||||
|
|
||||||
First, we'll deploy a [Matchbox](https://github.com/coreos/matchbox) service and setup a network boot environment. Then, we'll declare a Kubernetes cluster in Terraform using the Typhoon Terraform module and power on machines. On PXE boot, machines will install Container Linux to disk, reboot into the disk install, and provision themselves as Kubernetes controllers or workers.
|
First, we'll deploy a [Matchbox](https://github.com/coreos/matchbox) service and setup a network boot environment. Then, we'll declare a Kubernetes cluster in Terraform using the Typhoon Terraform module and power on machines. On PXE boot, machines will install Container Linux to disk, reboot into the disk install, and provision themselves as Kubernetes controllers or workers.
|
||||||
|
|
||||||
@ -12,7 +12,7 @@ Controllers are provisioned as etcd peers and run `etcd-member` (etcd3) and `kub
|
|||||||
* PXE-enabled [network boot](https://coreos.com/matchbox/docs/latest/network-setup.html) environment
|
* PXE-enabled [network boot](https://coreos.com/matchbox/docs/latest/network-setup.html) environment
|
||||||
* Matchbox v0.6+ deployment with API enabled
|
* Matchbox v0.6+ deployment with API enabled
|
||||||
* Matchbox credentials `client.crt`, `client.key`, `ca.crt`
|
* Matchbox credentials `client.crt`, `client.key`, `ca.crt`
|
||||||
* Terraform v0.10.x and [terraform-provider-matchbox](https://github.com/coreos/terraform-provider-matchbox) installed locally
|
* Terraform v0.11.x and [terraform-provider-matchbox](https://github.com/coreos/terraform-provider-matchbox) installed locally
|
||||||
|
|
||||||
## Machines
|
## Machines
|
||||||
|
|
||||||
@ -109,11 +109,11 @@ Read about the [many ways](https://coreos.com/matchbox/docs/latest/network-setup
|
|||||||
|
|
||||||
## Terraform Setup
|
## Terraform Setup
|
||||||
|
|
||||||
Install [Terraform](https://www.terraform.io/downloads.html) v0.10.x on your system.
|
Install [Terraform](https://www.terraform.io/downloads.html) v0.11.x on your system.
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
$ terraform version
|
$ terraform version
|
||||||
Terraform v0.10.7
|
Terraform v0.11.1
|
||||||
```
|
```
|
||||||
|
|
||||||
Add the [terraform-provider-matchbox](https://github.com/coreos/terraform-provider-matchbox) plugin binary for your system.
|
Add the [terraform-provider-matchbox](https://github.com/coreos/terraform-provider-matchbox) plugin binary for your system.
|
||||||
@ -149,6 +149,26 @@ provider "matchbox" {
|
|||||||
client_key = "${file("~/.config/matchbox/client.key")}"
|
client_key = "${file("~/.config/matchbox/client.key")}"
|
||||||
ca = "${file("~/.config/matchbox/ca.crt")}"
|
ca = "${file("~/.config/matchbox/ca.crt")}"
|
||||||
}
|
}
|
||||||
|
|
||||||
|
provider "local" {
|
||||||
|
version = "~> 1.0"
|
||||||
|
alias = "default"
|
||||||
|
}
|
||||||
|
|
||||||
|
provider "null" {
|
||||||
|
version = "~> 1.0"
|
||||||
|
alias = "default"
|
||||||
|
}
|
||||||
|
|
||||||
|
provider "template" {
|
||||||
|
version = "~> 1.0"
|
||||||
|
alias = "default"
|
||||||
|
}
|
||||||
|
|
||||||
|
provider "tls" {
|
||||||
|
version = "~> 1.0"
|
||||||
|
alias = "default"
|
||||||
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
## Cluster
|
## Cluster
|
||||||
@ -157,12 +177,19 @@ Define a Kubernetes cluster using the module `bare-metal/container-linux/kuberne
|
|||||||
|
|
||||||
```tf
|
```tf
|
||||||
module "bare-metal-mercury" {
|
module "bare-metal-mercury" {
|
||||||
source = "git::https://github.com/poseidon/typhoon//bare-metal/container-linux/kubernetes"
|
source = "git::https://github.com/poseidon/typhoon//bare-metal/container-linux/kubernetes?ref=v1.9.4"
|
||||||
|
|
||||||
|
providers = {
|
||||||
|
local = "local.default"
|
||||||
|
null = "null.default"
|
||||||
|
template = "template.default"
|
||||||
|
tls = "tls.default"
|
||||||
|
}
|
||||||
|
|
||||||
# install
|
# install
|
||||||
matchbox_http_endpoint = "http://matchbox.example.com"
|
matchbox_http_endpoint = "http://matchbox.example.com"
|
||||||
container_linux_channel = "stable"
|
container_linux_channel = "stable"
|
||||||
container_linux_version = "1576.5.0"
|
container_linux_version = "1632.3.0"
|
||||||
ssh_authorized_key = "ssh-rsa AAAAB3Nz..."
|
ssh_authorized_key = "ssh-rsa AAAAB3Nz..."
|
||||||
|
|
||||||
# cluster
|
# cluster
|
||||||
@ -219,7 +246,7 @@ Get or update Terraform modules.
|
|||||||
$ terraform get # downloads missing modules
|
$ terraform get # downloads missing modules
|
||||||
$ terraform get --update # updates all modules
|
$ terraform get --update # updates all modules
|
||||||
Get: git::https://github.com/poseidon/typhoon (update)
|
Get: git::https://github.com/poseidon/typhoon (update)
|
||||||
Get: git::https://github.com/poseidon/bootkube-terraform.git?ref=v0.9.1 (update)
|
Get: git::https://github.com/poseidon/bootkube-terraform.git?ref=v0.11.0 (update)
|
||||||
```
|
```
|
||||||
|
|
||||||
Plan the resources to be created.
|
Plan the resources to be created.
|
||||||
@ -232,6 +259,7 @@ Plan: 55 to add, 0 to change, 0 to destroy.
|
|||||||
Apply the changes. Terraform will generate bootkube assets to `asset_dir` and create Matchbox profiles (e.g. controller, worker) and matching rules via the Matchbox API.
|
Apply the changes. Terraform will generate bootkube assets to `asset_dir` and create Matchbox profiles (e.g. controller, worker) and matching rules via the Matchbox API.
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
|
$ terraform apply
|
||||||
module.bare-metal-mercury.null_resource.copy-kubeconfig.0: Provisioning with 'file'...
|
module.bare-metal-mercury.null_resource.copy-kubeconfig.0: Provisioning with 'file'...
|
||||||
module.bare-metal-mercury.null_resource.copy-etcd-secrets.0: Provisioning with 'file'...
|
module.bare-metal-mercury.null_resource.copy-etcd-secrets.0: Provisioning with 'file'...
|
||||||
module.bare-metal-mercury.null_resource.copy-kubeconfig.0: Still creating... (10s elapsed)
|
module.bare-metal-mercury.null_resource.copy-kubeconfig.0: Still creating... (10s elapsed)
|
||||||
@ -290,9 +318,9 @@ bootkube[5]: Tearing down temporary bootstrap control plane...
|
|||||||
$ export KUBECONFIG=/home/user/.secrets/clusters/mercury/auth/kubeconfig
|
$ export KUBECONFIG=/home/user/.secrets/clusters/mercury/auth/kubeconfig
|
||||||
$ kubectl get nodes
|
$ kubectl get nodes
|
||||||
NAME STATUS AGE VERSION
|
NAME STATUS AGE VERSION
|
||||||
node1.example.com Ready 11m v1.9.1
|
node1.example.com Ready 11m v1.9.4
|
||||||
node2.example.com Ready 11m v1.9.1
|
node2.example.com Ready 11m v1.9.4
|
||||||
node3.example.com Ready 11m v1.9.1
|
node3.example.com Ready 11m v1.9.4
|
||||||
```
|
```
|
||||||
|
|
||||||
List the pods.
|
List the pods.
|
||||||
@ -307,7 +335,6 @@ kube-system kube-apiserver-7336w 1/1 Running 0
|
|||||||
kube-system kube-controller-manager-3271970485-b9chx 1/1 Running 0 11m
|
kube-system kube-controller-manager-3271970485-b9chx 1/1 Running 0 11m
|
||||||
kube-system kube-controller-manager-3271970485-v30js 1/1 Running 1 11m
|
kube-system kube-controller-manager-3271970485-v30js 1/1 Running 1 11m
|
||||||
kube-system kube-dns-1187388186-mx9rt 3/3 Running 0 11m
|
kube-system kube-dns-1187388186-mx9rt 3/3 Running 0 11m
|
||||||
kube-system kube-etcd-network-checkpointer-q24f7 1/1 Running 0 11m
|
|
||||||
kube-system kube-proxy-50sd4 1/1 Running 0 11m
|
kube-system kube-proxy-50sd4 1/1 Running 0 11m
|
||||||
kube-system kube-proxy-bczhp 1/1 Running 0 11m
|
kube-system kube-proxy-bczhp 1/1 Running 0 11m
|
||||||
kube-system kube-proxy-mp2fw 1/1 Running 0 11m
|
kube-system kube-proxy-mp2fw 1/1 Running 0 11m
|
||||||
@ -319,10 +346,10 @@ kube-system pod-checkpointer-wf65d-node1.example.com 1/1 Running 0
|
|||||||
|
|
||||||
## Going Further
|
## Going Further
|
||||||
|
|
||||||
Learn about [version pinning](concepts.md#versioning), [maintenance](topics/maintenance.md), and [addons](addons/overview.md).
|
Learn about [maintenance](topics/maintenance.md) and [addons](addons/overview.md).
|
||||||
|
|
||||||
!!! note
|
!!! note
|
||||||
On Container Linux clusters, install the `container-linux-update-operator` addon to coordinate reboots and drains when nodes auto-update. Otherwise, updates may not be applied until the next reboot.
|
On Container Linux clusters, install the `CLUO` addon to coordinate reboots and drains when nodes auto-update. Otherwise, updates may not be applied until the next reboot.
|
||||||
|
|
||||||
## Variables
|
## Variables
|
||||||
|
|
||||||
@ -332,7 +359,7 @@ Learn about [version pinning](concepts.md#versioning), [maintenance](topics/main
|
|||||||
|:-----|:------------|:--------|
|
|:-----|:------------|:--------|
|
||||||
| matchbox_http_endpoint | Matchbox HTTP read-only endpoint | http://matchbox.example.com:8080 |
|
| matchbox_http_endpoint | Matchbox HTTP read-only endpoint | http://matchbox.example.com:8080 |
|
||||||
| container_linux_channel | Container Linux channel | stable, beta, alpha |
|
| container_linux_channel | Container Linux channel | stable, beta, alpha |
|
||||||
| container_linux_version | Container Linux version of the kernel/initrd to PXE and the image to install | 1576.5.0 |
|
| container_linux_version | Container Linux version of the kernel/initrd to PXE and the image to install | 1632.3.0 |
|
||||||
| cluster_name | Cluster name | mercury |
|
| cluster_name | Cluster name | mercury |
|
||||||
| k8s_domain_name | FQDN resolving to the controller(s) nodes. Workers and kubectl will communicate with this endpoint | "myk8s.example.com" |
|
| k8s_domain_name | FQDN resolving to the controller(s) nodes. Workers and kubectl will communicate with this endpoint | "myk8s.example.com" |
|
||||||
| ssh_authorized_key | SSH public key for ~/.ssh/authorized_keys | "ssh-rsa AAAAB3Nz..." |
|
| ssh_authorized_key | SSH public key for ~/.ssh/authorized_keys | "ssh-rsa AAAAB3Nz..." |
|
||||||
|
@ -1,6 +1,6 @@
|
|||||||
# Digital Ocean
|
# Digital Ocean
|
||||||
|
|
||||||
In this tutorial, we'll create a Kubernetes v1.9.1 cluster on Digital Ocean.
|
In this tutorial, we'll create a Kubernetes v1.9.4 cluster on Digital Ocean.
|
||||||
|
|
||||||
We'll declare a Kubernetes cluster in Terraform using the Typhoon Terraform module. On apply, firewall rules, DNS records, tags, and droplets for Kubernetes controllers and workers will be created.
|
We'll declare a Kubernetes cluster in Terraform using the Typhoon Terraform module. On apply, firewall rules, DNS records, tags, and droplets for Kubernetes controllers and workers will be created.
|
||||||
|
|
||||||
@ -10,23 +10,23 @@ Controllers and workers are provisioned to run a `kubelet`. A one-time [bootkube
|
|||||||
|
|
||||||
* Digital Ocean Account and Token
|
* Digital Ocean Account and Token
|
||||||
* Digital Ocean Domain (registered Domain Name or delegated subdomain)
|
* Digital Ocean Domain (registered Domain Name or delegated subdomain)
|
||||||
* Terraform v0.10.x and [terraform-provider-ct](https://github.com/coreos/terraform-provider-ct) installed locally
|
* Terraform v0.11.x and [terraform-provider-ct](https://github.com/coreos/terraform-provider-ct) installed locally
|
||||||
|
|
||||||
## Terraform Setup
|
## Terraform Setup
|
||||||
|
|
||||||
Install [Terraform](https://www.terraform.io/downloads.html) v0.10.x on your system.
|
Install [Terraform](https://www.terraform.io/downloads.html) v0.11.x on your system.
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
$ terraform version
|
$ terraform version
|
||||||
Terraform v0.10.7
|
Terraform v0.11.1
|
||||||
```
|
```
|
||||||
|
|
||||||
Add the [terraform-provider-ct](https://github.com/coreos/terraform-provider-ct) plugin binary for your system.
|
Add the [terraform-provider-ct](https://github.com/coreos/terraform-provider-ct) plugin binary for your system.
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
wget https://github.com/coreos/terraform-provider-ct/releases/download/v0.2.0/terraform-provider-ct-v0.2.0-linux-amd64.tar.gz
|
wget https://github.com/coreos/terraform-provider-ct/releases/download/v0.2.1/terraform-provider-ct-v0.2.1-linux-amd64.tar.gz
|
||||||
tar xzf terraform-provider-ct-v0.2.0-linux-amd64.tar.gz
|
tar xzf terraform-provider-ct-v0.2.1-linux-amd64.tar.gz
|
||||||
sudo mv terraform-provider-ct-v0.2.0-linux-amd64/terraform-provider-ct /usr/local/bin/
|
sudo mv terraform-provider-ct-v0.2.1-linux-amd64/terraform-provider-ct /usr/local/bin/
|
||||||
```
|
```
|
||||||
|
|
||||||
Add the plugin to your `~/.terraformrc`.
|
Add the plugin to your `~/.terraformrc`.
|
||||||
@ -58,7 +58,29 @@ Configure the DigitalOcean provider to use your token in a `providers.tf` file.
|
|||||||
|
|
||||||
```tf
|
```tf
|
||||||
provider "digitalocean" {
|
provider "digitalocean" {
|
||||||
|
version = "0.1.3"
|
||||||
token = "${chomp(file("~/.config/digital-ocean/token"))}"
|
token = "${chomp(file("~/.config/digital-ocean/token"))}"
|
||||||
|
alias = "default"
|
||||||
|
}
|
||||||
|
|
||||||
|
provider "local" {
|
||||||
|
version = "~> 1.0"
|
||||||
|
alias = "default"
|
||||||
|
}
|
||||||
|
|
||||||
|
provider "null" {
|
||||||
|
version = "~> 1.0"
|
||||||
|
alias = "default"
|
||||||
|
}
|
||||||
|
|
||||||
|
provider "template" {
|
||||||
|
version = "~> 1.0"
|
||||||
|
alias = "default"
|
||||||
|
}
|
||||||
|
|
||||||
|
provider "tls" {
|
||||||
|
version = "~> 1.0"
|
||||||
|
alias = "default"
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@ -68,7 +90,15 @@ Define a Kubernetes cluster using the module `digital-ocean/container-linux/kube
|
|||||||
|
|
||||||
```tf
|
```tf
|
||||||
module "digital-ocean-nemo" {
|
module "digital-ocean-nemo" {
|
||||||
source = "git::https://github.com/poseidon/typhoon//digital-ocean/container-linux/kubernetes"
|
source = "git::https://github.com/poseidon/typhoon//digital-ocean/container-linux/kubernetes?ref=v1.9.4"
|
||||||
|
|
||||||
|
providers = {
|
||||||
|
digitalocean = "digitalocean.default"
|
||||||
|
local = "local.default"
|
||||||
|
null = "null.default"
|
||||||
|
template = "template.default"
|
||||||
|
tls = "tls.default"
|
||||||
|
}
|
||||||
|
|
||||||
region = "nyc3"
|
region = "nyc3"
|
||||||
dns_zone = "digital-ocean.example.com"
|
dns_zone = "digital-ocean.example.com"
|
||||||
@ -76,9 +106,9 @@ module "digital-ocean-nemo" {
|
|||||||
cluster_name = "nemo"
|
cluster_name = "nemo"
|
||||||
image = "coreos-stable"
|
image = "coreos-stable"
|
||||||
controller_count = 1
|
controller_count = 1
|
||||||
controller_type = "2gb"
|
controller_type = "s-2vcpu-2gb"
|
||||||
worker_count = 2
|
worker_count = 2
|
||||||
worker_type = "512mb"
|
worker_type = "s-1vcpu-1gb"
|
||||||
ssh_fingerprints = ["d7:9d:79:ae:56:32:73:79:95:88:e3:a2:ab:5d:45:e7"]
|
ssh_fingerprints = ["d7:9d:79:ae:56:32:73:79:95:88:e3:a2:ab:5d:45:e7"]
|
||||||
|
|
||||||
# output assets dir
|
# output assets dir
|
||||||
@ -114,7 +144,7 @@ Get or update Terraform modules.
|
|||||||
$ terraform get # downloads missing modules
|
$ terraform get # downloads missing modules
|
||||||
$ terraform get --update # updates all modules
|
$ terraform get --update # updates all modules
|
||||||
Get: git::https://github.com/poseidon/typhoon (update)
|
Get: git::https://github.com/poseidon/typhoon (update)
|
||||||
Get: git::https://github.com/poseidon/bootkube-terraform.git?ref=v0.9.1 (update)
|
Get: git::https://github.com/poseidon/bootkube-terraform.git?ref=v0.11.0 (update)
|
||||||
```
|
```
|
||||||
|
|
||||||
Plan the resources to be created.
|
Plan the resources to be created.
|
||||||
@ -147,9 +177,9 @@ In 3-6 minutes, the Kubernetes cluster will be ready.
|
|||||||
$ export KUBECONFIG=/home/user/.secrets/clusters/nemo/auth/kubeconfig
|
$ export KUBECONFIG=/home/user/.secrets/clusters/nemo/auth/kubeconfig
|
||||||
$ kubectl get nodes
|
$ kubectl get nodes
|
||||||
NAME STATUS AGE VERSION
|
NAME STATUS AGE VERSION
|
||||||
10.132.110.130 Ready 10m v1.9.1
|
10.132.110.130 Ready 10m v1.9.4
|
||||||
10.132.115.81 Ready 10m v1.9.1
|
10.132.115.81 Ready 10m v1.9.4
|
||||||
10.132.124.107 Ready 10m v1.9.1
|
10.132.124.107 Ready 10m v1.9.4
|
||||||
```
|
```
|
||||||
|
|
||||||
List the pods.
|
List the pods.
|
||||||
@ -174,10 +204,10 @@ kube-system pod-checkpointer-pr1lq-10.132.115.81 1/1 Running 0
|
|||||||
|
|
||||||
## Going Further
|
## Going Further
|
||||||
|
|
||||||
Learn about [version pinning](concepts.md#versioning), [maintenance](topics/maintenance.md), and [addons](addons/overview.md).
|
Learn about [maintenance](topics/maintenance.md) and [addons](addons/overview.md).
|
||||||
|
|
||||||
!!! note
|
!!! note
|
||||||
On Container Linux clusters, install the `container-linux-update-operator` addon to coordinate reboots and drains when nodes auto-update. Otherwise, updates may not be applied until the next reboot.
|
On Container Linux clusters, install the `CLUO` addon to coordinate reboots and drains when nodes auto-update. Otherwise, updates may not be applied until the next reboot.
|
||||||
|
|
||||||
## Variables
|
## Variables
|
||||||
|
|
||||||
@ -213,8 +243,8 @@ resource "digitalocean_domain" "zone-for-clusters" {
|
|||||||
DigitalOcean droplets are created with your SSH public key "fingerprint" (i.e. MD5 hash) to allow access. If your SSH public key is at `~/.ssh/id_rsa`, find the fingerprint with,
|
DigitalOcean droplets are created with your SSH public key "fingerprint" (i.e. MD5 hash) to allow access. If your SSH public key is at `~/.ssh/id_rsa`, find the fingerprint with,
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
ssh-keygen -lf ~/.ssh/id_rsa.pub | awk '{print $2}'
|
ssh-keygen -E md5 -lf ~/.ssh/id_rsa.pub | awk '{print $2}'
|
||||||
d7:9d:79:ae:56:32:73:79:95:88:e3:a2:ab:5d:45:e7
|
MD5:d7:9d:79:ae:56:32:73:79:95:88:e3:a2:ab:5d:45:e7
|
||||||
```
|
```
|
||||||
|
|
||||||
If you use `ssh-agent` (e.g. Yubikey for SSH), find the fingerprint with,
|
If you use `ssh-agent` (e.g. Yubikey for SSH), find the fingerprint with,
|
||||||
@ -224,7 +254,7 @@ ssh-add -l -E md5
|
|||||||
2048 MD5:d7:9d:79:ae:56:32:73:79:95:88:e3:a2:ab:5d:45:e7 cardno:000603633110 (RSA)
|
2048 MD5:d7:9d:79:ae:56:32:73:79:95:88:e3:a2:ab:5d:45:e7 cardno:000603633110 (RSA)
|
||||||
```
|
```
|
||||||
|
|
||||||
If you uploaded an SSH key to DigitalOcean (not required), find the fingerprint under Settings -> Security. Finally, if you don't have an SSH key, [create one now](https://help.github.com/articles/generating-a-new-ssh-key-and-adding-it-to-the-ssh-agent/).
|
Digital Ocean requires the SSH public key be uploaded to your account, so you may also find the fingerprint under Settings -> Security. Finally, if you don't have an SSH key, [create one now](https://help.github.com/articles/generating-a-new-ssh-key-and-adding-it-to-the-ssh-agent/).
|
||||||
|
|
||||||
### Optional
|
### Optional
|
||||||
|
|
||||||
@ -232,16 +262,18 @@ If you uploaded an SSH key to DigitalOcean (not required), find the fingerprint
|
|||||||
|:-----|:------------|:--------|:--------|
|
|:-----|:------------|:--------|:--------|
|
||||||
| image | OS image for droplets | "coreos-stable" | coreos-stable, coreos-beta, coreos-alpha |
|
| image | OS image for droplets | "coreos-stable" | coreos-stable, coreos-beta, coreos-alpha |
|
||||||
| controller_count | Number of controllers (i.e. masters) | 1 | 1 |
|
| controller_count | Number of controllers (i.e. masters) | 1 | 1 |
|
||||||
| controller_type | Digital Ocean droplet size | 2gb | 2gb (min), 4gb, 8gb |
|
| controller_type | Digital Ocean droplet size | s-2vcpu-2gb | s-2vcpu-2gb, s-2vcpu-4gb, s-4vcpu-8gb, ... |
|
||||||
| worker_count | Number of workers | 1 | 3 |
|
| worker_count | Number of workers | 1 | 3 |
|
||||||
| worker_type | Digital Ocean droplet size | 512mb | 512mb, 1gb, 2gb, 4gb |
|
| worker_type | Digital Ocean droplet size | s-1vcpu-1gb | s-1vcpu-1gb, s-1vcpu-2gb, s-2vcpu-2gb, ... |
|
||||||
| networking | Choice of networking provider | "flannel" | "flannel" |
|
| networking | Choice of networking provider | "flannel" | "flannel" |
|
||||||
| pod_cidr | CIDR range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
|
| pod_cidr | CIDR range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
|
||||||
| service_cidr | CIDR range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
|
| service_cidr | CIDR range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
|
||||||
| cluster_domain_suffix | FQDN suffix for Kubernetes services answered by kube-dns. | "cluster.local" | "k8s.example.com" |
|
| cluster_domain_suffix | FQDN suffix for Kubernetes services answered by kube-dns. | "cluster.local" | "k8s.example.com" |
|
||||||
|
|
||||||
|
Check the list of valid [droplet types](https://developers.digitalocean.com/documentation/changelog/api-v2/new-size-slugs-for-droplet-plan-changes/) or use `doctl compute size list`.
|
||||||
|
|
||||||
!!! warning
|
!!! warning
|
||||||
Do not choose a `controller_type` smaller than `2gb`. The `1gb` droplet is not sufficient for running a controller and bootstrapping will fail.
|
Do not choose a `controller_type` smaller than 2GB. Smaller droplets are not sufficient for running a controller and bootstrapping will fail.
|
||||||
|
|
||||||
!!! bug
|
!!! bug
|
||||||
Digital Ocean firewalls do not yet support the IP tunneling (IP in IP) protocol used by Calico. You can try using "calico" for `networking`, but it will only work if the cloud firewall is removed (unsafe).
|
Digital Ocean firewalls do not yet support the IP tunneling (IP in IP) protocol used by Calico. You can try using "calico" for `networking`, but it will only work if the cloud firewall is removed (unsafe).
|
||||||
|
@ -1,6 +1,6 @@
|
|||||||
# Google Cloud
|
# Google Cloud
|
||||||
|
|
||||||
In this tutorial, we'll create a Kubernetes v1.9.1 cluster on Google Compute Engine (not GKE).
|
In this tutorial, we'll create a Kubernetes v1.9.4 cluster on Google Compute Engine (not GKE).
|
||||||
|
|
||||||
We'll declare a Kubernetes cluster in Terraform using the Typhoon Terraform module. On apply, a network, firewall rules, managed instance groups of Kubernetes controllers and workers, network load balancers for controllers and workers, and health checks will be created.
|
We'll declare a Kubernetes cluster in Terraform using the Typhoon Terraform module. On apply, a network, firewall rules, managed instance groups of Kubernetes controllers and workers, network load balancers for controllers and workers, and health checks will be created.
|
||||||
|
|
||||||
@ -10,23 +10,23 @@ Controllers and workers are provisioned to run a `kubelet`. A one-time [bootkube
|
|||||||
|
|
||||||
* Google Cloud Account and Service Account
|
* Google Cloud Account and Service Account
|
||||||
* Google Cloud DNS Zone (registered Domain Name or delegated subdomain)
|
* Google Cloud DNS Zone (registered Domain Name or delegated subdomain)
|
||||||
* Terraform v0.10.x and [terraform-provider-ct](https://github.com/coreos/terraform-provider-ct) installed locally
|
* Terraform v0.11.x and [terraform-provider-ct](https://github.com/coreos/terraform-provider-ct) installed locally
|
||||||
|
|
||||||
## Terraform Setup
|
## Terraform Setup
|
||||||
|
|
||||||
Install [Terraform](https://www.terraform.io/downloads.html) v0.10.x on your system.
|
Install [Terraform](https://www.terraform.io/downloads.html) v0.11.x on your system.
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
$ terraform version
|
$ terraform version
|
||||||
Terraform v0.10.7
|
Terraform v0.11.1
|
||||||
```
|
```
|
||||||
|
|
||||||
Add the [terraform-provider-ct](https://github.com/coreos/terraform-provider-ct) plugin binary for your system.
|
Add the [terraform-provider-ct](https://github.com/coreos/terraform-provider-ct) plugin binary for your system.
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
wget https://github.com/coreos/terraform-provider-ct/releases/download/v0.2.0/terraform-provider-ct-v0.2.0-linux-amd64.tar.gz
|
wget https://github.com/coreos/terraform-provider-ct/releases/download/v0.2.1/terraform-provider-ct-v0.2.1-linux-amd64.tar.gz
|
||||||
tar xzf terraform-provider-ct-v0.2.0-linux-amd64.tar.gz
|
tar xzf terraform-provider-ct-v0.2.1-linux-amd64.tar.gz
|
||||||
sudo mv terraform-provider-ct-v0.2.0-linux-amd64/terraform-provider-ct /usr/local/bin/
|
sudo mv terraform-provider-ct-v0.2.1-linux-amd64/terraform-provider-ct /usr/local/bin/
|
||||||
```
|
```
|
||||||
|
|
||||||
Add the plugin to your `~/.terraformrc`.
|
Add the plugin to your `~/.terraformrc`.
|
||||||
@ -57,10 +57,33 @@ Configure the Google Cloud provider to use your service account key, project-id,
|
|||||||
|
|
||||||
```tf
|
```tf
|
||||||
provider "google" {
|
provider "google" {
|
||||||
|
version = "1.2"
|
||||||
|
alias = "default"
|
||||||
|
|
||||||
credentials = "${file("~/.config/google-cloud/terraform.json")}"
|
credentials = "${file("~/.config/google-cloud/terraform.json")}"
|
||||||
project = "project-id"
|
project = "project-id"
|
||||||
region = "us-central1"
|
region = "us-central1"
|
||||||
}
|
}
|
||||||
|
|
||||||
|
provider "local" {
|
||||||
|
version = "~> 1.0"
|
||||||
|
alias = "default"
|
||||||
|
}
|
||||||
|
|
||||||
|
provider "null" {
|
||||||
|
version = "~> 1.0"
|
||||||
|
alias = "default"
|
||||||
|
}
|
||||||
|
|
||||||
|
provider "template" {
|
||||||
|
version = "~> 1.0"
|
||||||
|
alias = "default"
|
||||||
|
}
|
||||||
|
|
||||||
|
provider "tls" {
|
||||||
|
version = "~> 1.0"
|
||||||
|
alias = "default"
|
||||||
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
Additional configuration options are described in the `google` provider [docs](https://www.terraform.io/docs/providers/google/index.html).
|
Additional configuration options are described in the `google` provider [docs](https://www.terraform.io/docs/providers/google/index.html).
|
||||||
@ -74,13 +97,21 @@ Define a Kubernetes cluster using the module `google-cloud/container-linux/kuber
|
|||||||
|
|
||||||
```tf
|
```tf
|
||||||
module "google-cloud-yavin" {
|
module "google-cloud-yavin" {
|
||||||
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes"
|
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes?ref=v1.9.4"
|
||||||
|
|
||||||
|
providers = {
|
||||||
|
google = "google.default"
|
||||||
|
local = "local.default"
|
||||||
|
null = "null.default"
|
||||||
|
template = "template.default"
|
||||||
|
tls = "tls.default"
|
||||||
|
}
|
||||||
|
|
||||||
# Google Cloud
|
# Google Cloud
|
||||||
region = "us-central1"
|
region = "us-central1"
|
||||||
dns_zone = "example.com"
|
dns_zone = "example.com"
|
||||||
dns_zone_name = "example-zone"
|
dns_zone_name = "example-zone"
|
||||||
os_image = "coreos-stable-1576-5-0-v20180105"
|
os_image = "coreos-stable"
|
||||||
|
|
||||||
cluster_name = "yavin"
|
cluster_name = "yavin"
|
||||||
controller_count = 1
|
controller_count = 1
|
||||||
@ -120,7 +151,7 @@ Get or update Terraform modules.
|
|||||||
$ terraform get # downloads missing modules
|
$ terraform get # downloads missing modules
|
||||||
$ terraform get --update # updates all modules
|
$ terraform get --update # updates all modules
|
||||||
Get: git::https://github.com/poseidon/typhoon (update)
|
Get: git::https://github.com/poseidon/typhoon (update)
|
||||||
Get: git::https://github.com/poseidon/bootkube-terraform.git?ref=v0.9.1 (update)
|
Get: git::https://github.com/poseidon/bootkube-terraform.git?ref=v0.11.0 (update)
|
||||||
```
|
```
|
||||||
|
|
||||||
Plan the resources to be created.
|
Plan the resources to be created.
|
||||||
@ -154,9 +185,9 @@ In 4-8 minutes, the Kubernetes cluster will be ready.
|
|||||||
$ export KUBECONFIG=/home/user/.secrets/clusters/yavin/auth/kubeconfig
|
$ export KUBECONFIG=/home/user/.secrets/clusters/yavin/auth/kubeconfig
|
||||||
$ kubectl get nodes
|
$ kubectl get nodes
|
||||||
NAME STATUS AGE VERSION
|
NAME STATUS AGE VERSION
|
||||||
yavin-controller-0.c.example-com.internal Ready 6m v1.9.1
|
yavin-controller-0.c.example-com.internal Ready 6m v1.9.4
|
||||||
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.9.1
|
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.9.4
|
||||||
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.9.1
|
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.9.4
|
||||||
```
|
```
|
||||||
|
|
||||||
List the pods.
|
List the pods.
|
||||||
@ -181,10 +212,10 @@ kube-system pod-checkpointer-l6lrt 1/1 Running 0
|
|||||||
|
|
||||||
## Going Further
|
## Going Further
|
||||||
|
|
||||||
Learn about [version pinning](concepts.md#versioning), [maintenance](topics/maintenance.md), and [addons](addons/overview.md).
|
Learn about [maintenance](topics/maintenance.md) and [addons](addons/overview.md).
|
||||||
|
|
||||||
!!! note
|
!!! note
|
||||||
On Container Linux clusters, install the `container-linux-update-operator` addon to coordinate reboots and drains when nodes auto-update. Otherwise, updates may not be applied until the next reboot.
|
On Container Linux clusters, install the `CLUO` addon to coordinate reboots and drains when nodes auto-update. Otherwise, updates may not be applied until the next reboot.
|
||||||
|
|
||||||
## Variables
|
## Variables
|
||||||
|
|
||||||
@ -197,7 +228,7 @@ Learn about [version pinning](concepts.md#versioning), [maintenance](topics/main
|
|||||||
| dns_zone | Google Cloud DNS zone | "google-cloud.example.com" |
|
| dns_zone | Google Cloud DNS zone | "google-cloud.example.com" |
|
||||||
| dns_zone_name | Google Cloud DNS zone name | "example-zone" |
|
| dns_zone_name | Google Cloud DNS zone name | "example-zone" |
|
||||||
| ssh_authorized_key | SSH public key for ~/.ssh_authorized_keys | "ssh-rsa AAAAB3NZ..." |
|
| ssh_authorized_key | SSH public key for ~/.ssh_authorized_keys | "ssh-rsa AAAAB3NZ..." |
|
||||||
| os_image | OS image for compute instances | "coreos-stable-1576-5-0-v20180105" |
|
| os_image | OS image for compute instances | "coreos-stable" |
|
||||||
| asset_dir | Path to a directory where generated assets should be placed (contains secrets) | "/home/user/.secrets/clusters/yavin" |
|
| asset_dir | Path to a directory where generated assets should be placed (contains secrets) | "/home/user/.secrets/clusters/yavin" |
|
||||||
|
|
||||||
Check the list of valid [regions](https://cloud.google.com/compute/docs/regions-zones/regions-zones) and list Container Linux [images](https://cloud.google.com/compute/docs/images) with `gcloud compute images list | grep coreos`.
|
Check the list of valid [regions](https://cloud.google.com/compute/docs/regions-zones/regions-zones) and list Container Linux [images](https://cloud.google.com/compute/docs/images) with `gcloud compute images list | grep coreos`.
|
||||||
@ -226,7 +257,7 @@ resource "google_dns_managed_zone" "zone-for-clusters" {
|
|||||||
| machine_type | Machine type for compute instances | "n1-standard-1" | See below |
|
| machine_type | Machine type for compute instances | "n1-standard-1" | See below |
|
||||||
| controller_count | Number of controllers (i.e. masters) | 1 | 1 |
|
| controller_count | Number of controllers (i.e. masters) | 1 | 1 |
|
||||||
| worker_count | Number of workers | 1 | 3 |
|
| worker_count | Number of workers | 1 | 3 |
|
||||||
| worker_preemptible | If enabled, Compute Engine will terminate controllers randomly within 24 hours | false | true |
|
| worker_preemptible | If enabled, Compute Engine will terminate workers randomly within 24 hours | false | true |
|
||||||
| networking | Choice of networking provider | "calico" | "calico" or "flannel" |
|
| networking | Choice of networking provider | "calico" | "calico" or "flannel" |
|
||||||
| pod_cidr | CIDR range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
|
| pod_cidr | CIDR range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
|
||||||
| service_cidr | CIDR range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
|
| service_cidr | CIDR range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
|
||||||
|
BIN
docs/img/grafana-capacity.png
Normal file
BIN
docs/img/grafana-capacity.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 234 KiB |
BIN
docs/img/grafana-control-plane.png
Normal file
BIN
docs/img/grafana-control-plane.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 259 KiB |
Binary file not shown.
Before Width: | Height: | Size: 212 KiB |
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user