mirror of
https://github.com/puppetmaster/typhoon.git
synced 2025-08-03 14:51:34 +02:00
Compare commits
53 Commits
Author | SHA1 | Date | |
---|---|---|---|
444363be2d | |||
bc7ad25c60 | |||
e838d4dc3d | |||
979c092ef6 | |||
db8e94bb4b | |||
eb093af9ed | |||
36096f844d | |||
d236628e53 | |||
577b927a2b | |||
000c11edf6 | |||
29b16c3fc0 | |||
0c7a879bc4 | |||
1e654c9e4e | |||
28ee693e6b | |||
8c7d95aefd | |||
d45dfdbf91 | |||
d7e0536838 | |||
8dd221a57c | |||
f17bb4cf61 | |||
44f1fe620a | |||
a504264e24 | |||
88cf7273dc | |||
58def65a09 | |||
cd7fd29194 | |||
aafa38476a | |||
9a07f1d30b | |||
c87db3ef37 | |||
342380cfa4 | |||
5e70d7e2c8 | |||
aab071309f | |||
f6ce12766b | |||
e1d6ab2f24 | |||
8b3d41d6a0 | |||
ccee5d3d89 | |||
8aefd4f082 | |||
78e6409bd0 | |||
2aef42d4f6 | |||
b7d67757de | |||
26f5d2d753 | |||
cd0a28904e | |||
618f8b30fd | |||
264d23a1b5 | |||
f96e91f225 | |||
efd4a0319d | |||
6df6bf904a | |||
5fba20d358 | |||
a8d3d3bb12 | |||
9ea6d2c245 | |||
507aac9b78 | |||
dfd2a0ec23 | |||
e3bf7d8f9b | |||
49050320ce | |||
74e025c9e4 |
101
CHANGES.md
101
CHANGES.md
@ -4,6 +4,107 @@ Notable changes between versions.
|
||||
|
||||
## Latest
|
||||
|
||||
* Kubernetes [v1.19.2](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.19.md#v1192)
|
||||
* Update flannel from v0.12.0 to v0.13.0-rc2 ([#216](https://github.com/poseidon/terraform-render-bootstrap/pull/216))
|
||||
* Update flannel-cni from v0.4.0 to v0.4.1
|
||||
* Update CNI plugins from v0.8.6 to v0.8.7
|
||||
|
||||
### Addons
|
||||
|
||||
* Refresh Prometheus rules/alerts and Grafana dashboards ([#831](https://github.com/poseidon/typhoon/pull/831))
|
||||
* Reduce apiserver metrics cardinality for non-core APIs ([#830](https://github.com/poseidon/typhoon/pull/830))
|
||||
|
||||
## v1.19.1
|
||||
|
||||
* Kubernetes [v1.19.1](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.19.md#v1191)
|
||||
* Change control plane seccomp annotations to GA `seccompProfile` ([#822](https://github.com/poseidon/typhoon/pull/822))
|
||||
* Update Cilium from v1.8.2 to [v1.8.3](https://github.com/cilium/cilium/releases/tag/v1.8.3)
|
||||
* Promote Cilium from experimental to general availability ([#827](https://github.com/poseidon/typhoon/pull/827))
|
||||
* Update Calico from v1.15.2 to [v1.15.3](https://github.com/projectcalico/calico/releases/tag/v3.15.3)
|
||||
|
||||
### Fedora CoreOS
|
||||
|
||||
* Update Fedora CoreOS Config version from v1.0.0 to v1.1.0
|
||||
* Require any [snippets](https://typhoon.psdn.io/advanced/customization/#hosts) customizations to update to v1.1.0
|
||||
|
||||
### Addons
|
||||
|
||||
* Update IngressClass resources to `networking.k8s.io/v1` ([#824](https://github.com/poseidon/typhoon/pull/824))
|
||||
* Update Prometheus from v2.20.0 to [v2.21.0](https://github.com/prometheus/prometheus/releases/tag/v2.21.0)
|
||||
* Remove Kubernetes node name labelmap `relabel_config` from etcd, Kubelet, and CAdvisor scrape config ([#828](https://github.com/poseidon/typhoon/pull/828))
|
||||
|
||||
## v1.19.0
|
||||
|
||||
* Kubernetes [v1.19.0](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.19.md#v1190)
|
||||
* Update etcd from v3.4.10 to [v3.4.12](https://github.com/etcd-io/etcd/releases/tag/v3.4.12)
|
||||
* Update Calico from v3.15.1 to [v3.15.2](https://docs.projectcalico.org/v3.15/release-notes/)
|
||||
|
||||
### Fedora CoreOS
|
||||
|
||||
* Fix race condition during bootstrap of multi-controller clusters ([#808](https://github.com/poseidon/typhoon/pull/808))
|
||||
* Fix SELinux label of bootstrap-secrets on non-bootstrap controllers
|
||||
|
||||
### Addons
|
||||
|
||||
* Introduce [fleetlock](https://github.com/poseidon/fleetlock) for Fedora CoreOS reboot coordination ([#814](https://github.com/poseidon/typhoon/pull/814))
|
||||
* Update nginx-ingress from v0.34.1 to [v0.35.0](https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.35.0)
|
||||
* Repository changed to `k8s.gcr.io/ingress-nginx/controller`
|
||||
* Update Grafana from v7.1.3 to [v7.1.5](https://github.com/grafana/grafana/releases/tag/v7.1.5)
|
||||
|
||||
## v1.18.8
|
||||
|
||||
* Kubernetes [v1.18.8](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md#v1188)
|
||||
* Migrate from Terraform v0.12.x to v0.13.x ([#804](https://github.com/poseidon/typhoon/pull/804)) (**action required**)
|
||||
* Recommend Terraform v0.13.x ([migration guide](https://typhoon.psdn.io/topics/maintenance/#terraform-versions))
|
||||
* Support automatic install of poseidon's provider plugins ([poseidon/ct](https://registry.terraform.io/providers/poseidon/ct/latest), [poseidon/matchbox](https://registry.terraform.io/providers/poseidon/matchbox/latest))
|
||||
* Require Terraform v0.12.26+ (migration compatibility)
|
||||
* Require `terraform-provider-ct` v0.6.1
|
||||
* Require `terraform-provider-matchbox` v0.4.1
|
||||
* Update etcd from v3.4.9 to [v3.4.10](https://github.com/etcd-io/etcd/releases/tag/v3.4.10)
|
||||
* Update CoreDNS from v1.6.7 to [v1.7.0](https://coredns.io/2020/06/15/coredns-1.7.0-release/)
|
||||
* Update Cilium from v1.8.1 to [v1.8.2](https://github.com/cilium/cilium/releases/tag/v1.8.2)
|
||||
* Update [coreos/flannel-cni](https://github.com/coreos/flannel-cni) to [poseidon/flannel-cni](https://github.com/poseidon/flannel-cni) ([#798](https://github.com/poseidon/typhoon/pull/798))
|
||||
* Update CNI plugins and fix CVEs with Flannel CNI (non-default)
|
||||
* Transition to a poseidon maintained container image
|
||||
|
||||
### AWS
|
||||
|
||||
* Allow `terraform-provider-aws` v3.0+ ([#803](https://github.com/poseidon/typhoon/pull/803))
|
||||
* Recommend updating `terraform-provider-aws` to v3.0+
|
||||
* Continue to allow v2.23+, no v3.x specific features are used
|
||||
|
||||
### DigitalOcean
|
||||
|
||||
* Require `terraform-provider-digitalocean` v1.21+ for Terraform v0.13.x (unenforced)
|
||||
* Require `terraform-provider-digitalocean` v1.20+ for Terraform v0.12.x
|
||||
|
||||
### Fedora CoreOS
|
||||
|
||||
* Fix support for Flannel with Fedora CoreOS ([#795](https://github.com/poseidon/typhoon/pull/795))
|
||||
* Configure `flannel.1` link to select its own MAC address to solve flannel
|
||||
pod-to-pod traffic drops starting with default link changes in Fedora CoreOS
|
||||
32.20200629.3.0 ([details](https://github.com/coreos/fedora-coreos-tracker/issues/574#issuecomment-665487296))
|
||||
|
||||
#### Addons
|
||||
|
||||
* Update Prometheus from v2.19.2 to [v2.20.0](https://github.com/prometheus/prometheus/releases/tag/v2.20.0)
|
||||
* Update Grafana from v7.0.6 to [v7.1.3](https://github.com/grafana/grafana/releases/tag/v7.1.3)
|
||||
|
||||
## v1.18.6
|
||||
|
||||
* Kubernetes [v1.18.6](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md#v1186)
|
||||
* Update Calico from v3.15.0 to [v3.15.1](https://docs.projectcalico.org/v3.15/release-notes/)
|
||||
* Update Cilium from v1.8.0 to [v1.8.1](https://github.com/cilium/cilium/releases/tag/v1.8.1)
|
||||
|
||||
#### Addons
|
||||
|
||||
* Update nginx-ingress from v0.33.0 to [v0.34.1](https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.34.1)
|
||||
* [ingress-nginx](https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.34.0) will publish images only to gcr.io
|
||||
* Update Prometheus from v2.19.1 to [v2.19.2](https://github.com/prometheus/prometheus/releases/tag/v2.19.2)
|
||||
* Update Grafana from v7.0.4 to [v7.0.6](https://github.com/grafana/grafana/releases/tag/v7.0.6)
|
||||
|
||||
## v1.18.5
|
||||
|
||||
* Kubernetes [v1.18.5](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md#v1185)
|
||||
* Add Cilium v1.8.0 as a (experimental) CNI provider option ([#760](https://github.com/poseidon/typhoon/pull/760))
|
||||
* Set `networking` to "cilium" to enable
|
||||
|
12
README.md
12
README.md
@ -11,8 +11,8 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
|
||||
|
||||
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
|
||||
|
||||
* Kubernetes v1.18.5 (upstream)
|
||||
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
|
||||
* Kubernetes v1.19.2 (upstream)
|
||||
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking
|
||||
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/), SELinux enforcing
|
||||
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [preemptible](https://typhoon.psdn.io/cl/google-cloud/#preemption) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization
|
||||
* Ready for Ingress, Prometheus, Grafana, CSI, or other [addons](https://typhoon.psdn.io/addons/overview/)
|
||||
@ -54,7 +54,7 @@ Define a Kubernetes cluster by using the Terraform module for your chosen platfo
|
||||
|
||||
```tf
|
||||
module "yavin" {
|
||||
source = "git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes?ref=v1.18.5"
|
||||
source = "git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes?ref=v1.19.2"
|
||||
|
||||
# Google Cloud
|
||||
cluster_name = "yavin"
|
||||
@ -93,9 +93,9 @@ In 4-8 minutes (varies by platform), the cluster will be ready. This Google Clou
|
||||
$ export KUBECONFIG=/home/user/.kube/configs/yavin-config
|
||||
$ kubectl get nodes
|
||||
NAME ROLES STATUS AGE VERSION
|
||||
yavin-controller-0.c.example-com.internal <none> Ready 6m v1.18.5
|
||||
yavin-worker-jrbf.c.example-com.internal <none> Ready 5m v1.18.5
|
||||
yavin-worker-mzdm.c.example-com.internal <none> Ready 5m v1.18.5
|
||||
yavin-controller-0.c.example-com.internal <none> Ready 6m v1.19.2
|
||||
yavin-worker-jrbf.c.example-com.internal <none> Ready 5m v1.19.2
|
||||
yavin-worker-mzdm.c.example-com.internal <none> Ready 5m v1.19.2
|
||||
```
|
||||
|
||||
List the pods.
|
||||
|
@ -49,6 +49,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": "true",
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": "true"
|
||||
},
|
||||
@ -72,7 +73,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(rate(coredns_dns_request_count_total{instance=~\"$instance\"}[5m])) by (proto)",
|
||||
"expr": "sum(rate(coredns_dns_requests_total{instance=~\"$instance\"}[5m])) by (proto)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{proto}}",
|
||||
@ -140,6 +141,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": "true",
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": "true"
|
||||
},
|
||||
@ -163,7 +165,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(rate(coredns_dns_request_type_count_total{instance=~\"$instance\"}[5m])) by (type)",
|
||||
"expr": "sum(rate(coredns_dns_requests_total{instance=~\"$instance\"}[5m])) by (type)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{type}}",
|
||||
@ -231,6 +233,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": "true",
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": "true"
|
||||
},
|
||||
@ -254,7 +257,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(rate(coredns_dns_request_count_total{instance=~\"$instance\"}[5m])) by (zone)",
|
||||
"expr": "sum(rate(coredns_dns_requests_total{instance=~\"$instance\"}[5m])) by (zone)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{zone}}",
|
||||
@ -335,6 +338,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": "true",
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -440,6 +444,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": "true",
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -463,7 +468,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(rate(coredns_dns_response_rcode_count_total{instance=~\"$instance\"}[5m])) by (rcode)",
|
||||
"expr": "sum(rate(coredns_dns_responses_total{instance=~\"$instance\"}[5m])) by (rcode)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{rcode}}",
|
||||
@ -544,6 +549,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": "true",
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -649,6 +655,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": "true",
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -767,6 +774,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": "true",
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -790,7 +798,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(coredns_cache_size{instance=~\"$instance\"}) by (type)",
|
||||
"expr": "sum(coredns_cache_entries{instance=~\"$instance\"}) by (type)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{type}}",
|
||||
@ -858,6 +866,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": "true",
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
|
@ -11,7 +11,6 @@ data:
|
||||
"editable": true,
|
||||
"gnetId": null,
|
||||
"hideControls": false,
|
||||
"id": 6,
|
||||
"links": [
|
||||
|
||||
],
|
||||
@ -343,7 +342,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "etcd_debugging_mvcc_db_total_size_in_bytes{job=\"$cluster\"}",
|
||||
"expr": "etcd_mvcc_db_total_size_in_bytes{job=\"$cluster\"}",
|
||||
"hide": false,
|
||||
"interval": "",
|
||||
"intervalFactor": 2,
|
||||
|
@ -172,7 +172,7 @@ data:
|
||||
"tableColumn": "",
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(kubelet_running_pod_count{cluster=\"$cluster\", job=\"kubelet\", instance=~\"$instance\"})",
|
||||
"expr": "sum(kubelet_running_pods{cluster=\"$cluster\", job=\"kubelet\", instance=~\"$instance\"})",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{instance}}",
|
||||
@ -256,7 +256,7 @@ data:
|
||||
"tableColumn": "",
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(kubelet_running_container_count{cluster=\"$cluster\", job=\"kubelet\", instance=~\"$instance\"})",
|
||||
"expr": "sum(kubelet_running_containers{cluster=\"$cluster\", job=\"kubelet\", instance=~\"$instance\"})",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{instance}}",
|
||||
@ -565,6 +565,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": true
|
||||
},
|
||||
@ -656,6 +657,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": true
|
||||
},
|
||||
@ -760,6 +762,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": true
|
||||
},
|
||||
@ -864,6 +867,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": true
|
||||
},
|
||||
@ -962,6 +966,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": true
|
||||
},
|
||||
@ -1075,6 +1080,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": true
|
||||
},
|
||||
@ -1168,6 +1174,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": true
|
||||
},
|
||||
@ -1274,6 +1281,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": true
|
||||
},
|
||||
@ -1378,6 +1386,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": true
|
||||
},
|
||||
@ -1469,6 +1478,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": true
|
||||
},
|
||||
@ -1574,6 +1584,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": true
|
||||
},
|
||||
@ -1665,6 +1676,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": true
|
||||
},
|
||||
@ -1769,6 +1781,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": true
|
||||
},
|
||||
@ -1873,6 +1886,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -1998,6 +2012,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": true
|
||||
},
|
||||
@ -2021,7 +2036,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "histogram_quantile(0.99, sum(rate(rest_client_request_latency_seconds_bucket{cluster=\"$cluster\",job=\"kubelet\", instance=~\"$instance\"}[5m])) by (instance, verb, url, le))",
|
||||
"expr": "histogram_quantile(0.99, sum(rate(rest_client_request_duration_seconds_bucket{cluster=\"$cluster\",job=\"kubelet\", instance=~\"$instance\"}[5m])) by (instance, verb, url, le))",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{instance}} {{verb}} {{url}}",
|
||||
@ -2102,6 +2117,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -2193,6 +2209,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -2284,6 +2301,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -2470,7 +2488,7 @@ data:
|
||||
"30d"
|
||||
]
|
||||
},
|
||||
"timezone": "",
|
||||
"timezone": "UTC",
|
||||
"title": "Kubernetes / Kubelet",
|
||||
"uid": "3138fa155d5915769fbded898ac09fd9",
|
||||
"version": 0
|
||||
@ -2607,6 +2625,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -2698,6 +2717,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": true
|
||||
},
|
||||
@ -2802,6 +2822,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -2893,6 +2914,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": true
|
||||
},
|
||||
@ -2997,6 +3019,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -3109,6 +3132,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -3132,7 +3156,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "histogram_quantile(0.99, sum(rate(rest_client_request_latency_seconds_bucket{job=\"kube-proxy\",instance=~\"$instance\",verb=\"POST\"}[5m])) by (verb, url, le))",
|
||||
"expr": "histogram_quantile(0.99, sum(rate(rest_client_request_duration_seconds_bucket{job=\"kube-proxy\",instance=~\"$instance\",verb=\"POST\"}[5m])) by (verb, url, le))",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{verb}} {{url}}",
|
||||
@ -3213,6 +3237,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": true
|
||||
},
|
||||
@ -3236,7 +3261,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "histogram_quantile(0.99, sum(rate(rest_client_request_latency_seconds_bucket{job=\"kube-proxy\", instance=~\"$instance\", verb=\"GET\"}[5m])) by (verb, url, le))",
|
||||
"expr": "histogram_quantile(0.99, sum(rate(rest_client_request_duration_seconds_bucket{job=\"kube-proxy\", instance=~\"$instance\", verb=\"GET\"}[5m])) by (verb, url, le))",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{verb}} {{url}}",
|
||||
@ -3317,6 +3342,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -3408,6 +3434,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -3499,6 +3526,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -3659,7 +3687,7 @@ data:
|
||||
"30d"
|
||||
]
|
||||
},
|
||||
"timezone": "",
|
||||
"timezone": "UTC",
|
||||
"title": "Kubernetes / Proxy",
|
||||
"uid": "632e265de029684c40b21cb76bca4f94",
|
||||
"version": 0
|
||||
|
@ -31,6 +31,7 @@ data:
|
||||
"fill": 1,
|
||||
"format": "percentunit",
|
||||
"id": 1,
|
||||
"interval": "1m",
|
||||
"legend": {
|
||||
"avg": false,
|
||||
"current": false,
|
||||
@ -686,6 +687,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 0,
|
||||
"link": true,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down to pods",
|
||||
"linkUrl": "./d/85a562078cdf77779eaa1add43ccec1e/k8s-resources-namespace?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$__cell_1",
|
||||
"pattern": "Value #A",
|
||||
@ -704,6 +706,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 0,
|
||||
"link": true,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down to workloads",
|
||||
"linkUrl": "./d/a87fb0d919ec0ea5f6543124e16c42a5/k8s-resources-workloads-namespace?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$__cell_1",
|
||||
"pattern": "Value #B",
|
||||
@ -722,6 +725,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #C",
|
||||
@ -740,6 +744,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #D",
|
||||
@ -758,6 +763,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #E",
|
||||
@ -776,6 +782,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #F",
|
||||
@ -794,6 +801,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #G",
|
||||
@ -812,6 +820,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": true,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down to pods",
|
||||
"linkUrl": "./d/85a562078cdf77779eaa1add43ccec1e/k8s-resources-namespace?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$__cell",
|
||||
"pattern": "namespace",
|
||||
@ -839,7 +848,7 @@ data:
|
||||
],
|
||||
"targets": [
|
||||
{
|
||||
"expr": "count(mixin_pod_workload{cluster=\"$cluster\"}) by (namespace)",
|
||||
"expr": "sum(kube_pod_owner{cluster=\"$cluster\"}) by (namespace)",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -848,7 +857,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "count(avg(mixin_pod_workload{cluster=\"$cluster\"}) by (workload, namespace)) by (namespace)",
|
||||
"expr": "count(avg(namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\"}) by (workload, namespace)) by (namespace)",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -1105,6 +1114,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 0,
|
||||
"link": true,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down to pods",
|
||||
"linkUrl": "./d/85a562078cdf77779eaa1add43ccec1e/k8s-resources-namespace?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$__cell_1",
|
||||
"pattern": "Value #A",
|
||||
@ -1123,6 +1133,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 0,
|
||||
"link": true,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down to workloads",
|
||||
"linkUrl": "./d/a87fb0d919ec0ea5f6543124e16c42a5/k8s-resources-workloads-namespace?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$__cell_1",
|
||||
"pattern": "Value #B",
|
||||
@ -1141,6 +1152,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #C",
|
||||
@ -1159,6 +1171,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #D",
|
||||
@ -1177,6 +1190,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #E",
|
||||
@ -1195,6 +1209,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #F",
|
||||
@ -1213,6 +1228,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #G",
|
||||
@ -1231,6 +1247,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": true,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down to pods",
|
||||
"linkUrl": "./d/85a562078cdf77779eaa1add43ccec1e/k8s-resources-namespace?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$__cell",
|
||||
"pattern": "namespace",
|
||||
@ -1258,7 +1275,7 @@ data:
|
||||
],
|
||||
"targets": [
|
||||
{
|
||||
"expr": "count(mixin_pod_workload{cluster=\"$cluster\"}) by (namespace)",
|
||||
"expr": "sum(kube_pod_owner{cluster=\"$cluster\"}) by (namespace)",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -1267,7 +1284,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "count(avg(mixin_pod_workload{cluster=\"$cluster\"}) by (workload, namespace)) by (namespace)",
|
||||
"expr": "count(avg(namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\"}) by (workload, namespace)) by (namespace)",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -1384,6 +1401,7 @@ data:
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"id": 11,
|
||||
"interval": "1m",
|
||||
"legend": {
|
||||
"avg": false,
|
||||
"current": false,
|
||||
@ -1426,6 +1444,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #A",
|
||||
@ -1444,6 +1463,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #B",
|
||||
@ -1462,6 +1482,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #C",
|
||||
@ -1480,6 +1501,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #D",
|
||||
@ -1498,6 +1520,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #E",
|
||||
@ -1516,6 +1539,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #F",
|
||||
@ -1534,6 +1558,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": true,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down to pods",
|
||||
"linkUrl": "./d/85a562078cdf77779eaa1add43ccec1e/k8s-resources-namespace?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$__cell",
|
||||
"pattern": "namespace",
|
||||
@ -2472,33 +2497,6 @@ data:
|
||||
"regex": "",
|
||||
"type": "datasource"
|
||||
},
|
||||
{
|
||||
"allValue": null,
|
||||
"current": {
|
||||
"text": "prod",
|
||||
"value": "prod"
|
||||
},
|
||||
"datasource": "$datasource",
|
||||
"hide": 2,
|
||||
"includeAll": false,
|
||||
"label": "cluster",
|
||||
"multi": false,
|
||||
"name": "cluster",
|
||||
"options": [
|
||||
|
||||
],
|
||||
"query": "label_values(node_cpu_seconds_total, cluster)",
|
||||
"refresh": 1,
|
||||
"regex": "",
|
||||
"sort": 2,
|
||||
"tagValuesQuery": "",
|
||||
"tags": [
|
||||
|
||||
],
|
||||
"tagsQuery": "",
|
||||
"type": "query",
|
||||
"useTags": false
|
||||
},
|
||||
{
|
||||
"allValue": null,
|
||||
"current": {
|
||||
@ -2557,7 +2555,7 @@ data:
|
||||
"30d"
|
||||
]
|
||||
},
|
||||
"timezone": "",
|
||||
"timezone": "UTC",
|
||||
"title": "Kubernetes / Compute Resources / Cluster",
|
||||
"uid": "efa86fd1d0c121a26444b636a3f509a8",
|
||||
"version": 0
|
||||
@ -2789,7 +2787,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\",container!=\"\"}) / sum(kube_pod_container_resource_requests_memory_bytes{namespace=\"$namespace\"})",
|
||||
"expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\",container!=\"\", image!=\"\"}) / sum(kube_pod_container_resource_requests_memory_bytes{namespace=\"$namespace\"})",
|
||||
"format": "time_series",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -2873,7 +2871,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\",container!=\"\"}) / sum(kube_pod_container_resource_limits_memory_bytes{namespace=\"$namespace\"})",
|
||||
"expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\",container!=\"\", image!=\"\"}) / sum(kube_pod_container_resource_limits_memory_bytes{namespace=\"$namespace\"})",
|
||||
"format": "time_series",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -3115,6 +3113,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #A",
|
||||
@ -3133,6 +3132,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #B",
|
||||
@ -3151,6 +3151,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #C",
|
||||
@ -3169,6 +3170,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #D",
|
||||
@ -3187,6 +3189,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #E",
|
||||
@ -3205,6 +3208,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": true,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "./d/6581e46e4e5c7ba40a07646395ef7b23/k8s-resources-pod?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-pod=$__cell",
|
||||
"pattern": "pod",
|
||||
@ -3387,7 +3391,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\"}) by (pod)",
|
||||
"expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\", image!=\"\"}) by (pod)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{pod}}",
|
||||
@ -3515,6 +3519,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #A",
|
||||
@ -3533,6 +3538,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #B",
|
||||
@ -3551,6 +3557,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #C",
|
||||
@ -3569,6 +3576,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #D",
|
||||
@ -3587,6 +3595,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #E",
|
||||
@ -3605,6 +3614,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #F",
|
||||
@ -3623,6 +3633,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #G",
|
||||
@ -3641,6 +3652,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #H",
|
||||
@ -3659,6 +3671,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": true,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "./d/6581e46e4e5c7ba40a07646395ef7b23/k8s-resources-pod?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-pod=$__cell",
|
||||
"pattern": "pod",
|
||||
@ -3686,7 +3699,7 @@ data:
|
||||
],
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\",container!=\"\"}) by (pod)",
|
||||
"expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\",container!=\"\", image!=\"\"}) by (pod)",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -3704,7 +3717,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\",container!=\"\"}) by (pod) / sum(kube_pod_container_resource_requests_memory_bytes{namespace=\"$namespace\"}) by (pod)",
|
||||
"expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\",container!=\"\", image!=\"\"}) by (pod) / sum(kube_pod_container_resource_requests_memory_bytes{namespace=\"$namespace\"}) by (pod)",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -3722,7 +3735,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\",container!=\"\"}) by (pod) / sum(kube_pod_container_resource_limits_memory_bytes{namespace=\"$namespace\"}) by (pod)",
|
||||
"expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\",container!=\"\", image!=\"\"}) by (pod) / sum(kube_pod_container_resource_limits_memory_bytes{namespace=\"$namespace\"}) by (pod)",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -3821,6 +3834,7 @@ data:
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"id": 9,
|
||||
"interval": "1m",
|
||||
"legend": {
|
||||
"avg": false,
|
||||
"current": false,
|
||||
@ -3863,6 +3877,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #A",
|
||||
@ -3881,6 +3896,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #B",
|
||||
@ -3899,6 +3915,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #C",
|
||||
@ -3917,6 +3934,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #D",
|
||||
@ -3935,6 +3953,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #E",
|
||||
@ -3953,6 +3972,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #F",
|
||||
@ -3971,6 +3991,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": true,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down to pods",
|
||||
"linkUrl": "./d/6581e46e4e5c7ba40a07646395ef7b23/k8s-resources-pod?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-pod=$__cell",
|
||||
"pattern": "pod",
|
||||
@ -4798,7 +4819,7 @@ data:
|
||||
"30d"
|
||||
]
|
||||
},
|
||||
"timezone": "",
|
||||
"timezone": "UTC",
|
||||
"title": "Kubernetes / Compute Resources / Namespace (Pods)",
|
||||
"uid": "85a562078cdf77779eaa1add43ccec1e",
|
||||
"version": 0
|
||||
@ -4861,7 +4882,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", node=\"$node\"}) by (pod)",
|
||||
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", node=~\"$node\"}) by (pod)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{pod}}",
|
||||
@ -4973,6 +4994,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #A",
|
||||
@ -4991,6 +5013,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #B",
|
||||
@ -5009,6 +5032,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #C",
|
||||
@ -5027,6 +5051,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #D",
|
||||
@ -5045,6 +5070,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #E",
|
||||
@ -5063,6 +5089,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "pod",
|
||||
@ -5090,7 +5117,7 @@ data:
|
||||
],
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", node=\"$node\"}) by (pod)",
|
||||
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", node=~\"$node\"}) by (pod)",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -5099,7 +5126,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "sum(kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\", node=\"$node\"}) by (pod)",
|
||||
"expr": "sum(kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\", node=~\"$node\"}) by (pod)",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -5108,7 +5135,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", node=\"$node\"}) by (pod) / sum(kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\", node=\"$node\"}) by (pod)",
|
||||
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", node=~\"$node\"}) by (pod) / sum(kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\", node=~\"$node\"}) by (pod)",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -5117,7 +5144,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "sum(kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\", node=\"$node\"}) by (pod)",
|
||||
"expr": "sum(kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\", node=~\"$node\"}) by (pod)",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -5126,7 +5153,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", node=\"$node\"}) by (pod) / sum(kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\", node=\"$node\"}) by (pod)",
|
||||
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", node=~\"$node\"}) by (pod) / sum(kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\", node=~\"$node\"}) by (pod)",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -5226,7 +5253,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(node_namespace_pod_container:container_memory_working_set_bytes{cluster=\"$cluster\", node=\"$node\", container!=\"\"}) by (pod)",
|
||||
"expr": "sum(node_namespace_pod_container:container_memory_working_set_bytes{cluster=\"$cluster\", node=~\"$node\", container!=\"\"}) by (pod)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{pod}}",
|
||||
@ -5338,6 +5365,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #A",
|
||||
@ -5356,6 +5384,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #B",
|
||||
@ -5374,6 +5403,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #C",
|
||||
@ -5392,6 +5422,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #D",
|
||||
@ -5410,6 +5441,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #E",
|
||||
@ -5428,6 +5460,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #F",
|
||||
@ -5446,6 +5479,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #G",
|
||||
@ -5464,6 +5498,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #H",
|
||||
@ -5482,6 +5517,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "pod",
|
||||
@ -5509,7 +5545,7 @@ data:
|
||||
],
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(node_namespace_pod_container:container_memory_working_set_bytes{cluster=\"$cluster\", node=\"$node\",container!=\"\"}) by (pod)",
|
||||
"expr": "sum(node_namespace_pod_container:container_memory_working_set_bytes{cluster=\"$cluster\", node=~\"$node\",container!=\"\"}) by (pod)",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -5518,7 +5554,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "sum(kube_pod_container_resource_requests_memory_bytes{cluster=\"$cluster\", node=\"$node\"}) by (pod)",
|
||||
"expr": "sum(kube_pod_container_resource_requests_memory_bytes{cluster=\"$cluster\", node=~\"$node\"}) by (pod)",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -5527,7 +5563,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "sum(node_namespace_pod_container:container_memory_working_set_bytes{cluster=\"$cluster\", node=\"$node\",container!=\"\"}) by (pod) / sum(kube_pod_container_resource_requests_memory_bytes{node=\"$node\"}) by (pod)",
|
||||
"expr": "sum(node_namespace_pod_container:container_memory_working_set_bytes{cluster=\"$cluster\", node=~\"$node\",container!=\"\"}) by (pod) / sum(kube_pod_container_resource_requests_memory_bytes{node=~\"$node\"}) by (pod)",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -5536,7 +5572,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "sum(kube_pod_container_resource_limits_memory_bytes{cluster=\"$cluster\", node=\"$node\"}) by (pod)",
|
||||
"expr": "sum(kube_pod_container_resource_limits_memory_bytes{cluster=\"$cluster\", node=~\"$node\"}) by (pod)",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -5545,7 +5581,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "sum(node_namespace_pod_container:container_memory_working_set_bytes{cluster=\"$cluster\", node=\"$node\",container!=\"\"}) by (pod) / sum(kube_pod_container_resource_limits_memory_bytes{node=\"$node\"}) by (pod)",
|
||||
"expr": "sum(node_namespace_pod_container:container_memory_working_set_bytes{cluster=\"$cluster\", node=~\"$node\",container!=\"\"}) by (pod) / sum(kube_pod_container_resource_limits_memory_bytes{node=~\"$node\"}) by (pod)",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -5554,7 +5590,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "sum(node_namespace_pod_container:container_memory_rss{cluster=\"$cluster\", node=\"$node\",container!=\"\"}) by (pod)",
|
||||
"expr": "sum(node_namespace_pod_container:container_memory_rss{cluster=\"$cluster\", node=~\"$node\",container!=\"\"}) by (pod)",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -5563,7 +5599,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "sum(node_namespace_pod_container:container_memory_cache{cluster=\"$cluster\", node=\"$node\",container!=\"\"}) by (pod)",
|
||||
"expr": "sum(node_namespace_pod_container:container_memory_cache{cluster=\"$cluster\", node=~\"$node\",container!=\"\"}) by (pod)",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -5572,7 +5608,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "sum(node_namespace_pod_container:container_memory_swap{cluster=\"$cluster\", node=\"$node\",container!=\"\"}) by (pod)",
|
||||
"expr": "sum(node_namespace_pod_container:container_memory_swap{cluster=\"$cluster\", node=~\"$node\",container!=\"\"}) by (pod)",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -5691,7 +5727,7 @@ data:
|
||||
"hide": 0,
|
||||
"includeAll": false,
|
||||
"label": null,
|
||||
"multi": false,
|
||||
"multi": true,
|
||||
"name": "node",
|
||||
"options": [
|
||||
|
||||
@ -5739,7 +5775,7 @@ data:
|
||||
"30d"
|
||||
]
|
||||
},
|
||||
"timezone": "",
|
||||
"timezone": "UTC",
|
||||
"title": "Kubernetes / Compute Resources / Node (Pods)",
|
||||
"uid": "200ac8fdbfbb74b39aff88118e4d1c2c",
|
||||
"version": 0
|
||||
|
@ -189,7 +189,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(increase(container_cpu_cfs_throttled_periods_total{namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", cluster=\"$cluster\"}[5m])) by (container) /sum(increase(container_cpu_cfs_periods_total{namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", cluster=\"$cluster\"}[5m])) by (container)",
|
||||
"expr": "sum(increase(container_cpu_cfs_throttled_periods_total{namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", container!=\"\", cluster=\"$cluster\"}[5m])) by (container) /sum(increase(container_cpu_cfs_periods_total{namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", container!=\"\", cluster=\"$cluster\"}[5m])) by (container)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{container}}",
|
||||
@ -203,7 +203,7 @@ data:
|
||||
"fill": true,
|
||||
"line": true,
|
||||
"op": "gt",
|
||||
"value": 1,
|
||||
"value": 0.80000000000000004,
|
||||
"yaxis": "left"
|
||||
}
|
||||
],
|
||||
@ -308,6 +308,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #A",
|
||||
@ -326,6 +327,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #B",
|
||||
@ -344,6 +346,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #C",
|
||||
@ -362,6 +365,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #D",
|
||||
@ -380,6 +384,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #E",
|
||||
@ -398,6 +403,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "container",
|
||||
@ -580,7 +586,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", container!=\"\"}) by (container)",
|
||||
"expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", container!=\"\", image!=\"\"}) by (container)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{container}}",
|
||||
@ -708,6 +714,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #A",
|
||||
@ -726,6 +733,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #B",
|
||||
@ -744,6 +752,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #C",
|
||||
@ -762,6 +771,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #D",
|
||||
@ -780,6 +790,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #E",
|
||||
@ -798,6 +809,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #F",
|
||||
@ -816,6 +828,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #G",
|
||||
@ -834,6 +847,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #H",
|
||||
@ -852,6 +866,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "container",
|
||||
@ -879,7 +894,7 @@ data:
|
||||
],
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", container!=\"\"}) by (container)",
|
||||
"expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", container!=\"\", image!=\"\"}) by (container)",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -897,7 +912,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}) by (container) / sum(kube_pod_container_resource_requests_memory_bytes{namespace=\"$namespace\", pod=\"$pod\"}) by (container)",
|
||||
"expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", image!=\"\"}) by (container) / sum(kube_pod_container_resource_requests_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}) by (container)",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -915,7 +930,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", container!=\"\"}) by (container) / sum(kube_pod_container_resource_limits_memory_bytes{namespace=\"$namespace\", pod=\"$pod\"}) by (container)",
|
||||
"expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", container!=\"\", image!=\"\"}) by (container) / sum(kube_pod_container_resource_limits_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}) by (container)",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -1014,6 +1029,7 @@ data:
|
||||
"datasource": "$datasource",
|
||||
"fill": 10,
|
||||
"id": 6,
|
||||
"interval": "1m",
|
||||
"legend": {
|
||||
"avg": false,
|
||||
"current": false,
|
||||
@ -1112,6 +1128,7 @@ data:
|
||||
"datasource": "$datasource",
|
||||
"fill": 10,
|
||||
"id": 7,
|
||||
"interval": "1m",
|
||||
"legend": {
|
||||
"avg": false,
|
||||
"current": false,
|
||||
@ -1210,6 +1227,7 @@ data:
|
||||
"datasource": "$datasource",
|
||||
"fill": 10,
|
||||
"id": 8,
|
||||
"interval": "1m",
|
||||
"legend": {
|
||||
"avg": false,
|
||||
"current": false,
|
||||
@ -1308,6 +1326,7 @@ data:
|
||||
"datasource": "$datasource",
|
||||
"fill": 10,
|
||||
"id": 9,
|
||||
"interval": "1m",
|
||||
"legend": {
|
||||
"avg": false,
|
||||
"current": false,
|
||||
@ -1406,6 +1425,7 @@ data:
|
||||
"datasource": "$datasource",
|
||||
"fill": 10,
|
||||
"id": 10,
|
||||
"interval": "1m",
|
||||
"legend": {
|
||||
"avg": false,
|
||||
"current": false,
|
||||
@ -1504,6 +1524,7 @@ data:
|
||||
"datasource": "$datasource",
|
||||
"fill": 10,
|
||||
"id": 11,
|
||||
"interval": "1m",
|
||||
"legend": {
|
||||
"avg": false,
|
||||
"current": false,
|
||||
@ -1724,7 +1745,7 @@ data:
|
||||
"30d"
|
||||
]
|
||||
},
|
||||
"timezone": "",
|
||||
"timezone": "UTC",
|
||||
"title": "Kubernetes / Compute Resources / Pod",
|
||||
"uid": "6581e46e4e5c7ba40a07646395ef7b23",
|
||||
"version": 0
|
||||
@ -1787,7 +1808,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
|
||||
"expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{pod}}",
|
||||
@ -1899,6 +1920,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #A",
|
||||
@ -1917,6 +1939,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #B",
|
||||
@ -1935,6 +1958,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #C",
|
||||
@ -1953,6 +1977,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #D",
|
||||
@ -1971,6 +1996,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #E",
|
||||
@ -1989,6 +2015,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": true,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "./d/6581e46e4e5c7ba40a07646395ef7b23/k8s-resources-pod?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-pod=$__cell",
|
||||
"pattern": "pod",
|
||||
@ -2016,7 +2043,7 @@ data:
|
||||
],
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
|
||||
"expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -2025,7 +2052,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "sum(\n kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
|
||||
"expr": "sum(\n kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -2034,7 +2061,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n/sum(\n kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
|
||||
"expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n/sum(\n kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -2043,7 +2070,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "sum(\n kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
|
||||
"expr": "sum(\n kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -2052,7 +2079,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n/sum(\n kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
|
||||
"expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n/sum(\n kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -2152,7 +2179,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(\n container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
|
||||
"expr": "sum(\n container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\", image!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{pod}}",
|
||||
@ -2264,6 +2291,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #A",
|
||||
@ -2282,6 +2310,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #B",
|
||||
@ -2300,6 +2329,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #C",
|
||||
@ -2318,6 +2348,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #D",
|
||||
@ -2336,6 +2367,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #E",
|
||||
@ -2354,6 +2386,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": true,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "./d/6581e46e4e5c7ba40a07646395ef7b23/k8s-resources-pod?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-pod=$__cell",
|
||||
"pattern": "pod",
|
||||
@ -2381,7 +2414,7 @@ data:
|
||||
],
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(\n container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
|
||||
"expr": "sum(\n container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\", image!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -2390,7 +2423,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "sum(\n kube_pod_container_resource_requests_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
|
||||
"expr": "sum(\n kube_pod_container_resource_requests_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -2399,7 +2432,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "sum(\n container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n/sum(\n kube_pod_container_resource_requests_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
|
||||
"expr": "sum(\n container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\", image!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n/sum(\n kube_pod_container_resource_requests_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -2408,7 +2441,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "sum(\n kube_pod_container_resource_limits_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
|
||||
"expr": "sum(\n kube_pod_container_resource_limits_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -2417,7 +2450,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "sum(\n container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n/sum(\n kube_pod_container_resource_limits_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
|
||||
"expr": "sum(\n container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\", image!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n/sum(\n kube_pod_container_resource_limits_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -2489,6 +2522,7 @@ data:
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"id": 5,
|
||||
"interval": "1m",
|
||||
"legend": {
|
||||
"avg": false,
|
||||
"current": false,
|
||||
@ -2531,6 +2565,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #A",
|
||||
@ -2549,6 +2584,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #B",
|
||||
@ -2567,6 +2603,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #C",
|
||||
@ -2585,6 +2622,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #D",
|
||||
@ -2603,6 +2641,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #E",
|
||||
@ -2621,6 +2660,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #F",
|
||||
@ -2639,6 +2679,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": true,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "./d/6581e46e4e5c7ba40a07646395ef7b23/k8s-resources-pod?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-pod=$__cell",
|
||||
"pattern": "pod",
|
||||
@ -2666,7 +2707,7 @@ data:
|
||||
],
|
||||
"targets": [
|
||||
{
|
||||
"expr": "(sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"expr": "(sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -2675,7 +2716,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "(sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"expr": "(sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -2684,7 +2725,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "(sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"expr": "(sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -2693,7 +2734,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "(sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"expr": "(sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -2702,7 +2743,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "(sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"expr": "(sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -2711,7 +2752,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "(sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"expr": "(sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -2811,7 +2852,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "(sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"expr": "(sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{pod}}",
|
||||
@ -2909,7 +2950,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "(sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"expr": "(sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{pod}}",
|
||||
@ -3007,7 +3048,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "(avg(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"expr": "(avg(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{pod}}",
|
||||
@ -3105,7 +3146,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "(avg(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"expr": "(avg(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{pod}}",
|
||||
@ -3203,7 +3244,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "(sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"expr": "(sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{pod}}",
|
||||
@ -3301,7 +3342,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "(sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"expr": "(sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{pod}}",
|
||||
@ -3399,7 +3440,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "(sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"expr": "(sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{pod}}",
|
||||
@ -3497,7 +3538,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "(sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"expr": "(sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{pod}}",
|
||||
@ -3646,7 +3687,7 @@ data:
|
||||
"options": [
|
||||
|
||||
],
|
||||
"query": "label_values(mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\"}, workload)",
|
||||
"query": "label_values(namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\"}, workload)",
|
||||
"refresh": 1,
|
||||
"regex": "",
|
||||
"sort": 1,
|
||||
@ -3673,7 +3714,7 @@ data:
|
||||
"options": [
|
||||
|
||||
],
|
||||
"query": "label_values(mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\"}, workload_type)",
|
||||
"query": "label_values(namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\"}, workload_type)",
|
||||
"refresh": 1,
|
||||
"regex": "",
|
||||
"sort": 1,
|
||||
@ -3716,7 +3757,7 @@ data:
|
||||
"30d"
|
||||
]
|
||||
},
|
||||
"timezone": "",
|
||||
"timezone": "UTC",
|
||||
"title": "Kubernetes / Compute Resources / Workload",
|
||||
"uid": "a164a7f0339f99e89cea5cb47e9be617",
|
||||
"version": 0
|
||||
@ -3798,7 +3839,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
|
||||
"expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{workload}} - {{workload_type}}",
|
||||
@ -3926,6 +3967,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 0,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #A",
|
||||
@ -3944,6 +3986,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #B",
|
||||
@ -3962,6 +4005,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #C",
|
||||
@ -3980,6 +4024,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #D",
|
||||
@ -3998,6 +4043,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #E",
|
||||
@ -4016,6 +4062,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #F",
|
||||
@ -4034,6 +4081,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": true,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "./d/a164a7f0339f99e89cea5cb47e9be617/k8s-resources-workload?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-workload=$__cell&var-type=$__cell_2",
|
||||
"pattern": "workload",
|
||||
@ -4052,6 +4100,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "workload_type",
|
||||
@ -4079,7 +4128,7 @@ data:
|
||||
],
|
||||
"targets": [
|
||||
{
|
||||
"expr": "count(mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}) by (workload, workload_type)",
|
||||
"expr": "count(namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}) by (workload, workload_type)",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -4088,7 +4137,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
|
||||
"expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -4097,7 +4146,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "sum(\n kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
|
||||
"expr": "sum(\n kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -4106,7 +4155,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n/sum(\n kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
|
||||
"expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n/sum(\n kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -4115,7 +4164,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "sum(\n kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
|
||||
"expr": "sum(\n kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -4124,7 +4173,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n/sum(\n kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
|
||||
"expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n/sum(\n kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -4243,7 +4292,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(\n container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
|
||||
"expr": "sum(\n container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\", image!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{workload}} - {{workload_type}}",
|
||||
@ -4371,6 +4420,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 0,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #A",
|
||||
@ -4389,6 +4439,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #B",
|
||||
@ -4407,6 +4458,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #C",
|
||||
@ -4425,6 +4477,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #D",
|
||||
@ -4443,6 +4496,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #E",
|
||||
@ -4461,6 +4515,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #F",
|
||||
@ -4479,6 +4534,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": true,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "./d/a164a7f0339f99e89cea5cb47e9be617/k8s-resources-workload?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-workload=$__cell&var-type=$__cell_2",
|
||||
"pattern": "workload",
|
||||
@ -4497,6 +4553,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "workload_type",
|
||||
@ -4524,7 +4581,7 @@ data:
|
||||
],
|
||||
"targets": [
|
||||
{
|
||||
"expr": "count(mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}) by (workload, workload_type)",
|
||||
"expr": "count(namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}) by (workload, workload_type)",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -4533,7 +4590,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "sum(\n container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
|
||||
"expr": "sum(\n container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\", image!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -4542,7 +4599,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "sum(\n kube_pod_container_resource_requests_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
|
||||
"expr": "sum(\n kube_pod_container_resource_requests_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -4551,7 +4608,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "sum(\n container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n/sum(\n kube_pod_container_resource_requests_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
|
||||
"expr": "sum(\n container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\", image!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n/sum(\n kube_pod_container_resource_requests_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -4560,7 +4617,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "sum(\n kube_pod_container_resource_limits_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
|
||||
"expr": "sum(\n kube_pod_container_resource_limits_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -4569,7 +4626,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "sum(\n container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n/sum(\n kube_pod_container_resource_limits_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
|
||||
"expr": "sum(\n container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\", image!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n/sum(\n kube_pod_container_resource_limits_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -4641,6 +4698,7 @@ data:
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"id": 5,
|
||||
"interval": "1m",
|
||||
"legend": {
|
||||
"avg": false,
|
||||
"current": false,
|
||||
@ -4683,6 +4741,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #A",
|
||||
@ -4701,6 +4760,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #B",
|
||||
@ -4719,6 +4779,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #C",
|
||||
@ -4737,6 +4798,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #D",
|
||||
@ -4755,6 +4817,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #E",
|
||||
@ -4773,6 +4836,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #F",
|
||||
@ -4791,6 +4855,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": true,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down to pods",
|
||||
"linkUrl": "./d/a164a7f0339f99e89cea5cb47e9be617/k8s-resources-workload?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-workload=$__cell&var-type=$type",
|
||||
"pattern": "workload",
|
||||
@ -4809,6 +4874,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "workload_type",
|
||||
@ -4836,7 +4902,7 @@ data:
|
||||
],
|
||||
"targets": [
|
||||
{
|
||||
"expr": "(sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"expr": "(sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -4845,7 +4911,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "(sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"expr": "(sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -4854,7 +4920,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "(sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"expr": "(sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -4863,7 +4929,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "(sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"expr": "(sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -4872,7 +4938,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "(sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"expr": "(sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -4881,7 +4947,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "(sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"expr": "(sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -4981,7 +5047,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "(sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"expr": "(sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{workload}}",
|
||||
@ -5079,7 +5145,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "(sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"expr": "(sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{workload}}",
|
||||
@ -5177,7 +5243,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "(avg(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"expr": "(avg(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{workload}}",
|
||||
@ -5275,7 +5341,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "(avg(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"expr": "(avg(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{workload}}",
|
||||
@ -5373,7 +5439,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "(sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"expr": "(sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{workload}}",
|
||||
@ -5471,7 +5537,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "(sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"expr": "(sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{workload}}",
|
||||
@ -5569,7 +5635,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "(sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"expr": "(sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{workload}}",
|
||||
@ -5667,7 +5733,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "(sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod) \ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"expr": "(sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{workload}}",
|
||||
@ -5757,7 +5823,7 @@ data:
|
||||
"value": "deployment"
|
||||
},
|
||||
"datasource": "$datasource",
|
||||
"definition": "label_values(mixin_pod_workload{namespace=~\"$namespace\", workload=~\".+\"}, workload_type)",
|
||||
"definition": "label_values(namespace_workload_pod:kube_pod_owner:relabel{namespace=~\"$namespace\", workload=~\".+\"}, workload_type)",
|
||||
"hide": 0,
|
||||
"includeAll": false,
|
||||
"label": null,
|
||||
@ -5766,7 +5832,7 @@ data:
|
||||
"options": [
|
||||
|
||||
],
|
||||
"query": "label_values(mixin_pod_workload{namespace=~\"$namespace\", workload=~\".+\"}, workload_type)",
|
||||
"query": "label_values(namespace_workload_pod:kube_pod_owner:relabel{namespace=~\"$namespace\", workload=~\".+\"}, workload_type)",
|
||||
"refresh": 1,
|
||||
"regex": "",
|
||||
"skipUrlSync": false,
|
||||
@ -5864,7 +5930,7 @@ data:
|
||||
"30d"
|
||||
]
|
||||
},
|
||||
"timezone": "",
|
||||
"timezone": "UTC",
|
||||
"title": "Kubernetes / Compute Resources / Namespace (Workloads)",
|
||||
"uid": "a87fb0d919ec0ea5f6543124e16c42a5",
|
||||
"version": 0
|
||||
|
@ -20,6 +20,24 @@ data:
|
||||
"id": null,
|
||||
"links": [
|
||||
|
||||
],
|
||||
"panels": [
|
||||
{
|
||||
"content": "The SLO (service level objective) and other metrics displayed on this dashboard are for informational purposes only.",
|
||||
"datasource": null,
|
||||
"description": "The SLO (service level objective) and other metrics displayed on this dashboard are for informational purposes only.",
|
||||
"gridPos": {
|
||||
"h": 2,
|
||||
"w": 24,
|
||||
"x": 0,
|
||||
"y": 0
|
||||
},
|
||||
"id": 2,
|
||||
"mode": "markdown",
|
||||
"span": 12,
|
||||
"title": "Notice",
|
||||
"type": "text"
|
||||
}
|
||||
],
|
||||
"refresh": "10s",
|
||||
"rows": [
|
||||
@ -37,7 +55,9 @@ data:
|
||||
"#d44a3a"
|
||||
],
|
||||
"datasource": "$datasource",
|
||||
"format": "none",
|
||||
"decimals": 3,
|
||||
"description": "How many percent of requests (both read and write) in 30 days have been answered successfully and fast enough?",
|
||||
"format": "percentunit",
|
||||
"gauge": {
|
||||
"maxValue": 100,
|
||||
"minValue": 0,
|
||||
@ -48,7 +68,7 @@ data:
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 2,
|
||||
"id": 3,
|
||||
"interval": null,
|
||||
"links": [
|
||||
|
||||
@ -78,7 +98,7 @@ data:
|
||||
"to": "null"
|
||||
}
|
||||
],
|
||||
"span": 2,
|
||||
"span": 4,
|
||||
"sparkline": {
|
||||
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||
"full": false,
|
||||
@ -88,7 +108,7 @@ data:
|
||||
"tableColumn": "",
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(up{job=\"apiserver\", cluster=\"$cluster\"})",
|
||||
"expr": "apiserver_request:availability30d{verb=\"all\", cluster=\"$cluster\"}",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "",
|
||||
@ -96,7 +116,7 @@ data:
|
||||
}
|
||||
],
|
||||
"thresholds": "",
|
||||
"title": "Up",
|
||||
"title": "Availability (30d) > 99.000%",
|
||||
"tooltip": {
|
||||
"shared": false
|
||||
},
|
||||
@ -109,7 +129,7 @@ data:
|
||||
"value": "null"
|
||||
}
|
||||
],
|
||||
"valueName": "min"
|
||||
"valueName": "avg"
|
||||
},
|
||||
{
|
||||
"aliasColors": {
|
||||
@ -119,11 +139,13 @@ data:
|
||||
"dashLength": 10,
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"decimals": 3,
|
||||
"description": "How much error budget is left looking at our 0.990% availability gurantees?",
|
||||
"fill": 10,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 3,
|
||||
"id": 4,
|
||||
"legend": {
|
||||
"alignAsTable": false,
|
||||
"avg": false,
|
||||
@ -132,6 +154,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -150,37 +173,16 @@ data:
|
||||
|
||||
],
|
||||
"spaceLength": 10,
|
||||
"span": 5,
|
||||
"span": 8,
|
||||
"stack": false,
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(rate(apiserver_request_total{job=\"apiserver\", instance=~\"$instance\",code=~\"2..\", cluster=\"$cluster\"}[5m]))",
|
||||
"expr": "100 * (apiserver_request:availability30d{verb=\"all\", cluster=\"$cluster\"} - 0.990000)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "2xx",
|
||||
"legendFormat": "errorbudget",
|
||||
"refId": "A"
|
||||
},
|
||||
{
|
||||
"expr": "sum(rate(apiserver_request_total{job=\"apiserver\", instance=~\"$instance\",code=~\"3..\", cluster=\"$cluster\"}[5m]))",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "3xx",
|
||||
"refId": "B"
|
||||
},
|
||||
{
|
||||
"expr": "sum(rate(apiserver_request_total{job=\"apiserver\", instance=~\"$instance\",code=~\"4..\", cluster=\"$cluster\"}[5m]))",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "4xx",
|
||||
"refId": "C"
|
||||
},
|
||||
{
|
||||
"expr": "sum(rate(apiserver_request_total{job=\"apiserver\", instance=~\"$instance\",code=~\"5..\", cluster=\"$cluster\"}[5m]))",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "5xx",
|
||||
"refId": "D"
|
||||
}
|
||||
],
|
||||
"thresholds": [
|
||||
@ -188,7 +190,7 @@ data:
|
||||
],
|
||||
"timeFrom": null,
|
||||
"timeShift": null,
|
||||
"title": "RPC Rate",
|
||||
"title": "ErrorBudget (30d) > 99.000%",
|
||||
"tooltip": {
|
||||
"shared": false,
|
||||
"sort": 0,
|
||||
@ -206,7 +208,8 @@ data:
|
||||
},
|
||||
"yaxes": [
|
||||
{
|
||||
"format": "ops",
|
||||
"decimals": 3,
|
||||
"format": "percentunit",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
@ -214,7 +217,215 @@ data:
|
||||
"show": true
|
||||
},
|
||||
{
|
||||
"format": "ops",
|
||||
"decimals": 3,
|
||||
"format": "percentunit",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": null,
|
||||
"show": true
|
||||
}
|
||||
]
|
||||
}
|
||||
],
|
||||
"repeat": null,
|
||||
"repeatIteration": null,
|
||||
"repeatRowId": null,
|
||||
"showTitle": false,
|
||||
"title": "Dashboard Row",
|
||||
"titleSize": "h6",
|
||||
"type": "row"
|
||||
},
|
||||
{
|
||||
"collapse": false,
|
||||
"collapsed": false,
|
||||
"panels": [
|
||||
{
|
||||
"cacheTimeout": null,
|
||||
"colorBackground": false,
|
||||
"colorValue": false,
|
||||
"colors": [
|
||||
"#299c46",
|
||||
"rgba(237, 129, 40, 0.89)",
|
||||
"#d44a3a"
|
||||
],
|
||||
"datasource": "$datasource",
|
||||
"decimals": 3,
|
||||
"description": "How many percent of read requests (LIST,GET) in 30 days have been answered successfully and fast enough?",
|
||||
"format": "percentunit",
|
||||
"gauge": {
|
||||
"maxValue": 100,
|
||||
"minValue": 0,
|
||||
"show": false,
|
||||
"thresholdLabels": false,
|
||||
"thresholdMarkers": true
|
||||
},
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 5,
|
||||
"interval": null,
|
||||
"links": [
|
||||
|
||||
],
|
||||
"mappingType": 1,
|
||||
"mappingTypes": [
|
||||
{
|
||||
"name": "value to text",
|
||||
"value": 1
|
||||
},
|
||||
{
|
||||
"name": "range to text",
|
||||
"value": 2
|
||||
}
|
||||
],
|
||||
"maxDataPoints": 100,
|
||||
"nullPointMode": "connected",
|
||||
"nullText": null,
|
||||
"postfix": "",
|
||||
"postfixFontSize": "50%",
|
||||
"prefix": "",
|
||||
"prefixFontSize": "50%",
|
||||
"rangeMaps": [
|
||||
{
|
||||
"from": "null",
|
||||
"text": "N/A",
|
||||
"to": "null"
|
||||
}
|
||||
],
|
||||
"span": 3,
|
||||
"sparkline": {
|
||||
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||
"full": false,
|
||||
"lineColor": "rgb(31, 120, 193)",
|
||||
"show": false
|
||||
},
|
||||
"tableColumn": "",
|
||||
"targets": [
|
||||
{
|
||||
"expr": "apiserver_request:availability30d{verb=\"read\", cluster=\"$cluster\"}",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"thresholds": "",
|
||||
"title": "Read Availability (30d)",
|
||||
"tooltip": {
|
||||
"shared": false
|
||||
},
|
||||
"type": "singlestat",
|
||||
"valueFontSize": "80%",
|
||||
"valueMaps": [
|
||||
{
|
||||
"op": "=",
|
||||
"text": "N/A",
|
||||
"value": "null"
|
||||
}
|
||||
],
|
||||
"valueName": "avg"
|
||||
},
|
||||
{
|
||||
"aliasColors": {
|
||||
|
||||
},
|
||||
"bars": false,
|
||||
"dashLength": 10,
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"description": "How many read requests (LIST,GET) per second do the apiservers get by code?",
|
||||
"fill": 10,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 6,
|
||||
"legend": {
|
||||
"alignAsTable": false,
|
||||
"avg": false,
|
||||
"current": false,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
"links": [
|
||||
|
||||
],
|
||||
"nullPointMode": "null",
|
||||
"percentage": false,
|
||||
"pointradius": 5,
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"repeat": null,
|
||||
"seriesOverrides": [
|
||||
{
|
||||
"alias": "/2../i",
|
||||
"color": "#56A64B"
|
||||
},
|
||||
{
|
||||
"alias": "/3../i",
|
||||
"color": "#F2CC0C"
|
||||
},
|
||||
{
|
||||
"alias": "/4../i",
|
||||
"color": "#3274D9"
|
||||
},
|
||||
{
|
||||
"alias": "/5../i",
|
||||
"color": "#E02F44"
|
||||
}
|
||||
],
|
||||
"spaceLength": 10,
|
||||
"span": 3,
|
||||
"stack": true,
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum by (code) (code_resource:apiserver_request_total:rate5m{verb=\"read\", cluster=\"$cluster\"})",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{ code }}",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"thresholds": [
|
||||
|
||||
],
|
||||
"timeFrom": null,
|
||||
"timeShift": null,
|
||||
"title": "Read SLI - Requests",
|
||||
"tooltip": {
|
||||
"shared": false,
|
||||
"sort": 0,
|
||||
"value_type": "individual"
|
||||
},
|
||||
"type": "graph",
|
||||
"xaxis": {
|
||||
"buckets": null,
|
||||
"mode": "time",
|
||||
"name": null,
|
||||
"show": true,
|
||||
"values": [
|
||||
|
||||
]
|
||||
},
|
||||
"yaxes": [
|
||||
{
|
||||
"format": "reqps",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": null,
|
||||
"show": true
|
||||
},
|
||||
{
|
||||
"format": "reqps",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
@ -231,21 +442,23 @@ data:
|
||||
"dashLength": 10,
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"description": "How many percent of read requests (LIST,GET) per second are returned with errors (5xx)?",
|
||||
"fill": 1,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 4,
|
||||
"id": 7,
|
||||
"legend": {
|
||||
"alignAsTable": true,
|
||||
"alignAsTable": false,
|
||||
"avg": false,
|
||||
"current": true,
|
||||
"current": false,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": true,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": true
|
||||
"values": false
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
@ -262,15 +475,15 @@ data:
|
||||
|
||||
],
|
||||
"spaceLength": 10,
|
||||
"span": 5,
|
||||
"span": 3,
|
||||
"stack": false,
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "histogram_quantile(0.99, sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\", instance=~\"$instance\", verb!=\"WATCH\", cluster=\"$cluster\"}[5m])) by (verb, le))",
|
||||
"expr": "sum by (resource) (code_resource:apiserver_request_total:rate5m{verb=\"read\",code=~\"5..\", cluster=\"$cluster\"}) / sum by (resource) (code_resource:apiserver_request_total:rate5m{verb=\"read\", cluster=\"$cluster\"})",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{verb}}",
|
||||
"legendFormat": "{{ resource }}",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
@ -279,7 +492,493 @@ data:
|
||||
],
|
||||
"timeFrom": null,
|
||||
"timeShift": null,
|
||||
"title": "Request duration 99th quantile",
|
||||
"title": "Read SLI - Errors",
|
||||
"tooltip": {
|
||||
"shared": false,
|
||||
"sort": 0,
|
||||
"value_type": "individual"
|
||||
},
|
||||
"type": "graph",
|
||||
"xaxis": {
|
||||
"buckets": null,
|
||||
"mode": "time",
|
||||
"name": null,
|
||||
"show": true,
|
||||
"values": [
|
||||
|
||||
]
|
||||
},
|
||||
"yaxes": [
|
||||
{
|
||||
"format": "percentunit",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": 0,
|
||||
"show": true
|
||||
},
|
||||
{
|
||||
"format": "percentunit",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": 0,
|
||||
"show": true
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"aliasColors": {
|
||||
|
||||
},
|
||||
"bars": false,
|
||||
"dashLength": 10,
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"description": "How many seconds is the 99th percentile for reading (LIST|GET) a given resource?",
|
||||
"fill": 1,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 8,
|
||||
"legend": {
|
||||
"alignAsTable": false,
|
||||
"avg": false,
|
||||
"current": false,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
"links": [
|
||||
|
||||
],
|
||||
"nullPointMode": "null",
|
||||
"percentage": false,
|
||||
"pointradius": 5,
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"repeat": null,
|
||||
"seriesOverrides": [
|
||||
|
||||
],
|
||||
"spaceLength": 10,
|
||||
"span": 3,
|
||||
"stack": false,
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "cluster_quantile:apiserver_request_duration_seconds:histogram_quantile{verb=\"read\", cluster=\"$cluster\"}",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{ resource }}",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"thresholds": [
|
||||
|
||||
],
|
||||
"timeFrom": null,
|
||||
"timeShift": null,
|
||||
"title": "Read SLI - Duration",
|
||||
"tooltip": {
|
||||
"shared": false,
|
||||
"sort": 0,
|
||||
"value_type": "individual"
|
||||
},
|
||||
"type": "graph",
|
||||
"xaxis": {
|
||||
"buckets": null,
|
||||
"mode": "time",
|
||||
"name": null,
|
||||
"show": true,
|
||||
"values": [
|
||||
|
||||
]
|
||||
},
|
||||
"yaxes": [
|
||||
{
|
||||
"format": "s",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": null,
|
||||
"show": true
|
||||
},
|
||||
{
|
||||
"format": "s",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": null,
|
||||
"show": true
|
||||
}
|
||||
]
|
||||
}
|
||||
],
|
||||
"repeat": null,
|
||||
"repeatIteration": null,
|
||||
"repeatRowId": null,
|
||||
"showTitle": false,
|
||||
"title": "Dashboard Row",
|
||||
"titleSize": "h6",
|
||||
"type": "row"
|
||||
},
|
||||
{
|
||||
"collapse": false,
|
||||
"collapsed": false,
|
||||
"panels": [
|
||||
{
|
||||
"cacheTimeout": null,
|
||||
"colorBackground": false,
|
||||
"colorValue": false,
|
||||
"colors": [
|
||||
"#299c46",
|
||||
"rgba(237, 129, 40, 0.89)",
|
||||
"#d44a3a"
|
||||
],
|
||||
"datasource": "$datasource",
|
||||
"decimals": 3,
|
||||
"description": "How many percent of write requests (POST|PUT|PATCH|DELETE) in 30 days have been answered successfully and fast enough?",
|
||||
"format": "percentunit",
|
||||
"gauge": {
|
||||
"maxValue": 100,
|
||||
"minValue": 0,
|
||||
"show": false,
|
||||
"thresholdLabels": false,
|
||||
"thresholdMarkers": true
|
||||
},
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 9,
|
||||
"interval": null,
|
||||
"links": [
|
||||
|
||||
],
|
||||
"mappingType": 1,
|
||||
"mappingTypes": [
|
||||
{
|
||||
"name": "value to text",
|
||||
"value": 1
|
||||
},
|
||||
{
|
||||
"name": "range to text",
|
||||
"value": 2
|
||||
}
|
||||
],
|
||||
"maxDataPoints": 100,
|
||||
"nullPointMode": "connected",
|
||||
"nullText": null,
|
||||
"postfix": "",
|
||||
"postfixFontSize": "50%",
|
||||
"prefix": "",
|
||||
"prefixFontSize": "50%",
|
||||
"rangeMaps": [
|
||||
{
|
||||
"from": "null",
|
||||
"text": "N/A",
|
||||
"to": "null"
|
||||
}
|
||||
],
|
||||
"span": 3,
|
||||
"sparkline": {
|
||||
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||
"full": false,
|
||||
"lineColor": "rgb(31, 120, 193)",
|
||||
"show": false
|
||||
},
|
||||
"tableColumn": "",
|
||||
"targets": [
|
||||
{
|
||||
"expr": "apiserver_request:availability30d{verb=\"write\", cluster=\"$cluster\"}",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"thresholds": "",
|
||||
"title": "Write Availability (30d)",
|
||||
"tooltip": {
|
||||
"shared": false
|
||||
},
|
||||
"type": "singlestat",
|
||||
"valueFontSize": "80%",
|
||||
"valueMaps": [
|
||||
{
|
||||
"op": "=",
|
||||
"text": "N/A",
|
||||
"value": "null"
|
||||
}
|
||||
],
|
||||
"valueName": "avg"
|
||||
},
|
||||
{
|
||||
"aliasColors": {
|
||||
|
||||
},
|
||||
"bars": false,
|
||||
"dashLength": 10,
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"description": "How many write requests (POST|PUT|PATCH|DELETE) per second do the apiservers get by code?",
|
||||
"fill": 10,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 10,
|
||||
"legend": {
|
||||
"alignAsTable": false,
|
||||
"avg": false,
|
||||
"current": false,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
"links": [
|
||||
|
||||
],
|
||||
"nullPointMode": "null",
|
||||
"percentage": false,
|
||||
"pointradius": 5,
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"repeat": null,
|
||||
"seriesOverrides": [
|
||||
{
|
||||
"alias": "/2../i",
|
||||
"color": "#56A64B"
|
||||
},
|
||||
{
|
||||
"alias": "/3../i",
|
||||
"color": "#F2CC0C"
|
||||
},
|
||||
{
|
||||
"alias": "/4../i",
|
||||
"color": "#3274D9"
|
||||
},
|
||||
{
|
||||
"alias": "/5../i",
|
||||
"color": "#E02F44"
|
||||
}
|
||||
],
|
||||
"spaceLength": 10,
|
||||
"span": 3,
|
||||
"stack": true,
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum by (code) (code_resource:apiserver_request_total:rate5m{verb=\"write\", cluster=\"$cluster\"})",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{ code }}",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"thresholds": [
|
||||
|
||||
],
|
||||
"timeFrom": null,
|
||||
"timeShift": null,
|
||||
"title": "Write SLI - Requests",
|
||||
"tooltip": {
|
||||
"shared": false,
|
||||
"sort": 0,
|
||||
"value_type": "individual"
|
||||
},
|
||||
"type": "graph",
|
||||
"xaxis": {
|
||||
"buckets": null,
|
||||
"mode": "time",
|
||||
"name": null,
|
||||
"show": true,
|
||||
"values": [
|
||||
|
||||
]
|
||||
},
|
||||
"yaxes": [
|
||||
{
|
||||
"format": "reqps",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": null,
|
||||
"show": true
|
||||
},
|
||||
{
|
||||
"format": "reqps",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": null,
|
||||
"show": true
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"aliasColors": {
|
||||
|
||||
},
|
||||
"bars": false,
|
||||
"dashLength": 10,
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"description": "How many percent of write requests (POST|PUT|PATCH|DELETE) per second are returned with errors (5xx)?",
|
||||
"fill": 1,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 11,
|
||||
"legend": {
|
||||
"alignAsTable": false,
|
||||
"avg": false,
|
||||
"current": false,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
"links": [
|
||||
|
||||
],
|
||||
"nullPointMode": "null",
|
||||
"percentage": false,
|
||||
"pointradius": 5,
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"repeat": null,
|
||||
"seriesOverrides": [
|
||||
|
||||
],
|
||||
"spaceLength": 10,
|
||||
"span": 3,
|
||||
"stack": false,
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum by (resource) (code_resource:apiserver_request_total:rate5m{verb=\"write\",code=~\"5..\", cluster=\"$cluster\"}) / sum by (resource) (code_resource:apiserver_request_total:rate5m{verb=\"write\", cluster=\"$cluster\"})",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{ resource }}",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"thresholds": [
|
||||
|
||||
],
|
||||
"timeFrom": null,
|
||||
"timeShift": null,
|
||||
"title": "Write SLI - Errors",
|
||||
"tooltip": {
|
||||
"shared": false,
|
||||
"sort": 0,
|
||||
"value_type": "individual"
|
||||
},
|
||||
"type": "graph",
|
||||
"xaxis": {
|
||||
"buckets": null,
|
||||
"mode": "time",
|
||||
"name": null,
|
||||
"show": true,
|
||||
"values": [
|
||||
|
||||
]
|
||||
},
|
||||
"yaxes": [
|
||||
{
|
||||
"format": "percentunit",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": 0,
|
||||
"show": true
|
||||
},
|
||||
{
|
||||
"format": "percentunit",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": 0,
|
||||
"show": true
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"aliasColors": {
|
||||
|
||||
},
|
||||
"bars": false,
|
||||
"dashLength": 10,
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"description": "How many seconds is the 99th percentile for writing (POST|PUT|PATCH|DELETE) a given resource?",
|
||||
"fill": 1,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 12,
|
||||
"legend": {
|
||||
"alignAsTable": false,
|
||||
"avg": false,
|
||||
"current": false,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
"links": [
|
||||
|
||||
],
|
||||
"nullPointMode": "null",
|
||||
"percentage": false,
|
||||
"pointradius": 5,
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"repeat": null,
|
||||
"seriesOverrides": [
|
||||
|
||||
],
|
||||
"spaceLength": 10,
|
||||
"span": 3,
|
||||
"stack": false,
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "cluster_quantile:apiserver_request_duration_seconds:histogram_quantile{verb=\"write\", cluster=\"$cluster\"}",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{ resource }}",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"thresholds": [
|
||||
|
||||
],
|
||||
"timeFrom": null,
|
||||
"timeShift": null,
|
||||
"title": "Write SLI - Duration",
|
||||
"tooltip": {
|
||||
"shared": false,
|
||||
"sort": 0,
|
||||
@ -339,7 +1038,7 @@ data:
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 5,
|
||||
"id": 13,
|
||||
"legend": {
|
||||
"alignAsTable": false,
|
||||
"avg": false,
|
||||
@ -348,6 +1047,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": false,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -430,7 +1130,7 @@ data:
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 6,
|
||||
"id": 14,
|
||||
"legend": {
|
||||
"alignAsTable": false,
|
||||
"avg": false,
|
||||
@ -439,6 +1139,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": false,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -521,7 +1222,7 @@ data:
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 7,
|
||||
"id": 15,
|
||||
"legend": {
|
||||
"alignAsTable": true,
|
||||
"avg": false,
|
||||
@ -530,6 +1231,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": true
|
||||
},
|
||||
@ -625,307 +1327,7 @@ data:
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 8,
|
||||
"legend": {
|
||||
"alignAsTable": false,
|
||||
"avg": false,
|
||||
"current": false,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
"links": [
|
||||
|
||||
],
|
||||
"nullPointMode": "null",
|
||||
"percentage": false,
|
||||
"pointradius": 5,
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"repeat": null,
|
||||
"seriesOverrides": [
|
||||
|
||||
],
|
||||
"spaceLength": 10,
|
||||
"span": 4,
|
||||
"stack": false,
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "etcd_helper_cache_entry_total{job=\"apiserver\", instance=~\"$instance\", cluster=\"$cluster\"}",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{instance}}",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"thresholds": [
|
||||
|
||||
],
|
||||
"timeFrom": null,
|
||||
"timeShift": null,
|
||||
"title": "ETCD Cache Entry Total",
|
||||
"tooltip": {
|
||||
"shared": false,
|
||||
"sort": 0,
|
||||
"value_type": "individual"
|
||||
},
|
||||
"type": "graph",
|
||||
"xaxis": {
|
||||
"buckets": null,
|
||||
"mode": "time",
|
||||
"name": null,
|
||||
"show": true,
|
||||
"values": [
|
||||
|
||||
]
|
||||
},
|
||||
"yaxes": [
|
||||
{
|
||||
"format": "short",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": 0,
|
||||
"show": true
|
||||
},
|
||||
{
|
||||
"format": "short",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": 0,
|
||||
"show": true
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"aliasColors": {
|
||||
|
||||
},
|
||||
"bars": false,
|
||||
"dashLength": 10,
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 9,
|
||||
"legend": {
|
||||
"alignAsTable": false,
|
||||
"avg": false,
|
||||
"current": false,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
"links": [
|
||||
|
||||
],
|
||||
"nullPointMode": "null",
|
||||
"percentage": false,
|
||||
"pointradius": 5,
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"repeat": null,
|
||||
"seriesOverrides": [
|
||||
|
||||
],
|
||||
"spaceLength": 10,
|
||||
"span": 4,
|
||||
"stack": false,
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(rate(etcd_helper_cache_hit_total{job=\"apiserver\",instance=~\"$instance\", cluster=\"$cluster\"}[5m])) by (instance)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{instance}} hit",
|
||||
"refId": "A"
|
||||
},
|
||||
{
|
||||
"expr": "sum(rate(etcd_helper_cache_miss_total{job=\"apiserver\",instance=~\"$instance\", cluster=\"$cluster\"}[5m])) by (instance)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{instance}} miss",
|
||||
"refId": "B"
|
||||
}
|
||||
],
|
||||
"thresholds": [
|
||||
|
||||
],
|
||||
"timeFrom": null,
|
||||
"timeShift": null,
|
||||
"title": "ETCD Cache Hit/Miss Rate",
|
||||
"tooltip": {
|
||||
"shared": false,
|
||||
"sort": 0,
|
||||
"value_type": "individual"
|
||||
},
|
||||
"type": "graph",
|
||||
"xaxis": {
|
||||
"buckets": null,
|
||||
"mode": "time",
|
||||
"name": null,
|
||||
"show": true,
|
||||
"values": [
|
||||
|
||||
]
|
||||
},
|
||||
"yaxes": [
|
||||
{
|
||||
"format": "ops",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": 0,
|
||||
"show": true
|
||||
},
|
||||
{
|
||||
"format": "ops",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": 0,
|
||||
"show": true
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"aliasColors": {
|
||||
|
||||
},
|
||||
"bars": false,
|
||||
"dashLength": 10,
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 10,
|
||||
"legend": {
|
||||
"alignAsTable": false,
|
||||
"avg": false,
|
||||
"current": false,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
"links": [
|
||||
|
||||
],
|
||||
"nullPointMode": "null",
|
||||
"percentage": false,
|
||||
"pointradius": 5,
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"repeat": null,
|
||||
"seriesOverrides": [
|
||||
|
||||
],
|
||||
"spaceLength": 10,
|
||||
"span": 4,
|
||||
"stack": false,
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "histogram_quantile(0.99,sum(rate(etcd_request_cache_get_duration_seconds_bucket{job=\"apiserver\",instance=~\"$instance\", cluster=\"$cluster\"}[5m])) by (instance, le))",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{instance}} get",
|
||||
"refId": "A"
|
||||
},
|
||||
{
|
||||
"expr": "histogram_quantile(0.99,sum(rate(etcd_request_cache_add_duration_seconds_bucket{job=\"apiserver\",instance=~\"$instance\", cluster=\"$cluster\"}[5m])) by (instance, le))",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{instance}} miss",
|
||||
"refId": "B"
|
||||
}
|
||||
],
|
||||
"thresholds": [
|
||||
|
||||
],
|
||||
"timeFrom": null,
|
||||
"timeShift": null,
|
||||
"title": "ETCD Cache Duration 99th Quantile",
|
||||
"tooltip": {
|
||||
"shared": false,
|
||||
"sort": 0,
|
||||
"value_type": "individual"
|
||||
},
|
||||
"type": "graph",
|
||||
"xaxis": {
|
||||
"buckets": null,
|
||||
"mode": "time",
|
||||
"name": null,
|
||||
"show": true,
|
||||
"values": [
|
||||
|
||||
]
|
||||
},
|
||||
"yaxes": [
|
||||
{
|
||||
"format": "s",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": 0,
|
||||
"show": true
|
||||
},
|
||||
{
|
||||
"format": "s",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": 0,
|
||||
"show": true
|
||||
}
|
||||
]
|
||||
}
|
||||
],
|
||||
"repeat": null,
|
||||
"repeatIteration": null,
|
||||
"repeatRowId": null,
|
||||
"showTitle": false,
|
||||
"title": "Dashboard Row",
|
||||
"titleSize": "h6",
|
||||
"type": "row"
|
||||
},
|
||||
{
|
||||
"collapse": false,
|
||||
"collapsed": false,
|
||||
"panels": [
|
||||
{
|
||||
"aliasColors": {
|
||||
|
||||
},
|
||||
"bars": false,
|
||||
"dashLength": 10,
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 11,
|
||||
"id": 16,
|
||||
"legend": {
|
||||
"alignAsTable": false,
|
||||
"avg": false,
|
||||
@ -934,6 +1336,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -1016,7 +1419,7 @@ data:
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 12,
|
||||
"id": 17,
|
||||
"legend": {
|
||||
"alignAsTable": false,
|
||||
"avg": false,
|
||||
@ -1025,6 +1428,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -1107,7 +1511,7 @@ data:
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 13,
|
||||
"id": 18,
|
||||
"legend": {
|
||||
"alignAsTable": false,
|
||||
"avg": false,
|
||||
@ -1116,6 +1520,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -1222,20 +1627,19 @@ data:
|
||||
{
|
||||
"allValue": null,
|
||||
"current": {
|
||||
"text": "prod",
|
||||
"value": "prod"
|
||||
|
||||
},
|
||||
"datasource": "$datasource",
|
||||
"hide": 2,
|
||||
"includeAll": false,
|
||||
"label": null,
|
||||
"label": "cluster",
|
||||
"multi": false,
|
||||
"name": "cluster",
|
||||
"options": [
|
||||
|
||||
],
|
||||
"query": "label_values(apiserver_request_total, cluster)",
|
||||
"refresh": 1,
|
||||
"refresh": 2,
|
||||
"regex": "",
|
||||
"sort": 1,
|
||||
"tagValuesQuery": "",
|
||||
@ -1303,7 +1707,7 @@ data:
|
||||
"30d"
|
||||
]
|
||||
},
|
||||
"timezone": "",
|
||||
"timezone": "UTC",
|
||||
"title": "Kubernetes / API server",
|
||||
"uid": "09ec8aa1e996d6ffcd6817bbaff4db1b",
|
||||
"version": 0
|
||||
@ -1440,6 +1844,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": true
|
||||
},
|
||||
@ -1544,6 +1949,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": true
|
||||
},
|
||||
@ -1648,6 +2054,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": true
|
||||
},
|
||||
@ -1752,6 +2159,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -1864,6 +2272,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -1887,7 +2296,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "histogram_quantile(0.99, sum(rate(rest_client_request_latency_seconds_bucket{job=\"kube-controller-manager\", instance=~\"$instance\", verb=\"POST\"}[5m])) by (verb, url, le))",
|
||||
"expr": "histogram_quantile(0.99, sum(rate(rest_client_request_duration_seconds_bucket{job=\"kube-controller-manager\", instance=~\"$instance\", verb=\"POST\"}[5m])) by (verb, url, le))",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{verb}} {{url}}",
|
||||
@ -1968,6 +2377,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": true
|
||||
},
|
||||
@ -1991,7 +2401,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "histogram_quantile(0.99, sum(rate(rest_client_request_latency_seconds_bucket{job=\"kube-controller-manager\", instance=~\"$instance\", verb=\"GET\"}[5m])) by (verb, url, le))",
|
||||
"expr": "histogram_quantile(0.99, sum(rate(rest_client_request_duration_seconds_bucket{job=\"kube-controller-manager\", instance=~\"$instance\", verb=\"GET\"}[5m])) by (verb, url, le))",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{verb}} {{url}}",
|
||||
@ -2072,6 +2482,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -2163,6 +2574,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -2254,6 +2666,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -2414,7 +2827,7 @@ data:
|
||||
"30d"
|
||||
]
|
||||
},
|
||||
"timezone": "",
|
||||
"timezone": "UTC",
|
||||
"title": "Kubernetes / Controller Manager",
|
||||
"uid": "72e0e05bef5099e5f049b05fdc429ed4",
|
||||
"version": 0
|
||||
@ -2467,6 +2880,7 @@ data:
|
||||
"min": true,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": true
|
||||
},
|
||||
@ -2662,6 +3076,7 @@ data:
|
||||
"min": true,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": true
|
||||
},
|
||||
@ -2965,7 +3380,7 @@ data:
|
||||
"30d"
|
||||
]
|
||||
},
|
||||
"timezone": "",
|
||||
"timezone": "UTC",
|
||||
"title": "Kubernetes / Persistent Volumes",
|
||||
"uid": "919b92a8e8041bd567af9edab12c840c",
|
||||
"version": 0
|
||||
@ -3102,6 +3517,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": true
|
||||
},
|
||||
@ -3214,6 +3630,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": true
|
||||
},
|
||||
@ -3339,6 +3756,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -3451,6 +3869,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -3474,7 +3893,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "histogram_quantile(0.99, sum(rate(rest_client_request_latency_seconds_bucket{job=\"kube-scheduler\", instance=~\"$instance\", verb=\"POST\"}[5m])) by (verb, url, le))",
|
||||
"expr": "histogram_quantile(0.99, sum(rate(rest_client_request_duration_seconds_bucket{job=\"kube-scheduler\", instance=~\"$instance\", verb=\"POST\"}[5m])) by (verb, url, le))",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{verb}} {{url}}",
|
||||
@ -3555,6 +3974,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": true
|
||||
},
|
||||
@ -3578,7 +3998,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "histogram_quantile(0.99, sum(rate(rest_client_request_latency_seconds_bucket{job=\"kube-scheduler\", instance=~\"$instance\", verb=\"GET\"}[5m])) by (verb, url, le))",
|
||||
"expr": "histogram_quantile(0.99, sum(rate(rest_client_request_duration_seconds_bucket{job=\"kube-scheduler\", instance=~\"$instance\", verb=\"GET\"}[5m])) by (verb, url, le))",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{verb}} {{url}}",
|
||||
@ -3659,6 +4079,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -3750,6 +4171,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -3841,6 +4263,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -4001,11 +4424,916 @@ data:
|
||||
"30d"
|
||||
]
|
||||
},
|
||||
"timezone": "",
|
||||
"timezone": "UTC",
|
||||
"title": "Kubernetes / Scheduler",
|
||||
"uid": "2e6b6a3b4bddf1427b3a55aa1311c656",
|
||||
"version": 0
|
||||
}
|
||||
statefulset.json: |-
|
||||
{
|
||||
"__inputs": [
|
||||
|
||||
],
|
||||
"__requires": [
|
||||
|
||||
],
|
||||
"annotations": {
|
||||
"list": [
|
||||
|
||||
]
|
||||
},
|
||||
"editable": false,
|
||||
"gnetId": null,
|
||||
"graphTooltip": 0,
|
||||
"hideControls": false,
|
||||
"id": null,
|
||||
"links": [
|
||||
|
||||
],
|
||||
"refresh": "",
|
||||
"rows": [
|
||||
{
|
||||
"collapse": false,
|
||||
"collapsed": false,
|
||||
"panels": [
|
||||
{
|
||||
"cacheTimeout": null,
|
||||
"colorBackground": false,
|
||||
"colorValue": false,
|
||||
"colors": [
|
||||
"#299c46",
|
||||
"rgba(237, 129, 40, 0.89)",
|
||||
"#d44a3a"
|
||||
],
|
||||
"datasource": "$datasource",
|
||||
"format": "none",
|
||||
"gauge": {
|
||||
"maxValue": 100,
|
||||
"minValue": 0,
|
||||
"show": false,
|
||||
"thresholdLabels": false,
|
||||
"thresholdMarkers": true
|
||||
},
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 2,
|
||||
"interval": null,
|
||||
"links": [
|
||||
|
||||
],
|
||||
"mappingType": 1,
|
||||
"mappingTypes": [
|
||||
{
|
||||
"name": "value to text",
|
||||
"value": 1
|
||||
},
|
||||
{
|
||||
"name": "range to text",
|
||||
"value": 2
|
||||
}
|
||||
],
|
||||
"maxDataPoints": 100,
|
||||
"nullPointMode": "connected",
|
||||
"nullText": null,
|
||||
"postfix": "cores",
|
||||
"postfixFontSize": "50%",
|
||||
"prefix": "",
|
||||
"prefixFontSize": "50%",
|
||||
"rangeMaps": [
|
||||
{
|
||||
"from": "null",
|
||||
"text": "N/A",
|
||||
"to": "null"
|
||||
}
|
||||
],
|
||||
"span": 4,
|
||||
"sparkline": {
|
||||
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||
"lineColor": "rgb(31, 120, 193)",
|
||||
"show": true
|
||||
},
|
||||
"tableColumn": "",
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(rate(container_cpu_usage_seconds_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\", pod=~\"$statefulset.*\"}[3m]))",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"thresholds": "",
|
||||
"title": "CPU",
|
||||
"tooltip": {
|
||||
"shared": false
|
||||
},
|
||||
"type": "singlestat",
|
||||
"valueFontSize": "80%",
|
||||
"valueMaps": [
|
||||
{
|
||||
"op": "=",
|
||||
"text": "0",
|
||||
"value": "null"
|
||||
}
|
||||
],
|
||||
"valueName": "current"
|
||||
},
|
||||
{
|
||||
"cacheTimeout": null,
|
||||
"colorBackground": false,
|
||||
"colorValue": false,
|
||||
"colors": [
|
||||
"#299c46",
|
||||
"rgba(237, 129, 40, 0.89)",
|
||||
"#d44a3a"
|
||||
],
|
||||
"datasource": "$datasource",
|
||||
"format": "none",
|
||||
"gauge": {
|
||||
"maxValue": 100,
|
||||
"minValue": 0,
|
||||
"show": false,
|
||||
"thresholdLabels": false,
|
||||
"thresholdMarkers": true
|
||||
},
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 3,
|
||||
"interval": null,
|
||||
"links": [
|
||||
|
||||
],
|
||||
"mappingType": 1,
|
||||
"mappingTypes": [
|
||||
{
|
||||
"name": "value to text",
|
||||
"value": 1
|
||||
},
|
||||
{
|
||||
"name": "range to text",
|
||||
"value": 2
|
||||
}
|
||||
],
|
||||
"maxDataPoints": 100,
|
||||
"nullPointMode": "connected",
|
||||
"nullText": null,
|
||||
"postfix": "GB",
|
||||
"postfixFontSize": "50%",
|
||||
"prefix": "",
|
||||
"prefixFontSize": "50%",
|
||||
"rangeMaps": [
|
||||
{
|
||||
"from": "null",
|
||||
"text": "N/A",
|
||||
"to": "null"
|
||||
}
|
||||
],
|
||||
"span": 4,
|
||||
"sparkline": {
|
||||
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||
"lineColor": "rgb(31, 120, 193)",
|
||||
"show": true
|
||||
},
|
||||
"tableColumn": "",
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(container_memory_usage_bytes{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\", pod=~\"$statefulset.*\"}) / 1024^3",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"thresholds": "",
|
||||
"title": "Memory",
|
||||
"tooltip": {
|
||||
"shared": false
|
||||
},
|
||||
"type": "singlestat",
|
||||
"valueFontSize": "80%",
|
||||
"valueMaps": [
|
||||
{
|
||||
"op": "=",
|
||||
"text": "0",
|
||||
"value": "null"
|
||||
}
|
||||
],
|
||||
"valueName": "current"
|
||||
},
|
||||
{
|
||||
"cacheTimeout": null,
|
||||
"colorBackground": false,
|
||||
"colorValue": false,
|
||||
"colors": [
|
||||
"#299c46",
|
||||
"rgba(237, 129, 40, 0.89)",
|
||||
"#d44a3a"
|
||||
],
|
||||
"datasource": "$datasource",
|
||||
"format": "none",
|
||||
"gauge": {
|
||||
"maxValue": 100,
|
||||
"minValue": 0,
|
||||
"show": false,
|
||||
"thresholdLabels": false,
|
||||
"thresholdMarkers": true
|
||||
},
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 4,
|
||||
"interval": null,
|
||||
"links": [
|
||||
|
||||
],
|
||||
"mappingType": 1,
|
||||
"mappingTypes": [
|
||||
{
|
||||
"name": "value to text",
|
||||
"value": 1
|
||||
},
|
||||
{
|
||||
"name": "range to text",
|
||||
"value": 2
|
||||
}
|
||||
],
|
||||
"maxDataPoints": 100,
|
||||
"nullPointMode": "connected",
|
||||
"nullText": null,
|
||||
"postfix": "Bps",
|
||||
"postfixFontSize": "50%",
|
||||
"prefix": "",
|
||||
"prefixFontSize": "50%",
|
||||
"rangeMaps": [
|
||||
{
|
||||
"from": "null",
|
||||
"text": "N/A",
|
||||
"to": "null"
|
||||
}
|
||||
],
|
||||
"span": 4,
|
||||
"sparkline": {
|
||||
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||
"lineColor": "rgb(31, 120, 193)",
|
||||
"show": true
|
||||
},
|
||||
"tableColumn": "",
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(rate(container_network_transmit_bytes_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\", pod=~\"$statefulset.*\"}[3m])) + sum(rate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=\"$namespace\",pod=~\"$statefulset.*\"}[3m]))",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"thresholds": "",
|
||||
"title": "Network",
|
||||
"tooltip": {
|
||||
"shared": false
|
||||
},
|
||||
"type": "singlestat",
|
||||
"valueFontSize": "80%",
|
||||
"valueMaps": [
|
||||
{
|
||||
"op": "=",
|
||||
"text": "0",
|
||||
"value": "null"
|
||||
}
|
||||
],
|
||||
"valueName": "current"
|
||||
}
|
||||
],
|
||||
"repeat": null,
|
||||
"repeatIteration": null,
|
||||
"repeatRowId": null,
|
||||
"showTitle": false,
|
||||
"title": "Dashboard Row",
|
||||
"titleSize": "h6",
|
||||
"type": "row"
|
||||
},
|
||||
{
|
||||
"collapse": false,
|
||||
"collapsed": false,
|
||||
"height": "100px",
|
||||
"panels": [
|
||||
{
|
||||
"cacheTimeout": null,
|
||||
"colorBackground": false,
|
||||
"colorValue": false,
|
||||
"colors": [
|
||||
"#299c46",
|
||||
"rgba(237, 129, 40, 0.89)",
|
||||
"#d44a3a"
|
||||
],
|
||||
"datasource": "$datasource",
|
||||
"format": "none",
|
||||
"gauge": {
|
||||
"maxValue": 100,
|
||||
"minValue": 0,
|
||||
"show": false,
|
||||
"thresholdLabels": false,
|
||||
"thresholdMarkers": true
|
||||
},
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 5,
|
||||
"interval": null,
|
||||
"links": [
|
||||
|
||||
],
|
||||
"mappingType": 1,
|
||||
"mappingTypes": [
|
||||
{
|
||||
"name": "value to text",
|
||||
"value": 1
|
||||
},
|
||||
{
|
||||
"name": "range to text",
|
||||
"value": 2
|
||||
}
|
||||
],
|
||||
"maxDataPoints": 100,
|
||||
"nullPointMode": "connected",
|
||||
"nullText": null,
|
||||
"postfix": "",
|
||||
"postfixFontSize": "50%",
|
||||
"prefix": "",
|
||||
"prefixFontSize": "50%",
|
||||
"rangeMaps": [
|
||||
{
|
||||
"from": "null",
|
||||
"text": "N/A",
|
||||
"to": "null"
|
||||
}
|
||||
],
|
||||
"span": 3,
|
||||
"sparkline": {
|
||||
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||
"full": false,
|
||||
"lineColor": "rgb(31, 120, 193)",
|
||||
"show": false
|
||||
},
|
||||
"tableColumn": "",
|
||||
"targets": [
|
||||
{
|
||||
"expr": "max(kube_statefulset_replicas{job=\"kube-state-metrics\", cluster=\"$cluster\", namespace=\"$namespace\", statefulset=\"$statefulset\"}) without (instance, pod)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"thresholds": "",
|
||||
"title": "Desired Replicas",
|
||||
"tooltip": {
|
||||
"shared": false
|
||||
},
|
||||
"type": "singlestat",
|
||||
"valueFontSize": "80%",
|
||||
"valueMaps": [
|
||||
{
|
||||
"op": "=",
|
||||
"text": "0",
|
||||
"value": "null"
|
||||
}
|
||||
],
|
||||
"valueName": "current"
|
||||
},
|
||||
{
|
||||
"cacheTimeout": null,
|
||||
"colorBackground": false,
|
||||
"colorValue": false,
|
||||
"colors": [
|
||||
"#299c46",
|
||||
"rgba(237, 129, 40, 0.89)",
|
||||
"#d44a3a"
|
||||
],
|
||||
"datasource": "$datasource",
|
||||
"format": "none",
|
||||
"gauge": {
|
||||
"maxValue": 100,
|
||||
"minValue": 0,
|
||||
"show": false,
|
||||
"thresholdLabels": false,
|
||||
"thresholdMarkers": true
|
||||
},
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 6,
|
||||
"interval": null,
|
||||
"links": [
|
||||
|
||||
],
|
||||
"mappingType": 1,
|
||||
"mappingTypes": [
|
||||
{
|
||||
"name": "value to text",
|
||||
"value": 1
|
||||
},
|
||||
{
|
||||
"name": "range to text",
|
||||
"value": 2
|
||||
}
|
||||
],
|
||||
"maxDataPoints": 100,
|
||||
"nullPointMode": "connected",
|
||||
"nullText": null,
|
||||
"postfix": "",
|
||||
"postfixFontSize": "50%",
|
||||
"prefix": "",
|
||||
"prefixFontSize": "50%",
|
||||
"rangeMaps": [
|
||||
{
|
||||
"from": "null",
|
||||
"text": "N/A",
|
||||
"to": "null"
|
||||
}
|
||||
],
|
||||
"span": 3,
|
||||
"sparkline": {
|
||||
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||
"full": false,
|
||||
"lineColor": "rgb(31, 120, 193)",
|
||||
"show": false
|
||||
},
|
||||
"tableColumn": "",
|
||||
"targets": [
|
||||
{
|
||||
"expr": "min(kube_statefulset_status_replicas_current{job=\"kube-state-metrics\", cluster=\"$cluster\", namespace=\"$namespace\", statefulset=\"$statefulset\"}) without (instance, pod)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"thresholds": "",
|
||||
"title": "Replicas of current version",
|
||||
"tooltip": {
|
||||
"shared": false
|
||||
},
|
||||
"type": "singlestat",
|
||||
"valueFontSize": "80%",
|
||||
"valueMaps": [
|
||||
{
|
||||
"op": "=",
|
||||
"text": "0",
|
||||
"value": "null"
|
||||
}
|
||||
],
|
||||
"valueName": "current"
|
||||
},
|
||||
{
|
||||
"cacheTimeout": null,
|
||||
"colorBackground": false,
|
||||
"colorValue": false,
|
||||
"colors": [
|
||||
"#299c46",
|
||||
"rgba(237, 129, 40, 0.89)",
|
||||
"#d44a3a"
|
||||
],
|
||||
"datasource": "$datasource",
|
||||
"format": "none",
|
||||
"gauge": {
|
||||
"maxValue": 100,
|
||||
"minValue": 0,
|
||||
"show": false,
|
||||
"thresholdLabels": false,
|
||||
"thresholdMarkers": true
|
||||
},
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 7,
|
||||
"interval": null,
|
||||
"links": [
|
||||
|
||||
],
|
||||
"mappingType": 1,
|
||||
"mappingTypes": [
|
||||
{
|
||||
"name": "value to text",
|
||||
"value": 1
|
||||
},
|
||||
{
|
||||
"name": "range to text",
|
||||
"value": 2
|
||||
}
|
||||
],
|
||||
"maxDataPoints": 100,
|
||||
"nullPointMode": "connected",
|
||||
"nullText": null,
|
||||
"postfix": "",
|
||||
"postfixFontSize": "50%",
|
||||
"prefix": "",
|
||||
"prefixFontSize": "50%",
|
||||
"rangeMaps": [
|
||||
{
|
||||
"from": "null",
|
||||
"text": "N/A",
|
||||
"to": "null"
|
||||
}
|
||||
],
|
||||
"span": 3,
|
||||
"sparkline": {
|
||||
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||
"full": false,
|
||||
"lineColor": "rgb(31, 120, 193)",
|
||||
"show": false
|
||||
},
|
||||
"tableColumn": "",
|
||||
"targets": [
|
||||
{
|
||||
"expr": "max(kube_statefulset_status_observed_generation{job=\"kube-state-metrics\", cluster=\"$cluster\", namespace=\"$namespace\", statefulset=\"$statefulset\"}) without (instance, pod)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"thresholds": "",
|
||||
"title": "Observed Generation",
|
||||
"tooltip": {
|
||||
"shared": false
|
||||
},
|
||||
"type": "singlestat",
|
||||
"valueFontSize": "80%",
|
||||
"valueMaps": [
|
||||
{
|
||||
"op": "=",
|
||||
"text": "0",
|
||||
"value": "null"
|
||||
}
|
||||
],
|
||||
"valueName": "current"
|
||||
},
|
||||
{
|
||||
"cacheTimeout": null,
|
||||
"colorBackground": false,
|
||||
"colorValue": false,
|
||||
"colors": [
|
||||
"#299c46",
|
||||
"rgba(237, 129, 40, 0.89)",
|
||||
"#d44a3a"
|
||||
],
|
||||
"datasource": "$datasource",
|
||||
"format": "none",
|
||||
"gauge": {
|
||||
"maxValue": 100,
|
||||
"minValue": 0,
|
||||
"show": false,
|
||||
"thresholdLabels": false,
|
||||
"thresholdMarkers": true
|
||||
},
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 8,
|
||||
"interval": null,
|
||||
"links": [
|
||||
|
||||
],
|
||||
"mappingType": 1,
|
||||
"mappingTypes": [
|
||||
{
|
||||
"name": "value to text",
|
||||
"value": 1
|
||||
},
|
||||
{
|
||||
"name": "range to text",
|
||||
"value": 2
|
||||
}
|
||||
],
|
||||
"maxDataPoints": 100,
|
||||
"nullPointMode": "connected",
|
||||
"nullText": null,
|
||||
"postfix": "",
|
||||
"postfixFontSize": "50%",
|
||||
"prefix": "",
|
||||
"prefixFontSize": "50%",
|
||||
"rangeMaps": [
|
||||
{
|
||||
"from": "null",
|
||||
"text": "N/A",
|
||||
"to": "null"
|
||||
}
|
||||
],
|
||||
"span": 3,
|
||||
"sparkline": {
|
||||
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||
"full": false,
|
||||
"lineColor": "rgb(31, 120, 193)",
|
||||
"show": false
|
||||
},
|
||||
"tableColumn": "",
|
||||
"targets": [
|
||||
{
|
||||
"expr": "max(kube_statefulset_metadata_generation{job=\"kube-state-metrics\", statefulset=\"$statefulset\", cluster=\"$cluster\", namespace=\"$namespace\"}) without (instance, pod)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"thresholds": "",
|
||||
"title": "Metadata Generation",
|
||||
"tooltip": {
|
||||
"shared": false
|
||||
},
|
||||
"type": "singlestat",
|
||||
"valueFontSize": "80%",
|
||||
"valueMaps": [
|
||||
{
|
||||
"op": "=",
|
||||
"text": "0",
|
||||
"value": "null"
|
||||
}
|
||||
],
|
||||
"valueName": "current"
|
||||
}
|
||||
],
|
||||
"repeat": null,
|
||||
"repeatIteration": null,
|
||||
"repeatRowId": null,
|
||||
"showTitle": false,
|
||||
"title": "Dashboard Row",
|
||||
"titleSize": "h6",
|
||||
"type": "row"
|
||||
},
|
||||
{
|
||||
"collapse": false,
|
||||
"collapsed": false,
|
||||
"panels": [
|
||||
{
|
||||
"aliasColors": {
|
||||
|
||||
},
|
||||
"bars": false,
|
||||
"dashLength": 10,
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 9,
|
||||
"legend": {
|
||||
"alignAsTable": false,
|
||||
"avg": false,
|
||||
"current": false,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
"links": [
|
||||
|
||||
],
|
||||
"nullPointMode": "null",
|
||||
"percentage": false,
|
||||
"pointradius": 5,
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"repeat": null,
|
||||
"seriesOverrides": [
|
||||
|
||||
],
|
||||
"spaceLength": 10,
|
||||
"stack": false,
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "max(kube_statefulset_replicas{job=\"kube-state-metrics\", statefulset=\"$statefulset\", cluster=\"$cluster\", namespace=\"$namespace\"}) without (instance, pod)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "replicas specified",
|
||||
"refId": "A"
|
||||
},
|
||||
{
|
||||
"expr": "max(kube_statefulset_status_replicas{job=\"kube-state-metrics\", statefulset=\"$statefulset\", cluster=\"$cluster\", namespace=\"$namespace\"}) without (instance, pod)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "replicas created",
|
||||
"refId": "B"
|
||||
},
|
||||
{
|
||||
"expr": "min(kube_statefulset_status_replicas_ready{job=\"kube-state-metrics\", statefulset=\"$statefulset\", cluster=\"$cluster\", namespace=\"$namespace\"}) without (instance, pod)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "ready",
|
||||
"refId": "C"
|
||||
},
|
||||
{
|
||||
"expr": "min(kube_statefulset_status_replicas_current{job=\"kube-state-metrics\", statefulset=\"$statefulset\", cluster=\"$cluster\", namespace=\"$namespace\"}) without (instance, pod)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "replicas of current version",
|
||||
"refId": "D"
|
||||
},
|
||||
{
|
||||
"expr": "min(kube_statefulset_status_replicas_updated{job=\"kube-state-metrics\", statefulset=\"$statefulset\", cluster=\"$cluster\", namespace=\"$namespace\"}) without (instance, pod)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "updated",
|
||||
"refId": "E"
|
||||
}
|
||||
],
|
||||
"thresholds": [
|
||||
|
||||
],
|
||||
"timeFrom": null,
|
||||
"timeShift": null,
|
||||
"title": "Replicas",
|
||||
"tooltip": {
|
||||
"shared": false,
|
||||
"sort": 0,
|
||||
"value_type": "individual"
|
||||
},
|
||||
"type": "graph",
|
||||
"xaxis": {
|
||||
"buckets": null,
|
||||
"mode": "time",
|
||||
"name": null,
|
||||
"show": true,
|
||||
"values": [
|
||||
|
||||
]
|
||||
},
|
||||
"yaxes": [
|
||||
{
|
||||
"format": "short",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": null,
|
||||
"show": true
|
||||
},
|
||||
{
|
||||
"format": "short",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": null,
|
||||
"show": true
|
||||
}
|
||||
]
|
||||
}
|
||||
],
|
||||
"repeat": null,
|
||||
"repeatIteration": null,
|
||||
"repeatRowId": null,
|
||||
"showTitle": false,
|
||||
"title": "Dashboard Row",
|
||||
"titleSize": "h6",
|
||||
"type": "row"
|
||||
}
|
||||
],
|
||||
"schemaVersion": 14,
|
||||
"style": "dark",
|
||||
"tags": [
|
||||
"kubernetes-mixin"
|
||||
],
|
||||
"templating": {
|
||||
"list": [
|
||||
{
|
||||
"current": {
|
||||
"text": "default",
|
||||
"value": "default"
|
||||
},
|
||||
"hide": 0,
|
||||
"label": null,
|
||||
"name": "datasource",
|
||||
"options": [
|
||||
|
||||
],
|
||||
"query": "prometheus",
|
||||
"refresh": 1,
|
||||
"regex": "",
|
||||
"type": "datasource"
|
||||
},
|
||||
{
|
||||
"allValue": null,
|
||||
"current": {
|
||||
|
||||
},
|
||||
"datasource": "$datasource",
|
||||
"hide": 2,
|
||||
"includeAll": false,
|
||||
"label": "cluster",
|
||||
"multi": false,
|
||||
"name": "cluster",
|
||||
"options": [
|
||||
|
||||
],
|
||||
"query": "label_values(kube_statefulset_metadata_generation, cluster)",
|
||||
"refresh": 2,
|
||||
"regex": "",
|
||||
"sort": 1,
|
||||
"tagValuesQuery": "",
|
||||
"tags": [
|
||||
|
||||
],
|
||||
"tagsQuery": "",
|
||||
"type": "query",
|
||||
"useTags": false
|
||||
},
|
||||
{
|
||||
"allValue": null,
|
||||
"current": {
|
||||
|
||||
},
|
||||
"datasource": "$datasource",
|
||||
"hide": 0,
|
||||
"includeAll": false,
|
||||
"label": "Namespace",
|
||||
"multi": false,
|
||||
"name": "namespace",
|
||||
"options": [
|
||||
|
||||
],
|
||||
"query": "label_values(kube_statefulset_metadata_generation{job=\"kube-state-metrics\", cluster=\"$cluster\"}, namespace)",
|
||||
"refresh": 2,
|
||||
"regex": "",
|
||||
"sort": 1,
|
||||
"tagValuesQuery": "",
|
||||
"tags": [
|
||||
|
||||
],
|
||||
"tagsQuery": "",
|
||||
"type": "query",
|
||||
"useTags": false
|
||||
},
|
||||
{
|
||||
"allValue": null,
|
||||
"current": {
|
||||
|
||||
},
|
||||
"datasource": "$datasource",
|
||||
"hide": 0,
|
||||
"includeAll": false,
|
||||
"label": "Name",
|
||||
"multi": false,
|
||||
"name": "statefulset",
|
||||
"options": [
|
||||
|
||||
],
|
||||
"query": "label_values(kube_statefulset_metadata_generation{job=\"kube-state-metrics\", cluster=\"$cluster\", namespace=\"$namespace\"}, statefulset)",
|
||||
"refresh": 2,
|
||||
"regex": "",
|
||||
"sort": 1,
|
||||
"tagValuesQuery": "",
|
||||
"tags": [
|
||||
|
||||
],
|
||||
"tagsQuery": "",
|
||||
"type": "query",
|
||||
"useTags": false
|
||||
}
|
||||
]
|
||||
},
|
||||
"time": {
|
||||
"from": "now-1h",
|
||||
"to": "now"
|
||||
},
|
||||
"timepicker": {
|
||||
"refresh_intervals": [
|
||||
"5s",
|
||||
"10s",
|
||||
"30s",
|
||||
"1m",
|
||||
"5m",
|
||||
"15m",
|
||||
"30m",
|
||||
"1h",
|
||||
"2h",
|
||||
"1d"
|
||||
],
|
||||
"time_options": [
|
||||
"5m",
|
||||
"15m",
|
||||
"1h",
|
||||
"6h",
|
||||
"12h",
|
||||
"24h",
|
||||
"2d",
|
||||
"7d",
|
||||
"30d"
|
||||
]
|
||||
},
|
||||
"timezone": "UTC",
|
||||
"title": "Kubernetes / StatefulSets",
|
||||
"uid": "a31c1f46e6f727cb37c0d731a7245005",
|
||||
"version": 0
|
||||
}
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: grafana-dashboards-k8s
|
||||
|
@ -308,6 +308,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": "true",
|
||||
"show": "true",
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": "true"
|
||||
},
|
||||
@ -399,6 +400,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": "true",
|
||||
"show": "true",
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": "true"
|
||||
},
|
||||
@ -503,6 +505,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": "true",
|
||||
"show": "true",
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": "true"
|
||||
},
|
||||
@ -621,6 +624,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": "true",
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": "true"
|
||||
},
|
||||
@ -719,6 +723,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": "true",
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": "true"
|
||||
},
|
||||
@ -810,6 +815,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": "true",
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": "true"
|
||||
},
|
||||
|
@ -48,6 +48,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -140,6 +141,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -265,6 +267,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -471,6 +474,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -586,6 +590,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -704,6 +709,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -796,6 +802,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
|
@ -48,6 +48,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -71,10 +72,10 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "(\n prometheus_remote_storage_highest_timestamp_in_seconds{cluster=~\"$cluster\", instance=~\"$instance\"} \n- \n ignoring(queue) group_right(instance) prometheus_remote_storage_queue_highest_sent_timestamp_seconds{cluster=~\"$cluster\", instance=~\"$instance\"}\n)\n",
|
||||
"expr": "(\n prometheus_remote_storage_highest_timestamp_in_seconds{cluster=~\"$cluster\", instance=~\"$instance\"} \n- \n ignoring(remote_name, url) group_right(instance) prometheus_remote_storage_queue_highest_sent_timestamp_seconds{cluster=~\"$cluster\", instance=~\"$instance\"}\n)\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{cluster}}:{{instance}}-{{queue}}",
|
||||
"legendFormat": "{{cluster}}:{{instance}} {{remote_name}}:{{url}}",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
@ -139,6 +140,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -162,10 +164,10 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "(\n rate(prometheus_remote_storage_highest_timestamp_in_seconds{cluster=~\"$cluster\", instance=~\"$instance\"}[5m]) \n- \n ignoring (queue) group_right(instance) rate(prometheus_remote_storage_queue_highest_sent_timestamp_seconds{cluster=~\"$cluster\", instance=~\"$instance\"}[5m])\n)\n",
|
||||
"expr": "(\n rate(prometheus_remote_storage_highest_timestamp_in_seconds{cluster=~\"$cluster\", instance=~\"$instance\"}[5m]) \n- \n ignoring (remote_name, url) group_right(instance) rate(prometheus_remote_storage_queue_highest_sent_timestamp_seconds{cluster=~\"$cluster\", instance=~\"$instance\"}[5m])\n)\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{cluster}}:{{instance}}-{{queue}}",
|
||||
"legendFormat": "{{cluster}}:{{instance}} {{remote_name}}:{{url}}",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
@ -243,6 +245,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -266,10 +269,10 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "rate(\n prometheus_remote_storage_samples_in_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m])\n- \n ignoring(queue) group_right(instance) rate(prometheus_remote_storage_succeeded_samples_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m]) \n- \n rate(prometheus_remote_storage_dropped_samples_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m])\n",
|
||||
"expr": "rate(\n prometheus_remote_storage_samples_in_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m])\n- \n ignoring(remote_name, url) group_right(instance) rate(prometheus_remote_storage_succeeded_samples_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m])\n- \n rate(prometheus_remote_storage_dropped_samples_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m])\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{cluster}}:{{instance}}-{{queue}}",
|
||||
"legendFormat": "{{cluster}}:{{instance}} {{remote_name}}:{{url}}",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
@ -347,6 +350,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -374,7 +378,7 @@ data:
|
||||
"expr": "prometheus_remote_storage_shards{cluster=~\"$cluster\", instance=~\"$instance\"}",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{cluster}}:{{instance}}-{{queue}}",
|
||||
"legendFormat": "{{cluster}}:{{instance}} {{remote_name}}:{{url}}",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
@ -439,6 +443,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -465,7 +470,7 @@ data:
|
||||
"expr": "prometheus_remote_storage_shards_max{cluster=~\"$cluster\", instance=~\"$instance\"}",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{cluster}}:{{instance}}-{{queue}}",
|
||||
"legendFormat": "{{cluster}}:{{instance}} {{remote_name}}:{{url}}",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
@ -530,6 +535,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -556,7 +562,7 @@ data:
|
||||
"expr": "prometheus_remote_storage_shards_min{cluster=~\"$cluster\", instance=~\"$instance\"}",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{cluster}}:{{instance}}-{{queue}}",
|
||||
"legendFormat": "{{cluster}}:{{instance}} {{remote_name}}:{{url}}",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
@ -621,6 +627,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -647,7 +654,7 @@ data:
|
||||
"expr": "prometheus_remote_storage_shards_desired{cluster=~\"$cluster\", instance=~\"$instance\"}",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{cluster}}:{{instance}}-{{queue}}",
|
||||
"legendFormat": "{{cluster}}:{{instance}} {{remote_name}}:{{url}}",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
@ -725,6 +732,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -751,7 +759,7 @@ data:
|
||||
"expr": "prometheus_remote_storage_shard_capacity{cluster=~\"$cluster\", instance=~\"$instance\"}",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{cluster}}:{{instance}}-{{queue}}",
|
||||
"legendFormat": "{{cluster}}:{{instance}} {{remote_name}}:{{url}}",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
@ -816,6 +824,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -842,7 +851,7 @@ data:
|
||||
"expr": "prometheus_remote_storage_pending_samples{cluster=~\"$cluster\", instance=~\"$instance\"}",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{cluster}}:{{instance}}-{{queue}}",
|
||||
"legendFormat": "{{cluster}}:{{instance}} {{remote_name}}:{{url}}",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
@ -920,6 +929,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -1011,6 +1021,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -1037,7 +1048,7 @@ data:
|
||||
"expr": "prometheus_wal_watcher_current_segment{cluster=~\"$cluster\", instance=~\"$instance\"}",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{cluster}}:{{instance}}-{{queue}}",
|
||||
"legendFormat": "{{cluster}}:{{instance}} {{consumer}}",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
@ -1115,6 +1126,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -1141,7 +1153,7 @@ data:
|
||||
"expr": "rate(prometheus_remote_storage_dropped_samples_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m])",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{cluster}}:{{instance}}-{{queue}}",
|
||||
"legendFormat": "{{cluster}}:{{instance}} {{remote_name}}:{{url}}",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
@ -1206,6 +1218,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -1232,7 +1245,7 @@ data:
|
||||
"expr": "rate(prometheus_remote_storage_failed_samples_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m])",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{cluster}}:{{instance}}-{{queue}}",
|
||||
"legendFormat": "{{cluster}}:{{instance}} {{remote_name}}:{{url}}",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
@ -1297,6 +1310,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -1323,7 +1337,7 @@ data:
|
||||
"expr": "rate(prometheus_remote_storage_retried_samples_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m])",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{cluster}}:{{instance}}-{{queue}}",
|
||||
"legendFormat": "{{cluster}}:{{instance}} {{remote_name}}:{{url}}",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
@ -1388,6 +1402,7 @@ data:
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"sideWidth": null,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@ -1414,7 +1429,7 @@ data:
|
||||
"expr": "rate(prometheus_remote_storage_enqueue_retries_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m])",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{cluster}}:{{instance}}-{{queue}}",
|
||||
"legendFormat": "{{cluster}}:{{instance}} {{remote_name}}:{{url}}",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
@ -1567,11 +1582,11 @@ data:
|
||||
"includeAll": true,
|
||||
"label": null,
|
||||
"multi": false,
|
||||
"name": "queue",
|
||||
"name": "url",
|
||||
"options": [
|
||||
|
||||
],
|
||||
"query": "label_values(prometheus_remote_storage_shards{cluster=~\"$cluster\", instance=~\"$instance\"}, queue)",
|
||||
"query": "label_values(prometheus_remote_storage_shards{cluster=~\"$cluster\", instance=~\"$instance\"}, url)",
|
||||
"refresh": 2,
|
||||
"regex": "",
|
||||
"sort": 0,
|
||||
@ -1690,6 +1705,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #A",
|
||||
@ -1708,6 +1724,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "Value #B",
|
||||
@ -1726,6 +1743,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "instance",
|
||||
@ -1744,6 +1762,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "job",
|
||||
@ -1762,6 +1781,7 @@ data:
|
||||
"dateFormat": "YYYY-MM-DD HH:mm:ss",
|
||||
"decimals": 2,
|
||||
"link": false,
|
||||
"linkTargetBlank": false,
|
||||
"linkTooltip": "Drill down",
|
||||
"linkUrl": "",
|
||||
"pattern": "version",
|
||||
@ -2814,7 +2834,7 @@ data:
|
||||
]
|
||||
},
|
||||
"timezone": "utc",
|
||||
"title": "Prometheus",
|
||||
"title": "Prometheus Overview",
|
||||
"uid": "",
|
||||
"version": 0
|
||||
}
|
||||
|
@ -18,12 +18,13 @@ spec:
|
||||
labels:
|
||||
name: grafana
|
||||
phase: prod
|
||||
annotations:
|
||||
seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
|
||||
spec:
|
||||
securityContext:
|
||||
seccompProfile:
|
||||
type: RuntimeDefault
|
||||
containers:
|
||||
- name: grafana
|
||||
image: docker.io/grafana/grafana:7.0.4
|
||||
image: docker.io/grafana/grafana:7.1.5
|
||||
env:
|
||||
- name: GF_PATHS_CONFIG
|
||||
value: "/etc/grafana/custom.ini"
|
||||
|
@ -1,4 +1,4 @@
|
||||
apiVersion: networking.k8s.io/v1beta1
|
||||
apiVersion: networking.k8s.io/v1
|
||||
kind: IngressClass
|
||||
metadata:
|
||||
name: public
|
||||
|
@ -17,12 +17,13 @@ spec:
|
||||
labels:
|
||||
name: nginx-ingress-controller
|
||||
phase: prod
|
||||
annotations:
|
||||
seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
|
||||
spec:
|
||||
securityContext:
|
||||
seccompProfile:
|
||||
type: RuntimeDefault
|
||||
containers:
|
||||
- name: nginx-ingress-controller
|
||||
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.33.0
|
||||
image: k8s.gcr.io/ingress-nginx/controller:v0.35.0
|
||||
args:
|
||||
- /nginx-ingress-controller
|
||||
- --ingress-class=public
|
||||
@ -47,7 +48,6 @@ spec:
|
||||
containerPort: 10254
|
||||
hostPort: 10254
|
||||
livenessProbe:
|
||||
failureThreshold: 3
|
||||
httpGet:
|
||||
path: /healthz
|
||||
port: 10254
|
||||
@ -55,15 +55,16 @@ spec:
|
||||
initialDelaySeconds: 10
|
||||
periodSeconds: 10
|
||||
successThreshold: 1
|
||||
failureThreshold: 3
|
||||
timeoutSeconds: 5
|
||||
readinessProbe:
|
||||
failureThreshold: 3
|
||||
httpGet:
|
||||
path: /healthz
|
||||
port: 10254
|
||||
scheme: HTTP
|
||||
periodSeconds: 10
|
||||
successThreshold: 1
|
||||
failureThreshold: 3
|
||||
timeoutSeconds: 5
|
||||
lifecycle:
|
||||
preStop:
|
||||
|
@ -1,4 +1,4 @@
|
||||
apiVersion: networking.k8s.io/v1beta1
|
||||
apiVersion: networking.k8s.io/v1
|
||||
kind: IngressClass
|
||||
metadata:
|
||||
name: public
|
||||
|
@ -17,12 +17,13 @@ spec:
|
||||
labels:
|
||||
name: nginx-ingress-controller
|
||||
phase: prod
|
||||
annotations:
|
||||
seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
|
||||
spec:
|
||||
securityContext:
|
||||
seccompProfile:
|
||||
type: RuntimeDefault
|
||||
containers:
|
||||
- name: nginx-ingress-controller
|
||||
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.33.0
|
||||
image: k8s.gcr.io/ingress-nginx/controller:v0.35.0
|
||||
args:
|
||||
- /nginx-ingress-controller
|
||||
- --ingress-class=public
|
||||
@ -47,7 +48,6 @@ spec:
|
||||
containerPort: 10254
|
||||
hostPort: 10254
|
||||
livenessProbe:
|
||||
failureThreshold: 3
|
||||
httpGet:
|
||||
path: /healthz
|
||||
port: 10254
|
||||
@ -55,15 +55,16 @@ spec:
|
||||
initialDelaySeconds: 10
|
||||
periodSeconds: 10
|
||||
successThreshold: 1
|
||||
failureThreshold: 3
|
||||
timeoutSeconds: 5
|
||||
readinessProbe:
|
||||
failureThreshold: 3
|
||||
httpGet:
|
||||
path: /healthz
|
||||
port: 10254
|
||||
scheme: HTTP
|
||||
periodSeconds: 10
|
||||
successThreshold: 1
|
||||
failureThreshold: 3
|
||||
timeoutSeconds: 5
|
||||
lifecycle:
|
||||
preStop:
|
||||
|
@ -1,4 +1,4 @@
|
||||
apiVersion: networking.k8s.io/v1beta1
|
||||
apiVersion: networking.k8s.io/v1
|
||||
kind: IngressClass
|
||||
metadata:
|
||||
name: public
|
||||
|
@ -1,7 +1,7 @@
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: ingress-controller-public
|
||||
name: nginx-ingress-controller
|
||||
namespace: ingress
|
||||
spec:
|
||||
replicas: 2
|
||||
@ -10,19 +10,20 @@ spec:
|
||||
maxUnavailable: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
name: ingress-controller-public
|
||||
name: nginx-ingress-controller
|
||||
phase: prod
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
name: ingress-controller-public
|
||||
name: nginx-ingress-controller
|
||||
phase: prod
|
||||
annotations:
|
||||
seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
|
||||
spec:
|
||||
securityContext:
|
||||
seccompProfile:
|
||||
type: RuntimeDefault
|
||||
containers:
|
||||
- name: nginx-ingress-controller
|
||||
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.33.0
|
||||
image: k8s.gcr.io/ingress-nginx/controller:v0.35.0
|
||||
args:
|
||||
- /nginx-ingress-controller
|
||||
- --ingress-class=public
|
||||
@ -76,4 +77,3 @@ spec:
|
||||
runAsUser: 101 # www-data
|
||||
restartPolicy: Always
|
||||
terminationGracePeriodSeconds: 300
|
||||
|
||||
|
@ -1,4 +1,4 @@
|
||||
apiVersion: networking.k8s.io/v1beta1
|
||||
apiVersion: networking.k8s.io/v1
|
||||
kind: IngressClass
|
||||
metadata:
|
||||
name: public
|
||||
|
@ -17,12 +17,13 @@ spec:
|
||||
labels:
|
||||
name: nginx-ingress-controller
|
||||
phase: prod
|
||||
annotations:
|
||||
seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
|
||||
spec:
|
||||
securityContext:
|
||||
seccompProfile:
|
||||
type: RuntimeDefault
|
||||
containers:
|
||||
- name: nginx-ingress-controller
|
||||
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.33.0
|
||||
image: k8s.gcr.io/ingress-nginx/controller:v0.35.0
|
||||
args:
|
||||
- /nginx-ingress-controller
|
||||
- --ingress-class=public
|
||||
@ -47,7 +48,6 @@ spec:
|
||||
containerPort: 10254
|
||||
hostPort: 10254
|
||||
livenessProbe:
|
||||
failureThreshold: 3
|
||||
httpGet:
|
||||
path: /healthz
|
||||
port: 10254
|
||||
@ -55,15 +55,16 @@ spec:
|
||||
initialDelaySeconds: 10
|
||||
periodSeconds: 10
|
||||
successThreshold: 1
|
||||
failureThreshold: 3
|
||||
timeoutSeconds: 5
|
||||
readinessProbe:
|
||||
failureThreshold: 3
|
||||
httpGet:
|
||||
path: /healthz
|
||||
port: 10254
|
||||
scheme: HTTP
|
||||
periodSeconds: 10
|
||||
successThreshold: 1
|
||||
failureThreshold: 3
|
||||
timeoutSeconds: 5
|
||||
lifecycle:
|
||||
preStop:
|
||||
|
@ -1,4 +1,4 @@
|
||||
apiVersion: networking.k8s.io/v1beta1
|
||||
apiVersion: networking.k8s.io/v1
|
||||
kind: IngressClass
|
||||
metadata:
|
||||
name: public
|
||||
|
@ -17,12 +17,13 @@ spec:
|
||||
labels:
|
||||
name: nginx-ingress-controller
|
||||
phase: prod
|
||||
annotations:
|
||||
seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
|
||||
spec:
|
||||
securityContext:
|
||||
seccompProfile:
|
||||
type: RuntimeDefault
|
||||
containers:
|
||||
- name: nginx-ingress-controller
|
||||
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.33.0
|
||||
image: k8s.gcr.io/ingress-nginx/controller:v0.35.0
|
||||
args:
|
||||
- /nginx-ingress-controller
|
||||
- --ingress-class=public
|
||||
@ -47,7 +48,6 @@ spec:
|
||||
containerPort: 10254
|
||||
hostPort: 10254
|
||||
livenessProbe:
|
||||
failureThreshold: 3
|
||||
httpGet:
|
||||
path: /healthz
|
||||
port: 10254
|
||||
@ -55,15 +55,16 @@ spec:
|
||||
initialDelaySeconds: 10
|
||||
periodSeconds: 10
|
||||
successThreshold: 1
|
||||
failureThreshold: 3
|
||||
timeoutSeconds: 5
|
||||
readinessProbe:
|
||||
failureThreshold: 3
|
||||
httpGet:
|
||||
path: /healthz
|
||||
port: 10254
|
||||
scheme: HTTP
|
||||
periodSeconds: 10
|
||||
successThreshold: 1
|
||||
failureThreshold: 3
|
||||
timeoutSeconds: 5
|
||||
lifecycle:
|
||||
preStop:
|
||||
|
@ -34,7 +34,7 @@ data:
|
||||
- job_name: 'kubernetes-apiservers'
|
||||
kubernetes_sd_configs:
|
||||
- role: endpoints
|
||||
|
||||
|
||||
scheme: https
|
||||
tls_config:
|
||||
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
|
||||
@ -68,13 +68,16 @@ data:
|
||||
- source_labels: [__name__, group]
|
||||
regex: apiserver_request_duration_seconds_bucket;.+
|
||||
action: drop
|
||||
- source_labels: [__name__, group]
|
||||
regex: apiserver_request_duration_seconds_count;.+
|
||||
action: drop
|
||||
|
||||
# Scrape config for node (i.e. kubelet) /metrics (e.g. 'kubelet_'). Explore
|
||||
# metrics from a node by scraping kubelet (127.0.0.1:10250/metrics).
|
||||
- job_name: 'kubelet'
|
||||
kubernetes_sd_configs:
|
||||
- role: node
|
||||
|
||||
|
||||
scheme: https
|
||||
tls_config:
|
||||
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
|
||||
@ -82,10 +85,6 @@ data:
|
||||
insecure_skip_verify: true
|
||||
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
|
||||
|
||||
relabel_configs:
|
||||
- action: labelmap
|
||||
regex: __meta_kubernetes_node_name
|
||||
|
||||
# Scrape config for Kubelet cAdvisor. Explore metrics from a node by
|
||||
# scraping kubelet (127.0.0.1:10250/metrics/cadvisor).
|
||||
- job_name: 'kubernetes-cadvisor'
|
||||
@ -100,9 +99,6 @@ data:
|
||||
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
|
||||
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
|
||||
|
||||
relabel_configs:
|
||||
- action: labelmap
|
||||
regex: __meta_kubernetes_node_name
|
||||
metric_relabel_configs:
|
||||
- source_labels: [__name__, image]
|
||||
action: drop
|
||||
@ -121,13 +117,11 @@ data:
|
||||
- source_labels: [__meta_kubernetes_node_label_node_kubernetes_io_controller]
|
||||
action: keep
|
||||
regex: 'true'
|
||||
- action: labelmap
|
||||
regex: __meta_kubernetes_node_name
|
||||
- source_labels: [__meta_kubernetes_node_address_InternalIP]
|
||||
action: replace
|
||||
target_label: __address__
|
||||
replacement: '${1}:2381'
|
||||
|
||||
|
||||
# Scrape config for service endpoints.
|
||||
#
|
||||
# The relabeling allows the actual service scrape endpoint to be configured
|
||||
@ -172,7 +166,7 @@ data:
|
||||
- source_labels: [__meta_kubernetes_service_name]
|
||||
action: replace
|
||||
target_label: job
|
||||
|
||||
|
||||
metric_relabel_configs:
|
||||
- source_labels: [__name__]
|
||||
action: drop
|
||||
|
@ -14,13 +14,14 @@ spec:
|
||||
labels:
|
||||
name: prometheus
|
||||
phase: prod
|
||||
annotations:
|
||||
seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
|
||||
spec:
|
||||
securityContext:
|
||||
seccompProfile:
|
||||
type: RuntimeDefault
|
||||
serviceAccountName: prometheus
|
||||
containers:
|
||||
- name: prometheus
|
||||
image: quay.io/prometheus/prometheus:v2.19.1
|
||||
image: quay.io/prometheus/prometheus:v2.21.0
|
||||
args:
|
||||
- --web.listen-address=0.0.0.0:9090
|
||||
- --config.file=/etc/prometheus/prometheus.yaml
|
||||
|
@ -18,9 +18,10 @@ spec:
|
||||
labels:
|
||||
name: kube-state-metrics
|
||||
phase: prod
|
||||
annotations:
|
||||
seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
|
||||
spec:
|
||||
securityContext:
|
||||
seccompProfile:
|
||||
type: RuntimeDefault
|
||||
serviceAccountName: kube-state-metrics
|
||||
containers:
|
||||
- name: kube-state-metrics
|
||||
|
@ -17,13 +17,13 @@ spec:
|
||||
labels:
|
||||
name: node-exporter
|
||||
phase: prod
|
||||
annotations:
|
||||
seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
|
||||
spec:
|
||||
serviceAccountName: node-exporter
|
||||
securityContext:
|
||||
runAsNonRoot: true
|
||||
runAsUser: 65534
|
||||
seccompProfile:
|
||||
type: RuntimeDefault
|
||||
hostNetwork: true
|
||||
hostPID: true
|
||||
containers:
|
||||
|
@ -11,8 +11,8 @@ data:
|
||||
"annotations": {
|
||||
"message": "etcd cluster \"{{ $labels.job }}\": members are down ({{ $value }})."
|
||||
},
|
||||
"expr": "max by (job) (\n sum by (job) (up{job=~\".*etcd.*\"} == bool 0)\nor\n count by (job,endpoint) (\n sum by (job,endpoint,To) (rate(etcd_network_peer_sent_failures_total{job=~\".*etcd.*\"}[3m])) > 0.01\n )\n)\n> 0\n",
|
||||
"for": "3m",
|
||||
"expr": "max without (endpoint) (\n sum without (instance) (up{job=~\".*etcd.*\"} == bool 0)\nor\n count without (To) (\n sum without (instance) (rate(etcd_network_peer_sent_failures_total{job=~\".*etcd.*\"}[120s])) > 0.01\n )\n)\n> 0\n",
|
||||
"for": "10m",
|
||||
"labels": {
|
||||
"severity": "critical"
|
||||
}
|
||||
@ -22,7 +22,7 @@ data:
|
||||
"annotations": {
|
||||
"message": "etcd cluster \"{{ $labels.job }}\": insufficient members ({{ $value }})."
|
||||
},
|
||||
"expr": "sum(up{job=~\".*etcd.*\"} == bool 1) by (job) < ((count(up{job=~\".*etcd.*\"}) by (job) + 1) / 2)\n",
|
||||
"expr": "sum(up{job=~\".*etcd.*\"} == bool 1) without (instance) < ((count(up{job=~\".*etcd.*\"}) without (instance) + 1) / 2)\n",
|
||||
"for": "3m",
|
||||
"labels": {
|
||||
"severity": "critical"
|
||||
@ -44,18 +44,40 @@ data:
|
||||
"annotations": {
|
||||
"message": "etcd cluster \"{{ $labels.job }}\": {{ $value }} leader changes within the last 15 minutes. Frequent elections may be a sign of insufficient resources, high network latency, or disruptions by other components and should be investigated."
|
||||
},
|
||||
"expr": "increase((max by (job) (etcd_server_leader_changes_seen_total{job=~\".*etcd.*\"}) or 0*absent(etcd_server_leader_changes_seen_total{job=~\".*etcd.*\"}))[15m:1m]) >= 3\n",
|
||||
"expr": "increase((max without (instance) (etcd_server_leader_changes_seen_total{job=~\".*etcd.*\"}) or 0*absent(etcd_server_leader_changes_seen_total{job=~\".*etcd.*\"}))[15m:1m]) >= 4\n",
|
||||
"for": "5m",
|
||||
"labels": {
|
||||
"severity": "warning"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "etcdHighNumberOfFailedGRPCRequests",
|
||||
"annotations": {
|
||||
"message": "etcd cluster \"{{ $labels.job }}\": {{ $value }}% of requests for {{ $labels.grpc_method }} failed on etcd instance {{ $labels.instance }}."
|
||||
},
|
||||
"expr": "100 * sum(rate(grpc_server_handled_total{job=~\".*etcd.*\", grpc_code!=\"OK\"}[5m])) without (grpc_type, grpc_code)\n /\nsum(rate(grpc_server_handled_total{job=~\".*etcd.*\"}[5m])) without (grpc_type, grpc_code)\n > 1\n",
|
||||
"for": "10m",
|
||||
"labels": {
|
||||
"severity": "warning"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "etcdHighNumberOfFailedGRPCRequests",
|
||||
"annotations": {
|
||||
"message": "etcd cluster \"{{ $labels.job }}\": {{ $value }}% of requests for {{ $labels.grpc_method }} failed on etcd instance {{ $labels.instance }}."
|
||||
},
|
||||
"expr": "100 * sum(rate(grpc_server_handled_total{job=~\".*etcd.*\", grpc_code!=\"OK\"}[5m])) without (grpc_type, grpc_code)\n /\nsum(rate(grpc_server_handled_total{job=~\".*etcd.*\"}[5m])) without (grpc_type, grpc_code)\n > 5\n",
|
||||
"for": "5m",
|
||||
"labels": {
|
||||
"severity": "critical"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "etcdGRPCRequestsSlow",
|
||||
"annotations": {
|
||||
"message": "etcd cluster \"{{ $labels.job }}\": gRPC requests to {{ $labels.grpc_method }} are taking {{ $value }}s on etcd instance {{ $labels.instance }}."
|
||||
},
|
||||
"expr": "histogram_quantile(0.99, sum(rate(grpc_server_handling_seconds_bucket{job=~\".*etcd.*\", grpc_type=\"unary\"}[5m])) by (job, instance, grpc_service, grpc_method, le))\n> 0.15\n",
|
||||
"expr": "histogram_quantile(0.99, sum(rate(grpc_server_handling_seconds_bucket{job=~\".*etcd.*\", grpc_type=\"unary\"}[5m])) without(grpc_type))\n> 0.15\n",
|
||||
"for": "10m",
|
||||
"labels": {
|
||||
"severity": "critical"
|
||||
@ -110,7 +132,7 @@ data:
|
||||
"annotations": {
|
||||
"message": "{{ $value }}% of requests for {{ $labels.method }} failed on etcd instance {{ $labels.instance }}"
|
||||
},
|
||||
"expr": "sum(rate(etcd_http_failed_total{job=~\".*etcd.*\", code!=\"404\"}[5m])) BY (method) / sum(rate(etcd_http_received_total{job=~\".*etcd.*\"}[5m]))\nBY (method) > 0.01\n",
|
||||
"expr": "sum(rate(etcd_http_failed_total{job=~\".*etcd.*\", code!=\"404\"}[5m])) without (code) / sum(rate(etcd_http_received_total{job=~\".*etcd.*\"}[5m]))\nwithout (code) > 0.01\n",
|
||||
"for": "10m",
|
||||
"labels": {
|
||||
"severity": "warning"
|
||||
@ -121,7 +143,7 @@ data:
|
||||
"annotations": {
|
||||
"message": "{{ $value }}% of requests for {{ $labels.method }} failed on etcd instance {{ $labels.instance }}."
|
||||
},
|
||||
"expr": "sum(rate(etcd_http_failed_total{job=~\".*etcd.*\", code!=\"404\"}[5m])) BY (method) / sum(rate(etcd_http_received_total{job=~\".*etcd.*\"}[5m]))\nBY (method) > 0.05\n",
|
||||
"expr": "sum(rate(etcd_http_failed_total{job=~\".*etcd.*\", code!=\"404\"}[5m])) without (code) / sum(rate(etcd_http_received_total{job=~\".*etcd.*\"}[5m]))\nwithout (code) > 0.05\n",
|
||||
"for": "10m",
|
||||
"labels": {
|
||||
"severity": "critical"
|
||||
@ -145,112 +167,137 @@ data:
|
||||
kube.yaml: |-
|
||||
{
|
||||
"groups": [
|
||||
{
|
||||
"name": "kube-apiserver-error",
|
||||
"rules": [
|
||||
{
|
||||
"expr": "sum by (status_class) (\n label_replace(\n rate(apiserver_request_total{job=\"apiserver\"}[5m]\n ), \"status_class\", \"${1}xx\", \"code\", \"([0-9])..\")\n)\n",
|
||||
"labels": {
|
||||
"job": "apiserver"
|
||||
},
|
||||
"record": "status_class:apiserver_request_total:rate5m"
|
||||
},
|
||||
{
|
||||
"expr": "sum by (status_class) (\n label_replace(\n rate(apiserver_request_total{job=\"apiserver\"}[30m]\n ), \"status_class\", \"${1}xx\", \"code\", \"([0-9])..\")\n)\n",
|
||||
"labels": {
|
||||
"job": "apiserver"
|
||||
},
|
||||
"record": "status_class:apiserver_request_total:rate30m"
|
||||
},
|
||||
{
|
||||
"expr": "sum by (status_class) (\n label_replace(\n rate(apiserver_request_total{job=\"apiserver\"}[1h]\n ), \"status_class\", \"${1}xx\", \"code\", \"([0-9])..\")\n)\n",
|
||||
"labels": {
|
||||
"job": "apiserver"
|
||||
},
|
||||
"record": "status_class:apiserver_request_total:rate1h"
|
||||
},
|
||||
{
|
||||
"expr": "sum by (status_class) (\n label_replace(\n rate(apiserver_request_total{job=\"apiserver\"}[2h]\n ), \"status_class\", \"${1}xx\", \"code\", \"([0-9])..\")\n)\n",
|
||||
"labels": {
|
||||
"job": "apiserver"
|
||||
},
|
||||
"record": "status_class:apiserver_request_total:rate2h"
|
||||
},
|
||||
{
|
||||
"expr": "sum by (status_class) (\n label_replace(\n rate(apiserver_request_total{job=\"apiserver\"}[6h]\n ), \"status_class\", \"${1}xx\", \"code\", \"([0-9])..\")\n)\n",
|
||||
"labels": {
|
||||
"job": "apiserver"
|
||||
},
|
||||
"record": "status_class:apiserver_request_total:rate6h"
|
||||
},
|
||||
{
|
||||
"expr": "sum by (status_class) (\n label_replace(\n rate(apiserver_request_total{job=\"apiserver\"}[1d]\n ), \"status_class\", \"${1}xx\", \"code\", \"([0-9])..\")\n)\n",
|
||||
"labels": {
|
||||
"job": "apiserver"
|
||||
},
|
||||
"record": "status_class:apiserver_request_total:rate1d"
|
||||
},
|
||||
{
|
||||
"expr": "sum by (status_class) (\n label_replace(\n rate(apiserver_request_total{job=\"apiserver\"}[3d]\n ), \"status_class\", \"${1}xx\", \"code\", \"([0-9])..\")\n)\n",
|
||||
"labels": {
|
||||
"job": "apiserver"
|
||||
},
|
||||
"record": "status_class:apiserver_request_total:rate3d"
|
||||
},
|
||||
{
|
||||
"expr": "sum(status_class:apiserver_request_total:rate5m{job=\"apiserver\",status_class=\"5xx\"})\n/\nsum(status_class:apiserver_request_total:rate5m{job=\"apiserver\"})\n",
|
||||
"labels": {
|
||||
"job": "apiserver"
|
||||
},
|
||||
"record": "status_class_5xx:apiserver_request_total:ratio_rate5m"
|
||||
},
|
||||
{
|
||||
"expr": "sum(status_class:apiserver_request_total:rate30m{job=\"apiserver\",status_class=\"5xx\"})\n/\nsum(status_class:apiserver_request_total:rate30m{job=\"apiserver\"})\n",
|
||||
"labels": {
|
||||
"job": "apiserver"
|
||||
},
|
||||
"record": "status_class_5xx:apiserver_request_total:ratio_rate30m"
|
||||
},
|
||||
{
|
||||
"expr": "sum(status_class:apiserver_request_total:rate1h{job=\"apiserver\",status_class=\"5xx\"})\n/\nsum(status_class:apiserver_request_total:rate1h{job=\"apiserver\"})\n",
|
||||
"labels": {
|
||||
"job": "apiserver"
|
||||
},
|
||||
"record": "status_class_5xx:apiserver_request_total:ratio_rate1h"
|
||||
},
|
||||
{
|
||||
"expr": "sum(status_class:apiserver_request_total:rate2h{job=\"apiserver\",status_class=\"5xx\"})\n/\nsum(status_class:apiserver_request_total:rate2h{job=\"apiserver\"})\n",
|
||||
"labels": {
|
||||
"job": "apiserver"
|
||||
},
|
||||
"record": "status_class_5xx:apiserver_request_total:ratio_rate2h"
|
||||
},
|
||||
{
|
||||
"expr": "sum(status_class:apiserver_request_total:rate6h{job=\"apiserver\",status_class=\"5xx\"})\n/\nsum(status_class:apiserver_request_total:rate6h{job=\"apiserver\"})\n",
|
||||
"labels": {
|
||||
"job": "apiserver"
|
||||
},
|
||||
"record": "status_class_5xx:apiserver_request_total:ratio_rate6h"
|
||||
},
|
||||
{
|
||||
"expr": "sum(status_class:apiserver_request_total:rate1d{job=\"apiserver\",status_class=\"5xx\"})\n/\nsum(status_class:apiserver_request_total:rate1d{job=\"apiserver\"})\n",
|
||||
"labels": {
|
||||
"job": "apiserver"
|
||||
},
|
||||
"record": "status_class_5xx:apiserver_request_total:ratio_rate1d"
|
||||
},
|
||||
{
|
||||
"expr": "sum(status_class:apiserver_request_total:rate3d{job=\"apiserver\",status_class=\"5xx\"})\n/\nsum(status_class:apiserver_request_total:rate3d{job=\"apiserver\"})\n",
|
||||
"labels": {
|
||||
"job": "apiserver"
|
||||
},
|
||||
"record": "status_class_5xx:apiserver_request_total:ratio_rate3d"
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "kube-apiserver.rules",
|
||||
"rules": [
|
||||
{
|
||||
"expr": "(\n (\n # too slow\n sum(rate(apiserver_request_duration_seconds_count{job=\"apiserver\",verb=~\"LIST|GET\"}[1d]))\n -\n (\n (\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",scope=~\"resource|\",le=\"0.1\"}[1d]))\n or\n vector(0)\n )\n +\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",scope=\"namespace\",le=\"0.5\"}[1d]))\n +\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",scope=\"cluster\",le=\"5\"}[1d]))\n )\n )\n +\n # errors\n sum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"LIST|GET\",code=~\"5..\"}[1d]))\n)\n/\nsum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"LIST|GET\"}[1d]))\n",
|
||||
"labels": {
|
||||
"verb": "read"
|
||||
},
|
||||
"record": "apiserver_request:burnrate1d"
|
||||
},
|
||||
{
|
||||
"expr": "(\n (\n # too slow\n sum(rate(apiserver_request_duration_seconds_count{job=\"apiserver\",verb=~\"LIST|GET\"}[1h]))\n -\n (\n (\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",scope=~\"resource|\",le=\"0.1\"}[1h]))\n or\n vector(0)\n )\n +\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",scope=\"namespace\",le=\"0.5\"}[1h]))\n +\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",scope=\"cluster\",le=\"5\"}[1h]))\n )\n )\n +\n # errors\n sum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"LIST|GET\",code=~\"5..\"}[1h]))\n)\n/\nsum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"LIST|GET\"}[1h]))\n",
|
||||
"labels": {
|
||||
"verb": "read"
|
||||
},
|
||||
"record": "apiserver_request:burnrate1h"
|
||||
},
|
||||
{
|
||||
"expr": "(\n (\n # too slow\n sum(rate(apiserver_request_duration_seconds_count{job=\"apiserver\",verb=~\"LIST|GET\"}[2h]))\n -\n (\n (\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",scope=~\"resource|\",le=\"0.1\"}[2h]))\n or\n vector(0)\n )\n +\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",scope=\"namespace\",le=\"0.5\"}[2h]))\n +\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",scope=\"cluster\",le=\"5\"}[2h]))\n )\n )\n +\n # errors\n sum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"LIST|GET\",code=~\"5..\"}[2h]))\n)\n/\nsum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"LIST|GET\"}[2h]))\n",
|
||||
"labels": {
|
||||
"verb": "read"
|
||||
},
|
||||
"record": "apiserver_request:burnrate2h"
|
||||
},
|
||||
{
|
||||
"expr": "(\n (\n # too slow\n sum(rate(apiserver_request_duration_seconds_count{job=\"apiserver\",verb=~\"LIST|GET\"}[30m]))\n -\n (\n (\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",scope=~\"resource|\",le=\"0.1\"}[30m]))\n or\n vector(0)\n )\n +\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",scope=\"namespace\",le=\"0.5\"}[30m]))\n +\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",scope=\"cluster\",le=\"5\"}[30m]))\n )\n )\n +\n # errors\n sum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"LIST|GET\",code=~\"5..\"}[30m]))\n)\n/\nsum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"LIST|GET\"}[30m]))\n",
|
||||
"labels": {
|
||||
"verb": "read"
|
||||
},
|
||||
"record": "apiserver_request:burnrate30m"
|
||||
},
|
||||
{
|
||||
"expr": "(\n (\n # too slow\n sum(rate(apiserver_request_duration_seconds_count{job=\"apiserver\",verb=~\"LIST|GET\"}[3d]))\n -\n (\n (\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",scope=~\"resource|\",le=\"0.1\"}[3d]))\n or\n vector(0)\n )\n +\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",scope=\"namespace\",le=\"0.5\"}[3d]))\n +\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",scope=\"cluster\",le=\"5\"}[3d]))\n )\n )\n +\n # errors\n sum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"LIST|GET\",code=~\"5..\"}[3d]))\n)\n/\nsum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"LIST|GET\"}[3d]))\n",
|
||||
"labels": {
|
||||
"verb": "read"
|
||||
},
|
||||
"record": "apiserver_request:burnrate3d"
|
||||
},
|
||||
{
|
||||
"expr": "(\n (\n # too slow\n sum(rate(apiserver_request_duration_seconds_count{job=\"apiserver\",verb=~\"LIST|GET\"}[5m]))\n -\n (\n (\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",scope=~\"resource|\",le=\"0.1\"}[5m]))\n or\n vector(0)\n )\n +\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",scope=\"namespace\",le=\"0.5\"}[5m]))\n +\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",scope=\"cluster\",le=\"5\"}[5m]))\n )\n )\n +\n # errors\n sum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"LIST|GET\",code=~\"5..\"}[5m]))\n)\n/\nsum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"LIST|GET\"}[5m]))\n",
|
||||
"labels": {
|
||||
"verb": "read"
|
||||
},
|
||||
"record": "apiserver_request:burnrate5m"
|
||||
},
|
||||
{
|
||||
"expr": "(\n (\n # too slow\n sum(rate(apiserver_request_duration_seconds_count{job=\"apiserver\",verb=~\"LIST|GET\"}[6h]))\n -\n (\n (\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",scope=~\"resource|\",le=\"0.1\"}[6h]))\n or\n vector(0)\n )\n +\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",scope=\"namespace\",le=\"0.5\"}[6h]))\n +\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",scope=\"cluster\",le=\"5\"}[6h]))\n )\n )\n +\n # errors\n sum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"LIST|GET\",code=~\"5..\"}[6h]))\n)\n/\nsum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"LIST|GET\"}[6h]))\n",
|
||||
"labels": {
|
||||
"verb": "read"
|
||||
},
|
||||
"record": "apiserver_request:burnrate6h"
|
||||
},
|
||||
{
|
||||
"expr": "(\n (\n # too slow\n sum(rate(apiserver_request_duration_seconds_count{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\"}[1d]))\n -\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\",le=\"1\"}[1d]))\n )\n +\n sum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\",code=~\"5..\"}[1d]))\n)\n/\nsum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\"}[1d]))\n",
|
||||
"labels": {
|
||||
"verb": "write"
|
||||
},
|
||||
"record": "apiserver_request:burnrate1d"
|
||||
},
|
||||
{
|
||||
"expr": "(\n (\n # too slow\n sum(rate(apiserver_request_duration_seconds_count{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\"}[1h]))\n -\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\",le=\"1\"}[1h]))\n )\n +\n sum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\",code=~\"5..\"}[1h]))\n)\n/\nsum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\"}[1h]))\n",
|
||||
"labels": {
|
||||
"verb": "write"
|
||||
},
|
||||
"record": "apiserver_request:burnrate1h"
|
||||
},
|
||||
{
|
||||
"expr": "(\n (\n # too slow\n sum(rate(apiserver_request_duration_seconds_count{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\"}[2h]))\n -\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\",le=\"1\"}[2h]))\n )\n +\n sum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\",code=~\"5..\"}[2h]))\n)\n/\nsum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\"}[2h]))\n",
|
||||
"labels": {
|
||||
"verb": "write"
|
||||
},
|
||||
"record": "apiserver_request:burnrate2h"
|
||||
},
|
||||
{
|
||||
"expr": "(\n (\n # too slow\n sum(rate(apiserver_request_duration_seconds_count{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\"}[30m]))\n -\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\",le=\"1\"}[30m]))\n )\n +\n sum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\",code=~\"5..\"}[30m]))\n)\n/\nsum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\"}[30m]))\n",
|
||||
"labels": {
|
||||
"verb": "write"
|
||||
},
|
||||
"record": "apiserver_request:burnrate30m"
|
||||
},
|
||||
{
|
||||
"expr": "(\n (\n # too slow\n sum(rate(apiserver_request_duration_seconds_count{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\"}[3d]))\n -\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\",le=\"1\"}[3d]))\n )\n +\n sum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\",code=~\"5..\"}[3d]))\n)\n/\nsum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\"}[3d]))\n",
|
||||
"labels": {
|
||||
"verb": "write"
|
||||
},
|
||||
"record": "apiserver_request:burnrate3d"
|
||||
},
|
||||
{
|
||||
"expr": "(\n (\n # too slow\n sum(rate(apiserver_request_duration_seconds_count{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\"}[5m]))\n -\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\",le=\"1\"}[5m]))\n )\n +\n sum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\",code=~\"5..\"}[5m]))\n)\n/\nsum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\"}[5m]))\n",
|
||||
"labels": {
|
||||
"verb": "write"
|
||||
},
|
||||
"record": "apiserver_request:burnrate5m"
|
||||
},
|
||||
{
|
||||
"expr": "(\n (\n # too slow\n sum(rate(apiserver_request_duration_seconds_count{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\"}[6h]))\n -\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\",le=\"1\"}[6h]))\n )\n +\n sum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\",code=~\"5..\"}[6h]))\n)\n/\nsum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\"}[6h]))\n",
|
||||
"labels": {
|
||||
"verb": "write"
|
||||
},
|
||||
"record": "apiserver_request:burnrate6h"
|
||||
},
|
||||
{
|
||||
"expr": "sum by (code,resource) (rate(apiserver_request_total{job=\"apiserver\",verb=~\"LIST|GET\"}[5m]))\n",
|
||||
"labels": {
|
||||
"verb": "read"
|
||||
},
|
||||
"record": "code_resource:apiserver_request_total:rate5m"
|
||||
},
|
||||
{
|
||||
"expr": "sum by (code,resource) (rate(apiserver_request_total{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\"}[5m]))\n",
|
||||
"labels": {
|
||||
"verb": "write"
|
||||
},
|
||||
"record": "code_resource:apiserver_request_total:rate5m"
|
||||
},
|
||||
{
|
||||
"expr": "histogram_quantile(0.99, sum by (le, resource) (rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\"}[5m]))) > 0\n",
|
||||
"labels": {
|
||||
"quantile": "0.99",
|
||||
"verb": "read"
|
||||
},
|
||||
"record": "cluster_quantile:apiserver_request_duration_seconds:histogram_quantile"
|
||||
},
|
||||
{
|
||||
"expr": "histogram_quantile(0.99, sum by (le, resource) (rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\"}[5m]))) > 0\n",
|
||||
"labels": {
|
||||
"quantile": "0.99",
|
||||
"verb": "write"
|
||||
},
|
||||
"record": "cluster_quantile:apiserver_request_duration_seconds:histogram_quantile"
|
||||
},
|
||||
{
|
||||
"expr": "sum(rate(apiserver_request_duration_seconds_sum{subresource!=\"log\",verb!~\"LIST|WATCH|WATCHLIST|DELETECOLLECTION|PROXY|CONNECT\"}[5m])) without(instance, pod)\n/\nsum(rate(apiserver_request_duration_seconds_count{subresource!=\"log\",verb!~\"LIST|WATCH|WATCHLIST|DELETECOLLECTION|PROXY|CONNECT\"}[5m])) without(instance, pod)\n",
|
||||
"record": "cluster:apiserver_request_duration_seconds:mean5m"
|
||||
@ -278,6 +325,143 @@ data:
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"interval": "3m",
|
||||
"name": "kube-apiserver-availability.rules",
|
||||
"rules": [
|
||||
{
|
||||
"expr": "1 - (\n (\n # write too slow\n sum(increase(apiserver_request_duration_seconds_count{verb=~\"POST|PUT|PATCH|DELETE\"}[30d]))\n -\n sum(increase(apiserver_request_duration_seconds_bucket{verb=~\"POST|PUT|PATCH|DELETE\",le=\"1\"}[30d]))\n ) +\n (\n # read too slow\n sum(increase(apiserver_request_duration_seconds_count{verb=~\"LIST|GET\"}[30d]))\n -\n (\n (\n sum(increase(apiserver_request_duration_seconds_bucket{verb=~\"LIST|GET\",scope=~\"resource|\",le=\"0.1\"}[30d]))\n or\n vector(0)\n )\n +\n sum(increase(apiserver_request_duration_seconds_bucket{verb=~\"LIST|GET\",scope=\"namespace\",le=\"0.5\"}[30d]))\n +\n sum(increase(apiserver_request_duration_seconds_bucket{verb=~\"LIST|GET\",scope=\"cluster\",le=\"5\"}[30d]))\n )\n ) +\n # errors\n sum(code:apiserver_request_total:increase30d{code=~\"5..\"} or vector(0))\n)\n/\nsum(code:apiserver_request_total:increase30d)\n",
|
||||
"labels": {
|
||||
"verb": "all"
|
||||
},
|
||||
"record": "apiserver_request:availability30d"
|
||||
},
|
||||
{
|
||||
"expr": "1 - (\n sum(increase(apiserver_request_duration_seconds_count{job=\"apiserver\",verb=~\"LIST|GET\"}[30d]))\n -\n (\n # too slow\n (\n sum(increase(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",scope=~\"resource|\",le=\"0.1\"}[30d]))\n or\n vector(0)\n )\n +\n sum(increase(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",scope=\"namespace\",le=\"0.5\"}[30d]))\n +\n sum(increase(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",scope=\"cluster\",le=\"5\"}[30d]))\n )\n +\n # errors\n sum(code:apiserver_request_total:increase30d{verb=\"read\",code=~\"5..\"} or vector(0))\n)\n/\nsum(code:apiserver_request_total:increase30d{verb=\"read\"})\n",
|
||||
"labels": {
|
||||
"verb": "read"
|
||||
},
|
||||
"record": "apiserver_request:availability30d"
|
||||
},
|
||||
{
|
||||
"expr": "1 - (\n (\n # too slow\n sum(increase(apiserver_request_duration_seconds_count{verb=~\"POST|PUT|PATCH|DELETE\"}[30d]))\n -\n sum(increase(apiserver_request_duration_seconds_bucket{verb=~\"POST|PUT|PATCH|DELETE\",le=\"1\"}[30d]))\n )\n +\n # errors\n sum(code:apiserver_request_total:increase30d{verb=\"write\",code=~\"5..\"} or vector(0))\n)\n/\nsum(code:apiserver_request_total:increase30d{verb=\"write\"})\n",
|
||||
"labels": {
|
||||
"verb": "write"
|
||||
},
|
||||
"record": "apiserver_request:availability30d"
|
||||
},
|
||||
{
|
||||
"expr": "sum by (code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=\"LIST\",code=~\"2..\"}[30d]))\n",
|
||||
"record": "code_verb:apiserver_request_total:increase30d"
|
||||
},
|
||||
{
|
||||
"expr": "sum by (code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=\"GET\",code=~\"2..\"}[30d]))\n",
|
||||
"record": "code_verb:apiserver_request_total:increase30d"
|
||||
},
|
||||
{
|
||||
"expr": "sum by (code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=\"POST\",code=~\"2..\"}[30d]))\n",
|
||||
"record": "code_verb:apiserver_request_total:increase30d"
|
||||
},
|
||||
{
|
||||
"expr": "sum by (code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=\"PUT\",code=~\"2..\"}[30d]))\n",
|
||||
"record": "code_verb:apiserver_request_total:increase30d"
|
||||
},
|
||||
{
|
||||
"expr": "sum by (code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=\"PATCH\",code=~\"2..\"}[30d]))\n",
|
||||
"record": "code_verb:apiserver_request_total:increase30d"
|
||||
},
|
||||
{
|
||||
"expr": "sum by (code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=\"DELETE\",code=~\"2..\"}[30d]))\n",
|
||||
"record": "code_verb:apiserver_request_total:increase30d"
|
||||
},
|
||||
{
|
||||
"expr": "sum by (code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=\"LIST\",code=~\"3..\"}[30d]))\n",
|
||||
"record": "code_verb:apiserver_request_total:increase30d"
|
||||
},
|
||||
{
|
||||
"expr": "sum by (code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=\"GET\",code=~\"3..\"}[30d]))\n",
|
||||
"record": "code_verb:apiserver_request_total:increase30d"
|
||||
},
|
||||
{
|
||||
"expr": "sum by (code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=\"POST\",code=~\"3..\"}[30d]))\n",
|
||||
"record": "code_verb:apiserver_request_total:increase30d"
|
||||
},
|
||||
{
|
||||
"expr": "sum by (code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=\"PUT\",code=~\"3..\"}[30d]))\n",
|
||||
"record": "code_verb:apiserver_request_total:increase30d"
|
||||
},
|
||||
{
|
||||
"expr": "sum by (code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=\"PATCH\",code=~\"3..\"}[30d]))\n",
|
||||
"record": "code_verb:apiserver_request_total:increase30d"
|
||||
},
|
||||
{
|
||||
"expr": "sum by (code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=\"DELETE\",code=~\"3..\"}[30d]))\n",
|
||||
"record": "code_verb:apiserver_request_total:increase30d"
|
||||
},
|
||||
{
|
||||
"expr": "sum by (code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=\"LIST\",code=~\"4..\"}[30d]))\n",
|
||||
"record": "code_verb:apiserver_request_total:increase30d"
|
||||
},
|
||||
{
|
||||
"expr": "sum by (code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=\"GET\",code=~\"4..\"}[30d]))\n",
|
||||
"record": "code_verb:apiserver_request_total:increase30d"
|
||||
},
|
||||
{
|
||||
"expr": "sum by (code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=\"POST\",code=~\"4..\"}[30d]))\n",
|
||||
"record": "code_verb:apiserver_request_total:increase30d"
|
||||
},
|
||||
{
|
||||
"expr": "sum by (code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=\"PUT\",code=~\"4..\"}[30d]))\n",
|
||||
"record": "code_verb:apiserver_request_total:increase30d"
|
||||
},
|
||||
{
|
||||
"expr": "sum by (code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=\"PATCH\",code=~\"4..\"}[30d]))\n",
|
||||
"record": "code_verb:apiserver_request_total:increase30d"
|
||||
},
|
||||
{
|
||||
"expr": "sum by (code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=\"DELETE\",code=~\"4..\"}[30d]))\n",
|
||||
"record": "code_verb:apiserver_request_total:increase30d"
|
||||
},
|
||||
{
|
||||
"expr": "sum by (code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=\"LIST\",code=~\"5..\"}[30d]))\n",
|
||||
"record": "code_verb:apiserver_request_total:increase30d"
|
||||
},
|
||||
{
|
||||
"expr": "sum by (code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=\"GET\",code=~\"5..\"}[30d]))\n",
|
||||
"record": "code_verb:apiserver_request_total:increase30d"
|
||||
},
|
||||
{
|
||||
"expr": "sum by (code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=\"POST\",code=~\"5..\"}[30d]))\n",
|
||||
"record": "code_verb:apiserver_request_total:increase30d"
|
||||
},
|
||||
{
|
||||
"expr": "sum by (code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=\"PUT\",code=~\"5..\"}[30d]))\n",
|
||||
"record": "code_verb:apiserver_request_total:increase30d"
|
||||
},
|
||||
{
|
||||
"expr": "sum by (code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=\"PATCH\",code=~\"5..\"}[30d]))\n",
|
||||
"record": "code_verb:apiserver_request_total:increase30d"
|
||||
},
|
||||
{
|
||||
"expr": "sum by (code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=\"DELETE\",code=~\"5..\"}[30d]))\n",
|
||||
"record": "code_verb:apiserver_request_total:increase30d"
|
||||
},
|
||||
{
|
||||
"expr": "sum by (code) (code_verb:apiserver_request_total:increase30d{verb=~\"LIST|GET\"})\n",
|
||||
"labels": {
|
||||
"verb": "read"
|
||||
},
|
||||
"record": "code:apiserver_request_total:increase30d"
|
||||
},
|
||||
{
|
||||
"expr": "sum by (code) (code_verb:apiserver_request_total:increase30d{verb=~\"POST|PUT|PATCH|DELETE\"})\n",
|
||||
"labels": {
|
||||
"verb": "write"
|
||||
},
|
||||
"record": "code:apiserver_request_total:increase30d"
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "k8s.rules",
|
||||
"rules": [
|
||||
@ -286,23 +470,23 @@ data:
|
||||
"record": "namespace:container_cpu_usage_seconds_total:sum_rate"
|
||||
},
|
||||
{
|
||||
"expr": "sum by (cluster, namespace, pod, container) (\n rate(container_cpu_usage_seconds_total{job=\"kubernetes-cadvisor\", image!=\"\", container!=\"POD\"}[5m])\n) * on (cluster, namespace, pod) group_left(node) topk by (cluster, namespace, pod) (\n 1, max by(cluster, namespace, pod, node) (kube_pod_info)\n)\n",
|
||||
"expr": "sum by (cluster, namespace, pod, container) (\n rate(container_cpu_usage_seconds_total{job=\"kubernetes-cadvisor\", image!=\"\", container!=\"POD\"}[5m])\n) * on (cluster, namespace, pod) group_left(node) topk by (cluster, namespace, pod) (\n 1, max by(cluster, namespace, pod, node) (kube_pod_info{node!=\"\"})\n)\n",
|
||||
"record": "node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate"
|
||||
},
|
||||
{
|
||||
"expr": "container_memory_working_set_bytes{job=\"kubernetes-cadvisor\", image!=\"\"}\n* on (namespace, pod) group_left(node) topk by(namespace, pod) (1,\n max by(namespace, pod, node) (kube_pod_info)\n)\n",
|
||||
"expr": "container_memory_working_set_bytes{job=\"kubernetes-cadvisor\", image!=\"\"}\n* on (namespace, pod) group_left(node) topk by(namespace, pod) (1,\n max by(namespace, pod, node) (kube_pod_info{node!=\"\"})\n)\n",
|
||||
"record": "node_namespace_pod_container:container_memory_working_set_bytes"
|
||||
},
|
||||
{
|
||||
"expr": "container_memory_rss{job=\"kubernetes-cadvisor\", image!=\"\"}\n* on (namespace, pod) group_left(node) topk by(namespace, pod) (1,\n max by(namespace, pod, node) (kube_pod_info)\n)\n",
|
||||
"expr": "container_memory_rss{job=\"kubernetes-cadvisor\", image!=\"\"}\n* on (namespace, pod) group_left(node) topk by(namespace, pod) (1,\n max by(namespace, pod, node) (kube_pod_info{node!=\"\"})\n)\n",
|
||||
"record": "node_namespace_pod_container:container_memory_rss"
|
||||
},
|
||||
{
|
||||
"expr": "container_memory_cache{job=\"kubernetes-cadvisor\", image!=\"\"}\n* on (namespace, pod) group_left(node) topk by(namespace, pod) (1,\n max by(namespace, pod, node) (kube_pod_info)\n)\n",
|
||||
"expr": "container_memory_cache{job=\"kubernetes-cadvisor\", image!=\"\"}\n* on (namespace, pod) group_left(node) topk by(namespace, pod) (1,\n max by(namespace, pod, node) (kube_pod_info{node!=\"\"})\n)\n",
|
||||
"record": "node_namespace_pod_container:container_memory_cache"
|
||||
},
|
||||
{
|
||||
"expr": "container_memory_swap{job=\"kubernetes-cadvisor\", image!=\"\"}\n* on (namespace, pod) group_left(node) topk by(namespace, pod) (1,\n max by(namespace, pod, node) (kube_pod_info)\n)\n",
|
||||
"expr": "container_memory_swap{job=\"kubernetes-cadvisor\", image!=\"\"}\n* on (namespace, pod) group_left(node) topk by(namespace, pod) (1,\n max by(namespace, pod, node) (kube_pod_info{node!=\"\"})\n)\n",
|
||||
"record": "node_namespace_pod_container:container_memory_swap"
|
||||
},
|
||||
{
|
||||
@ -322,21 +506,21 @@ data:
|
||||
"labels": {
|
||||
"workload_type": "deployment"
|
||||
},
|
||||
"record": "mixin_pod_workload"
|
||||
"record": "namespace_workload_pod:kube_pod_owner:relabel"
|
||||
},
|
||||
{
|
||||
"expr": "max by (cluster, namespace, workload, pod) (\n label_replace(\n kube_pod_owner{job=\"kube-state-metrics\", owner_kind=\"DaemonSet\"},\n \"workload\", \"$1\", \"owner_name\", \"(.*)\"\n )\n)\n",
|
||||
"labels": {
|
||||
"workload_type": "daemonset"
|
||||
},
|
||||
"record": "mixin_pod_workload"
|
||||
"record": "namespace_workload_pod:kube_pod_owner:relabel"
|
||||
},
|
||||
{
|
||||
"expr": "max by (cluster, namespace, workload, pod) (\n label_replace(\n kube_pod_owner{job=\"kube-state-metrics\", owner_kind=\"StatefulSet\"},\n \"workload\", \"$1\", \"owner_name\", \"(.*)\"\n )\n)\n",
|
||||
"labels": {
|
||||
"workload_type": "statefulset"
|
||||
},
|
||||
"record": "mixin_pod_workload"
|
||||
"record": "namespace_workload_pod:kube_pod_owner:relabel"
|
||||
}
|
||||
]
|
||||
},
|
||||
@ -412,11 +596,11 @@ data:
|
||||
"name": "node.rules",
|
||||
"rules": [
|
||||
{
|
||||
"expr": "sum(min(kube_pod_info) by (cluster, node))\n",
|
||||
"expr": "sum(min(kube_pod_info{node!=\"\"}) by (cluster, node))\n",
|
||||
"record": ":kube_pod_info_node_count:"
|
||||
},
|
||||
{
|
||||
"expr": "topk by(namespace, pod) (1,\n max by (node, namespace, pod) (\n label_replace(kube_pod_info{job=\"kube-state-metrics\"}, \"pod\", \"$1\", \"pod\", \"(.*)\")\n))\n",
|
||||
"expr": "topk by(namespace, pod) (1,\n max by (node, namespace, pod) (\n label_replace(kube_pod_info{job=\"kube-state-metrics\",node!=\"\"}, \"pod\", \"$1\", \"pod\", \"(.*)\")\n))\n",
|
||||
"record": "node_namespace_pod:kube_pod_info:"
|
||||
},
|
||||
{
|
||||
@ -461,104 +645,113 @@ data:
|
||||
{
|
||||
"alert": "KubePodCrashLooping",
|
||||
"annotations": {
|
||||
"message": "Pod {{ $labels.namespace }}/{{ $labels.pod }} ({{ $labels.container }}) is restarting {{ printf \"%.2f\" $value }} times / 5 minutes.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubepodcrashlooping"
|
||||
"description": "Pod {{ $labels.namespace }}/{{ $labels.pod }} ({{ $labels.container }}) is restarting {{ printf \"%.2f\" $value }} times / 5 minutes.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubepodcrashlooping",
|
||||
"summary": "Pod is crash looping."
|
||||
},
|
||||
"expr": "rate(kube_pod_container_status_restarts_total{job=\"kube-state-metrics\"}[15m]) * 60 * 5 > 0\n",
|
||||
"expr": "rate(kube_pod_container_status_restarts_total{job=\"kube-state-metrics\"}[5m]) * 60 * 5 > 0\n",
|
||||
"for": "15m",
|
||||
"labels": {
|
||||
"severity": "critical"
|
||||
"severity": "warning"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "KubePodNotReady",
|
||||
"annotations": {
|
||||
"message": "Pod {{ $labels.namespace }}/{{ $labels.pod }} has been in a non-ready state for longer than 15 minutes.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubepodnotready"
|
||||
"description": "Pod {{ $labels.namespace }}/{{ $labels.pod }} has been in a non-ready state for longer than 15 minutes.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubepodnotready",
|
||||
"summary": "Pod has been in a non-ready state for more than 15 minutes."
|
||||
},
|
||||
"expr": "sum by (namespace, pod) (max by(namespace, pod) (kube_pod_status_phase{job=\"kube-state-metrics\", phase=~\"Pending|Unknown\"}) * on(namespace, pod) group_left(owner_kind) max by(namespace, pod, owner_kind) (kube_pod_owner{owner_kind!=\"Job\"})) > 0\n",
|
||||
"expr": "sum by (namespace, pod) (\n max by(namespace, pod) (\n kube_pod_status_phase{job=\"kube-state-metrics\", phase=~\"Pending|Unknown\"}\n ) * on(namespace, pod) group_left(owner_kind) topk by(namespace, pod) (\n 1, max by(namespace, pod, owner_kind) (kube_pod_owner{owner_kind!=\"Job\"})\n )\n) > 0\n",
|
||||
"for": "15m",
|
||||
"labels": {
|
||||
"severity": "critical"
|
||||
"severity": "warning"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "KubeDeploymentGenerationMismatch",
|
||||
"annotations": {
|
||||
"message": "Deployment generation for {{ $labels.namespace }}/{{ $labels.deployment }} does not match, this indicates that the Deployment has failed but has not been rolled back.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubedeploymentgenerationmismatch"
|
||||
"description": "Deployment generation for {{ $labels.namespace }}/{{ $labels.deployment }} does not match, this indicates that the Deployment has failed but has not been rolled back.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubedeploymentgenerationmismatch",
|
||||
"summary": "Deployment generation mismatch due to possible roll-back"
|
||||
},
|
||||
"expr": "kube_deployment_status_observed_generation{job=\"kube-state-metrics\"}\n !=\nkube_deployment_metadata_generation{job=\"kube-state-metrics\"}\n",
|
||||
"for": "15m",
|
||||
"labels": {
|
||||
"severity": "critical"
|
||||
"severity": "warning"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "KubeDeploymentReplicasMismatch",
|
||||
"annotations": {
|
||||
"message": "Deployment {{ $labels.namespace }}/{{ $labels.deployment }} has not matched the expected number of replicas for longer than 15 minutes.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubedeploymentreplicasmismatch"
|
||||
"description": "Deployment {{ $labels.namespace }}/{{ $labels.deployment }} has not matched the expected number of replicas for longer than 15 minutes.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubedeploymentreplicasmismatch",
|
||||
"summary": "Deployment has not matched the expected number of replicas."
|
||||
},
|
||||
"expr": "(\n kube_deployment_spec_replicas{job=\"kube-state-metrics\"}\n !=\n kube_deployment_status_replicas_available{job=\"kube-state-metrics\"}\n) and (\n changes(kube_deployment_status_replicas_updated{job=\"kube-state-metrics\"}[5m])\n ==\n 0\n)\n",
|
||||
"for": "15m",
|
||||
"labels": {
|
||||
"severity": "critical"
|
||||
"severity": "warning"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "KubeStatefulSetReplicasMismatch",
|
||||
"annotations": {
|
||||
"message": "StatefulSet {{ $labels.namespace }}/{{ $labels.statefulset }} has not matched the expected number of replicas for longer than 15 minutes.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubestatefulsetreplicasmismatch"
|
||||
"description": "StatefulSet {{ $labels.namespace }}/{{ $labels.statefulset }} has not matched the expected number of replicas for longer than 15 minutes.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubestatefulsetreplicasmismatch",
|
||||
"summary": "Deployment has not matched the expected number of replicas."
|
||||
},
|
||||
"expr": "(\n kube_statefulset_status_replicas_ready{job=\"kube-state-metrics\"}\n !=\n kube_statefulset_status_replicas{job=\"kube-state-metrics\"}\n) and (\n changes(kube_statefulset_status_replicas_updated{job=\"kube-state-metrics\"}[5m])\n ==\n 0\n)\n",
|
||||
"for": "15m",
|
||||
"labels": {
|
||||
"severity": "critical"
|
||||
"severity": "warning"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "KubeStatefulSetGenerationMismatch",
|
||||
"annotations": {
|
||||
"message": "StatefulSet generation for {{ $labels.namespace }}/{{ $labels.statefulset }} does not match, this indicates that the StatefulSet has failed but has not been rolled back.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubestatefulsetgenerationmismatch"
|
||||
"description": "StatefulSet generation for {{ $labels.namespace }}/{{ $labels.statefulset }} does not match, this indicates that the StatefulSet has failed but has not been rolled back.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubestatefulsetgenerationmismatch",
|
||||
"summary": "StatefulSet generation mismatch due to possible roll-back"
|
||||
},
|
||||
"expr": "kube_statefulset_status_observed_generation{job=\"kube-state-metrics\"}\n !=\nkube_statefulset_metadata_generation{job=\"kube-state-metrics\"}\n",
|
||||
"for": "15m",
|
||||
"labels": {
|
||||
"severity": "critical"
|
||||
"severity": "warning"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "KubeStatefulSetUpdateNotRolledOut",
|
||||
"annotations": {
|
||||
"message": "StatefulSet {{ $labels.namespace }}/{{ $labels.statefulset }} update has not been rolled out.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubestatefulsetupdatenotrolledout"
|
||||
"description": "StatefulSet {{ $labels.namespace }}/{{ $labels.statefulset }} update has not been rolled out.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubestatefulsetupdatenotrolledout",
|
||||
"summary": "StatefulSet update has not been rolled out."
|
||||
},
|
||||
"expr": "max without (revision) (\n kube_statefulset_status_current_revision{job=\"kube-state-metrics\"}\n unless\n kube_statefulset_status_update_revision{job=\"kube-state-metrics\"}\n)\n *\n(\n kube_statefulset_replicas{job=\"kube-state-metrics\"}\n !=\n kube_statefulset_status_replicas_updated{job=\"kube-state-metrics\"}\n)\n",
|
||||
"expr": "(\n max without (revision) (\n kube_statefulset_status_current_revision{job=\"kube-state-metrics\"}\n unless\n kube_statefulset_status_update_revision{job=\"kube-state-metrics\"}\n )\n *\n (\n kube_statefulset_replicas{job=\"kube-state-metrics\"}\n !=\n kube_statefulset_status_replicas_updated{job=\"kube-state-metrics\"}\n )\n) and (\n changes(kube_statefulset_status_replicas_updated{job=\"kube-state-metrics\"}[5m])\n ==\n 0\n)\n",
|
||||
"for": "15m",
|
||||
"labels": {
|
||||
"severity": "critical"
|
||||
"severity": "warning"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "KubeDaemonSetRolloutStuck",
|
||||
"annotations": {
|
||||
"message": "Only {{ $value | humanizePercentage }} of the desired Pods of DaemonSet {{ $labels.namespace }}/{{ $labels.daemonset }} are scheduled and ready.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubedaemonsetrolloutstuck"
|
||||
"description": "DaemonSet {{ $labels.namespace }}/{{ $labels.daemonset }} has not finished or progressed for at least 15 minutes.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubedaemonsetrolloutstuck",
|
||||
"summary": "DaemonSet rollout is stuck."
|
||||
},
|
||||
"expr": "kube_daemonset_status_number_ready{job=\"kube-state-metrics\"}\n /\nkube_daemonset_status_desired_number_scheduled{job=\"kube-state-metrics\"} < 1.00\n",
|
||||
"expr": "(\n (\n kube_daemonset_status_current_number_scheduled{job=\"kube-state-metrics\"}\n !=\n kube_daemonset_status_desired_number_scheduled{job=\"kube-state-metrics\"}\n ) or (\n kube_daemonset_status_number_misscheduled{job=\"kube-state-metrics\"}\n !=\n 0\n ) or (\n kube_daemonset_updated_number_scheduled{job=\"kube-state-metrics\"}\n !=\n kube_daemonset_status_desired_number_scheduled{job=\"kube-state-metrics\"}\n ) or (\n kube_daemonset_status_number_available{job=\"kube-state-metrics\"}\n !=\n kube_daemonset_status_desired_number_scheduled{job=\"kube-state-metrics\"}\n )\n) and (\n changes(kube_daemonset_updated_number_scheduled{job=\"kube-state-metrics\"}[5m])\n ==\n 0\n)\n",
|
||||
"for": "15m",
|
||||
"labels": {
|
||||
"severity": "critical"
|
||||
"severity": "warning"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "KubeContainerWaiting",
|
||||
"annotations": {
|
||||
"message": "Pod {{ $labels.namespace }}/{{ $labels.pod }} container {{ $labels.container}} has been in waiting state for longer than 1 hour.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubecontainerwaiting"
|
||||
"description": "Pod {{ $labels.namespace }}/{{ $labels.pod }} container {{ $labels.container}} has been in waiting state for longer than 1 hour.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubecontainerwaiting",
|
||||
"summary": "Pod container waiting longer than 1 hour"
|
||||
},
|
||||
"expr": "sum by (namespace, pod, container) (kube_pod_container_status_waiting_reason{job=\"kube-state-metrics\"}) > 0\n",
|
||||
"for": "1h",
|
||||
@ -569,8 +762,9 @@ data:
|
||||
{
|
||||
"alert": "KubeDaemonSetNotScheduled",
|
||||
"annotations": {
|
||||
"message": "{{ $value }} Pods of DaemonSet {{ $labels.namespace }}/{{ $labels.daemonset }} are not scheduled.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubedaemonsetnotscheduled"
|
||||
"description": "{{ $value }} Pods of DaemonSet {{ $labels.namespace }}/{{ $labels.daemonset }} are not scheduled.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubedaemonsetnotscheduled",
|
||||
"summary": "DaemonSet pods are not scheduled."
|
||||
},
|
||||
"expr": "kube_daemonset_status_desired_number_scheduled{job=\"kube-state-metrics\"}\n -\nkube_daemonset_status_current_number_scheduled{job=\"kube-state-metrics\"} > 0\n",
|
||||
"for": "10m",
|
||||
@ -581,23 +775,12 @@ data:
|
||||
{
|
||||
"alert": "KubeDaemonSetMisScheduled",
|
||||
"annotations": {
|
||||
"message": "{{ $value }} Pods of DaemonSet {{ $labels.namespace }}/{{ $labels.daemonset }} are running where they are not supposed to run.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubedaemonsetmisscheduled"
|
||||
"description": "{{ $value }} Pods of DaemonSet {{ $labels.namespace }}/{{ $labels.daemonset }} are running where they are not supposed to run.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubedaemonsetmisscheduled",
|
||||
"summary": "DaemonSet pods are misscheduled."
|
||||
},
|
||||
"expr": "kube_daemonset_status_number_misscheduled{job=\"kube-state-metrics\"} > 0\n",
|
||||
"for": "10m",
|
||||
"labels": {
|
||||
"severity": "warning"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "KubeCronJobRunning",
|
||||
"annotations": {
|
||||
"message": "CronJob {{ $labels.namespace }}/{{ $labels.cronjob }} is taking more than 1h to complete.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubecronjobrunning"
|
||||
},
|
||||
"expr": "time() - kube_cronjob_next_schedule_time{job=\"kube-state-metrics\"} > 3600\n",
|
||||
"for": "1h",
|
||||
"for": "15m",
|
||||
"labels": {
|
||||
"severity": "warning"
|
||||
}
|
||||
@ -605,11 +788,12 @@ data:
|
||||
{
|
||||
"alert": "KubeJobCompletion",
|
||||
"annotations": {
|
||||
"message": "Job {{ $labels.namespace }}/{{ $labels.job_name }} is taking more than one hour to complete.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubejobcompletion"
|
||||
"description": "Job {{ $labels.namespace }}/{{ $labels.job_name }} is taking more than 12 hours to complete.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubejobcompletion",
|
||||
"summary": "Job did not complete in time"
|
||||
},
|
||||
"expr": "kube_job_spec_completions{job=\"kube-state-metrics\"} - kube_job_status_succeeded{job=\"kube-state-metrics\"} > 0\n",
|
||||
"for": "1h",
|
||||
"for": "12h",
|
||||
"labels": {
|
||||
"severity": "warning"
|
||||
}
|
||||
@ -617,8 +801,9 @@ data:
|
||||
{
|
||||
"alert": "KubeJobFailed",
|
||||
"annotations": {
|
||||
"message": "Job {{ $labels.namespace }}/{{ $labels.job_name }} failed to complete.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubejobfailed"
|
||||
"description": "Job {{ $labels.namespace }}/{{ $labels.job_name }} failed to complete.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubejobfailed",
|
||||
"summary": "Job failed to complete."
|
||||
},
|
||||
"expr": "kube_job_failed{job=\"kube-state-metrics\"} > 0\n",
|
||||
"for": "15m",
|
||||
@ -629,8 +814,9 @@ data:
|
||||
{
|
||||
"alert": "KubeHpaReplicasMismatch",
|
||||
"annotations": {
|
||||
"message": "HPA {{ $labels.namespace }}/{{ $labels.hpa }} has not matched the desired number of replicas for longer than 15 minutes.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubehpareplicasmismatch"
|
||||
"description": "HPA {{ $labels.namespace }}/{{ $labels.hpa }} has not matched the desired number of replicas for longer than 15 minutes.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubehpareplicasmismatch",
|
||||
"summary": "HPA has not matched descired number of replicas."
|
||||
},
|
||||
"expr": "(kube_hpa_status_desired_replicas{job=\"kube-state-metrics\"}\n !=\nkube_hpa_status_current_replicas{job=\"kube-state-metrics\"})\n and\nchanges(kube_hpa_status_current_replicas[15m]) == 0\n",
|
||||
"for": "15m",
|
||||
@ -641,8 +827,9 @@ data:
|
||||
{
|
||||
"alert": "KubeHpaMaxedOut",
|
||||
"annotations": {
|
||||
"message": "HPA {{ $labels.namespace }}/{{ $labels.hpa }} has been running at max replicas for longer than 15 minutes.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubehpamaxedout"
|
||||
"description": "HPA {{ $labels.namespace }}/{{ $labels.hpa }} has been running at max replicas for longer than 15 minutes.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubehpamaxedout",
|
||||
"summary": "HPA is running at max replicas"
|
||||
},
|
||||
"expr": "kube_hpa_status_current_replicas{job=\"kube-state-metrics\"}\n ==\nkube_hpa_spec_max_replicas{job=\"kube-state-metrics\"}\n",
|
||||
"for": "15m",
|
||||
@ -658,8 +845,9 @@ data:
|
||||
{
|
||||
"alert": "KubeCPUOvercommit",
|
||||
"annotations": {
|
||||
"message": "Cluster has overcommitted CPU resource requests for Pods and cannot tolerate node failure.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubecpuovercommit"
|
||||
"description": "Cluster has overcommitted CPU resource requests for Pods and cannot tolerate node failure.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubecpuovercommit",
|
||||
"summary": "Cluster has overcommitted CPU resource requests."
|
||||
},
|
||||
"expr": "sum(namespace:kube_pod_container_resource_requests_cpu_cores:sum{})\n /\nsum(kube_node_status_allocatable_cpu_cores)\n >\n(count(kube_node_status_allocatable_cpu_cores)-1) / count(kube_node_status_allocatable_cpu_cores)\n",
|
||||
"for": "5m",
|
||||
@ -668,10 +856,11 @@ data:
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "KubeMemOvercommit",
|
||||
"alert": "KubeMemoryOvercommit",
|
||||
"annotations": {
|
||||
"message": "Cluster has overcommitted memory resource requests for Pods and cannot tolerate node failure.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubememovercommit"
|
||||
"description": "Cluster has overcommitted memory resource requests for Pods and cannot tolerate node failure.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubememoryovercommit",
|
||||
"summary": "Cluster has overcommitted memory resource requests."
|
||||
},
|
||||
"expr": "sum(namespace:kube_pod_container_resource_requests_memory_bytes:sum{})\n /\nsum(kube_node_status_allocatable_memory_bytes)\n >\n(count(kube_node_status_allocatable_memory_bytes)-1)\n /\ncount(kube_node_status_allocatable_memory_bytes)\n",
|
||||
"for": "5m",
|
||||
@ -680,10 +869,11 @@ data:
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "KubeCPUOvercommit",
|
||||
"alert": "KubeCPUQuotaOvercommit",
|
||||
"annotations": {
|
||||
"message": "Cluster has overcommitted CPU resource requests for Namespaces.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubecpuovercommit"
|
||||
"description": "Cluster has overcommitted CPU resource requests for Namespaces.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubecpuquotaovercommit",
|
||||
"summary": "Cluster has overcommitted CPU resource requests."
|
||||
},
|
||||
"expr": "sum(kube_resourcequota{job=\"kube-state-metrics\", type=\"hard\", resource=\"cpu\"})\n /\nsum(kube_node_status_allocatable_cpu_cores)\n > 1.5\n",
|
||||
"for": "5m",
|
||||
@ -692,10 +882,11 @@ data:
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "KubeMemOvercommit",
|
||||
"alert": "KubeMemoryQuotaOvercommit",
|
||||
"annotations": {
|
||||
"message": "Cluster has overcommitted memory resource requests for Namespaces.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubememovercommit"
|
||||
"description": "Cluster has overcommitted memory resource requests for Namespaces.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubememoryquotaovercommit",
|
||||
"summary": "Cluster has overcommitted memory resource requests."
|
||||
},
|
||||
"expr": "sum(kube_resourcequota{job=\"kube-state-metrics\", type=\"hard\", resource=\"memory\"})\n /\nsum(kube_node_status_allocatable_memory_bytes{job=\"node-exporter\"})\n > 1.5\n",
|
||||
"for": "5m",
|
||||
@ -703,13 +894,40 @@ data:
|
||||
"severity": "warning"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "KubeQuotaAlmostFull",
|
||||
"annotations": {
|
||||
"description": "Namespace {{ $labels.namespace }} is using {{ $value | humanizePercentage }} of its {{ $labels.resource }} quota.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubequotaalmostfull",
|
||||
"summary": "Namespace quota is going to be full."
|
||||
},
|
||||
"expr": "kube_resourcequota{job=\"kube-state-metrics\", type=\"used\"}\n / ignoring(instance, job, type)\n(kube_resourcequota{job=\"kube-state-metrics\", type=\"hard\"} > 0)\n > 0.9 < 1\n",
|
||||
"for": "15m",
|
||||
"labels": {
|
||||
"severity": "info"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "KubeQuotaFullyUsed",
|
||||
"annotations": {
|
||||
"description": "Namespace {{ $labels.namespace }} is using {{ $value | humanizePercentage }} of its {{ $labels.resource }} quota.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubequotafullyused",
|
||||
"summary": "Namespace quota is fully used."
|
||||
},
|
||||
"expr": "kube_resourcequota{job=\"kube-state-metrics\", type=\"used\"}\n / ignoring(instance, job, type)\n(kube_resourcequota{job=\"kube-state-metrics\", type=\"hard\"} > 0)\n == 1\n",
|
||||
"for": "15m",
|
||||
"labels": {
|
||||
"severity": "info"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "KubeQuotaExceeded",
|
||||
"annotations": {
|
||||
"message": "Namespace {{ $labels.namespace }} is using {{ $value | humanizePercentage }} of its {{ $labels.resource }} quota.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubequotaexceeded"
|
||||
"description": "Namespace {{ $labels.namespace }} is using {{ $value | humanizePercentage }} of its {{ $labels.resource }} quota.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubequotaexceeded",
|
||||
"summary": "Namespace quota has exceeded the limits."
|
||||
},
|
||||
"expr": "kube_resourcequota{job=\"kube-state-metrics\", type=\"used\"}\n / ignoring(instance, job, type)\n(kube_resourcequota{job=\"kube-state-metrics\", type=\"hard\"} > 0)\n > 0.90\n",
|
||||
"expr": "kube_resourcequota{job=\"kube-state-metrics\", type=\"used\"}\n / ignoring(instance, job, type)\n(kube_resourcequota{job=\"kube-state-metrics\", type=\"hard\"} > 0)\n > 1\n",
|
||||
"for": "15m",
|
||||
"labels": {
|
||||
"severity": "warning"
|
||||
@ -718,13 +936,14 @@ data:
|
||||
{
|
||||
"alert": "CPUThrottlingHigh",
|
||||
"annotations": {
|
||||
"message": "{{ $value | humanizePercentage }} throttling of CPU in namespace {{ $labels.namespace }} for container {{ $labels.container }} in pod {{ $labels.pod }}.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-cputhrottlinghigh"
|
||||
"description": "{{ $value | humanizePercentage }} throttling of CPU in namespace {{ $labels.namespace }} for container {{ $labels.container }} in pod {{ $labels.pod }}.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-cputhrottlinghigh",
|
||||
"summary": "Processes experience elevated CPU throttling."
|
||||
},
|
||||
"expr": "sum(increase(container_cpu_cfs_throttled_periods_total{container!=\"\", }[5m])) by (container, pod, namespace)\n /\nsum(increase(container_cpu_cfs_periods_total{}[5m])) by (container, pod, namespace)\n > ( 100 / 100 )\n",
|
||||
"expr": "sum(increase(container_cpu_cfs_throttled_periods_total{container!=\"\", }[5m])) by (container, pod, namespace)\n /\nsum(increase(container_cpu_cfs_periods_total{}[5m])) by (container, pod, namespace)\n > ( 80 / 100 )\n",
|
||||
"for": "15m",
|
||||
"labels": {
|
||||
"severity": "warning"
|
||||
"severity": "info"
|
||||
}
|
||||
}
|
||||
]
|
||||
@ -733,10 +952,11 @@ data:
|
||||
"name": "kubernetes-storage",
|
||||
"rules": [
|
||||
{
|
||||
"alert": "KubePersistentVolumeUsageCritical",
|
||||
"alert": "KubePersistentVolumeFillingUp",
|
||||
"annotations": {
|
||||
"message": "The PersistentVolume claimed by {{ $labels.persistentvolumeclaim }} in Namespace {{ $labels.namespace }} is only {{ $value | humanizePercentage }} free.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubepersistentvolumeusagecritical"
|
||||
"description": "The PersistentVolume claimed by {{ $labels.persistentvolumeclaim }} in Namespace {{ $labels.namespace }} is only {{ $value | humanizePercentage }} free.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubepersistentvolumefillingup",
|
||||
"summary": "PersistentVolume is filling up."
|
||||
},
|
||||
"expr": "kubelet_volume_stats_available_bytes{job=\"kubelet\"}\n /\nkubelet_volume_stats_capacity_bytes{job=\"kubelet\"}\n < 0.03\n",
|
||||
"for": "1m",
|
||||
@ -745,22 +965,24 @@ data:
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "KubePersistentVolumeFullInFourDays",
|
||||
"alert": "KubePersistentVolumeFillingUp",
|
||||
"annotations": {
|
||||
"message": "Based on recent sampling, the PersistentVolume claimed by {{ $labels.persistentvolumeclaim }} in Namespace {{ $labels.namespace }} is expected to fill up within four days. Currently {{ $value | humanizePercentage }} is available.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubepersistentvolumefullinfourdays"
|
||||
"description": "Based on recent sampling, the PersistentVolume claimed by {{ $labels.persistentvolumeclaim }} in Namespace {{ $labels.namespace }} is expected to fill up within four days. Currently {{ $value | humanizePercentage }} is available.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubepersistentvolumefillingup",
|
||||
"summary": "PersistentVolume is filling up."
|
||||
},
|
||||
"expr": "(\n kubelet_volume_stats_available_bytes{job=\"kubelet\"}\n /\n kubelet_volume_stats_capacity_bytes{job=\"kubelet\"}\n) < 0.15\nand\npredict_linear(kubelet_volume_stats_available_bytes{job=\"kubelet\"}[6h], 4 * 24 * 3600) < 0\n",
|
||||
"for": "1h",
|
||||
"labels": {
|
||||
"severity": "critical"
|
||||
"severity": "warning"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "KubePersistentVolumeErrors",
|
||||
"annotations": {
|
||||
"message": "The persistent volume {{ $labels.persistentvolume }} has status {{ $labels.phase }}.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubepersistentvolumeerrors"
|
||||
"description": "The persistent volume {{ $labels.persistentvolume }} has status {{ $labels.phase }}.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubepersistentvolumeerrors",
|
||||
"summary": "PersistentVolume is having issues with provisioning."
|
||||
},
|
||||
"expr": "kube_persistentvolume_status_phase{phase=~\"Failed|Pending\",job=\"kube-state-metrics\"} > 0\n",
|
||||
"for": "5m",
|
||||
@ -776,10 +998,11 @@ data:
|
||||
{
|
||||
"alert": "KubeVersionMismatch",
|
||||
"annotations": {
|
||||
"message": "There are {{ $value }} different semantic versions of Kubernetes components running.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeversionmismatch"
|
||||
"description": "There are {{ $value }} different semantic versions of Kubernetes components running.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeversionmismatch",
|
||||
"summary": "Different semantic versions of Kubernetes components running."
|
||||
},
|
||||
"expr": "count(count by (gitVersion) (label_replace(kubernetes_build_info{job!~\"kube-dns|coredns\"},\"gitVersion\",\"$1\",\"gitVersion\",\"(v[0-9]*.[0-9]*.[0-9]*).*\"))) > 1\n",
|
||||
"expr": "count(count by (gitVersion) (label_replace(kubernetes_build_info{job!~\"kube-dns|coredns\"},\"gitVersion\",\"$1\",\"gitVersion\",\"(v[0-9]*.[0-9]*).*\"))) > 1\n",
|
||||
"for": "15m",
|
||||
"labels": {
|
||||
"severity": "warning"
|
||||
@ -788,8 +1011,9 @@ data:
|
||||
{
|
||||
"alert": "KubeClientErrors",
|
||||
"annotations": {
|
||||
"message": "Kubernetes API server client '{{ $labels.job }}/{{ $labels.instance }}' is experiencing {{ $value | humanizePercentage }} errors.'",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeclienterrors"
|
||||
"description": "Kubernetes API server client '{{ $labels.job }}/{{ $labels.instance }}' is experiencing {{ $value | humanizePercentage }} errors.'",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeclienterrors",
|
||||
"summary": "Kubernetes API server client is experiencing errors."
|
||||
},
|
||||
"expr": "(sum(rate(rest_client_requests_total{code=~\"5..\"}[5m])) by (instance, job)\n /\nsum(rate(rest_client_requests_total[5m])) by (instance, job))\n> 0.01\n",
|
||||
"for": "15m",
|
||||
@ -800,30 +1024,66 @@ data:
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "kube-apiserver-error-alerts",
|
||||
"name": "kube-apiserver-slos",
|
||||
"rules": [
|
||||
{
|
||||
"alert": "ErrorBudgetBurn",
|
||||
"alert": "KubeAPIErrorBudgetBurn",
|
||||
"annotations": {
|
||||
"message": "High requests error budget burn for job=apiserver (current value: {{ $value }})",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-errorbudgetburn"
|
||||
"description": "The API server is burning too much error budget.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeapierrorbudgetburn",
|
||||
"summary": "The API server is burning too much error budget."
|
||||
},
|
||||
"expr": "(\n status_class_5xx:apiserver_request_total:ratio_rate1h{job=\"apiserver\"} > (14.4*0.010000)\n and\n status_class_5xx:apiserver_request_total:ratio_rate5m{job=\"apiserver\"} > (14.4*0.010000)\n)\nor\n(\n status_class_5xx:apiserver_request_total:ratio_rate6h{job=\"apiserver\"} > (6*0.010000)\n and\n status_class_5xx:apiserver_request_total:ratio_rate30m{job=\"apiserver\"} > (6*0.010000)\n)\n",
|
||||
"expr": "sum(apiserver_request:burnrate1h) > (14.40 * 0.01000)\nand\nsum(apiserver_request:burnrate5m) > (14.40 * 0.01000)\n",
|
||||
"for": "2m",
|
||||
"labels": {
|
||||
"job": "apiserver",
|
||||
"severity": "critical"
|
||||
"long": "1h",
|
||||
"severity": "critical",
|
||||
"short": "5m"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "ErrorBudgetBurn",
|
||||
"alert": "KubeAPIErrorBudgetBurn",
|
||||
"annotations": {
|
||||
"message": "High requests error budget burn for job=apiserver (current value: {{ $value }})",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-errorbudgetburn"
|
||||
"description": "The API server is burning too much error budget.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeapierrorbudgetburn",
|
||||
"summary": "The API server is burning too much error budget."
|
||||
},
|
||||
"expr": "(\n status_class_5xx:apiserver_request_total:ratio_rate1d{job=\"apiserver\"} > (3*0.010000)\n and\n status_class_5xx:apiserver_request_total:ratio_rate2h{job=\"apiserver\"} > (3*0.010000)\n)\nor\n(\n status_class_5xx:apiserver_request_total:ratio_rate3d{job=\"apiserver\"} > (0.010000)\n and\n status_class_5xx:apiserver_request_total:ratio_rate6h{job=\"apiserver\"} > (0.010000)\n)\n",
|
||||
"expr": "sum(apiserver_request:burnrate6h) > (6.00 * 0.01000)\nand\nsum(apiserver_request:burnrate30m) > (6.00 * 0.01000)\n",
|
||||
"for": "15m",
|
||||
"labels": {
|
||||
"job": "apiserver",
|
||||
"severity": "warning"
|
||||
"long": "6h",
|
||||
"severity": "critical",
|
||||
"short": "30m"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "KubeAPIErrorBudgetBurn",
|
||||
"annotations": {
|
||||
"description": "The API server is burning too much error budget.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeapierrorbudgetburn",
|
||||
"summary": "The API server is burning too much error budget."
|
||||
},
|
||||
"expr": "sum(apiserver_request:burnrate1d) > (3.00 * 0.01000)\nand\nsum(apiserver_request:burnrate2h) > (3.00 * 0.01000)\n",
|
||||
"for": "1h",
|
||||
"labels": {
|
||||
"long": "1d",
|
||||
"severity": "warning",
|
||||
"short": "2h"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "KubeAPIErrorBudgetBurn",
|
||||
"annotations": {
|
||||
"description": "The API server is burning too much error budget.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeapierrorbudgetburn",
|
||||
"summary": "The API server is burning too much error budget."
|
||||
},
|
||||
"expr": "sum(apiserver_request:burnrate3d) > (1.00 * 0.01000)\nand\nsum(apiserver_request:burnrate6h) > (1.00 * 0.01000)\n",
|
||||
"for": "3h",
|
||||
"labels": {
|
||||
"long": "3d",
|
||||
"severity": "warning",
|
||||
"short": "6h"
|
||||
}
|
||||
}
|
||||
]
|
||||
@ -831,59 +1091,12 @@ data:
|
||||
{
|
||||
"name": "kubernetes-system-apiserver",
|
||||
"rules": [
|
||||
{
|
||||
"alert": "KubeAPILatencyHigh",
|
||||
"annotations": {
|
||||
"message": "The API server has an abnormal latency of {{ $value }} seconds for {{ $labels.verb }} {{ $labels.resource }}.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeapilatencyhigh"
|
||||
},
|
||||
"expr": "(\n cluster:apiserver_request_duration_seconds:mean5m{job=\"apiserver\"}\n >\n on (verb) group_left()\n (\n avg by (verb) (cluster:apiserver_request_duration_seconds:mean5m{job=\"apiserver\"} >= 0)\n +\n 2*stddev by (verb) (cluster:apiserver_request_duration_seconds:mean5m{job=\"apiserver\"} >= 0)\n )\n) > on (verb) group_left()\n1.2 * avg by (verb) (cluster:apiserver_request_duration_seconds:mean5m{job=\"apiserver\"} >= 0)\nand on (verb,resource)\ncluster_quantile:apiserver_request_duration_seconds:histogram_quantile{job=\"apiserver\",quantile=\"0.99\"}\n>\n1\n",
|
||||
"for": "5m",
|
||||
"labels": {
|
||||
"severity": "warning"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "KubeAPILatencyHigh",
|
||||
"annotations": {
|
||||
"message": "The API server has a 99th percentile latency of {{ $value }} seconds for {{ $labels.verb }} {{ $labels.resource }}.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeapilatencyhigh"
|
||||
},
|
||||
"expr": "cluster_quantile:apiserver_request_duration_seconds:histogram_quantile{job=\"apiserver\",quantile=\"0.99\"} > 4\n",
|
||||
"for": "10m",
|
||||
"labels": {
|
||||
"severity": "critical"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "KubeAPIErrorsHigh",
|
||||
"annotations": {
|
||||
"message": "API server is returning errors for {{ $value | humanizePercentage }} of requests for {{ $labels.verb }} {{ $labels.resource }} {{ $labels.subresource }}.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeapierrorshigh"
|
||||
},
|
||||
"expr": "sum(rate(apiserver_request_total{job=\"apiserver\",code=~\"5..\"}[5m])) by (resource,subresource,verb)\n /\nsum(rate(apiserver_request_total{job=\"apiserver\"}[5m])) by (resource,subresource,verb) > 0.10\n",
|
||||
"for": "10m",
|
||||
"labels": {
|
||||
"severity": "critical"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "KubeAPIErrorsHigh",
|
||||
"annotations": {
|
||||
"message": "API server is returning errors for {{ $value | humanizePercentage }} of requests for {{ $labels.verb }} {{ $labels.resource }} {{ $labels.subresource }}.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeapierrorshigh"
|
||||
},
|
||||
"expr": "sum(rate(apiserver_request_total{job=\"apiserver\",code=~\"5..\"}[5m])) by (resource,subresource,verb)\n /\nsum(rate(apiserver_request_total{job=\"apiserver\"}[5m])) by (resource,subresource,verb) > 0.05\n",
|
||||
"for": "10m",
|
||||
"labels": {
|
||||
"severity": "warning"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "KubeClientCertificateExpiration",
|
||||
"annotations": {
|
||||
"message": "A client certificate used to authenticate to the apiserver is expiring in less than 1.0 hours.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeclientcertificateexpiration"
|
||||
"description": "A client certificate used to authenticate to the apiserver is expiring in less than 1.0 hours.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeclientcertificateexpiration",
|
||||
"summary": "Client certificate is about to expire."
|
||||
},
|
||||
"expr": "apiserver_client_certificate_expiration_seconds_count{job=\"apiserver\"} > 0 and on(job) histogram_quantile(0.01, sum by (job, le) (rate(apiserver_client_certificate_expiration_seconds_bucket{job=\"apiserver\"}[5m]))) < 3600\n",
|
||||
"labels": {
|
||||
@ -893,8 +1106,9 @@ data:
|
||||
{
|
||||
"alert": "KubeClientCertificateExpiration",
|
||||
"annotations": {
|
||||
"message": "A client certificate used to authenticate to the apiserver is expiring in less than 0.1 hours.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeclientcertificateexpiration"
|
||||
"description": "A client certificate used to authenticate to the apiserver is expiring in less than 0.1 hours.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeclientcertificateexpiration",
|
||||
"summary": "Client certificate is about to expire."
|
||||
},
|
||||
"expr": "apiserver_client_certificate_expiration_seconds_count{job=\"apiserver\"} > 0 and on(job) histogram_quantile(0.01, sum by (job, le) (rate(apiserver_client_certificate_expiration_seconds_bucket{job=\"apiserver\"}[5m]))) < 300\n",
|
||||
"labels": {
|
||||
@ -904,8 +1118,9 @@ data:
|
||||
{
|
||||
"alert": "AggregatedAPIErrors",
|
||||
"annotations": {
|
||||
"message": "An aggregated API {{ $labels.name }}/{{ $labels.namespace }} has reported errors. The number of errors have increased for it in the past five minutes. High values indicate that the availability of the service changes too often.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-aggregatedapierrors"
|
||||
"description": "An aggregated API {{ $labels.name }}/{{ $labels.namespace }} has reported errors. The number of errors have increased for it in the past five minutes. High values indicate that the availability of the service changes too often.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-aggregatedapierrors",
|
||||
"summary": "An aggregated API has reported errors."
|
||||
},
|
||||
"expr": "sum by(name, namespace)(increase(aggregator_unavailable_apiservice_count[5m])) > 2\n",
|
||||
"labels": {
|
||||
@ -915,10 +1130,11 @@ data:
|
||||
{
|
||||
"alert": "AggregatedAPIDown",
|
||||
"annotations": {
|
||||
"message": "An aggregated API {{ $labels.name }}/{{ $labels.namespace }} is down. It has not been available at least for the past five minutes.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-aggregatedapidown"
|
||||
"description": "An aggregated API {{ $labels.name }}/{{ $labels.namespace }} has been only {{ $value | humanize }}% available over the last 10m.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-aggregatedapidown",
|
||||
"summary": "An aggregated API is down."
|
||||
},
|
||||
"expr": "sum by(name, namespace)(sum_over_time(aggregator_unavailable_apiservice[5m])) > 0\n",
|
||||
"expr": "(1 - max by(name, namespace)(avg_over_time(aggregator_unavailable_apiservice[10m]))) * 100 < 85\n",
|
||||
"for": "5m",
|
||||
"labels": {
|
||||
"severity": "warning"
|
||||
@ -927,8 +1143,9 @@ data:
|
||||
{
|
||||
"alert": "KubeAPIDown",
|
||||
"annotations": {
|
||||
"message": "KubeAPI has disappeared from Prometheus target discovery.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeapidown"
|
||||
"description": "KubeAPI has disappeared from Prometheus target discovery.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeapidown",
|
||||
"summary": "Target disappeared from Prometheus target discovery."
|
||||
},
|
||||
"expr": "absent(up{job=\"apiserver\"} == 1)\n",
|
||||
"for": "15m",
|
||||
@ -944,8 +1161,9 @@ data:
|
||||
{
|
||||
"alert": "KubeNodeNotReady",
|
||||
"annotations": {
|
||||
"message": "{{ $labels.node }} has been unready for more than 15 minutes.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubenodenotready"
|
||||
"description": "{{ $labels.node }} has been unready for more than 15 minutes.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubenodenotready",
|
||||
"summary": "Node is not ready."
|
||||
},
|
||||
"expr": "kube_node_status_condition{job=\"kube-state-metrics\",condition=\"Ready\",status=\"true\"} == 0\n",
|
||||
"for": "15m",
|
||||
@ -956,11 +1174,12 @@ data:
|
||||
{
|
||||
"alert": "KubeNodeUnreachable",
|
||||
"annotations": {
|
||||
"message": "{{ $labels.node }} is unreachable and some workloads may be rescheduled.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubenodeunreachable"
|
||||
"description": "{{ $labels.node }} is unreachable and some workloads may be rescheduled.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubenodeunreachable",
|
||||
"summary": "Node is unreachable."
|
||||
},
|
||||
"expr": "kube_node_spec_taint{job=\"kube-state-metrics\",key=\"node.kubernetes.io/unreachable\",effect=\"NoSchedule\"} == 1\n",
|
||||
"for": "2m",
|
||||
"expr": "(kube_node_spec_taint{job=\"kube-state-metrics\",key=\"node.kubernetes.io/unreachable\",effect=\"NoSchedule\"} unless ignoring(key,value) kube_node_spec_taint{job=\"kube-state-metrics\",key=~\"ToBeDeletedByClusterAutoscaler|cloud.google.com/impending-node-termination|aws-node-termination-handler/spot-itn\"}) == 1\n",
|
||||
"for": "15m",
|
||||
"labels": {
|
||||
"severity": "warning"
|
||||
}
|
||||
@ -968,10 +1187,11 @@ data:
|
||||
{
|
||||
"alert": "KubeletTooManyPods",
|
||||
"annotations": {
|
||||
"message": "Kubelet '{{ $labels.node }}' is running at {{ $value | humanizePercentage }} of its Pod capacity.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubelettoomanypods"
|
||||
"description": "Kubelet '{{ $labels.node }}' is running at {{ $value | humanizePercentage }} of its Pod capacity.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubelettoomanypods",
|
||||
"summary": "Kubelet is running at capacity."
|
||||
},
|
||||
"expr": "max(max(kubelet_running_pod_count{job=\"kubelet\"}) by(instance) * on(instance) group_left(node) kubelet_node_name{job=\"kubelet\"}) by(node) / max(kube_node_status_capacity_pods{job=\"kube-state-metrics\"} != 1) by(node) > 0.95\n",
|
||||
"expr": "count by(node) (\n (kube_pod_status_phase{job=\"kube-state-metrics\",phase=\"Running\"} == 1) * on(instance,pod,namespace,cluster) group_left(node) topk by(instance,pod,namespace,cluster) (1, kube_pod_info{job=\"kube-state-metrics\"})\n)\n/\nmax by(node) (\n kube_node_status_capacity_pods{job=\"kube-state-metrics\"} != 1\n) > 0.95\n",
|
||||
"for": "15m",
|
||||
"labels": {
|
||||
"severity": "warning"
|
||||
@ -980,8 +1200,9 @@ data:
|
||||
{
|
||||
"alert": "KubeNodeReadinessFlapping",
|
||||
"annotations": {
|
||||
"message": "The readiness status of node {{ $labels.node }} has changed {{ $value }} times in the last 15 minutes.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubenodereadinessflapping"
|
||||
"description": "The readiness status of node {{ $labels.node }} has changed {{ $value }} times in the last 15 minutes.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubenodereadinessflapping",
|
||||
"summary": "Node readiness status is flapping."
|
||||
},
|
||||
"expr": "sum(changes(kube_node_status_condition{status=\"true\",condition=\"Ready\"}[15m])) by (node) > 2\n",
|
||||
"for": "15m",
|
||||
@ -992,8 +1213,9 @@ data:
|
||||
{
|
||||
"alert": "KubeletPlegDurationHigh",
|
||||
"annotations": {
|
||||
"message": "The Kubelet Pod Lifecycle Event Generator has a 99th percentile duration of {{ $value }} seconds on node {{ $labels.node }}.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeletplegdurationhigh"
|
||||
"description": "The Kubelet Pod Lifecycle Event Generator has a 99th percentile duration of {{ $value }} seconds on node {{ $labels.node }}.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeletplegdurationhigh",
|
||||
"summary": "Kubelet Pod Lifecycle Event Generator is taking too long to relist."
|
||||
},
|
||||
"expr": "node_quantile:kubelet_pleg_relist_duration_seconds:histogram_quantile{quantile=\"0.99\"} >= 10\n",
|
||||
"for": "5m",
|
||||
@ -1004,10 +1226,85 @@ data:
|
||||
{
|
||||
"alert": "KubeletPodStartUpLatencyHigh",
|
||||
"annotations": {
|
||||
"message": "Kubelet Pod startup 99th percentile latency is {{ $value }} seconds on node {{ $labels.node }}.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeletpodstartuplatencyhigh"
|
||||
"description": "Kubelet Pod startup 99th percentile latency is {{ $value }} seconds on node {{ $labels.node }}.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeletpodstartuplatencyhigh",
|
||||
"summary": "Kubelet Pod startup latency is too high."
|
||||
},
|
||||
"expr": "histogram_quantile(0.99, sum(rate(kubelet_pod_worker_duration_seconds_bucket{job=\"kubelet\"}[5m])) by (instance, le)) * on(instance) group_left(node) kubelet_node_name > 60\n",
|
||||
"expr": "histogram_quantile(0.99, sum(rate(kubelet_pod_worker_duration_seconds_bucket{job=\"kubelet\"}[5m])) by (instance, le)) * on(instance) group_left(node) kubelet_node_name{job=\"kubelet\"} > 60\n",
|
||||
"for": "15m",
|
||||
"labels": {
|
||||
"severity": "warning"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "KubeletClientCertificateExpiration",
|
||||
"annotations": {
|
||||
"description": "Client certificate for Kubelet on node {{ $labels.node }} expires in {{ $value | humanizeDuration }}.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeletclientcertificateexpiration",
|
||||
"summary": "Kubelet client certificate is about to expire."
|
||||
},
|
||||
"expr": "kubelet_certificate_manager_client_ttl_seconds < 3600\n",
|
||||
"labels": {
|
||||
"severity": "warning"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "KubeletClientCertificateExpiration",
|
||||
"annotations": {
|
||||
"description": "Client certificate for Kubelet on node {{ $labels.node }} expires in {{ $value | humanizeDuration }}.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeletclientcertificateexpiration",
|
||||
"summary": "Kubelet client certificate is about to expire."
|
||||
},
|
||||
"expr": "kubelet_certificate_manager_client_ttl_seconds < 300\n",
|
||||
"labels": {
|
||||
"severity": "critical"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "KubeletServerCertificateExpiration",
|
||||
"annotations": {
|
||||
"description": "Server certificate for Kubelet on node {{ $labels.node }} expires in {{ $value | humanizeDuration }}.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeletservercertificateexpiration",
|
||||
"summary": "Kubelet server certificate is about to expire."
|
||||
},
|
||||
"expr": "kubelet_certificate_manager_server_ttl_seconds < 3600\n",
|
||||
"labels": {
|
||||
"severity": "warning"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "KubeletServerCertificateExpiration",
|
||||
"annotations": {
|
||||
"description": "Server certificate for Kubelet on node {{ $labels.node }} expires in {{ $value | humanizeDuration }}.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeletservercertificateexpiration",
|
||||
"summary": "Kubelet server certificate is about to expire."
|
||||
},
|
||||
"expr": "kubelet_certificate_manager_server_ttl_seconds < 300\n",
|
||||
"labels": {
|
||||
"severity": "critical"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "KubeletClientCertificateRenewalErrors",
|
||||
"annotations": {
|
||||
"description": "Kubelet on node {{ $labels.node }} has failed to renew its client certificate ({{ $value | humanize }} errors in the last 5 minutes).",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeletclientcertificaterenewalerrors",
|
||||
"summary": "Kubelet has failed to renew its client certificate."
|
||||
},
|
||||
"expr": "increase(kubelet_certificate_manager_client_expiration_renew_errors[5m]) > 0\n",
|
||||
"for": "15m",
|
||||
"labels": {
|
||||
"severity": "warning"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "KubeletServerCertificateRenewalErrors",
|
||||
"annotations": {
|
||||
"description": "Kubelet on node {{ $labels.node }} has failed to renew its server certificate ({{ $value | humanize }} errors in the last 5 minutes).",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeletservercertificaterenewalerrors",
|
||||
"summary": "Kubelet has failed to renew its server certificate."
|
||||
},
|
||||
"expr": "increase(kubelet_server_expiration_renew_errors[5m]) > 0\n",
|
||||
"for": "15m",
|
||||
"labels": {
|
||||
"severity": "warning"
|
||||
@ -1016,8 +1313,9 @@ data:
|
||||
{
|
||||
"alert": "KubeletDown",
|
||||
"annotations": {
|
||||
"message": "Kubelet has disappeared from Prometheus target discovery.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeletdown"
|
||||
"description": "Kubelet has disappeared from Prometheus target discovery.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeletdown",
|
||||
"summary": "Target disappeared from Prometheus target discovery."
|
||||
},
|
||||
"expr": "absent(up{job=\"kubelet\"} == 1)\n",
|
||||
"for": "15m",
|
||||
@ -1033,8 +1331,9 @@ data:
|
||||
{
|
||||
"alert": "KubeSchedulerDown",
|
||||
"annotations": {
|
||||
"message": "KubeScheduler has disappeared from Prometheus target discovery.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeschedulerdown"
|
||||
"description": "KubeScheduler has disappeared from Prometheus target discovery.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeschedulerdown",
|
||||
"summary": "Target disappeared from Prometheus target discovery."
|
||||
},
|
||||
"expr": "absent(up{job=\"kube-scheduler\"} == 1)\n",
|
||||
"for": "15m",
|
||||
@ -1050,8 +1349,9 @@ data:
|
||||
{
|
||||
"alert": "KubeControllerManagerDown",
|
||||
"annotations": {
|
||||
"message": "KubeControllerManager has disappeared from Prometheus target discovery.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubecontrollermanagerdown"
|
||||
"description": "KubeControllerManager has disappeared from Prometheus target discovery.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubecontrollermanagerdown",
|
||||
"summary": "Target disappeared from Prometheus target discovery."
|
||||
},
|
||||
"expr": "absent(up{job=\"kube-controller-manager\"} == 1)\n",
|
||||
"for": "15m",
|
||||
@ -1350,14 +1650,25 @@ data:
|
||||
{
|
||||
"alert": "NodeHighNumberConntrackEntriesUsed",
|
||||
"annotations": {
|
||||
"description": "{{ $value | humanizePercentage }} of conntrack entries are used",
|
||||
"summary": "Number of conntrack are getting close to the limit"
|
||||
"description": "{{ $value | humanizePercentage }} of conntrack entries are used.",
|
||||
"summary": "Number of conntrack are getting close to the limit."
|
||||
},
|
||||
"expr": "(node_nf_conntrack_entries / node_nf_conntrack_entries_limit) > 0.75\n",
|
||||
"labels": {
|
||||
"severity": "warning"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "NodeTextFileCollectorScrapeError",
|
||||
"annotations": {
|
||||
"description": "Node Exporter text file collector failed to scrape.",
|
||||
"summary": "Node Exporter text file collector failed to scrape."
|
||||
},
|
||||
"expr": "node_textfile_scrape_error{job=\"node-exporter\"} == 1\n",
|
||||
"labels": {
|
||||
"severity": "warning"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "NodeClockSkewDetected",
|
||||
"annotations": {
|
||||
@ -1381,6 +1692,29 @@ data:
|
||||
"labels": {
|
||||
"severity": "warning"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "NodeRAIDDegraded",
|
||||
"annotations": {
|
||||
"description": "RAID array '{{ $labels.device }}' on {{ $labels.instance }} is in degraded state due to one or more disks failures. Number of spare drives is insufficient to fix issue automatically.",
|
||||
"summary": "RAID Array is degraded"
|
||||
},
|
||||
"expr": "node_md_disks_required - ignoring (state) (node_md_disks{state=\"active\"}) > 0\n",
|
||||
"for": "15m",
|
||||
"labels": {
|
||||
"severity": "critical"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "NodeRAIDDiskFailure",
|
||||
"annotations": {
|
||||
"description": "At least one device in RAID array on {{ $labels.instance }} failed. Array '{{ $labels.device }}' needs attention and possibly a disk swap.",
|
||||
"summary": "Failed device in RAID array"
|
||||
},
|
||||
"expr": "node_md_disks{state=\"fail\"} > 0\n",
|
||||
"labels": {
|
||||
"severity": "warning"
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
@ -1515,7 +1849,7 @@ data:
|
||||
{
|
||||
"alert": "PrometheusRemoteStorageFailures",
|
||||
"annotations": {
|
||||
"description": "Prometheus {{$labels.instance}} failed to send {{ printf \"%.1f\" $value }}% of the samples to {{ if $labels.queue }}{{ $labels.queue }}{{ else }}{{ $labels.url }}{{ end }}.",
|
||||
"description": "Prometheus {{$labels.instance}} failed to send {{ printf \"%.1f\" $value }}% of the samples to {{ $labels.remote_name}}:{{ $labels.url }}",
|
||||
"summary": "Prometheus fails to send samples to remote storage."
|
||||
},
|
||||
"expr": "(\n rate(prometheus_remote_storage_failed_samples_total{job=\"prometheus\"}[5m])\n/\n (\n rate(prometheus_remote_storage_failed_samples_total{job=\"prometheus\"}[5m])\n +\n rate(prometheus_remote_storage_succeeded_samples_total{job=\"prometheus\"}[5m])\n )\n)\n* 100\n> 1\n",
|
||||
@ -1527,7 +1861,7 @@ data:
|
||||
{
|
||||
"alert": "PrometheusRemoteWriteBehind",
|
||||
"annotations": {
|
||||
"description": "Prometheus {{$labels.instance}} remote write is {{ printf \"%.1f\" $value }}s behind for {{ if $labels.queue }}{{ $labels.queue }}{{ else }}{{ $labels.url }}{{ end }}.",
|
||||
"description": "Prometheus {{$labels.instance}} remote write is {{ printf \"%.1f\" $value }}s behind for {{ $labels.remote_name}}:{{ $labels.url }}.",
|
||||
"summary": "Prometheus remote write is behind."
|
||||
},
|
||||
"expr": "# Without max_over_time, failed scrapes could create false negatives, see\n# https://www.robustperception.io/alerting-on-gauges-in-prometheus-2-0 for details.\n(\n max_over_time(prometheus_remote_storage_highest_timestamp_in_seconds{job=\"prometheus\"}[5m])\n- on(job, instance) group_right\n max_over_time(prometheus_remote_storage_queue_highest_sent_timestamp_seconds{job=\"prometheus\"}[5m])\n)\n> 120\n",
|
||||
@ -1539,7 +1873,7 @@ data:
|
||||
{
|
||||
"alert": "PrometheusRemoteWriteDesiredShards",
|
||||
"annotations": {
|
||||
"description": "Prometheus {{$labels.instance}} remote write desired shards calculation wants to run {{ $value }} shards, which is more than the max of {{ printf `prometheus_remote_storage_shards_max{instance=\"%s\",job=\"prometheus\"}` $labels.instance | query | first | value }}.",
|
||||
"description": "Prometheus {{$labels.instance}} remote write desired shards calculation wants to run {{ $value }} shards for queue {{ $labels.remote_name}}:{{ $labels.url }}, which is more than the max of {{ printf `prometheus_remote_storage_shards_max{instance=\"%s\",job=\"prometheus\"}` $labels.instance | query | first | value }}.",
|
||||
"summary": "Prometheus remote write desired shards calculation wants to run more than configured max shards."
|
||||
},
|
||||
"expr": "# Without max_over_time, failed scrapes could create false negatives, see\n# https://www.robustperception.io/alerting-on-gauges-in-prometheus-2-0 for details.\n(\n max_over_time(prometheus_remote_storage_shards_desired{job=\"prometheus\"}[5m])\n>\n max_over_time(prometheus_remote_storage_shards_max{job=\"prometheus\"}[5m])\n)\n",
|
||||
@ -1571,6 +1905,18 @@ data:
|
||||
"labels": {
|
||||
"severity": "warning"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "PrometheusTargetLimitHit",
|
||||
"annotations": {
|
||||
"description": "Prometheus {{$labels.instance}} has dropped {{ printf \"%.0f\" $value }} targets because the number of targets exceeded the configured target_limit.",
|
||||
"summary": "Prometheus has dropped targets because some scrape configs have exceeded the targets limit."
|
||||
},
|
||||
"expr": "increase(prometheus_target_scrape_pool_exceeded_target_limit_total{job=\"prometheus\"}[5m]) > 0\n",
|
||||
"for": "15m",
|
||||
"labels": {
|
||||
"severity": "warning"
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
|
@ -11,8 +11,8 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
|
||||
|
||||
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
|
||||
|
||||
* Kubernetes v1.18.5 (upstream)
|
||||
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
|
||||
* Kubernetes v1.19.2 (upstream)
|
||||
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking
|
||||
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
|
||||
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [spot](https://typhoon.psdn.io/cl/aws/#spot) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization
|
||||
* Ready for Ingress, Prometheus, Grafana, CSI, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
|
||||
|
@ -1,6 +1,6 @@
|
||||
# Kubernetes assets (kubeconfig, manifests)
|
||||
module "bootstrap" {
|
||||
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=5a7c963caf59740891df2aeae4b1561ccb3b9db6"
|
||||
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=d0f2123c5971410dc14aecde2307eb13e89c2bdf"
|
||||
|
||||
cluster_name = var.cluster_name
|
||||
api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)]
|
||||
|
@ -7,7 +7,7 @@ systemd:
|
||||
- name: 40-etcd-cluster.conf
|
||||
contents: |
|
||||
[Service]
|
||||
Environment="ETCD_IMAGE_TAG=v3.4.9"
|
||||
Environment="ETCD_IMAGE_TAG=v3.4.12"
|
||||
Environment="ETCD_IMAGE_URL=docker://quay.io/coreos/etcd"
|
||||
Environment="RKT_RUN_ARGS=--insecure-options=image"
|
||||
Environment="ETCD_NAME=${etcd_name}"
|
||||
@ -52,7 +52,7 @@ systemd:
|
||||
Description=Kubelet
|
||||
Wants=rpc-statd.service
|
||||
[Service]
|
||||
Environment=KUBELET_IMAGE=docker://quay.io/poseidon/kubelet:v1.18.5
|
||||
Environment=KUBELET_IMAGE=docker://quay.io/poseidon/kubelet:v1.19.2
|
||||
Environment=KUBELET_CGROUP_DRIVER=${cgroup_driver}
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
|
||||
@ -134,7 +134,7 @@ systemd:
|
||||
--volume script,kind=host,source=/opt/bootstrap/apply \
|
||||
--mount volume=script,target=/apply \
|
||||
--insecure-options=image \
|
||||
docker://quay.io/poseidon/kubelet:v1.18.5 \
|
||||
docker://quay.io/poseidon/kubelet:v1.19.2 \
|
||||
--net=host \
|
||||
--dns=host \
|
||||
--exec=/apply
|
||||
@ -142,6 +142,11 @@ systemd:
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
storage:
|
||||
directories:
|
||||
- path: /var/lib/etcd
|
||||
filesystem: root
|
||||
mode: 0700
|
||||
overwrite: true
|
||||
files:
|
||||
- path: /etc/kubernetes/kubeconfig
|
||||
filesystem: root
|
||||
@ -163,6 +168,7 @@ storage:
|
||||
mv tls/etcd/etcd-client* /etc/kubernetes/bootstrap-secrets/
|
||||
chown -R etcd:etcd /etc/ssl/etcd
|
||||
chmod -R 500 /etc/ssl/etcd
|
||||
chmod -R 700 /var/lib/etcd
|
||||
mv auth/kubeconfig /etc/kubernetes/bootstrap-secrets/
|
||||
mv tls/k8s/* /etc/kubernetes/bootstrap-secrets/
|
||||
mkdir -p /etc/kubernetes/manifests
|
||||
|
@ -1,11 +1,15 @@
|
||||
# Terraform version and plugin versions
|
||||
|
||||
terraform {
|
||||
required_version = "~> 0.12.6"
|
||||
required_version = ">= 0.12.26, < 0.14.0"
|
||||
required_providers {
|
||||
aws = "~> 2.23"
|
||||
ct = "~> 0.4"
|
||||
aws = ">= 2.23, <= 4.0"
|
||||
template = "~> 2.1"
|
||||
null = "~> 2.1"
|
||||
|
||||
ct = {
|
||||
source = "poseidon/ct"
|
||||
version = "~> 0.6.1"
|
||||
}
|
||||
}
|
||||
}
|
||||
|
@ -25,7 +25,7 @@ systemd:
|
||||
Description=Kubelet
|
||||
Wants=rpc-statd.service
|
||||
[Service]
|
||||
Environment=KUBELET_IMAGE=docker://quay.io/poseidon/kubelet:v1.18.5
|
||||
Environment=KUBELET_IMAGE=docker://quay.io/poseidon/kubelet:v1.19.2
|
||||
Environment=KUBELET_CGROUP_DRIVER=${cgroup_driver}
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
|
||||
@ -129,7 +129,7 @@ storage:
|
||||
--volume config,kind=host,source=/etc/kubernetes \
|
||||
--mount volume=config,target=/etc/kubernetes \
|
||||
--insecure-options=image \
|
||||
docker://quay.io/poseidon/kubelet:v1.18.5 \
|
||||
docker://quay.io/poseidon/kubelet:v1.19.2 \
|
||||
--net=host \
|
||||
--dns=host \
|
||||
--exec=/usr/local/bin/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname)
|
||||
|
@ -1,4 +1,14 @@
|
||||
# Terraform version and plugin versions
|
||||
|
||||
terraform {
|
||||
required_version = ">= 0.12"
|
||||
required_version = ">= 0.12.26, < 0.14.0"
|
||||
required_providers {
|
||||
aws = ">= 2.23, <= 4.0"
|
||||
template = "~> 2.1"
|
||||
|
||||
ct = {
|
||||
source = "poseidon/ct"
|
||||
version = "~> 0.6.1"
|
||||
}
|
||||
}
|
||||
}
|
||||
|
@ -11,8 +11,8 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
|
||||
|
||||
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
|
||||
|
||||
* Kubernetes v1.18.5 (upstream)
|
||||
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
|
||||
* Kubernetes v1.19.2 (upstream)
|
||||
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking
|
||||
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/), SELinux enforcing
|
||||
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [spot](https://typhoon.psdn.io/cl/aws/#spot) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization
|
||||
* Ready for Ingress, Prometheus, Grafana, CSI, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
|
||||
|
@ -1,6 +1,6 @@
|
||||
# Kubernetes assets (kubeconfig, manifests)
|
||||
module "bootstrap" {
|
||||
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=5a7c963caf59740891df2aeae4b1561ccb3b9db6"
|
||||
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=d0f2123c5971410dc14aecde2307eb13e89c2bdf"
|
||||
|
||||
cluster_name = var.cluster_name
|
||||
api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)]
|
||||
|
@ -1,6 +1,6 @@
|
||||
---
|
||||
variant: fcos
|
||||
version: 1.0.0
|
||||
version: 1.1.0
|
||||
systemd:
|
||||
units:
|
||||
- name: etcd-member.service
|
||||
@ -28,7 +28,7 @@ systemd:
|
||||
--network host \
|
||||
--volume /var/lib/etcd:/var/lib/etcd:rw,Z \
|
||||
--volume /etc/ssl/etcd:/etc/ssl/certs:ro,Z \
|
||||
quay.io/coreos/etcd:v3.4.9
|
||||
quay.io/coreos/etcd:v3.4.12
|
||||
ExecStop=/usr/bin/podman stop etcd
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
@ -55,7 +55,7 @@ systemd:
|
||||
Description=Kubelet (System Container)
|
||||
Wants=rpc-statd.service
|
||||
[Service]
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.18.5
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.19.2
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
|
||||
ExecStartPre=/bin/mkdir -p /opt/cni/bin
|
||||
@ -124,11 +124,13 @@ systemd:
|
||||
--volume /opt/bootstrap/assets:/assets:ro,Z \
|
||||
--volume /opt/bootstrap/apply:/apply:ro,Z \
|
||||
--entrypoint=/apply \
|
||||
quay.io/poseidon/kubelet:v1.18.5
|
||||
quay.io/poseidon/kubelet:v1.19.2
|
||||
ExecStartPost=/bin/touch /opt/bootstrap/bootstrap.done
|
||||
ExecStartPost=-/usr/bin/podman stop bootstrap
|
||||
storage:
|
||||
directories:
|
||||
- path: /var/lib/etcd
|
||||
mode: 0700
|
||||
- path: /etc/kubernetes
|
||||
- path: /opt/bootstrap
|
||||
files:
|
||||
@ -158,6 +160,7 @@ storage:
|
||||
mv manifests /opt/bootstrap/assets/manifests
|
||||
mv manifests-networking/* /opt/bootstrap/assets/manifests/
|
||||
rm -rf assets auth static-manifests tls manifests-networking
|
||||
chcon -R -u system_u -t container_file_t /etc/kubernetes/bootstrap-secrets
|
||||
- path: /opt/bootstrap/apply
|
||||
mode: 0544
|
||||
contents:
|
||||
@ -181,6 +184,13 @@ storage:
|
||||
inline: |
|
||||
net.ipv4.conf.default.rp_filter=0
|
||||
net.ipv4.conf.*.rp_filter=0
|
||||
- path: /etc/systemd/network/50-flannel.link
|
||||
contents:
|
||||
inline: |
|
||||
[Match]
|
||||
OriginalName=flannel*
|
||||
[Link]
|
||||
MACAddressPolicy=none
|
||||
- path: /etc/systemd/system.conf.d/accounting.conf
|
||||
contents:
|
||||
inline: |
|
||||
|
@ -1,11 +1,15 @@
|
||||
# Terraform version and plugin versions
|
||||
|
||||
terraform {
|
||||
required_version = "~> 0.12.6"
|
||||
required_version = ">= 0.12.26, < 0.14.0"
|
||||
required_providers {
|
||||
aws = "~> 2.23"
|
||||
ct = "~> 0.4"
|
||||
aws = ">= 2.23, <= 4.0"
|
||||
template = "~> 2.1"
|
||||
null = "~> 2.1"
|
||||
|
||||
ct = {
|
||||
source = "poseidon/ct"
|
||||
version = "~> 0.6.1"
|
||||
}
|
||||
}
|
||||
}
|
||||
|
@ -1,6 +1,6 @@
|
||||
---
|
||||
variant: fcos
|
||||
version: 1.0.0
|
||||
version: 1.1.0
|
||||
systemd:
|
||||
units:
|
||||
- name: docker.service
|
||||
@ -25,7 +25,7 @@ systemd:
|
||||
Description=Kubelet (System Container)
|
||||
Wants=rpc-statd.service
|
||||
[Service]
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.18.5
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.19.2
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
|
||||
ExecStartPre=/bin/mkdir -p /opt/cni/bin
|
||||
@ -89,7 +89,7 @@ systemd:
|
||||
Type=oneshot
|
||||
RemainAfterExit=true
|
||||
ExecStart=/bin/true
|
||||
ExecStop=/bin/bash -c '/usr/bin/podman run --volume /etc/kubernetes:/etc/kubernetes:ro,z --entrypoint /usr/local/bin/kubectl quay.io/poseidon/kubelet:v1.18.5 --kubeconfig=/etc/kubernetes/kubeconfig delete node $HOSTNAME'
|
||||
ExecStop=/bin/bash -c '/usr/bin/podman run --volume /etc/kubernetes:/etc/kubernetes:ro,z --entrypoint /usr/local/bin/kubectl quay.io/poseidon/kubelet:v1.19.2 --kubeconfig=/etc/kubernetes/kubeconfig delete node $HOSTNAME'
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
storage:
|
||||
@ -110,6 +110,13 @@ storage:
|
||||
inline: |
|
||||
net.ipv4.conf.default.rp_filter=0
|
||||
net.ipv4.conf.*.rp_filter=0
|
||||
- path: /etc/systemd/network/50-flannel.link
|
||||
contents:
|
||||
inline: |
|
||||
[Match]
|
||||
OriginalName=flannel*
|
||||
[Link]
|
||||
MACAddressPolicy=none
|
||||
- path: /etc/systemd/system.conf.d/accounting.conf
|
||||
contents:
|
||||
inline: |
|
||||
|
@ -1,4 +1,14 @@
|
||||
# Terraform version and plugin versions
|
||||
|
||||
terraform {
|
||||
required_version = ">= 0.12"
|
||||
required_version = ">= 0.12.26, < 0.14.0"
|
||||
required_providers {
|
||||
aws = ">= 2.23, <= 4.0"
|
||||
template = "~> 2.1"
|
||||
|
||||
ct = {
|
||||
source = "poseidon/ct"
|
||||
version = "~> 0.6.1"
|
||||
}
|
||||
}
|
||||
}
|
||||
|
@ -11,8 +11,8 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
|
||||
|
||||
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
|
||||
|
||||
* Kubernetes v1.18.5 (upstream)
|
||||
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
|
||||
* Kubernetes v1.19.2 (upstream)
|
||||
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking
|
||||
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
|
||||
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [low-priority](https://typhoon.psdn.io/cl/azure/#low-priority) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization
|
||||
* Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
|
||||
|
@ -1,6 +1,6 @@
|
||||
# Kubernetes assets (kubeconfig, manifests)
|
||||
module "bootstrap" {
|
||||
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=5a7c963caf59740891df2aeae4b1561ccb3b9db6"
|
||||
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=d0f2123c5971410dc14aecde2307eb13e89c2bdf"
|
||||
|
||||
cluster_name = var.cluster_name
|
||||
api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)]
|
||||
|
@ -7,7 +7,7 @@ systemd:
|
||||
- name: 40-etcd-cluster.conf
|
||||
contents: |
|
||||
[Service]
|
||||
Environment="ETCD_IMAGE_TAG=v3.4.9"
|
||||
Environment="ETCD_IMAGE_TAG=v3.4.12"
|
||||
Environment="ETCD_IMAGE_URL=docker://quay.io/coreos/etcd"
|
||||
Environment="RKT_RUN_ARGS=--insecure-options=image"
|
||||
Environment="ETCD_NAME=${etcd_name}"
|
||||
@ -52,7 +52,7 @@ systemd:
|
||||
Description=Kubelet
|
||||
Wants=rpc-statd.service
|
||||
[Service]
|
||||
Environment=KUBELET_IMAGE=docker://quay.io/poseidon/kubelet:v1.18.5
|
||||
Environment=KUBELET_IMAGE=docker://quay.io/poseidon/kubelet:v1.19.2
|
||||
Environment=KUBELET_CGROUP_DRIVER=${cgroup_driver}
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
|
||||
@ -134,7 +134,7 @@ systemd:
|
||||
--volume script,kind=host,source=/opt/bootstrap/apply \
|
||||
--mount volume=script,target=/apply \
|
||||
--insecure-options=image \
|
||||
docker://quay.io/poseidon/kubelet:v1.18.5 \
|
||||
docker://quay.io/poseidon/kubelet:v1.19.2 \
|
||||
--net=host \
|
||||
--dns=host \
|
||||
--exec=/apply
|
||||
@ -142,6 +142,11 @@ systemd:
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
storage:
|
||||
directories:
|
||||
- path: /var/lib/etcd
|
||||
filesystem: root
|
||||
mode: 0700
|
||||
overwrite: true
|
||||
files:
|
||||
- path: /etc/kubernetes/kubeconfig
|
||||
filesystem: root
|
||||
@ -163,6 +168,7 @@ storage:
|
||||
mv tls/etcd/etcd-client* /etc/kubernetes/bootstrap-secrets/
|
||||
chown -R etcd:etcd /etc/ssl/etcd
|
||||
chmod -R 500 /etc/ssl/etcd
|
||||
chmod -R 700 /var/lib/etcd
|
||||
mv auth/kubeconfig /etc/kubernetes/bootstrap-secrets/
|
||||
mv tls/k8s/* /etc/kubernetes/bootstrap-secrets/
|
||||
mkdir -p /etc/kubernetes/manifests
|
||||
|
@ -1,12 +1,16 @@
|
||||
# Terraform version and plugin versions
|
||||
|
||||
terraform {
|
||||
required_version = "~> 0.12.6"
|
||||
required_version = ">= 0.12.26, < 0.14.0"
|
||||
required_providers {
|
||||
azurerm = "~> 2.8"
|
||||
ct = "~> 0.4"
|
||||
template = "~> 2.1"
|
||||
null = "~> 2.1"
|
||||
|
||||
ct = {
|
||||
source = "poseidon/ct"
|
||||
version = "~> 0.6.1"
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -25,7 +25,7 @@ systemd:
|
||||
Description=Kubelet
|
||||
Wants=rpc-statd.service
|
||||
[Service]
|
||||
Environment=KUBELET_IMAGE=docker://quay.io/poseidon/kubelet:v1.18.5
|
||||
Environment=KUBELET_IMAGE=docker://quay.io/poseidon/kubelet:v1.19.2
|
||||
Environment=KUBELET_CGROUP_DRIVER=${cgroup_driver}
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
|
||||
@ -129,7 +129,7 @@ storage:
|
||||
--volume config,kind=host,source=/etc/kubernetes \
|
||||
--mount volume=config,target=/etc/kubernetes \
|
||||
--insecure-options=image \
|
||||
docker://quay.io/poseidon/kubelet:v1.18.5 \
|
||||
docker://quay.io/poseidon/kubelet:v1.19.2 \
|
||||
--net=host \
|
||||
--dns=host \
|
||||
--exec=/usr/local/bin/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname | tr '[:upper:]' '[:lower:]')
|
||||
|
@ -1,4 +1,14 @@
|
||||
# Terraform version and plugin versions
|
||||
|
||||
terraform {
|
||||
required_version = ">= 0.12"
|
||||
required_version = ">= 0.12.26, < 0.14.0"
|
||||
required_providers {
|
||||
azurerm = "~> 2.8"
|
||||
template = "~> 2.1"
|
||||
|
||||
ct = {
|
||||
source = "poseidon/ct"
|
||||
version = "~> 0.6.1"
|
||||
}
|
||||
}
|
||||
}
|
||||
|
@ -11,8 +11,8 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
|
||||
|
||||
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
|
||||
|
||||
* Kubernetes v1.18.5 (upstream)
|
||||
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
|
||||
* Kubernetes v1.19.2 (upstream)
|
||||
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking
|
||||
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/), SELinux enforcing
|
||||
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [spot priority](https://typhoon.psdn.io/fedora-coreos/azure/#low-priority) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/) customization
|
||||
* Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
|
||||
|
@ -1,6 +1,6 @@
|
||||
# Kubernetes assets (kubeconfig, manifests)
|
||||
module "bootstrap" {
|
||||
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=5a7c963caf59740891df2aeae4b1561ccb3b9db6"
|
||||
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=d0f2123c5971410dc14aecde2307eb13e89c2bdf"
|
||||
|
||||
cluster_name = var.cluster_name
|
||||
api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)]
|
||||
|
@ -1,6 +1,6 @@
|
||||
---
|
||||
variant: fcos
|
||||
version: 1.0.0
|
||||
version: 1.1.0
|
||||
systemd:
|
||||
units:
|
||||
- name: etcd-member.service
|
||||
@ -28,7 +28,7 @@ systemd:
|
||||
--network host \
|
||||
--volume /var/lib/etcd:/var/lib/etcd:rw,Z \
|
||||
--volume /etc/ssl/etcd:/etc/ssl/certs:ro,Z \
|
||||
quay.io/coreos/etcd:v3.4.9
|
||||
quay.io/coreos/etcd:v3.4.12
|
||||
ExecStop=/usr/bin/podman stop etcd
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
@ -54,7 +54,7 @@ systemd:
|
||||
Description=Kubelet (System Container)
|
||||
Wants=rpc-statd.service
|
||||
[Service]
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.18.5
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.19.2
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
|
||||
ExecStartPre=/bin/mkdir -p /opt/cni/bin
|
||||
@ -123,11 +123,13 @@ systemd:
|
||||
--volume /opt/bootstrap/assets:/assets:ro,Z \
|
||||
--volume /opt/bootstrap/apply:/apply:ro,Z \
|
||||
--entrypoint=/apply \
|
||||
quay.io/poseidon/kubelet:v1.18.5
|
||||
quay.io/poseidon/kubelet:v1.19.2
|
||||
ExecStartPost=/bin/touch /opt/bootstrap/bootstrap.done
|
||||
ExecStartPost=-/usr/bin/podman stop bootstrap
|
||||
storage:
|
||||
directories:
|
||||
- path: /var/lib/etcd
|
||||
mode: 0700
|
||||
- path: /etc/kubernetes
|
||||
- path: /opt/bootstrap
|
||||
files:
|
||||
@ -157,6 +159,7 @@ storage:
|
||||
mv manifests /opt/bootstrap/assets/manifests
|
||||
mv manifests-networking/* /opt/bootstrap/assets/manifests/
|
||||
rm -rf assets auth static-manifests tls manifests-networking
|
||||
chcon -R -u system_u -t container_file_t /etc/kubernetes/bootstrap-secrets
|
||||
- path: /opt/bootstrap/apply
|
||||
mode: 0544
|
||||
contents:
|
||||
@ -180,6 +183,13 @@ storage:
|
||||
inline: |
|
||||
net.ipv4.conf.default.rp_filter=0
|
||||
net.ipv4.conf.*.rp_filter=0
|
||||
- path: /etc/systemd/network/50-flannel.link
|
||||
contents:
|
||||
inline: |
|
||||
[Match]
|
||||
OriginalName=flannel*
|
||||
[Link]
|
||||
MACAddressPolicy=none
|
||||
- path: /etc/systemd/system.conf.d/accounting.conf
|
||||
contents:
|
||||
inline: |
|
||||
|
@ -1,12 +1,16 @@
|
||||
# Terraform version and plugin versions
|
||||
|
||||
terraform {
|
||||
required_version = "~> 0.12.6"
|
||||
required_version = ">= 0.12.26, < 0.14.0"
|
||||
required_providers {
|
||||
azurerm = "~> 2.8"
|
||||
ct = "~> 0.4"
|
||||
template = "~> 2.1"
|
||||
null = "~> 2.1"
|
||||
|
||||
ct = {
|
||||
source = "poseidon/ct"
|
||||
version = "~> 0.6.1"
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -1,6 +1,6 @@
|
||||
---
|
||||
variant: fcos
|
||||
version: 1.0.0
|
||||
version: 1.1.0
|
||||
systemd:
|
||||
units:
|
||||
- name: docker.service
|
||||
@ -24,7 +24,7 @@ systemd:
|
||||
Description=Kubelet (System Container)
|
||||
Wants=rpc-statd.service
|
||||
[Service]
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.18.5
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.19.2
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
|
||||
ExecStartPre=/bin/mkdir -p /opt/cni/bin
|
||||
@ -88,7 +88,7 @@ systemd:
|
||||
Type=oneshot
|
||||
RemainAfterExit=true
|
||||
ExecStart=/bin/true
|
||||
ExecStop=/bin/bash -c '/usr/bin/podman run --volume /etc/kubernetes:/etc/kubernetes:ro,z --entrypoint /usr/local/bin/kubectl quay.io/poseidon/kubelet:v1.18.5 --kubeconfig=/etc/kubernetes/kubeconfig delete node $HOSTNAME'
|
||||
ExecStop=/bin/bash -c '/usr/bin/podman run --volume /etc/kubernetes:/etc/kubernetes:ro,z --entrypoint /usr/local/bin/kubectl quay.io/poseidon/kubelet:v1.19.2 --kubeconfig=/etc/kubernetes/kubeconfig delete node $HOSTNAME'
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
storage:
|
||||
@ -109,6 +109,13 @@ storage:
|
||||
inline: |
|
||||
net.ipv4.conf.default.rp_filter=0
|
||||
net.ipv4.conf.*.rp_filter=0
|
||||
- path: /etc/systemd/network/50-flannel.link
|
||||
contents:
|
||||
inline: |
|
||||
[Match]
|
||||
OriginalName=flannel*
|
||||
[Link]
|
||||
MACAddressPolicy=none
|
||||
- path: /etc/systemd/system.conf.d/accounting.conf
|
||||
contents:
|
||||
inline: |
|
||||
|
@ -1,4 +1,14 @@
|
||||
# Terraform version and plugin versions
|
||||
|
||||
terraform {
|
||||
required_version = ">= 0.12"
|
||||
required_version = ">= 0.12.26, < 0.14.0"
|
||||
required_providers {
|
||||
azurerm = "~> 2.8"
|
||||
template = "~> 2.1"
|
||||
|
||||
ct = {
|
||||
source = "poseidon/ct"
|
||||
version = "~> 0.6.1"
|
||||
}
|
||||
}
|
||||
}
|
||||
|
@ -11,8 +11,8 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
|
||||
|
||||
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
|
||||
|
||||
* Kubernetes v1.18.5 (upstream)
|
||||
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
|
||||
* Kubernetes v1.19.2 (upstream)
|
||||
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking
|
||||
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
|
||||
* Advanced features like [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization
|
||||
* Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
|
||||
|
@ -1,6 +1,6 @@
|
||||
# Kubernetes assets (kubeconfig, manifests)
|
||||
module "bootstrap" {
|
||||
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=5a7c963caf59740891df2aeae4b1561ccb3b9db6"
|
||||
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=d0f2123c5971410dc14aecde2307eb13e89c2bdf"
|
||||
|
||||
cluster_name = var.cluster_name
|
||||
api_servers = [var.k8s_domain_name]
|
||||
|
@ -7,7 +7,7 @@ systemd:
|
||||
- name: 40-etcd-cluster.conf
|
||||
contents: |
|
||||
[Service]
|
||||
Environment="ETCD_IMAGE_TAG=v3.4.9"
|
||||
Environment="ETCD_IMAGE_TAG=v3.4.12"
|
||||
Environment="ETCD_IMAGE_URL=docker://quay.io/coreos/etcd"
|
||||
Environment="RKT_RUN_ARGS=--insecure-options=image"
|
||||
Environment="ETCD_NAME=${etcd_name}"
|
||||
@ -60,7 +60,7 @@ systemd:
|
||||
Description=Kubelet
|
||||
Wants=rpc-statd.service
|
||||
[Service]
|
||||
Environment=KUBELET_IMAGE=docker://quay.io/poseidon/kubelet:v1.18.5
|
||||
Environment=KUBELET_IMAGE=docker://quay.io/poseidon/kubelet:v1.19.2
|
||||
Environment=KUBELET_CGROUP_DRIVER=${cgroup_driver}
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
|
||||
@ -147,7 +147,7 @@ systemd:
|
||||
--volume script,kind=host,source=/opt/bootstrap/apply \
|
||||
--mount volume=script,target=/apply \
|
||||
--insecure-options=image \
|
||||
docker://quay.io/poseidon/kubelet:v1.18.5 \
|
||||
docker://quay.io/poseidon/kubelet:v1.19.2 \
|
||||
--net=host \
|
||||
--dns=host \
|
||||
--exec=/apply
|
||||
@ -156,6 +156,10 @@ systemd:
|
||||
WantedBy=multi-user.target
|
||||
storage:
|
||||
directories:
|
||||
- path: /var/lib/etcd
|
||||
filesystem: root
|
||||
mode: 0700
|
||||
overwrite: true
|
||||
- path: /etc/kubernetes
|
||||
filesystem: root
|
||||
mode: 0755
|
||||
@ -180,6 +184,7 @@ storage:
|
||||
mv tls/etcd/etcd-client* /etc/kubernetes/bootstrap-secrets/
|
||||
chown -R etcd:etcd /etc/ssl/etcd
|
||||
chmod -R 500 /etc/ssl/etcd
|
||||
chmod -R 700 /var/lib/etcd
|
||||
mv auth/kubeconfig /etc/kubernetes/bootstrap-secrets/
|
||||
mv tls/k8s/* /etc/kubernetes/bootstrap-secrets/
|
||||
mkdir -p /etc/kubernetes/manifests
|
||||
|
@ -33,7 +33,7 @@ systemd:
|
||||
Description=Kubelet
|
||||
Wants=rpc-statd.service
|
||||
[Service]
|
||||
Environment=KUBELET_IMAGE=docker://quay.io/poseidon/kubelet:v1.18.5
|
||||
Environment=KUBELET_IMAGE=docker://quay.io/poseidon/kubelet:v1.19.2
|
||||
Environment=KUBELET_CGROUP_DRIVER=${cgroup_driver}
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
|
||||
|
@ -1,12 +1,20 @@
|
||||
# Terraform version and plugin versions
|
||||
|
||||
terraform {
|
||||
required_version = "~> 0.12.6"
|
||||
required_version = ">= 0.12.26, < 0.14.0"
|
||||
required_providers {
|
||||
matchbox = "~> 0.3.0"
|
||||
ct = "~> 0.4"
|
||||
template = "~> 2.1"
|
||||
null = "~> 2.1"
|
||||
|
||||
ct = {
|
||||
source = "poseidon/ct"
|
||||
version = "~> 0.6.1"
|
||||
}
|
||||
|
||||
matchbox = {
|
||||
source = "poseidon/matchbox"
|
||||
version = "~> 0.4.1"
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -11,8 +11,8 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
|
||||
|
||||
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
|
||||
|
||||
* Kubernetes v1.18.5 (upstream)
|
||||
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
|
||||
* Kubernetes v1.19.2 (upstream)
|
||||
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking
|
||||
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/), SELinux enforcing
|
||||
* Advanced features like [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization
|
||||
* Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
|
||||
|
@ -1,6 +1,6 @@
|
||||
# Kubernetes assets (kubeconfig, manifests)
|
||||
module "bootstrap" {
|
||||
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=5a7c963caf59740891df2aeae4b1561ccb3b9db6"
|
||||
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=d0f2123c5971410dc14aecde2307eb13e89c2bdf"
|
||||
|
||||
cluster_name = var.cluster_name
|
||||
api_servers = [var.k8s_domain_name]
|
||||
|
@ -1,6 +1,6 @@
|
||||
---
|
||||
variant: fcos
|
||||
version: 1.0.0
|
||||
version: 1.1.0
|
||||
systemd:
|
||||
units:
|
||||
- name: etcd-member.service
|
||||
@ -28,7 +28,7 @@ systemd:
|
||||
--network host \
|
||||
--volume /var/lib/etcd:/var/lib/etcd:rw,Z \
|
||||
--volume /etc/ssl/etcd:/etc/ssl/certs:ro,Z \
|
||||
quay.io/coreos/etcd:v3.4.9
|
||||
quay.io/coreos/etcd:v3.4.12
|
||||
ExecStop=/usr/bin/podman stop etcd
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
@ -53,7 +53,7 @@ systemd:
|
||||
Description=Kubelet (System Container)
|
||||
Wants=rpc-statd.service
|
||||
[Service]
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.18.5
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.19.2
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
|
||||
ExecStartPre=/bin/mkdir -p /opt/cni/bin
|
||||
@ -134,11 +134,13 @@ systemd:
|
||||
--volume /opt/bootstrap/assets:/assets:ro,Z \
|
||||
--volume /opt/bootstrap/apply:/apply:ro,Z \
|
||||
--entrypoint=/apply \
|
||||
quay.io/poseidon/kubelet:v1.18.5
|
||||
quay.io/poseidon/kubelet:v1.19.2
|
||||
ExecStartPost=/bin/touch /opt/bootstrap/bootstrap.done
|
||||
ExecStartPost=-/usr/bin/podman stop bootstrap
|
||||
storage:
|
||||
directories:
|
||||
- path: /var/lib/etcd
|
||||
mode: 0700
|
||||
- path: /etc/kubernetes
|
||||
- path: /opt/bootstrap
|
||||
files:
|
||||
@ -168,6 +170,7 @@ storage:
|
||||
mv manifests /opt/bootstrap/assets/manifests
|
||||
mv manifests-networking/* /opt/bootstrap/assets/manifests/
|
||||
rm -rf assets auth static-manifests tls manifests-networking
|
||||
chcon -R -u system_u -t container_file_t /etc/kubernetes/bootstrap-secrets
|
||||
- path: /opt/bootstrap/apply
|
||||
mode: 0544
|
||||
contents:
|
||||
@ -191,6 +194,13 @@ storage:
|
||||
inline: |
|
||||
net.ipv4.conf.default.rp_filter=0
|
||||
net.ipv4.conf.*.rp_filter=0
|
||||
- path: /etc/systemd/network/50-flannel.link
|
||||
contents:
|
||||
inline: |
|
||||
[Match]
|
||||
OriginalName=flannel*
|
||||
[Link]
|
||||
MACAddressPolicy=none
|
||||
- path: /etc/systemd/system.conf.d/accounting.conf
|
||||
contents:
|
||||
inline: |
|
||||
|
@ -1,6 +1,6 @@
|
||||
---
|
||||
variant: fcos
|
||||
version: 1.0.0
|
||||
version: 1.1.0
|
||||
systemd:
|
||||
units:
|
||||
- name: docker.service
|
||||
@ -23,7 +23,7 @@ systemd:
|
||||
Description=Kubelet (System Container)
|
||||
Wants=rpc-statd.service
|
||||
[Service]
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.18.5
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.19.2
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
|
||||
ExecStartPre=/bin/mkdir -p /opt/cni/bin
|
||||
@ -111,6 +111,13 @@ storage:
|
||||
inline: |
|
||||
net.ipv4.conf.default.rp_filter=0
|
||||
net.ipv4.conf.*.rp_filter=0
|
||||
- path: /etc/systemd/network/50-flannel.link
|
||||
contents:
|
||||
inline: |
|
||||
[Match]
|
||||
OriginalName=flannel*
|
||||
[Link]
|
||||
MACAddressPolicy=none
|
||||
- path: /etc/systemd/system.conf.d/accounting.conf
|
||||
contents:
|
||||
inline: |
|
||||
|
@ -1,11 +1,19 @@
|
||||
# Terraform version and plugin versions
|
||||
|
||||
terraform {
|
||||
required_version = "~> 0.12.6"
|
||||
required_version = ">= 0.12.26, < 0.14.0"
|
||||
required_providers {
|
||||
matchbox = "~> 0.3.0"
|
||||
ct = "~> 0.4"
|
||||
template = "~> 2.1"
|
||||
null = "~> 2.1"
|
||||
|
||||
ct = {
|
||||
source = "poseidon/ct"
|
||||
version = "~> 0.6.1"
|
||||
}
|
||||
|
||||
matchbox = {
|
||||
source = "poseidon/matchbox"
|
||||
version = "~> 0.4.1"
|
||||
}
|
||||
}
|
||||
}
|
||||
|
@ -11,8 +11,8 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
|
||||
|
||||
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
|
||||
|
||||
* Kubernetes v1.18.5 (upstream)
|
||||
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
|
||||
* Kubernetes v1.19.2 (upstream)
|
||||
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking
|
||||
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
|
||||
* Advanced features like [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization
|
||||
* Ready for Ingress, Prometheus, Grafana, CSI, and other [addons](https://typhoon.psdn.io/addons/overview/)
|
||||
|
@ -1,6 +1,6 @@
|
||||
# Kubernetes assets (kubeconfig, manifests)
|
||||
module "bootstrap" {
|
||||
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=5a7c963caf59740891df2aeae4b1561ccb3b9db6"
|
||||
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=d0f2123c5971410dc14aecde2307eb13e89c2bdf"
|
||||
|
||||
cluster_name = var.cluster_name
|
||||
api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)]
|
||||
|
@ -7,7 +7,7 @@ systemd:
|
||||
- name: 40-etcd-cluster.conf
|
||||
contents: |
|
||||
[Service]
|
||||
Environment="ETCD_IMAGE_TAG=v3.4.9"
|
||||
Environment="ETCD_IMAGE_TAG=v3.4.12"
|
||||
Environment="ETCD_IMAGE_URL=docker://quay.io/coreos/etcd"
|
||||
Environment="RKT_RUN_ARGS=--insecure-options=image"
|
||||
Environment="ETCD_NAME=${etcd_name}"
|
||||
@ -62,7 +62,7 @@ systemd:
|
||||
After=coreos-metadata.service
|
||||
Wants=rpc-statd.service
|
||||
[Service]
|
||||
Environment=KUBELET_IMAGE=docker://quay.io/poseidon/kubelet:v1.18.5
|
||||
Environment=KUBELET_IMAGE=docker://quay.io/poseidon/kubelet:v1.19.2
|
||||
EnvironmentFile=/run/metadata/coreos
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
|
||||
@ -144,7 +144,7 @@ systemd:
|
||||
--volume script,kind=host,source=/opt/bootstrap/apply \
|
||||
--mount volume=script,target=/apply \
|
||||
--insecure-options=image \
|
||||
docker://quay.io/poseidon/kubelet:v1.18.5 \
|
||||
docker://quay.io/poseidon/kubelet:v1.19.2 \
|
||||
--net=host \
|
||||
--dns=host \
|
||||
--exec=/apply
|
||||
@ -153,6 +153,10 @@ systemd:
|
||||
WantedBy=multi-user.target
|
||||
storage:
|
||||
directories:
|
||||
- path: /var/lib/etcd
|
||||
filesystem: root
|
||||
mode: 0700
|
||||
overwrite: true
|
||||
- path: /etc/kubernetes
|
||||
filesystem: root
|
||||
mode: 0755
|
||||
@ -171,6 +175,7 @@ storage:
|
||||
mv tls/etcd/etcd-client* /etc/kubernetes/bootstrap-secrets/
|
||||
chown -R etcd:etcd /etc/ssl/etcd
|
||||
chmod -R 500 /etc/ssl/etcd
|
||||
chmod -R 700 /var/lib/etcd
|
||||
mv auth/kubeconfig /etc/kubernetes/bootstrap-secrets/
|
||||
mv tls/k8s/* /etc/kubernetes/bootstrap-secrets/
|
||||
mkdir -p /etc/kubernetes/manifests
|
||||
|
@ -35,7 +35,7 @@ systemd:
|
||||
After=coreos-metadata.service
|
||||
Wants=rpc-statd.service
|
||||
[Service]
|
||||
Environment=KUBELET_IMAGE=docker://quay.io/poseidon/kubelet:v1.18.5
|
||||
Environment=KUBELET_IMAGE=docker://quay.io/poseidon/kubelet:v1.19.2
|
||||
EnvironmentFile=/run/metadata/coreos
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
|
||||
@ -134,7 +134,7 @@ storage:
|
||||
--volume config,kind=host,source=/etc/kubernetes \
|
||||
--mount volume=config,target=/etc/kubernetes \
|
||||
--insecure-options=image \
|
||||
docker://quay.io/poseidon/kubelet:v1.18.5 \
|
||||
docker://quay.io/poseidon/kubelet:v1.19.2 \
|
||||
--net=host \
|
||||
--dns=host \
|
||||
--exec=/usr/local/bin/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname)
|
||||
|
@ -1,12 +1,20 @@
|
||||
# Terraform version and plugin versions
|
||||
|
||||
terraform {
|
||||
required_version = "~> 0.12.6"
|
||||
required_version = ">= 0.12.26, < 0.14.0"
|
||||
required_providers {
|
||||
digitalocean = "~> 1.16"
|
||||
ct = "~> 0.4"
|
||||
template = "~> 2.1"
|
||||
null = "~> 2.1"
|
||||
template = "~> 2.1"
|
||||
null = "~> 2.1"
|
||||
|
||||
ct = {
|
||||
source = "poseidon/ct"
|
||||
version = "~> 0.6.1"
|
||||
}
|
||||
|
||||
digitalocean = {
|
||||
source = "digitalocean/digitalocean"
|
||||
version = "~> 1.20"
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
|
||||
|
||||
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
|
||||
|
||||
* Kubernetes v1.18.5 (upstream)
|
||||
* Kubernetes v1.19.2 (upstream)
|
||||
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
|
||||
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/), SELinux enforcing
|
||||
* Advanced features like [snippets](https://typhoon.psdn.io/advanced/customization/) customization
|
||||
|
@ -1,6 +1,6 @@
|
||||
# Kubernetes assets (kubeconfig, manifests)
|
||||
module "bootstrap" {
|
||||
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=5a7c963caf59740891df2aeae4b1561ccb3b9db6"
|
||||
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=d0f2123c5971410dc14aecde2307eb13e89c2bdf"
|
||||
|
||||
cluster_name = var.cluster_name
|
||||
api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)]
|
||||
|
@ -1,6 +1,6 @@
|
||||
---
|
||||
variant: fcos
|
||||
version: 1.0.0
|
||||
version: 1.1.0
|
||||
systemd:
|
||||
units:
|
||||
- name: etcd-member.service
|
||||
@ -28,7 +28,7 @@ systemd:
|
||||
--network host \
|
||||
--volume /var/lib/etcd:/var/lib/etcd:rw,Z \
|
||||
--volume /etc/ssl/etcd:/etc/ssl/certs:ro,Z \
|
||||
quay.io/coreos/etcd:v3.4.9
|
||||
quay.io/coreos/etcd:v3.4.12
|
||||
ExecStop=/usr/bin/podman stop etcd
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
@ -55,7 +55,7 @@ systemd:
|
||||
After=afterburn.service
|
||||
Wants=rpc-statd.service
|
||||
[Service]
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.18.5
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.19.2
|
||||
EnvironmentFile=/run/metadata/afterburn
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
|
||||
@ -135,11 +135,13 @@ systemd:
|
||||
--volume /opt/bootstrap/assets:/assets:ro,Z \
|
||||
--volume /opt/bootstrap/apply:/apply:ro,Z \
|
||||
--entrypoint=/apply \
|
||||
quay.io/poseidon/kubelet:v1.18.5
|
||||
quay.io/poseidon/kubelet:v1.19.2
|
||||
ExecStartPost=/bin/touch /opt/bootstrap/bootstrap.done
|
||||
ExecStartPost=-/usr/bin/podman stop bootstrap
|
||||
storage:
|
||||
directories:
|
||||
- path: /var/lib/etcd
|
||||
mode: 0700
|
||||
- path: /etc/kubernetes
|
||||
- path: /opt/bootstrap
|
||||
files:
|
||||
@ -164,6 +166,7 @@ storage:
|
||||
mv manifests /opt/bootstrap/assets/manifests
|
||||
mv manifests-networking/* /opt/bootstrap/assets/manifests/
|
||||
rm -rf assets auth static-manifests tls manifests-networking
|
||||
chcon -R -u system_u -t container_file_t /etc/kubernetes/bootstrap-secrets
|
||||
- path: /opt/bootstrap/apply
|
||||
mode: 0544
|
||||
contents:
|
||||
@ -187,6 +190,13 @@ storage:
|
||||
inline: |
|
||||
net.ipv4.conf.default.rp_filter=0
|
||||
net.ipv4.conf.*.rp_filter=0
|
||||
- path: /etc/systemd/network/50-flannel.link
|
||||
contents:
|
||||
inline: |
|
||||
[Match]
|
||||
OriginalName=flannel*
|
||||
[Link]
|
||||
MACAddressPolicy=none
|
||||
- path: /etc/systemd/system.conf.d/accounting.conf
|
||||
contents:
|
||||
inline: |
|
||||
|
@ -1,6 +1,6 @@
|
||||
---
|
||||
variant: fcos
|
||||
version: 1.0.0
|
||||
version: 1.1.0
|
||||
systemd:
|
||||
units:
|
||||
- name: docker.service
|
||||
@ -26,7 +26,7 @@ systemd:
|
||||
After=afterburn.service
|
||||
Wants=rpc-statd.service
|
||||
[Service]
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.18.5
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.19.2
|
||||
EnvironmentFile=/run/metadata/afterburn
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
|
||||
@ -98,7 +98,7 @@ systemd:
|
||||
Type=oneshot
|
||||
RemainAfterExit=true
|
||||
ExecStart=/bin/true
|
||||
ExecStop=/bin/bash -c '/usr/bin/podman run --volume /etc/kubernetes:/etc/kubernetes:ro,z --entrypoint /usr/local/bin/kubectl quay.io/poseidon/kubelet:v1.18.5 --kubeconfig=/etc/kubernetes/kubeconfig delete node $HOSTNAME'
|
||||
ExecStop=/bin/bash -c '/usr/bin/podman run --volume /etc/kubernetes:/etc/kubernetes:ro,z --entrypoint /usr/local/bin/kubectl quay.io/poseidon/kubelet:v1.19.2 --kubeconfig=/etc/kubernetes/kubeconfig delete node $HOSTNAME'
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
storage:
|
||||
@ -114,6 +114,13 @@ storage:
|
||||
inline: |
|
||||
net.ipv4.conf.default.rp_filter=0
|
||||
net.ipv4.conf.*.rp_filter=0
|
||||
- path: /etc/systemd/network/50-flannel.link
|
||||
contents:
|
||||
inline: |
|
||||
[Match]
|
||||
OriginalName=flannel*
|
||||
[Link]
|
||||
MACAddressPolicy=none
|
||||
- path: /etc/systemd/system.conf.d/accounting.conf
|
||||
contents:
|
||||
inline: |
|
||||
|
@ -1,12 +1,20 @@
|
||||
# Terraform version and plugin versions
|
||||
|
||||
terraform {
|
||||
required_version = "~> 0.12.6"
|
||||
required_version = ">= 0.12.26, < 0.14.0"
|
||||
required_providers {
|
||||
digitalocean = "~> 1.16"
|
||||
ct = "~> 0.4"
|
||||
template = "~> 2.1"
|
||||
null = "~> 2.1"
|
||||
template = "~> 2.1"
|
||||
null = "~> 2.1"
|
||||
|
||||
ct = {
|
||||
source = "poseidon/ct"
|
||||
version = "~> 0.6.1"
|
||||
}
|
||||
|
||||
digitalocean = {
|
||||
source = "digitalocean/digitalocean"
|
||||
version = "~> 1.20"
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
39
docs/addons/fleetlock.md
Normal file
39
docs/addons/fleetlock.md
Normal file
@ -0,0 +1,39 @@
|
||||
## fleetlock
|
||||
|
||||
[fleetlock](https://github.com/poseidon/fleetlock) is a reboot coordinator for Fedora CoreOS nodes. It implements the [FleetLock](https://github.com/coreos/airlock/pull/1/files) protocol for use as a [Zincati](https://github.com/coreos/zincati) lock [strategy](https://github.com/coreos/zincati/blob/master/docs/usage/updates-strategy.md) backend.
|
||||
|
||||
Declare a Zincati `fleet_lock` strategy when provisioning Fedora CoreOS nodes via [snippets](/advanced/customization/#hosts).
|
||||
|
||||
```yaml
|
||||
variant: fcos
|
||||
version: 1.1.0
|
||||
storage:
|
||||
files:
|
||||
- path: /etc/zincati/config.d/55-update-strategy.toml
|
||||
contents:
|
||||
inline: |
|
||||
[updates]
|
||||
strategy = "fleet_lock"
|
||||
[updates.fleet_lock]
|
||||
base_url = "http://10.3.0.15/"
|
||||
```
|
||||
|
||||
```tf
|
||||
module "nemo" {
|
||||
...
|
||||
controller_snippets = [
|
||||
file("./snippets/zincati-strategy.yaml"),
|
||||
]
|
||||
worker_snippets = [
|
||||
file("./snippets/zincati-strategy.yaml"),
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
Apply fleetlock based on the example manifests.
|
||||
|
||||
```sh
|
||||
git clone git@github.com:poseidon/fleetlock.git
|
||||
kubectl apply -f examples/k8s
|
||||
```
|
||||
|
@ -1,8 +1,9 @@
|
||||
# Addons
|
||||
|
||||
Every Typhoon cluster is verified to work well with several post-install addons.
|
||||
Typhoon clusters are verified to work well with several post-install addons.
|
||||
|
||||
* Nginx [Ingress Controller](ingress.md)
|
||||
* [Prometheus](prometheus.md)
|
||||
* [Grafana](grafana.md)
|
||||
* [fleetlock](fleetlock.md)
|
||||
|
||||
|
@ -37,7 +37,7 @@ For example, ensure an `/opt/hello` file is created with permissions 0644.
|
||||
```yaml
|
||||
# custom-files
|
||||
variant: fcos
|
||||
version: 1.0.0
|
||||
version: 1.1.0
|
||||
storage:
|
||||
files:
|
||||
- path: /opt/hello
|
||||
@ -83,7 +83,7 @@ module "mercury" {
|
||||
}
|
||||
```
|
||||
|
||||
### Container Linux
|
||||
### Flatcar Linux
|
||||
|
||||
Define a Container Linux Config (CLC) ([config](https://github.com/coreos/container-linux-config-transpiler/blob/master/doc/configuration.md), [examples](https://github.com/coreos/container-linux-config-transpiler/blob/master/doc/examples.md)) in version control near your Terraform workspace directory (e.g. perhaps in a `snippets` subdirectory). You may organize snippets into multiple files, if desired.
|
||||
|
||||
@ -125,7 +125,7 @@ systemd:
|
||||
Environment="ETCD_LOG_PACKAGE_LEVELS=etcdserver=WARNING,security=DEBUG"
|
||||
```
|
||||
|
||||
Reference the CLC contents by location (e.g. `file("./custom-units.yaml")`). On [AWS](/cl/aws/#cluster), [Azure](/cl/azure/#cluster), [DigitalOcean](/cl/digital-ocean/#cluster), or [Google Cloud](/cl/google-cloud/#cluster) extend the `controller_snippets` or `worker_snippets` list variables.
|
||||
Reference the CLC contents by location (e.g. `file("./custom-units.yaml")`). On [AWS](/flatcar-linux/aws/#cluster), [Azure](/flatcar-linux/azure/#cluster), [DigitalOcean](/flatcar-linux/digital-ocean/#cluster), or [Google Cloud](/flatcar-linux/google-cloud/#cluster) extend the `controller_snippets` or `worker_snippets` list variables.
|
||||
|
||||
```tf
|
||||
module "nemo" {
|
||||
@ -145,7 +145,7 @@ module "nemo" {
|
||||
}
|
||||
```
|
||||
|
||||
On [Bare-Metal](/cl/bare-metal/#cluster), different CLCs may be used for each node (since hardware may be heterogeneous). Extend the `snippets` map variable by mapping a controller or worker name key to a list of snippets.
|
||||
On [Bare-Metal](/flatcar-linux/bare-metal/#cluster), different CLCs may be used for each node (since hardware may be heterogeneous). Extend the `snippets` map variable by mapping a controller or worker name key to a list of snippets.
|
||||
|
||||
```tf
|
||||
module "mercury" {
|
||||
@ -183,7 +183,7 @@ To set an alternative Kubelet image, use a snippet to set a systemd dropin.
|
||||
```
|
||||
# host-image-override.yaml
|
||||
variant: fcos <- remove for Flatcar Linux
|
||||
version: 1.0.0 <- remove for Flatcar Linux
|
||||
version: 1.1.0 <- remove for Flatcar Linux
|
||||
systemd:
|
||||
units:
|
||||
- name: kubelet.service
|
||||
|
@ -15,26 +15,51 @@ Internal Terraform Modules:
|
||||
|
||||
Create a cluster following the AWS [tutorial](../flatcar-linux/aws.md#cluster). Define a worker pool using the AWS internal `workers` module.
|
||||
|
||||
```tf
|
||||
module "tempest-worker-pool" {
|
||||
source = "git::https://github.com/poseidon/typhoon//aws/container-linux/kubernetes/workers?ref=v1.14.3"
|
||||
=== "Fedora CoreOS"
|
||||
|
||||
# AWS
|
||||
vpc_id = module.tempest.vpc_id
|
||||
subnet_ids = module.tempest.subnet_ids
|
||||
security_groups = module.tempest.worker_security_groups
|
||||
```tf
|
||||
module "tempest-worker-pool" {
|
||||
source = "git::https://github.com/poseidon/typhoon//aws/fedora-coreos/kubernetes/workers?ref=v1.19.2"
|
||||
|
||||
# configuration
|
||||
name = "tempest-pool"
|
||||
kubeconfig = module.tempest.kubeconfig
|
||||
ssh_authorized_key = var.ssh_authorized_key
|
||||
# AWS
|
||||
vpc_id = module.tempest.vpc_id
|
||||
subnet_ids = module.tempest.subnet_ids
|
||||
security_groups = module.tempest.worker_security_groups
|
||||
|
||||
# optional
|
||||
worker_count = 2
|
||||
instance_type = "m5.large"
|
||||
os_image = "flatcar-beta"
|
||||
}
|
||||
```
|
||||
# configuration
|
||||
name = "tempest-pool"
|
||||
kubeconfig = module.tempest.kubeconfig
|
||||
ssh_authorized_key = var.ssh_authorized_key
|
||||
|
||||
# optional
|
||||
worker_count = 2
|
||||
instance_type = "m5.large"
|
||||
os_stream = "next"
|
||||
}
|
||||
```
|
||||
|
||||
=== "Flatcar Linux"
|
||||
|
||||
```tf
|
||||
module "tempest-worker-pool" {
|
||||
source = "git::https://github.com/poseidon/typhoon//aws/container-linux/kubernetes/workers?ref=v1.19.2"
|
||||
|
||||
# AWS
|
||||
vpc_id = module.tempest.vpc_id
|
||||
subnet_ids = module.tempest.subnet_ids
|
||||
security_groups = module.tempest.worker_security_groups
|
||||
|
||||
# configuration
|
||||
name = "tempest-pool"
|
||||
kubeconfig = module.tempest.kubeconfig
|
||||
ssh_authorized_key = var.ssh_authorized_key
|
||||
|
||||
# optional
|
||||
worker_count = 2
|
||||
instance_type = "m5.large"
|
||||
os_image = "flatcar-beta"
|
||||
}
|
||||
```
|
||||
|
||||
Apply the change.
|
||||
|
||||
@ -65,13 +90,13 @@ The AWS internal `workers` module supports a number of [variables](https://githu
|
||||
|:-----|:------------|:--------|:--------|
|
||||
| worker_count | Number of instances | 1 | 3 |
|
||||
| instance_type | EC2 instance type | "t3.small" | "t3.medium" |
|
||||
| os_image | AMI channel for a Container Linux derivative | "flatcar-stable" | flatcar-stable, flatcar-beta, flatcar-alph, flatcar-edge |
|
||||
| os_image | AMI channel for a Container Linux derivative | "flatcar-stable" | flatcar-stable, flatcar-beta, flatcar-alpha, flatcar-edge |
|
||||
| os_stream | Fedora CoreOS stream for compute instances | "stable" | "testing", "next" |
|
||||
| disk_size | Size of the EBS volume in GB | 40 | 100 |
|
||||
| disk_type | Type of the EBS volume | "gp2" | standard, gp2, io1 |
|
||||
| disk_iops | IOPS of the EBS volume | 0 (i.e. auto) | 400 |
|
||||
| spot_price | Spot price in USD for worker instances or 0 to use on-demand instances | 0 | 0.10 |
|
||||
| snippets | Container Linux Config snippets | [] | [examples](/advanced/customization/) |
|
||||
| snippets | Fedora CoreOS or Container Linux Config snippets | [] | [examples](/advanced/customization/) |
|
||||
| service_cidr | Must match `service_cidr` of cluster | "10.3.0.0/16" | "10.3.0.0/24" |
|
||||
| node_labels | List of initial node labels | [] | ["worker-pool=foo"] |
|
||||
|
||||
@ -81,28 +106,57 @@ Check the list of valid [instance types](https://aws.amazon.com/ec2/instance-typ
|
||||
|
||||
Create a cluster following the Azure [tutorial](../flatcar-linux/azure.md#cluster). Define a worker pool using the Azure internal `workers` module.
|
||||
|
||||
```tf
|
||||
module "ramius-worker-pool" {
|
||||
source = "git::https://github.com/poseidon/typhoon//azure/container-linux/kubernetes/workers?ref=v1.18.5"
|
||||
=== "Fedora CoreOS"
|
||||
|
||||
# Azure
|
||||
region = module.ramius.region
|
||||
resource_group_name = module.ramius.resource_group_name
|
||||
subnet_id = module.ramius.subnet_id
|
||||
security_group_id = module.ramius.security_group_id
|
||||
backend_address_pool_id = module.ramius.backend_address_pool_id
|
||||
```tf
|
||||
module "ramius-worker-pool" {
|
||||
source = "git::https://github.com/poseidon/typhoon//azure/fedora-coreos/kubernetes/workers?ref=v1.19.2"
|
||||
|
||||
# configuration
|
||||
name = "ramius-spot"
|
||||
kubeconfig = module.ramius.kubeconfig
|
||||
ssh_authorized_key = var.ssh_authorized_key
|
||||
# Azure
|
||||
region = module.ramius.region
|
||||
resource_group_name = module.ramius.resource_group_name
|
||||
subnet_id = module.ramius.subnet_id
|
||||
security_group_id = module.ramius.security_group_id
|
||||
backend_address_pool_id = module.ramius.backend_address_pool_id
|
||||
|
||||
# optional
|
||||
worker_count = 2
|
||||
vm_type = "Standard_F4"
|
||||
priority = "Spot"
|
||||
}
|
||||
```
|
||||
# configuration
|
||||
name = "ramius-spot"
|
||||
kubeconfig = module.ramius.kubeconfig
|
||||
ssh_authorized_key = var.ssh_authorized_key
|
||||
|
||||
# optional
|
||||
worker_count = 2
|
||||
vm_type = "Standard_F4"
|
||||
priority = "Spot"
|
||||
os_image = "/subscriptions/some/path/Microsoft.Compute/images/fedora-coreos-31.20200323.3.2"
|
||||
}
|
||||
```
|
||||
|
||||
=== "Flatcar Linux"
|
||||
|
||||
```tf
|
||||
module "ramius-worker-pool" {
|
||||
source = "git::https://github.com/poseidon/typhoon//azure/container-linux/kubernetes/workers?ref=v1.19.2"
|
||||
|
||||
# Azure
|
||||
region = module.ramius.region
|
||||
resource_group_name = module.ramius.resource_group_name
|
||||
subnet_id = module.ramius.subnet_id
|
||||
security_group_id = module.ramius.security_group_id
|
||||
backend_address_pool_id = module.ramius.backend_address_pool_id
|
||||
|
||||
# configuration
|
||||
name = "ramius-spot"
|
||||
kubeconfig = module.ramius.kubeconfig
|
||||
ssh_authorized_key = var.ssh_authorized_key
|
||||
|
||||
# optional
|
||||
worker_count = 2
|
||||
vm_type = "Standard_F4"
|
||||
priority = "Spot"
|
||||
os_image = "flatcar-beta"
|
||||
}
|
||||
```
|
||||
|
||||
Apply the change.
|
||||
|
||||
@ -147,27 +201,53 @@ Check the list of valid [machine types](https://azure.microsoft.com/en-us/pricin
|
||||
|
||||
Create a cluster following the Google Cloud [tutorial](../flatcar-linux/google-cloud.md#cluster). Define a worker pool using the Google Cloud internal `workers` module.
|
||||
|
||||
```tf
|
||||
module "yavin-worker-pool" {
|
||||
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes/workers?ref=v1.18.5"
|
||||
=== "Fedora CoreOS"
|
||||
|
||||
# Google Cloud
|
||||
region = "europe-west2"
|
||||
network = module.yavin.network_name
|
||||
cluster_name = "yavin"
|
||||
```tf
|
||||
module "yavin-worker-pool" {
|
||||
source = "git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes/workers?ref=v1.19.2"
|
||||
|
||||
# configuration
|
||||
name = "yavin-16x"
|
||||
kubeconfig = module.yavin.kubeconfig
|
||||
ssh_authorized_key = var.ssh_authorized_key
|
||||
# Google Cloud
|
||||
region = "europe-west2"
|
||||
network = module.yavin.network_name
|
||||
cluster_name = "yavin"
|
||||
|
||||
# optional
|
||||
worker_count = 2
|
||||
machine_type = "n1-standard-16"
|
||||
os_image = "coreos-beta"
|
||||
preemptible = true
|
||||
}
|
||||
```
|
||||
# configuration
|
||||
name = "yavin-16x"
|
||||
kubeconfig = module.yavin.kubeconfig
|
||||
ssh_authorized_key = var.ssh_authorized_key
|
||||
|
||||
# optional
|
||||
worker_count = 2
|
||||
machine_type = "n1-standard-16"
|
||||
os_stream = "testing"
|
||||
preemptible = true
|
||||
}
|
||||
```
|
||||
|
||||
=== "Flatcar Linux"
|
||||
|
||||
```tf
|
||||
module "yavin-worker-pool" {
|
||||
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes/workers?ref=v1.19.2"
|
||||
|
||||
# Google Cloud
|
||||
region = "europe-west2"
|
||||
network = module.yavin.network_name
|
||||
cluster_name = "yavin"
|
||||
|
||||
# configuration
|
||||
name = "yavin-16x"
|
||||
kubeconfig = module.yavin.kubeconfig
|
||||
ssh_authorized_key = var.ssh_authorized_key
|
||||
|
||||
# optional
|
||||
worker_count = 2
|
||||
machine_type = "n1-standard-16"
|
||||
os_image = "flatcar-linux-2303-4-0" # custom
|
||||
preemptible = true
|
||||
}
|
||||
```
|
||||
|
||||
Apply the change.
|
||||
|
||||
@ -180,11 +260,11 @@ Verify a managed instance group of workers joins the cluster within a few minute
|
||||
```
|
||||
$ kubectl get nodes
|
||||
NAME STATUS AGE VERSION
|
||||
yavin-controller-0.c.example-com.internal Ready 6m v1.18.5
|
||||
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.18.5
|
||||
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.18.5
|
||||
yavin-16x-worker-jrbf.c.example-com.internal Ready 3m v1.18.5
|
||||
yavin-16x-worker-mzdm.c.example-com.internal Ready 3m v1.18.5
|
||||
yavin-controller-0.c.example-com.internal Ready 6m v1.19.2
|
||||
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.19.2
|
||||
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.19.2
|
||||
yavin-16x-worker-jrbf.c.example-com.internal Ready 3m v1.19.2
|
||||
yavin-16x-worker-mzdm.c.example-com.internal Ready 3m v1.19.2
|
||||
```
|
||||
|
||||
### Variables
|
||||
|
@ -16,10 +16,10 @@ Together, they diversify Typhoon to support a range of container technologies.
|
||||
|
||||
| Property | Flatcar Linux | Fedora CoreOS |
|
||||
|-------------------|---------------------------------|---------------|
|
||||
| Kernel | ~4.19.x | ~5.5.x |
|
||||
| Kernel | ~4.19.x | ~5.7.x |
|
||||
| systemd | 241 | 243 |
|
||||
| Ignition system | Ignition v2.x spec | Ignition v3.x spec |
|
||||
| Container Engine | docker 18.06.3-ce | docker 18.09.8 |
|
||||
| Container Engine | docker 18.06.3-ce | docker 19.03.11 |
|
||||
| storage driver | overlay2 (extfs) | overlay2 (xfs) |
|
||||
| logging driver | json-file | journald |
|
||||
| cgroup driver | cgroupfs (except Flatcar edge) | systemd |
|
||||
@ -37,8 +37,8 @@ Together, they diversify Typhoon to support a range of container technologies.
|
||||
| control plane images | upstream images | upstream images |
|
||||
| on-host etcd | rkt-fly | podman |
|
||||
| on-host kubelet | rkt-fly | podman |
|
||||
| CNI plugins | calico or flannel | calico or flannel |
|
||||
| coordinated drain & OS update | [CLUO](https://github.com/coreos/container-linux-update-operator) addon | (planned) |
|
||||
| CNI plugins | calico, cilium, flannel | calico, cilium, flannel |
|
||||
| coordinated drain & OS update | [FLUO](https://github.com/kinvolk/flatcar-linux-update-operator) addon | [fleetlock](https://github.com/poseidon/fleetlock) |
|
||||
|
||||
## Directory Locations
|
||||
|
||||
|
@ -1,6 +1,6 @@
|
||||
# AWS
|
||||
|
||||
In this tutorial, we'll create a Kubernetes v1.18.5 cluster on AWS with Fedora CoreOS.
|
||||
In this tutorial, we'll create a Kubernetes v1.19.2 cluster on AWS with Fedora CoreOS.
|
||||
|
||||
We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create a VPC, gateway, subnets, security groups, controller instances, worker auto-scaling group, network load balancer, and TLS assets.
|
||||
|
||||
@ -10,23 +10,15 @@ Controller hosts are provisioned to run an `etcd-member` peer and a `kubelet` se
|
||||
|
||||
* AWS Account and IAM credentials
|
||||
* AWS Route53 DNS Zone (registered Domain Name or delegated subdomain)
|
||||
* Terraform v0.12.6+ and [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) installed locally
|
||||
* Terraform v0.13.0+
|
||||
|
||||
## Terraform Setup
|
||||
|
||||
Install [Terraform](https://www.terraform.io/downloads.html) v0.12.6+ on your system.
|
||||
Install [Terraform](https://www.terraform.io/downloads.html) v0.13.0+ on your system.
|
||||
|
||||
```sh
|
||||
$ terraform version
|
||||
Terraform v0.12.21
|
||||
```
|
||||
|
||||
Add the [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) plugin binary for your system to `~/.terraform.d/plugins/`, noting the final name.
|
||||
|
||||
```sh
|
||||
wget https://github.com/poseidon/terraform-provider-ct/releases/download/v0.5.0/terraform-provider-ct-v0.5.0-linux-amd64.tar.gz
|
||||
tar xzf terraform-provider-ct-v0.5.0-linux-amd64.tar.gz
|
||||
mv terraform-provider-ct-v0.5.0-linux-amd64/terraform-provider-ct ~/.terraform.d/plugins/terraform-provider-ct_v0.5.0
|
||||
Terraform v0.13.0
|
||||
```
|
||||
|
||||
Read [concepts](/architecture/concepts/) to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. `infra`).
|
||||
@ -49,13 +41,23 @@ Configure the AWS provider to use your access key credentials in a `providers.tf
|
||||
|
||||
```tf
|
||||
provider "aws" {
|
||||
version = "2.68.0"
|
||||
region = "eu-central-1"
|
||||
shared_credentials_file = "/home/user/.config/aws/credentials"
|
||||
}
|
||||
|
||||
provider "ct" {
|
||||
version = "0.5.0"
|
||||
provider "ct" {}
|
||||
|
||||
terraform {
|
||||
required_providers {
|
||||
ct = {
|
||||
source = "poseidon/ct"
|
||||
version = "0.6.1"
|
||||
}
|
||||
aws = {
|
||||
source = "hashicorp/aws"
|
||||
version = "3.6.0"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
@ -70,7 +72,7 @@ Define a Kubernetes cluster using the module `aws/fedora-coreos/kubernetes`.
|
||||
|
||||
```tf
|
||||
module "tempest" {
|
||||
source = "git::https://github.com/poseidon/typhoon//aws/fedora-coreos/kubernetes?ref=v1.18.5"
|
||||
source = "git::https://github.com/poseidon/typhoon//aws/fedora-coreos/kubernetes?ref=v1.19.2"
|
||||
|
||||
# AWS
|
||||
cluster_name = "tempest"
|
||||
@ -143,9 +145,9 @@ List nodes in the cluster.
|
||||
$ export KUBECONFIG=/home/user/.kube/configs/tempest-config
|
||||
$ kubectl get nodes
|
||||
NAME STATUS ROLES AGE VERSION
|
||||
ip-10-0-3-155 Ready <none> 10m v1.18.5
|
||||
ip-10-0-26-65 Ready <none> 10m v1.18.5
|
||||
ip-10-0-41-21 Ready <none> 10m v1.18.5
|
||||
ip-10-0-3-155 Ready <none> 10m v1.19.2
|
||||
ip-10-0-26-65 Ready <none> 10m v1.19.2
|
||||
ip-10-0-41-21 Ready <none> 10m v1.19.2
|
||||
```
|
||||
|
||||
List the pods.
|
||||
@ -216,7 +218,7 @@ Reference the DNS zone id with `aws_route53_zone.zone-for-clusters.zone_id`.
|
||||
| worker_price | Spot price in USD for worker instances or 0 to use on-demand instances | 0 | 0.10 |
|
||||
| controller_snippets | Controller Fedora CoreOS Config snippets | [] | [examples](/advanced/customization/) |
|
||||
| worker_snippets | Worker Fedora CoreOS Config snippets | [] | [examples](/advanced/customization/) |
|
||||
| networking | Choice of networking provider | "calico" | "calico" or "flannel" |
|
||||
| networking | Choice of networking provider | "calico" | "calico" or "cilium" or "flannel" |
|
||||
| network_mtu | CNI interface MTU (calico only) | 1480 | 8981 |
|
||||
| host_cidr | CIDR IPv4 range to assign to EC2 instances | "10.0.0.0/16" | "10.1.0.0/16" |
|
||||
| pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
|
||||
|
@ -1,6 +1,6 @@
|
||||
# Azure
|
||||
|
||||
In this tutorial, we'll create a Kubernetes v1.18.5 cluster on Azure with Fedora CoreOS.
|
||||
In this tutorial, we'll create a Kubernetes v1.19.2 cluster on Azure with Fedora CoreOS.
|
||||
|
||||
We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create a resource group, virtual network, subnets, security groups, controller availability set, worker scale set, load balancer, and TLS assets.
|
||||
|
||||
@ -10,23 +10,15 @@ Controller hosts are provisioned to run an `etcd-member` peer and a `kubelet` se
|
||||
|
||||
* Azure account
|
||||
* Azure DNS Zone (registered Domain Name or delegated subdomain)
|
||||
* Terraform v0.12.6+ and [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) installed locally
|
||||
* Terraform v0.13.0+
|
||||
|
||||
## Terraform Setup
|
||||
|
||||
Install [Terraform](https://www.terraform.io/downloads.html) v0.12.6+ on your system.
|
||||
Install [Terraform](https://www.terraform.io/downloads.html) v0.13.0+ on your system.
|
||||
|
||||
```sh
|
||||
$ terraform version
|
||||
Terraform v0.12.21
|
||||
```
|
||||
|
||||
Add the [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) plugin binary for your system to `~/.terraform.d/plugins/`, noting the final name.
|
||||
|
||||
```sh
|
||||
wget https://github.com/poseidon/terraform-provider-ct/releases/download/v0.5.0/terraform-provider-ct-v0.5.0-linux-amd64.tar.gz
|
||||
tar xzf terraform-provider-ct-v0.5.0-linux-amd64.tar.gz
|
||||
mv terraform-provider-ct-v0.5.0-linux-amd64/terraform-provider-ct ~/.terraform.d/plugins/terraform-provider-ct_v0.5.0
|
||||
Terraform v0.13.0
|
||||
```
|
||||
|
||||
Read [concepts](/architecture/concepts/) to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. `infra`).
|
||||
@ -47,11 +39,22 @@ Configure the Azure provider in a `providers.tf` file.
|
||||
|
||||
```tf
|
||||
provider "azurerm" {
|
||||
version = "2.16.0"
|
||||
features {}
|
||||
}
|
||||
|
||||
provider "ct" {
|
||||
version = "0.5.0"
|
||||
provider "ct" {}
|
||||
|
||||
terraform {
|
||||
required_providers {
|
||||
ct = {
|
||||
source = "poseidon/ct"
|
||||
version = "0.6.1"
|
||||
}
|
||||
azurerm = {
|
||||
source = "hashicorp/azurerm"
|
||||
version = "2.27.0"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
@ -83,7 +86,7 @@ Define a Kubernetes cluster using the module `azure/fedora-coreos/kubernetes`.
|
||||
|
||||
```tf
|
||||
module "ramius" {
|
||||
source = "git::https://github.com/poseidon/typhoon//azure/fedora-coreos/kubernetes?ref=v1.18.5"
|
||||
source = "git::https://github.com/poseidon/typhoon//azure/fedora-coreos/kubernetes?ref=v1.19.2"
|
||||
|
||||
# Azure
|
||||
cluster_name = "ramius"
|
||||
@ -158,9 +161,9 @@ List nodes in the cluster.
|
||||
$ export KUBECONFIG=/home/user/.kube/configs/ramius-config
|
||||
$ kubectl get nodes
|
||||
NAME STATUS ROLES AGE VERSION
|
||||
ramius-controller-0 Ready <none> 24m v1.18.5
|
||||
ramius-worker-000001 Ready <none> 25m v1.18.5
|
||||
ramius-worker-000002 Ready <none> 24m v1.18.5
|
||||
ramius-controller-0 Ready <none> 24m v1.19.2
|
||||
ramius-worker-000001 Ready <none> 25m v1.19.2
|
||||
ramius-worker-000002 Ready <none> 24m v1.19.2
|
||||
```
|
||||
|
||||
List the pods.
|
||||
@ -242,7 +245,7 @@ Reference the DNS zone with `azurerm_dns_zone.clusters.name` and its resource gr
|
||||
| worker_priority | Set priority to Spot to use reduced cost surplus capacity, with the tradeoff that instances can be deallocated at any time | Regular | Spot |
|
||||
| controller_snippets | Controller Fedora CoreOS Config snippets | [] | [example](/advanced/customization/#usage) |
|
||||
| worker_snippets | Worker Fedora CoreOS Config snippets | [] | [example](/advanced/customization/#usage) |
|
||||
| networking | Choice of networking provider | "calico" | "flannel" or "calico" |
|
||||
| networking | Choice of networking provider | "calico" | "calico" or "cilium" or "flannel" |
|
||||
| host_cidr | CIDR IPv4 range to assign to instances | "10.0.0.0/16" | "10.0.0.0/20" |
|
||||
| pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
|
||||
| service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
|
||||
|
@ -1,6 +1,6 @@
|
||||
# Bare-Metal
|
||||
|
||||
In this tutorial, we'll network boot and provision a Kubernetes v1.18.5 cluster on bare-metal with Fedora CoreOS.
|
||||
In this tutorial, we'll network boot and provision a Kubernetes v1.19.2 cluster on bare-metal with Fedora CoreOS.
|
||||
|
||||
First, we'll deploy a [Matchbox](https://github.com/poseidon/matchbox) service and setup a network boot environment. Then, we'll declare a Kubernetes cluster using the Typhoon Terraform module and power on machines. On PXE boot, machines will install Fedora CoreOS to disk, reboot into the disk install, and provision themselves as Kubernetes controllers or workers via Ignition.
|
||||
|
||||
@ -12,7 +12,7 @@ Controller hosts are provisioned to run an `etcd-member` peer and a `kubelet` se
|
||||
* PXE-enabled [network boot](https://coreos.com/matchbox/docs/latest/network-setup.html) environment (with HTTPS support)
|
||||
* Matchbox v0.6+ deployment with API enabled
|
||||
* Matchbox credentials `client.crt`, `client.key`, `ca.crt`
|
||||
* Terraform v0.12.6+, [terraform-provider-matchbox](https://github.com/poseidon/terraform-provider-matchbox), and [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) installed locally
|
||||
* Terraform v0.13.0+
|
||||
|
||||
## Machines
|
||||
|
||||
@ -107,27 +107,11 @@ Read about the [many ways](https://coreos.com/matchbox/docs/latest/network-setup
|
||||
|
||||
## Terraform Setup
|
||||
|
||||
Install [Terraform](https://www.terraform.io/downloads.html) v0.12.6+ on your system.
|
||||
Install [Terraform](https://www.terraform.io/downloads.html) v0.13.0+ on your system.
|
||||
|
||||
```sh
|
||||
$ terraform version
|
||||
Terraform v0.12.21
|
||||
```
|
||||
|
||||
Add the [terraform-provider-matchbox](https://github.com/poseidon/terraform-provider-matchbox) plugin binary for your system to `~/.terraform.d/plugins/`, noting the final name.
|
||||
|
||||
```sh
|
||||
wget https://github.com/poseidon/terraform-provider-matchbox/releases/download/v0.3.0/terraform-provider-matchbox-v0.3.0-linux-amd64.tar.gz
|
||||
tar xzf terraform-provider-matchbox-v0.3.0-linux-amd64.tar.gz
|
||||
mv terraform-provider-matchbox-v0.3.0-linux-amd64/terraform-provider-matchbox ~/.terraform.d/plugins/terraform-provider-matchbox_v0.3.0
|
||||
```
|
||||
|
||||
Add the [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) plugin binary for your system to `~/.terraform.d/plugins/`, noting the final name.
|
||||
|
||||
```sh
|
||||
wget https://github.com/poseidon/terraform-provider-ct/releases/download/v0.5.0/terraform-provider-ct-v0.5.0-linux-amd64.tar.gz
|
||||
tar xzf terraform-provider-ct-v0.5.0-linux-amd64.tar.gz
|
||||
mv terraform-provider-ct-v0.5.0-linux-amd64/terraform-provider-ct ~/.terraform.d/plugins/terraform-provider-ct_v0.5.0
|
||||
Terraform v0.13.0
|
||||
```
|
||||
|
||||
Read [concepts](/architecture/concepts/) to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. `infra`).
|
||||
@ -142,15 +126,25 @@ Configure the Matchbox provider to use your Matchbox API endpoint and client cer
|
||||
|
||||
```tf
|
||||
provider "matchbox" {
|
||||
version = "0.3.0"
|
||||
endpoint = "matchbox.example.com:8081"
|
||||
client_cert = file("~/.config/matchbox/client.crt")
|
||||
client_key = file("~/.config/matchbox/client.key")
|
||||
ca = file("~/.config/matchbox/ca.crt")
|
||||
}
|
||||
|
||||
provider "ct" {
|
||||
version = "0.5.0"
|
||||
provider "ct" {}
|
||||
|
||||
terraform {
|
||||
required_providers {
|
||||
ct = {
|
||||
source = "poseidon/ct"
|
||||
version = "0.6.1"
|
||||
}
|
||||
matchbox = {
|
||||
source = "poseidon/matchbox"
|
||||
version = "0.4.1"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
@ -160,7 +154,7 @@ Define a Kubernetes cluster using the module `bare-metal/fedora-coreos/kubernete
|
||||
|
||||
```tf
|
||||
module "mercury" {
|
||||
source = "git::https://github.com/poseidon/typhoon//bare-metal/fedora-coreos/kubernetes?ref=v1.18.5"
|
||||
source = "git::https://github.com/poseidon/typhoon//bare-metal/fedora-coreos/kubernetes?ref=v1.19.2"
|
||||
|
||||
# bare-metal
|
||||
cluster_name = "mercury"
|
||||
@ -289,9 +283,9 @@ List nodes in the cluster.
|
||||
$ export KUBECONFIG=/home/user/.kube/configs/mercury-config
|
||||
$ kubectl get nodes
|
||||
NAME STATUS ROLES AGE VERSION
|
||||
node1.example.com Ready <none> 10m v1.18.5
|
||||
node2.example.com Ready <none> 10m v1.18.5
|
||||
node3.example.com Ready <none> 10m v1.18.5
|
||||
node1.example.com Ready <none> 10m v1.19.2
|
||||
node2.example.com Ready <none> 10m v1.19.2
|
||||
node3.example.com Ready <none> 10m v1.19.2
|
||||
```
|
||||
|
||||
List the pods.
|
||||
@ -339,7 +333,7 @@ Check the [variables.tf](https://github.com/poseidon/typhoon/blob/master/bare-me
|
||||
|:-----|:------------|:--------|:--------|
|
||||
| cached_install | PXE boot and install from the Matchbox `/assets` cache. Admin MUST have downloaded Fedora CoreOS images into the cache | false | true |
|
||||
| install_disk | Disk device where Fedora CoreOS should be installed | "sda" (not "/dev/sda" like Container Linux) | "sdb" |
|
||||
| networking | Choice of networking provider | "calico" | "calico" or "flannel" |
|
||||
| networking | Choice of networking provider | "calico" | "calico" or "cilium" or "flannel" |
|
||||
| network_mtu | CNI interface MTU (calico-only) | 1480 | - |
|
||||
| snippets | Map from machine names to lists of Fedora CoreOS Config snippets | {} | [examples](/advanced/customization/) |
|
||||
| network_ip_autodetection_method | Method to detect host IPv4 address (calico-only) | "first-found" | "can-reach=10.0.0.1" |
|
||||
|
@ -1,6 +1,6 @@
|
||||
# DigitalOcean
|
||||
|
||||
In this tutorial, we'll create a Kubernetes v1.18.5 cluster on DigitalOcean with Fedora CoreOS.
|
||||
In this tutorial, we'll create a Kubernetes v1.19.2 cluster on DigitalOcean with Fedora CoreOS.
|
||||
|
||||
We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create controller droplets, worker droplets, DNS records, tags, and TLS assets.
|
||||
|
||||
@ -10,23 +10,15 @@ Controller hosts are provisioned to run an `etcd-member` peer and a `kubelet` se
|
||||
|
||||
* Digital Ocean Account and Token
|
||||
* Digital Ocean Domain (registered Domain Name or delegated subdomain)
|
||||
* Terraform v0.12.6+ and [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) installed locally
|
||||
* Terraform v0.13.0+
|
||||
|
||||
## Terraform Setup
|
||||
|
||||
Install [Terraform](https://www.terraform.io/downloads.html) v0.12.6+ on your system.
|
||||
Install [Terraform](https://www.terraform.io/downloads.html) v0.13.0+ on your system.
|
||||
|
||||
```sh
|
||||
$ terraform version
|
||||
Terraform v0.12.21
|
||||
```
|
||||
|
||||
Add the [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) plugin binary for your system to `~/.terraform.d/plugins/`, noting the final name.
|
||||
|
||||
```sh
|
||||
wget https://github.com/poseidon/terraform-provider-ct/releases/download/v0.5.0/terraform-provider-ct-v0.5.0-linux-amd64.tar.gz
|
||||
tar xzf terraform-provider-ct-v0.5.0-linux-amd64.tar.gz
|
||||
mv terraform-provider-ct-v0.5.0-linux-amd64/terraform-provider-ct ~/.terraform.d/plugins/terraform-provider-ct_v0.5.0
|
||||
Terraform v0.13.0
|
||||
```
|
||||
|
||||
Read [concepts](/architecture/concepts/) to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. `infra`).
|
||||
@ -50,12 +42,22 @@ Configure the DigitalOcean provider to use your token in a `providers.tf` file.
|
||||
|
||||
```tf
|
||||
provider "digitalocean" {
|
||||
version = "1.20.0"
|
||||
token = "${chomp(file("~/.config/digital-ocean/token"))}"
|
||||
}
|
||||
|
||||
provider "ct" {
|
||||
version = "0.5.0"
|
||||
provider "ct" {}
|
||||
|
||||
terraform {
|
||||
required_providers {
|
||||
ct = {
|
||||
source = "poseidon/ct"
|
||||
version = "0.6.1"
|
||||
}
|
||||
digitalocean = {
|
||||
source = "digitalocean/digitalocean"
|
||||
version = "1.22.1"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
@ -79,7 +81,7 @@ Define a Kubernetes cluster using the module `digital-ocean/fedora-coreos/kubern
|
||||
|
||||
```tf
|
||||
module "nemo" {
|
||||
source = "git::https://github.com/poseidon/typhoon//digital-ocean/fedora-coreos/kubernetes?ref=v1.18.5"
|
||||
source = "git::https://github.com/poseidon/typhoon//digital-ocean/fedora-coreos/kubernetes?ref=v1.19.2"
|
||||
|
||||
# Digital Ocean
|
||||
cluster_name = "nemo"
|
||||
@ -153,9 +155,9 @@ List nodes in the cluster.
|
||||
$ export KUBECONFIG=/home/user/.kube/configs/nemo-config
|
||||
$ kubectl get nodes
|
||||
NAME STATUS ROLES AGE VERSION
|
||||
10.132.110.130 Ready <none> 10m v1.18.5
|
||||
10.132.115.81 Ready <none> 10m v1.18.5
|
||||
10.132.124.107 Ready <none> 10m v1.18.5
|
||||
10.132.110.130 Ready <none> 10m v1.19.2
|
||||
10.132.115.81 Ready <none> 10m v1.19.2
|
||||
10.132.124.107 Ready <none> 10m v1.19.2
|
||||
```
|
||||
|
||||
List the pods.
|
||||
@ -238,7 +240,7 @@ Digital Ocean requires the SSH public key be uploaded to your account, so you ma
|
||||
| worker_type | Droplet type for workers | "s-1vcpu-2gb" | s-1vcpu-2gb, s-2vcpu-2gb, ... |
|
||||
| controller_snippets | Controller Fedora CoreOS Config snippets | [] | [example](/advanced/customization/) |
|
||||
| worker_snippets | Worker Fedora CoreOS Config snippets | [] | [example](/advanced/customization/) |
|
||||
| networking | Choice of networking provider | "calico" | "flannel" or "calico" |
|
||||
| networking | Choice of networking provider | "calico" | "calico" or "cilium" or "flannel" |
|
||||
| pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
|
||||
| service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
|
||||
|
||||
|
@ -1,6 +1,6 @@
|
||||
# Google Cloud
|
||||
|
||||
In this tutorial, we'll create a Kubernetes v1.18.5 cluster on Google Compute Engine with Fedora CoreOS.
|
||||
In this tutorial, we'll create a Kubernetes v1.19.2 cluster on Google Compute Engine with Fedora CoreOS.
|
||||
|
||||
We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create a network, firewall rules, health checks, controller instances, worker managed instance group, load balancers, and TLS assets.
|
||||
|
||||
@ -10,23 +10,15 @@ Controller hosts are provisioned to run an `etcd-member` peer and a `kubelet` se
|
||||
|
||||
* Google Cloud Account and Service Account
|
||||
* Google Cloud DNS Zone (registered Domain Name or delegated subdomain)
|
||||
* Terraform v0.12.6+ and [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) installed locally
|
||||
* Terraform v0.13.0+
|
||||
|
||||
## Terraform Setup
|
||||
|
||||
Install [Terraform](https://www.terraform.io/downloads.html) v0.12.6+ on your system.
|
||||
Install [Terraform](https://www.terraform.io/downloads.html) v0.13.0+ on your system.
|
||||
|
||||
```sh
|
||||
$ terraform version
|
||||
Terraform v0.12.21
|
||||
```
|
||||
|
||||
Add the [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) plugin binary for your system to `~/.terraform.d/plugins/`, noting the final name.
|
||||
|
||||
```sh
|
||||
wget https://github.com/poseidon/terraform-provider-ct/releases/download/v0.5.0/terraform-provider-ct-v0.5.0-linux-amd64.tar.gz
|
||||
tar xzf terraform-provider-ct-v0.5.0-linux-amd64.tar.gz
|
||||
mv terraform-provider-ct-v0.5.0-linux-amd64/terraform-provider-ct ~/.terraform.d/plugins/terraform-provider-ct_v0.5.0
|
||||
Terraform v0.13.0
|
||||
```
|
||||
|
||||
Read [concepts](/architecture/concepts/) to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. `infra`).
|
||||
@ -49,14 +41,24 @@ Configure the Google Cloud provider to use your service account key, project-id,
|
||||
|
||||
```tf
|
||||
provider "google" {
|
||||
version = "3.27.0"
|
||||
project = "project-id"
|
||||
region = "us-central1"
|
||||
credentials = file("~/.config/google-cloud/terraform.json")
|
||||
}
|
||||
|
||||
provider "ct" {
|
||||
version = "0.5.0"
|
||||
provider "ct" {}
|
||||
|
||||
terraform {
|
||||
required_providers {
|
||||
ct = {
|
||||
source = "poseidon/ct"
|
||||
version = "0.6.1"
|
||||
}
|
||||
google = {
|
||||
source = "hashicorp/google"
|
||||
version = "3.38.0"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
@ -145,9 +147,9 @@ List nodes in the cluster.
|
||||
$ export KUBECONFIG=/home/user/.kube/configs/yavin-config
|
||||
$ kubectl get nodes
|
||||
NAME ROLES STATUS AGE VERSION
|
||||
yavin-controller-0.c.example-com.internal <none> Ready 6m v1.18.5
|
||||
yavin-worker-jrbf.c.example-com.internal <none> Ready 5m v1.18.5
|
||||
yavin-worker-mzdm.c.example-com.internal <none> Ready 5m v1.18.5
|
||||
yavin-controller-0.c.example-com.internal <none> Ready 6m v1.19.2
|
||||
yavin-worker-jrbf.c.example-com.internal <none> Ready 5m v1.19.2
|
||||
yavin-worker-mzdm.c.example-com.internal <none> Ready 5m v1.19.2
|
||||
```
|
||||
|
||||
List the pods.
|
||||
@ -218,7 +220,7 @@ resource "google_dns_managed_zone" "zone-for-clusters" {
|
||||
| worker_preemptible | If enabled, Compute Engine will terminate workers randomly within 24 hours | false | true |
|
||||
| controller_snippets | Controller Fedora CoreOS Config snippets | [] | [examples](/advanced/customization/) |
|
||||
| worker_snippets | Worker Fedora CoreOS Config snippets | [] | [examples](/advanced/customization/) |
|
||||
| networking | Choice of networking provider | "calico" | "calico" or "flannel" |
|
||||
| networking | Choice of networking provider | "calico" | "calico" or "cilium" or "flannel" |
|
||||
| pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
|
||||
| service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
|
||||
| worker_node_labels | List of initial worker node labels | [] | ["worker-pool=default"] |
|
||||
|
@ -1,6 +1,6 @@
|
||||
# AWS
|
||||
|
||||
In this tutorial, we'll create a Kubernetes v1.18.5 cluster on AWS with CoreOS Container Linux or Flatcar Linux.
|
||||
In this tutorial, we'll create a Kubernetes v1.19.2 cluster on AWS with CoreOS Container Linux or Flatcar Linux.
|
||||
|
||||
We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create a VPC, gateway, subnets, security groups, controller instances, worker auto-scaling group, network load balancer, and TLS assets.
|
||||
|
||||
@ -10,23 +10,15 @@ Controller hosts are provisioned to run an `etcd-member` peer and a `kubelet` se
|
||||
|
||||
* AWS Account and IAM credentials
|
||||
* AWS Route53 DNS Zone (registered Domain Name or delegated subdomain)
|
||||
* Terraform v0.12.6+ and [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) installed locally
|
||||
* Terraform v0.13.0+
|
||||
|
||||
## Terraform Setup
|
||||
|
||||
Install [Terraform](https://www.terraform.io/downloads.html) v0.12.6+ on your system.
|
||||
Install [Terraform](https://www.terraform.io/downloads.html) v0.13.0+ on your system.
|
||||
|
||||
```sh
|
||||
$ terraform version
|
||||
Terraform v0.12.21
|
||||
```
|
||||
|
||||
Add the [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) plugin binary for your system to `~/.terraform.d/plugins/`, noting the final name.
|
||||
|
||||
```sh
|
||||
wget https://github.com/poseidon/terraform-provider-ct/releases/download/v0.5.0/terraform-provider-ct-v0.5.0-linux-amd64.tar.gz
|
||||
tar xzf terraform-provider-ct-v0.5.0-linux-amd64.tar.gz
|
||||
mv terraform-provider-ct-v0.5.0-linux-amd64/terraform-provider-ct ~/.terraform.d/plugins/terraform-provider-ct_v0.5.0
|
||||
Terraform v0.13.0
|
||||
```
|
||||
|
||||
Read [concepts](/architecture/concepts/) to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. `infra`).
|
||||
@ -49,13 +41,23 @@ Configure the AWS provider to use your access key credentials in a `providers.tf
|
||||
|
||||
```tf
|
||||
provider "aws" {
|
||||
version = "2.68.0"
|
||||
region = "eu-central-1"
|
||||
shared_credentials_file = "/home/user/.config/aws/credentials"
|
||||
}
|
||||
|
||||
provider "ct" {
|
||||
version = "0.5.0"
|
||||
provider "ct" {}
|
||||
|
||||
terraform {
|
||||
required_providers {
|
||||
ct = {
|
||||
source = "poseidon/ct"
|
||||
version = "0.6.1"
|
||||
}
|
||||
aws = {
|
||||
source = "hashicorp/aws"
|
||||
version = "3.6.0"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
@ -70,7 +72,7 @@ Define a Kubernetes cluster using the module `aws/container-linux/kubernetes`.
|
||||
|
||||
```tf
|
||||
module "tempest" {
|
||||
source = "git::https://github.com/poseidon/typhoon//aws/container-linux/kubernetes?ref=v1.18.5"
|
||||
source = "git::https://github.com/poseidon/typhoon//aws/container-linux/kubernetes?ref=v1.19.2"
|
||||
|
||||
# AWS
|
||||
cluster_name = "tempest"
|
||||
@ -143,9 +145,9 @@ List nodes in the cluster.
|
||||
$ export KUBECONFIG=/home/user/.kube/configs/tempest-config
|
||||
$ kubectl get nodes
|
||||
NAME STATUS ROLES AGE VERSION
|
||||
ip-10-0-3-155 Ready <none> 10m v1.18.5
|
||||
ip-10-0-26-65 Ready <none> 10m v1.18.5
|
||||
ip-10-0-41-21 Ready <none> 10m v1.18.5
|
||||
ip-10-0-3-155 Ready <none> 10m v1.19.2
|
||||
ip-10-0-26-65 Ready <none> 10m v1.19.2
|
||||
ip-10-0-41-21 Ready <none> 10m v1.19.2
|
||||
```
|
||||
|
||||
List the pods.
|
||||
@ -216,7 +218,7 @@ Reference the DNS zone id with `aws_route53_zone.zone-for-clusters.zone_id`.
|
||||
| worker_price | Spot price in USD for worker instances or 0 to use on-demand instances | 0/null | 0.10 |
|
||||
| controller_snippets | Controller Container Linux Config snippets | [] | [example](/advanced/customization/) |
|
||||
| worker_snippets | Worker Container Linux Config snippets | [] | [example](/advanced/customization/) |
|
||||
| networking | Choice of networking provider | "calico" | "calico" or "flannel" |
|
||||
| networking | Choice of networking provider | "calico" | "calico" or "cilium" or "flannel" |
|
||||
| network_mtu | CNI interface MTU (calico only) | 1480 | 8981 |
|
||||
| host_cidr | CIDR IPv4 range to assign to EC2 instances | "10.0.0.0/16" | "10.1.0.0/16" |
|
||||
| pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
|
||||
|
@ -1,6 +1,6 @@
|
||||
# Azure
|
||||
|
||||
In this tutorial, we'll create a Kubernetes v1.18.5 cluster on Azure with CoreOS Container Linux or Flatcar Linux.
|
||||
In this tutorial, we'll create a Kubernetes v1.19.2 cluster on Azure with CoreOS Container Linux or Flatcar Linux.
|
||||
|
||||
We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create a resource group, virtual network, subnets, security groups, controller availability set, worker scale set, load balancer, and TLS assets.
|
||||
|
||||
@ -10,23 +10,15 @@ Controller hosts are provisioned to run an `etcd-member` peer and a `kubelet` se
|
||||
|
||||
* Azure account
|
||||
* Azure DNS Zone (registered Domain Name or delegated subdomain)
|
||||
* Terraform v0.12.6+ and [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) installed locally
|
||||
* Terraform v0.13.0+
|
||||
|
||||
## Terraform Setup
|
||||
|
||||
Install [Terraform](https://www.terraform.io/downloads.html) v0.12.6+ on your system.
|
||||
Install [Terraform](https://www.terraform.io/downloads.html) v0.13.0+ on your system.
|
||||
|
||||
```sh
|
||||
$ terraform version
|
||||
Terraform v0.12.21
|
||||
```
|
||||
|
||||
Add the [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) plugin binary for your system to `~/.terraform.d/plugins/`, noting the final name.
|
||||
|
||||
```sh
|
||||
wget https://github.com/poseidon/terraform-provider-ct/releases/download/v0.5.0/terraform-provider-ct-v0.5.0-linux-amd64.tar.gz
|
||||
tar xzf terraform-provider-ct-v0.5.0-linux-amd64.tar.gz
|
||||
mv terraform-provider-ct-v0.5.0-linux-amd64/terraform-provider-ct ~/.terraform.d/plugins/terraform-provider-ct_v0.5.0
|
||||
Terraform v0.13.0
|
||||
```
|
||||
|
||||
Read [concepts](/architecture/concepts/) to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. `infra`).
|
||||
@ -47,11 +39,22 @@ Configure the Azure provider in a `providers.tf` file.
|
||||
|
||||
```tf
|
||||
provider "azurerm" {
|
||||
version = "2.16.0"
|
||||
features {}
|
||||
}
|
||||
|
||||
provider "ct" {
|
||||
version = "0.5.0"
|
||||
provider "ct" {}
|
||||
|
||||
terraform {
|
||||
required_providers {
|
||||
ct = {
|
||||
source = "poseidon/ct"
|
||||
version = "0.6.1"
|
||||
}
|
||||
azurerm = {
|
||||
source = "hashicorp/azurerm"
|
||||
version = "2.27.0"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
@ -72,7 +75,7 @@ Define a Kubernetes cluster using the module `azure/container-linux/kubernetes`.
|
||||
|
||||
```tf
|
||||
module "ramius" {
|
||||
source = "git::https://github.com/poseidon/typhoon//azure/container-linux/kubernetes?ref=v1.18.5"
|
||||
source = "git::https://github.com/poseidon/typhoon//azure/container-linux/kubernetes?ref=v1.19.2"
|
||||
|
||||
# Azure
|
||||
cluster_name = "ramius"
|
||||
@ -146,9 +149,9 @@ List nodes in the cluster.
|
||||
$ export KUBECONFIG=/home/user/.kube/configs/ramius-config
|
||||
$ kubectl get nodes
|
||||
NAME STATUS ROLES AGE VERSION
|
||||
ramius-controller-0 Ready <none> 24m v1.18.5
|
||||
ramius-worker-000001 Ready <none> 25m v1.18.5
|
||||
ramius-worker-000002 Ready <none> 24m v1.18.5
|
||||
ramius-controller-0 Ready <none> 24m v1.19.2
|
||||
ramius-worker-000001 Ready <none> 25m v1.19.2
|
||||
ramius-worker-000002 Ready <none> 24m v1.19.2
|
||||
```
|
||||
|
||||
List the pods.
|
||||
@ -230,7 +233,7 @@ Reference the DNS zone with `azurerm_dns_zone.clusters.name` and its resource gr
|
||||
| worker_priority | Set priority to Spot to use reduced cost surplus capacity, with the tradeoff that instances can be deallocated at any time | Regular | Spot |
|
||||
| controller_snippets | Controller Container Linux Config snippets | [] | [example](/advanced/customization/#usage) |
|
||||
| worker_snippets | Worker Container Linux Config snippets | [] | [example](/advanced/customization/#usage) |
|
||||
| networking | Choice of networking provider | "calico" | "flannel" or "calico" |
|
||||
| networking | Choice of networking provider | "calico" | "calico" or "cilium" or "flannel" |
|
||||
| host_cidr | CIDR IPv4 range to assign to instances | "10.0.0.0/16" | "10.0.0.0/20" |
|
||||
| pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
|
||||
| service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
|
||||
|
@ -1,6 +1,6 @@
|
||||
# Bare-Metal
|
||||
|
||||
In this tutorial, we'll network boot and provision a Kubernetes v1.18.5 cluster on bare-metal with CoreOS Container Linux or Flatcar Linux.
|
||||
In this tutorial, we'll network boot and provision a Kubernetes v1.19.2 cluster on bare-metal with CoreOS Container Linux or Flatcar Linux.
|
||||
|
||||
First, we'll deploy a [Matchbox](https://github.com/poseidon/matchbox) service and setup a network boot environment. Then, we'll declare a Kubernetes cluster using the Typhoon Terraform module and power on machines. On PXE boot, machines will install Container Linux to disk, reboot into the disk install, and provision themselves as Kubernetes controllers or workers via Ignition.
|
||||
|
||||
@ -12,7 +12,7 @@ Controller hosts are provisioned to run an `etcd-member` peer and a `kubelet` se
|
||||
* PXE-enabled [network boot](https://coreos.com/matchbox/docs/latest/network-setup.html) environment (with HTTPS support)
|
||||
* Matchbox v0.6+ deployment with API enabled
|
||||
* Matchbox credentials `client.crt`, `client.key`, `ca.crt`
|
||||
* Terraform v0.12.6+, [terraform-provider-matchbox](https://github.com/poseidon/terraform-provider-matchbox), and [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) installed locally
|
||||
* Terraform v0.13.0+
|
||||
|
||||
## Machines
|
||||
|
||||
@ -107,27 +107,11 @@ Read about the [many ways](https://coreos.com/matchbox/docs/latest/network-setup
|
||||
|
||||
## Terraform Setup
|
||||
|
||||
Install [Terraform](https://www.terraform.io/downloads.html) v0.12.6+ on your system.
|
||||
Install [Terraform](https://www.terraform.io/downloads.html) v0.13.0+ on your system.
|
||||
|
||||
```sh
|
||||
$ terraform version
|
||||
Terraform v0.12.21
|
||||
```
|
||||
|
||||
Add the [terraform-provider-matchbox](https://github.com/poseidon/terraform-provider-matchbox) plugin binary for your system to `~/.terraform.d/plugins/`, noting the final name.
|
||||
|
||||
```sh
|
||||
wget https://github.com/poseidon/terraform-provider-matchbox/releases/download/v0.3.0/terraform-provider-matchbox-v0.3.0-linux-amd64.tar.gz
|
||||
tar xzf terraform-provider-matchbox-v0.3.0-linux-amd64.tar.gz
|
||||
mv terraform-provider-matchbox-v0.3.0-linux-amd64/terraform-provider-matchbox ~/.terraform.d/plugins/terraform-provider-matchbox_v0.3.0
|
||||
```
|
||||
|
||||
Add the [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) plugin binary for your system to `~/.terraform.d/plugins/`, noting the final name.
|
||||
|
||||
```sh
|
||||
wget https://github.com/poseidon/terraform-provider-ct/releases/download/v0.5.0/terraform-provider-ct-v0.5.0-linux-amd64.tar.gz
|
||||
tar xzf terraform-provider-ct-v0.5.0-linux-amd64.tar.gz
|
||||
mv terraform-provider-ct-v0.5.0-linux-amd64/terraform-provider-ct ~/.terraform.d/plugins/terraform-provider-ct_v0.5.0
|
||||
Terraform v0.13.0
|
||||
```
|
||||
|
||||
Read [concepts](/architecture/concepts/) to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. `infra`).
|
||||
@ -142,15 +126,25 @@ Configure the Matchbox provider to use your Matchbox API endpoint and client cer
|
||||
|
||||
```tf
|
||||
provider "matchbox" {
|
||||
version = "0.3.0"
|
||||
endpoint = "matchbox.example.com:8081"
|
||||
client_cert = file("~/.config/matchbox/client.crt")
|
||||
client_key = file("~/.config/matchbox/client.key")
|
||||
ca = file("~/.config/matchbox/ca.crt")
|
||||
}
|
||||
|
||||
provider "ct" {
|
||||
version = "0.5.0"
|
||||
provider "ct" {}
|
||||
|
||||
terraform {
|
||||
required_providers {
|
||||
ct = {
|
||||
source = "poseidon/ct"
|
||||
version = "0.6.1"
|
||||
}
|
||||
matchbox = {
|
||||
source = "poseidon/matchbox"
|
||||
version = "0.4.1"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
@ -160,7 +154,7 @@ Define a Kubernetes cluster using the module `bare-metal/container-linux/kuberne
|
||||
|
||||
```tf
|
||||
module "mercury" {
|
||||
source = "git::https://github.com/poseidon/typhoon//bare-metal/container-linux/kubernetes?ref=v1.18.5"
|
||||
source = "git::https://github.com/poseidon/typhoon//bare-metal/container-linux/kubernetes?ref=v1.19.2"
|
||||
|
||||
# bare-metal
|
||||
cluster_name = "mercury"
|
||||
@ -299,9 +293,9 @@ List nodes in the cluster.
|
||||
$ export KUBECONFIG=/home/user/.kube/configs/mercury-config
|
||||
$ kubectl get nodes
|
||||
NAME STATUS ROLES AGE VERSION
|
||||
node1.example.com Ready <none> 10m v1.18.5
|
||||
node2.example.com Ready <none> 10m v1.18.5
|
||||
node3.example.com Ready <none> 10m v1.18.5
|
||||
node1.example.com Ready <none> 10m v1.19.2
|
||||
node2.example.com Ready <none> 10m v1.19.2
|
||||
node3.example.com Ready <none> 10m v1.19.2
|
||||
```
|
||||
|
||||
List the pods.
|
||||
@ -350,7 +344,7 @@ Check the [variables.tf](https://github.com/poseidon/typhoon/blob/master/bare-me
|
||||
| download_protocol | Protocol iPXE uses to download the kernel and initrd. iPXE must be compiled with [crypto](https://ipxe.org/crypto) support for https. Unused if cached_install is true | "https" | "http" |
|
||||
| cached_install | PXE boot and install from the Matchbox `/assets` cache. Admin MUST have downloaded Container Linux or Flatcar images into the cache | false | true |
|
||||
| install_disk | Disk device where Container Linux should be installed | "/dev/sda" | "/dev/sdb" |
|
||||
| networking | Choice of networking provider | "calico" | "calico" or "flannel" |
|
||||
| networking | Choice of networking provider | "calico" | "calico" or "cilium" or "flannel" |
|
||||
| network_mtu | CNI interface MTU (calico-only) | 1480 | - |
|
||||
| snippets | Map from machine names to lists of Container Linux Config snippets | {} | [examples](/advanced/customization/) |
|
||||
| network_ip_autodetection_method | Method to detect host IPv4 address (calico-only) | "first-found" | "can-reach=10.0.0.1" |
|
||||
|
@ -1,6 +1,6 @@
|
||||
# DigitalOcean
|
||||
|
||||
In this tutorial, we'll create a Kubernetes v1.18.5 cluster on DigitalOcean with CoreOS Container Linux or Flatcar Linux.
|
||||
In this tutorial, we'll create a Kubernetes v1.19.2 cluster on DigitalOcean with CoreOS Container Linux or Flatcar Linux.
|
||||
|
||||
We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create controller droplets, worker droplets, DNS records, tags, and TLS assets.
|
||||
|
||||
@ -10,23 +10,15 @@ Controller hosts are provisioned to run an `etcd-member` peer and a `kubelet` se
|
||||
|
||||
* Digital Ocean Account and Token
|
||||
* Digital Ocean Domain (registered Domain Name or delegated subdomain)
|
||||
* Terraform v0.12.6+ and [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) installed locally
|
||||
* Terraform v0.13.0+
|
||||
|
||||
## Terraform Setup
|
||||
|
||||
Install [Terraform](https://www.terraform.io/downloads.html) v0.12.6+ on your system.
|
||||
Install [Terraform](https://www.terraform.io/downloads.html) v0.13.0+ on your system.
|
||||
|
||||
```sh
|
||||
$ terraform version
|
||||
Terraform v0.12.21
|
||||
```
|
||||
|
||||
Add the [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) plugin binary for your system to `~/.terraform.d/plugins/`, noting the final name.
|
||||
|
||||
```sh
|
||||
wget https://github.com/poseidon/terraform-provider-ct/releases/download/v0.5.0/terraform-provider-ct-v0.5.0-linux-amd64.tar.gz
|
||||
tar xzf terraform-provider-ct-v0.5.0-linux-amd64.tar.gz
|
||||
mv terraform-provider-ct-v0.5.0-linux-amd64/terraform-provider-ct ~/.terraform.d/plugins/terraform-provider-ct_v0.5.0
|
||||
Terraform v0.13.0
|
||||
```
|
||||
|
||||
Read [concepts](/architecture/concepts/) to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. `infra`).
|
||||
@ -50,12 +42,22 @@ Configure the DigitalOcean provider to use your token in a `providers.tf` file.
|
||||
|
||||
```tf
|
||||
provider "digitalocean" {
|
||||
version = "1.20.0"
|
||||
token = "${chomp(file("~/.config/digital-ocean/token"))}"
|
||||
}
|
||||
|
||||
provider "ct" {
|
||||
version = "0.5.0"
|
||||
provider "ct" {}
|
||||
|
||||
terraform {
|
||||
required_providers {
|
||||
ct = {
|
||||
source = "poseidon/ct"
|
||||
version = "0.6.1"
|
||||
}
|
||||
digitalocean = {
|
||||
source = "digitalocean/digitalocean"
|
||||
version = "1.22.1"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
@ -79,7 +81,7 @@ Define a Kubernetes cluster using the module `digital-ocean/container-linux/kube
|
||||
|
||||
```tf
|
||||
module "nemo" {
|
||||
source = "git::https://github.com/poseidon/typhoon//digital-ocean/container-linux/kubernetes?ref=v1.18.5"
|
||||
source = "git::https://github.com/poseidon/typhoon//digital-ocean/container-linux/kubernetes?ref=v1.19.2"
|
||||
|
||||
# Digital Ocean
|
||||
cluster_name = "nemo"
|
||||
@ -153,9 +155,9 @@ List nodes in the cluster.
|
||||
$ export KUBECONFIG=/home/user/.kube/configs/nemo-config
|
||||
$ kubectl get nodes
|
||||
NAME STATUS ROLES AGE VERSION
|
||||
10.132.110.130 Ready <none> 10m v1.18.5
|
||||
10.132.115.81 Ready <none> 10m v1.18.5
|
||||
10.132.124.107 Ready <none> 10m v1.18.5
|
||||
10.132.110.130 Ready <none> 10m v1.19.2
|
||||
10.132.115.81 Ready <none> 10m v1.19.2
|
||||
10.132.124.107 Ready <none> 10m v1.19.2
|
||||
```
|
||||
|
||||
List the pods.
|
||||
@ -238,7 +240,7 @@ Digital Ocean requires the SSH public key be uploaded to your account, so you ma
|
||||
| worker_type | Droplet type for workers | "s-1vcpu-2gb" | s-1vcpu-2gb, s-2vcpu-2gb, ... |
|
||||
| controller_snippets | Controller Container Linux Config snippets | [] | [example](/advanced/customization/) |
|
||||
| worker_snippets | Worker Container Linux Config snippets | [] | [example](/advanced/customization/) |
|
||||
| networking | Choice of networking provider | "calico" | "flannel" or "calico" |
|
||||
| networking | Choice of networking provider | "calico" | "calico" or "cilium" or "flannel" |
|
||||
| pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
|
||||
| service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
|
||||
|
||||
|
@ -1,6 +1,6 @@
|
||||
# Google Cloud
|
||||
|
||||
In this tutorial, we'll create a Kubernetes v1.18.5 cluster on Google Compute Engine with CoreOS Container Linux or Flatcar Linux.
|
||||
In this tutorial, we'll create a Kubernetes v1.19.2 cluster on Google Compute Engine with CoreOS Container Linux or Flatcar Linux.
|
||||
|
||||
We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create a network, firewall rules, health checks, controller instances, worker managed instance group, load balancers, and TLS assets.
|
||||
|
||||
@ -10,23 +10,15 @@ Controller hosts are provisioned to run an `etcd-member` peer and a `kubelet` se
|
||||
|
||||
* Google Cloud Account and Service Account
|
||||
* Google Cloud DNS Zone (registered Domain Name or delegated subdomain)
|
||||
* Terraform v0.12.6+ and [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) installed locally
|
||||
* Terraform v0.13.0+
|
||||
|
||||
## Terraform Setup
|
||||
|
||||
Install [Terraform](https://www.terraform.io/downloads.html) v0.12.6+ on your system.
|
||||
Install [Terraform](https://www.terraform.io/downloads.html) v0.13.0+ on your system.
|
||||
|
||||
```sh
|
||||
$ terraform version
|
||||
Terraform v0.12.21
|
||||
```
|
||||
|
||||
Add the [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) plugin binary for your system to `~/.terraform.d/plugins/`, noting the final name.
|
||||
|
||||
```sh
|
||||
wget https://github.com/poseidon/terraform-provider-ct/releases/download/v0.5.0/terraform-provider-ct-v0.5.0-linux-amd64.tar.gz
|
||||
tar xzf terraform-provider-ct-v0.5.0-linux-amd64.tar.gz
|
||||
mv terraform-provider-ct-v0.5.0-linux-amd64/terraform-provider-ct ~/.terraform.d/plugins/terraform-provider-ct_v0.5.0
|
||||
Terraform v0.13.0
|
||||
```
|
||||
|
||||
Read [concepts](/architecture/concepts/) to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. `infra`).
|
||||
@ -49,14 +41,24 @@ Configure the Google Cloud provider to use your service account key, project-id,
|
||||
|
||||
```tf
|
||||
provider "google" {
|
||||
version = "3.27.0"
|
||||
project = "project-id"
|
||||
region = "us-central1"
|
||||
credentials = file("~/.config/google-cloud/terraform.json")
|
||||
}
|
||||
|
||||
provider "ct" {
|
||||
version = "0.5.0"
|
||||
provider "ct" {}
|
||||
|
||||
terraform {
|
||||
required_providers {
|
||||
ct = {
|
||||
source = "poseidon/ct"
|
||||
version = "0.6.1"
|
||||
}
|
||||
google = {
|
||||
source = "hashicorp/google"
|
||||
version = "3.38.0"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
@ -90,7 +92,7 @@ Define a Kubernetes cluster using the module `google-cloud/container-linux/kuber
|
||||
|
||||
```tf
|
||||
module "yavin" {
|
||||
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes?ref=v1.18.5"
|
||||
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes?ref=v1.19.2"
|
||||
|
||||
# Google Cloud
|
||||
cluster_name = "yavin"
|
||||
@ -165,9 +167,9 @@ List nodes in the cluster.
|
||||
$ export KUBECONFIG=/home/user/.kube/configs/yavin-config
|
||||
$ kubectl get nodes
|
||||
NAME ROLES STATUS AGE VERSION
|
||||
yavin-controller-0.c.example-com.internal <none> Ready 6m v1.18.5
|
||||
yavin-worker-jrbf.c.example-com.internal <none> Ready 5m v1.18.5
|
||||
yavin-worker-mzdm.c.example-com.internal <none> Ready 5m v1.18.5
|
||||
yavin-controller-0.c.example-com.internal <none> Ready 6m v1.19.2
|
||||
yavin-worker-jrbf.c.example-com.internal <none> Ready 5m v1.19.2
|
||||
yavin-worker-mzdm.c.example-com.internal <none> Ready 5m v1.19.2
|
||||
```
|
||||
|
||||
List the pods.
|
||||
@ -238,7 +240,7 @@ resource "google_dns_managed_zone" "zone-for-clusters" {
|
||||
| worker_preemptible | If enabled, Compute Engine will terminate workers randomly within 24 hours | false | true |
|
||||
| controller_snippets | Controller Container Linux Config snippets | [] | [example](/advanced/customization/) |
|
||||
| worker_snippets | Worker Container Linux Config snippets | [] | [example](/advanced/customization/) |
|
||||
| networking | Choice of networking provider | "calico" | "calico" or "flannel" |
|
||||
| networking | Choice of networking provider | "calico" | "calico" or "cilium" or "flannel" |
|
||||
| pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
|
||||
| service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
|
||||
| worker_node_labels | List of initial worker node labels | [] | ["worker-pool=default"] |
|
||||
|
@ -11,8 +11,8 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
|
||||
|
||||
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
|
||||
|
||||
* Kubernetes v1.18.5 (upstream)
|
||||
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
|
||||
* Kubernetes v1.19.2 (upstream)
|
||||
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking
|
||||
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/), SELinux enforcing
|
||||
* Advanced features like [worker pools](advanced/worker-pools/), [preemptible](fedora-coreos/google-cloud/#preemption) workers, and [snippets](advanced/customization/#container-linux) customization
|
||||
* Ready for Ingress, Prometheus, Grafana, CSI, or other [addons](addons/overview/)
|
||||
@ -29,7 +29,7 @@ Typhoon is available for [Fedora CoreOS](https://getfedora.org/coreos/).
|
||||
| Azure | Fedora CoreOS | [azure/fedora-coreos/kubernetes](fedora-coreos/azure.md) | alpha |
|
||||
| Bare-Metal | Fedora CoreOS | [bare-metal/fedora-coreos/kubernetes](fedora-coreos/bare-metal.md) | beta |
|
||||
| DigitalOcean | Fedora CoreOS | [digital-ocean/fedora-coreos/kubernetes](fedora-coreos/digitalocean.md) | beta |
|
||||
| Google Cloud | Fedora CoreOS | [google-cloud/fedora-coreos/kubernetes](google-cloud/fedora-coreos/kubernetes) | stable |
|
||||
| Google Cloud | Fedora CoreOS | [google-cloud/fedora-coreos/kubernetes](fedora-coreos/google-cloud/kubernetes) | stable |
|
||||
|
||||
Typhoon is available for [Flatcar Linux](https://www.flatcar-linux.org/releases/).
|
||||
|
||||
@ -53,7 +53,7 @@ Define a Kubernetes cluster by using the Terraform module for your chosen platfo
|
||||
|
||||
```tf
|
||||
module "yavin" {
|
||||
source = "git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes?ref=v1.18.5"
|
||||
source = "git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes?ref=v1.19.2"
|
||||
|
||||
# Google Cloud
|
||||
cluster_name = "yavin"
|
||||
@ -91,9 +91,9 @@ In 4-8 minutes (varies by platform), the cluster will be ready. This Google Clou
|
||||
$ export KUBECONFIG=/home/user/.kube/configs/yavin-config
|
||||
$ kubectl get nodes
|
||||
NAME ROLES STATUS AGE VERSION
|
||||
yavin-controller-0.c.example-com.internal <none> Ready 6m v1.18.5
|
||||
yavin-worker-jrbf.c.example-com.internal <none> Ready 5m v1.18.5
|
||||
yavin-worker-mzdm.c.example-com.internal <none> Ready 5m v1.18.5
|
||||
yavin-controller-0.c.example-com.internal <none> Ready 6m v1.19.2
|
||||
yavin-worker-jrbf.c.example-com.internal <none> Ready 5m v1.19.2
|
||||
yavin-worker-mzdm.c.example-com.internal <none> Ready 5m v1.19.2
|
||||
```
|
||||
|
||||
List the pods.
|
||||
|
@ -12,7 +12,7 @@ Ask questions on the IRC #typhoon channel on [freenode.net](http://freenode.net/
|
||||
|
||||
## Security Issues
|
||||
|
||||
If you find security issues, please see [security disclosures](/topics/security.md#disclosures).
|
||||
If you find security issues, please see [security disclosures](/topics/security/#disclosures).
|
||||
|
||||
## Maintainers
|
||||
|
||||
|
@ -183,7 +183,7 @@ show ip route bgp
|
||||
|
||||
### Port Forwarding
|
||||
|
||||
Expose the [Ingress Controller](/addons/ingress.md#bare-metal) by adding `port-forward` rules that DNAT a port on the router's WAN interface to an internal IP and port. By convention, a public Ingress controller is assigned a fixed service IP (e.g. 10.3.0.12).
|
||||
Expose the [Ingress Controller](/addons/ingress/#bare-metal) by adding `port-forward` rules that DNAT a port on the router's WAN interface to an internal IP and port. By convention, a public Ingress controller is assigned a fixed service IP (e.g. 10.3.0.12).
|
||||
|
||||
```
|
||||
configure
|
||||
|
@ -13,12 +13,12 @@ Typhoon provides tagged releases to allow clusters to be versioned using ordinar
|
||||
|
||||
```
|
||||
module "yavin" {
|
||||
source = "git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes?ref=v1.18.5"
|
||||
source = "git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes?ref=v1.19.2"
|
||||
...
|
||||
}
|
||||
|
||||
module "mercury" {
|
||||
source = "git::https://github.com/poseidon/typhoon//bare-metal/container-linux/kubernetes?ref=v1.18.5"
|
||||
source = "git::https://github.com/poseidon/typhoon//bare-metal/container-linux/kubernetes?ref=v1.19.2"
|
||||
...
|
||||
}
|
||||
```
|
||||
@ -134,9 +134,9 @@ The [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) p
|
||||
Add the [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) plugin binary for your system to `~/.terraform.d/plugins/`, noting the final name.
|
||||
|
||||
```sh
|
||||
wget https://github.com/poseidon/terraform-provider-ct/releases/download/v0.5.0/terraform-provider-ct-v0.5.0-linux-amd64.tar.gz
|
||||
tar xzf terraform-provider-ct-v0.5.0-linux-amd64.tar.gz
|
||||
mv terraform-provider-ct-v0.5.0-linux-amd64/terraform-provider-ct ~/.terraform.d/plugins/terraform-provider-ct_v0.5.0
|
||||
wget https://github.com/poseidon/terraform-provider-ct/releases/download/v0.5.0/terraform-provider-ct-v0.6.1-linux-amd64.tar.gz
|
||||
tar xzf terraform-provider-ct-v0.6.1-linux-amd64.tar.gz
|
||||
mv terraform-provider-ct-v0.6.1-linux-amd64/terraform-provider-ct ~/.terraform.d/plugins/terraform-provider-ct_v0.6.1
|
||||
```
|
||||
|
||||
Binary names are versioned. This enables the ability to upgrade different plugins and have clusters pin different versions.
|
||||
@ -147,17 +147,16 @@ $ tree ~/.terraform.d/
|
||||
└── plugins
|
||||
├── terraform-provider-ct_v0.2.1
|
||||
├── terraform-provider-ct_v0.3.0
|
||||
├── terraform-provider-ct_v0.5.0
|
||||
└── terraform-provider-matchbox_v0.3.0
|
||||
├── terraform-provider-ct_v0.6.1
|
||||
└── terraform-provider-matchbox_v0.4.1
|
||||
```
|
||||
|
||||
|
||||
Update the version of the `ct` plugin in each Terraform working directory. Typhoon clusters managed in the working directory **must** be v1.12.2 or higher.
|
||||
|
||||
```
|
||||
# providers.tf
|
||||
```tf
|
||||
provider "ct" {
|
||||
version = "0.5.0"
|
||||
version = "0.6.1"
|
||||
}
|
||||
```
|
||||
|
||||
@ -193,7 +192,7 @@ terraform apply
|
||||
|
||||
# add kubeconfig to new workers
|
||||
terraform state list | grep null_resource
|
||||
terraform taint -module digital-ocean-nemo null_resource.copy-worker-secrets[N]
|
||||
terraform taint module.nemo.null_resource.copy-worker-secrets[N]
|
||||
terraform apply
|
||||
```
|
||||
|
||||
@ -203,17 +202,91 @@ Expect downtime.
|
||||
|
||||
Google Cloud creates a new worker template and edits the worker instance group instantly. Manually terminate workers and replacement workers will use the user-data.
|
||||
|
||||
## Terraform v0.12.x
|
||||
## Terraform Versions
|
||||
|
||||
Terraform [v0.12](https://www.hashicorp.com/blog/announcing-terraform-0-12) introduced major changes to the provider plugin protocol and HCL language (first-class expressions, formal list and map types, nullable variables, variable constraints, and short-circuiting ternary operators).
|
||||
Terraform [v0.13](https://www.hashicorp.com/blog/announcing-hashicorp-terraform-0-13) introduced major changes to the provider plugin system. Terraform `init` can automatically install both `hashicorp` and `poseidon` provider plugins, eliminating the need to manually install plugin binaries.
|
||||
|
||||
Typhoon modules have been adapted for Terraform v0.12. Provider plugins requirements now enforce v0.12 compatibility. However, some HCL language changes were breaking (list [type hint](https://www.terraform.io/upgrade-guides/0-12.html#referring-to-list-variables) workarounds in v0.11 now have new meaning). Typhoon cannot offer both v0.11 and v0.12 compatibility in the same release. Upcoming releases require upgrading Terraform to v0.12.
|
||||
Typhoon modules have been updated for v0.13.x, but retain compatibility with v0.12.26+ to ease migration. Poseidon publishes [providers](/topics/security/#terraform-providers) to the Terraform Provider Registry for usage with v0.13+.
|
||||
|
||||
| Typhoon Release | Terraform version |
|
||||
|-------------------|---------------------|
|
||||
| v1.15.0 - ? | v0.12.x |
|
||||
| v1.18.8 - ? | v0.12.26+, v0.13.x |
|
||||
| v1.15.0 - v1.18.8 | v0.12.x |
|
||||
| v1.10.3 - v1.15.0 | v0.11.x |
|
||||
| v1.9.2 - v1.10.2 | v0.10.4+ or v0.11.x |
|
||||
| v1.7.3 - v1.9.1 | v0.10.x |
|
||||
| v1.6.4 - v1.7.2 | v0.9.x |
|
||||
|
||||
### New Workspace
|
||||
|
||||
With a new Terraform workspace, use Terraform v0.13.x and the updated Typhoon [tutorials](/fedora-coreos/aws/#provider).
|
||||
|
||||
### Existing Workspace
|
||||
|
||||
An existing Terraform workspace may already manage earlier Typhoon clusters created with Terraform v0.12.x.
|
||||
|
||||
First, upgrade `terraform-provider-ct` to v0.6.1 following the [guide](#upgrade-terraform-provider-ct) above. As usual, read about how `apply` affects existing cluster nodes when `ct` is upgraded. But `terraform-provider-ct` v0.6.1 is compatible with both Terraform v0.12 and v0.13, so do this first.
|
||||
|
||||
```tf
|
||||
provider "ct" {
|
||||
version = "0.6.1"
|
||||
}
|
||||
```
|
||||
|
||||
Next, create Typhoon clusters using the `ref` that introduced Terraform v0.13 forward compatibility (`v1.18.8`) or later. You will see a compatibility warning. Use blue/green cluster replacement to shift to these new clusters, then eliminate older clusters.
|
||||
|
||||
```
|
||||
module "nemo" {
|
||||
source = "git::https://github.com/poseidon/typhoon//digital-ocean/fedora-coreos/kubernetes?ref=v1.18.8"
|
||||
...
|
||||
}
|
||||
```
|
||||
|
||||
Install Terraform v0.13. Once all clusters in a workspace are on `v1.18.8` or above, you are ready to start using Terraform v0.13.
|
||||
|
||||
```
|
||||
terraform version
|
||||
v0.13.0
|
||||
```
|
||||
|
||||
Update `providers.tf` to match the Typhoon [tutorials](/fedora-coreos/aws/#provider) and use new `required_providers` block.
|
||||
|
||||
```
|
||||
terraform init
|
||||
terraform 0.13upgrade # sometimes helpful
|
||||
```
|
||||
|
||||
!!! note
|
||||
You will see `Could not retrieve the list of available versions for provider -/ct: provider`
|
||||
|
||||
In state files, existing clusters use Terraform v0.12 providers (e.g. `-/aws`). Pivot to Terraform v0.13 providers (e.g. `hashicorp/aws`) with the following commands, as applicable. Repeat until `terraform init` no longer shows old-style providers.
|
||||
|
||||
```
|
||||
terraform state replace-provider -- -/aws hashicorp/aws
|
||||
terraform state replace-provider -- -/azurerm hashicorp/azurerm
|
||||
terraform state replace-provider -- -/google hashicorp/google
|
||||
|
||||
terraform state replace-provider -- -/digitalocean digitalocean/digitalocean
|
||||
terraform state replace-provider -- -/ct poseidon/ct
|
||||
terraform state replace-provider -- -/matchbox poseidon/matchbox
|
||||
|
||||
terraform state replace-provider -- -/local hashicorp/local
|
||||
terraform state replace-provider -- -/null hashicorp/null
|
||||
terraform state replace-provider -- -/random hashicorp/random
|
||||
terraform state replace-provider -- -/template hashicorp/template
|
||||
terraform state replace-provider -- -/tls hashicorp/tls
|
||||
```
|
||||
|
||||
Finally, verify Terraform v0.13 plan shows no diff.
|
||||
|
||||
```
|
||||
terraform plan
|
||||
No changes. Infrastructure is up-to-date.
|
||||
```
|
||||
|
||||
### v0.12.x
|
||||
|
||||
Terraform [v0.12](https://www.hashicorp.com/blog/announcing-terraform-0-12) introduced major changes to the provider plugin protocol and HCL language (first-class expressions, formal list and map types, nullable variables, variable constraints, and short-circuiting ternary operators).
|
||||
|
||||
Typhoon modules have been adapted for Terraform v0.12. Provider plugins requirements now enforce v0.12 compatibility. However, some HCL language changes were breaking (list [type hint](https://www.terraform.io/upgrade-guides/0-12.html#referring-to-list-variables) workarounds in v0.11 now have new meaning). Typhoon cannot offer both v0.11 and v0.12 compatibility in the same release. Upcoming releases require upgrading Terraform to v0.12.
|
||||
|
||||
|
@ -38,7 +38,7 @@ Network performance varies based on the platform and CNI plugin. `iperf` was use
|
||||
|
||||
Notes:
|
||||
|
||||
* Calico and Flannel have comparable performance. Platform and configuration differences dominate.
|
||||
* Calico, Cilium, and Flannel have comparable performance. Platform and configuration differences dominate.
|
||||
* Azure and DigitalOcean network performance can be quite variable or depend on machine type
|
||||
* Only [certain AWS EC2 instance types](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/network_mtu.html#jumbo_frame_instances) allow jumbo frames. This is why the default MTU on AWS must be 1480.
|
||||
|
||||
|
@ -66,6 +66,21 @@ Two tag styles indicate the build strategy used.
|
||||
|
||||
The Typhoon-built Kubelet image is used as the official image. Automated builds provide an alternative image for those preferring to trust images built by Quay/Dockerhub (albeit lacking multi-arch). To use the fallback registry or an alternative tag, see [customization](/advanced/customization/#kubelet).
|
||||
|
||||
### flannel-cni
|
||||
|
||||
Typhoon packages the [flannel-cni](https://github.com/poseidon/flannel-cni) container image to provide security patches.
|
||||
|
||||
* [quay.io/poseidon/flannel-cni](https://quay.io/repository/poseidon/flannel-cni) (official)
|
||||
|
||||
## Terraform Providers
|
||||
|
||||
Typhoon publishes Terraform providers to the Terraform Registry, GPG signed by 0x8F515AD1602065C8.
|
||||
|
||||
| Name | Source | Registry |
|
||||
|----------|--------|----------|
|
||||
| ct | [github](https://github.com/poseidon/terraform-provider-ct) | [poseidon/ct](https://registry.terraform.io/providers/poseidon/ct/latest) |
|
||||
| matchbox | [github](https://github.com/poseidon/terraform-provider-matchbox) | [poseidon/matchbox](https://registry.terraform.io/providers/poseidon/matchbox/latest) |
|
||||
|
||||
## Disclosures
|
||||
|
||||
If you find security issues, please email `security@psdn.io`. If the issue lies in upstream Kubernetes, please inform upstream Kubernetes as well.
|
||||
|
@ -11,8 +11,8 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
|
||||
|
||||
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
|
||||
|
||||
* Kubernetes v1.18.5 (upstream)
|
||||
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
|
||||
* Kubernetes v1.19.2 (upstream)
|
||||
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking
|
||||
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
|
||||
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [preemptible](https://typhoon.psdn.io/cl/google-cloud/#preemption) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization
|
||||
* Ready for Ingress, Prometheus, Grafana, CSI, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
|
||||
|
@ -1,6 +1,6 @@
|
||||
# Kubernetes assets (kubeconfig, manifests)
|
||||
module "bootstrap" {
|
||||
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=5a7c963caf59740891df2aeae4b1561ccb3b9db6"
|
||||
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=d0f2123c5971410dc14aecde2307eb13e89c2bdf"
|
||||
|
||||
cluster_name = var.cluster_name
|
||||
api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)]
|
||||
|
@ -7,7 +7,7 @@ systemd:
|
||||
- name: 40-etcd-cluster.conf
|
||||
contents: |
|
||||
[Service]
|
||||
Environment="ETCD_IMAGE_TAG=v3.4.9"
|
||||
Environment="ETCD_IMAGE_TAG=v3.4.12"
|
||||
Environment="ETCD_IMAGE_URL=docker://quay.io/coreos/etcd"
|
||||
Environment="RKT_RUN_ARGS=--insecure-options=image"
|
||||
Environment="ETCD_NAME=${etcd_name}"
|
||||
@ -52,7 +52,7 @@ systemd:
|
||||
Description=Kubelet
|
||||
Wants=rpc-statd.service
|
||||
[Service]
|
||||
Environment=KUBELET_IMAGE=docker://quay.io/poseidon/kubelet:v1.18.5
|
||||
Environment=KUBELET_IMAGE=docker://quay.io/poseidon/kubelet:v1.19.2
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
|
||||
ExecStartPre=/bin/mkdir -p /opt/cni/bin
|
||||
@ -132,7 +132,7 @@ systemd:
|
||||
--volume script,kind=host,source=/opt/bootstrap/apply \
|
||||
--mount volume=script,target=/apply \
|
||||
--insecure-options=image \
|
||||
docker://quay.io/poseidon/kubelet:v1.18.5 \
|
||||
docker://quay.io/poseidon/kubelet:v1.19.2 \
|
||||
--net=host \
|
||||
--dns=host \
|
||||
--exec=/apply
|
||||
@ -140,6 +140,11 @@ systemd:
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
storage:
|
||||
directories:
|
||||
- path: /var/lib/etcd
|
||||
filesystem: root
|
||||
mode: 0700
|
||||
overwrite: true
|
||||
files:
|
||||
- path: /etc/kubernetes/kubeconfig
|
||||
filesystem: root
|
||||
@ -161,6 +166,7 @@ storage:
|
||||
mv tls/etcd/etcd-client* /etc/kubernetes/bootstrap-secrets/
|
||||
chown -R etcd:etcd /etc/ssl/etcd
|
||||
chmod -R 500 /etc/ssl/etcd
|
||||
chmod -R 700 /var/lib/etcd
|
||||
mv auth/kubeconfig /etc/kubernetes/bootstrap-secrets/
|
||||
mv tls/k8s/* /etc/kubernetes/bootstrap-secrets/
|
||||
mkdir -p /etc/kubernetes/manifests
|
||||
|
@ -1,11 +1,15 @@
|
||||
# Terraform version and plugin versions
|
||||
|
||||
terraform {
|
||||
required_version = "~> 0.12.6"
|
||||
required_version = ">= 0.12.26, < 0.14.0"
|
||||
required_providers {
|
||||
google = ">= 2.19, < 4.0"
|
||||
ct = "~> 0.4"
|
||||
template = "~> 2.1"
|
||||
null = "~> 2.1"
|
||||
|
||||
ct = {
|
||||
source = "poseidon/ct"
|
||||
version = "~> 0.6.1"
|
||||
}
|
||||
}
|
||||
}
|
||||
|
@ -25,7 +25,7 @@ systemd:
|
||||
Description=Kubelet
|
||||
Wants=rpc-statd.service
|
||||
[Service]
|
||||
Environment=KUBELET_IMAGE=docker://quay.io/poseidon/kubelet:v1.18.5
|
||||
Environment=KUBELET_IMAGE=docker://quay.io/poseidon/kubelet:v1.19.2
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
|
||||
ExecStartPre=/bin/mkdir -p /opt/cni/bin
|
||||
@ -127,7 +127,7 @@ storage:
|
||||
--volume config,kind=host,source=/etc/kubernetes \
|
||||
--mount volume=config,target=/etc/kubernetes \
|
||||
--insecure-options=image \
|
||||
docker://quay.io/poseidon/kubelet:v1.18.5 \
|
||||
docker://quay.io/poseidon/kubelet:v1.19.2 \
|
||||
--net=host \
|
||||
--dns=host \
|
||||
--exec=/usr/local/bin/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname)
|
||||
|
@ -1,4 +1,14 @@
|
||||
# Terraform version and plugin versions
|
||||
|
||||
terraform {
|
||||
required_version = ">= 0.12"
|
||||
required_version = ">= 0.12.26, < 0.14.0"
|
||||
required_providers {
|
||||
google = ">= 2.19, < 4.0"
|
||||
template = "~> 2.1"
|
||||
|
||||
ct = {
|
||||
source = "poseidon/ct"
|
||||
version = "~> 0.6.1"
|
||||
}
|
||||
}
|
||||
}
|
||||
|
@ -11,8 +11,8 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
|
||||
|
||||
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
|
||||
|
||||
* Kubernetes v1.18.5 (upstream)
|
||||
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
|
||||
* Kubernetes v1.19.2 (upstream)
|
||||
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking
|
||||
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/), SELinux enforcing
|
||||
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [preemptible](https://typhoon.psdn.io/fedora-coreos/google-cloud/#preemption) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/) customization
|
||||
* Ready for Ingress, Prometheus, Grafana, CSI, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
|
||||
|
@ -1,6 +1,6 @@
|
||||
# Kubernetes assets (kubeconfig, manifests)
|
||||
module "bootstrap" {
|
||||
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=5a7c963caf59740891df2aeae4b1561ccb3b9db6"
|
||||
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=d0f2123c5971410dc14aecde2307eb13e89c2bdf"
|
||||
|
||||
cluster_name = var.cluster_name
|
||||
api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)]
|
||||
|
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user