Compare commits

...

239 Commits

Author SHA1 Message Date
31c7f0ba0e Update nginx-ingress addon from v1.2.0 to v1.2.1
* https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v1.2.1
2022-05-31 16:37:57 +01:00
b8549a1e32 Update Cilium from v1.11.4 to v1.11.5
* https://github.com/poseidon/terraform-render-bootstrap/pull/309
2022-05-31 15:23:07 +01:00
8e8bf305c3 Update Prometheus and Grafana addons 2022-05-31 14:29:55 +01:00
a447494ccd Bump mkdocs-material from 8.2.15 to 8.2.16
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 8.2.15 to 8.2.16.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/CHANGELOG)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/8.2.15...8.2.16)

---
updated-dependencies:
- dependency-name: mkdocs-material
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-05-31 10:30:34 +01:00
c5573199db Update Kubernetes from v1.24.0 to v1.24.1
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.24.md#v1241
2022-05-28 09:39:14 +01:00
0be171cde7 Bump mkdocs-material from 8.2.14 to 8.2.15
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 8.2.14 to 8.2.15.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/CHANGELOG)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/8.2.14...8.2.15)

---
updated-dependencies:
- dependency-name: mkdocs-material
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-05-27 10:02:31 +01:00
e3b1e6c52e Bump mkdocs-material from 8.2.13 to 8.2.14
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 8.2.13 to 8.2.14.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/CHANGELOG)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/8.2.13...8.2.14)

---
updated-dependencies:
- dependency-name: mkdocs-material
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-05-09 18:48:45 -07:00
b0e0b132e4 Update Kubernetes from v1.23.6 to v1.24.0
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.24.md#v1240
2022-05-04 08:27:14 -07:00
4fba09e8f8 Bump mkdocs-material from 8.2.11 to 8.2.13
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 8.2.11 to 8.2.13.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/CHANGELOG)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/8.2.11...8.2.13)

---
updated-dependencies:
- dependency-name: mkdocs-material
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-05-03 07:42:29 -07:00
02f78fbd1a Update Grafana from v8.4.5 to v8.5.1 2022-05-02 08:19:41 -07:00
a122867748 Update nginx-ingress, Prometheus, and Grafana addons
* Sync addons with versions used in Poseidon
2022-04-27 21:02:32 -07:00
91b38bf3fd Update etcd from v3.5.2 to v3.5.4
* https://github.com/etcd-io/etcd/releases/tag/v3.5.4
2022-04-27 20:57:02 -07:00
9a4887d028 Add bind mounts for selinux to fcos kubelets
fixes #1123

Enables the use of CSI drivers with a StorageClass that lacks an explicit context mount option. In cases where the kubelet lacks mounts for `/etc/selinux` and `/sys/fs/selinux`, it is unable to set the `:Z` option for the CRI volume definition automatically. See [KEP 1710](https://github.com/kubernetes/enhancements/blob/master/keps/sig-storage/1710-selinux-relabeling/README.md#volume-mounting) for more information on how SELinux is passed to the CRI by Kubelet.

Prior to this change, a not-explicitly-labelled mount would have an `unlabeled_t` SELinux type on the host. Following this change, the Kubelet and CRI work together to dynamically relabel mounts that lack an explicit context specification every time it is rebound to a pod with SELinux type `container_file_t` and appropriate context labels to match the specifics for the pod it is bound to. This enables applications running in containers to consume dynamically provisioned storage on SELinux enforcing systems without explicitly setting the context on the StorageClass or PersistentVolume.
2022-04-26 21:33:26 -07:00
35bca6df90 Bump mkdocs-material from 8.2.9 to 8.2.11
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 8.2.9 to 8.2.11.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/CHANGELOG)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/8.2.9...8.2.11)

---
updated-dependencies:
- dependency-name: mkdocs-material
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-04-25 19:02:15 -07:00
d7f55c4e46 Remove use of deprecated key_algorithm field in TLS assets
* Fixes warning about use of deprecated field `key_algorithm` in
the `hashicorp/tls` provider. The key algorithm can now be inferred
directly from the private key so resources don't have to output
and pass around the algorithm
2022-04-20 19:52:03 -07:00
80c6e2e7e6 Update Kubernetes from v1.23.5 to v1.23.6
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.23.md#v1236
2022-04-20 19:39:05 -07:00
fddd8ac69d Fix Flatcar Linux nodes on Google Cloud not ignoring image changes
* Add `boot_disk[0].initialize_params` to the ignored fields for the
controller nodes
* Nodes will auto-update, Terraform should not attempt to delete and
recreate nodes (especially controllers!). Lack of this ignore causes
Terraform to propose deleting controller nodes when Flatcar Linux
releases new images
* Matches the configuration on Typhoon Fedora CoreOS (which does not
have the issue)
2022-04-20 18:53:00 -07:00
2f7d2a92e0 Update Cilium and Calico CNI providers
* Update Cilium from v1.11.3 to v1.11.4
* Update Calico from v3.22.1 to v3.22.2
2022-04-19 08:28:52 -07:00
6cd6bb38de Bump mkdocs-material from 8.2.8 to 8.2.9
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 8.2.8 to 8.2.9.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/CHANGELOG)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/8.2.8...8.2.9)

---
updated-dependencies:
- dependency-name: mkdocs-material
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-04-12 07:53:43 -07:00
d91408258b Update nginx-ingress, Prometheus, and Grafana addons 2022-04-04 08:53:29 -07:00
2df1873b7f Update Cilium from v1.11.2 to v1.11.3
* https://github.com/cilium/cilium/releases/tag/v1.11.3
2022-04-01 16:44:30 -07:00
93ebfc7dd0 Allow upgrading Azure Terraform Provider to v3.x
* Change subnet references to source and destinations prefixes
(plural)
* Remove references to a resource group in some load balancing
components, which no longer require it (inferred)
* Rename `worker_address_prefix` output to `worker_address_prefixes`
2022-04-01 16:36:53 -07:00
5365ce8204 Mount /etc/machine-id from host into Kubelet
* Kubelet node's System UUID can be detected from the sysfs
filesystem without a host mount, but if you need to distinguish
between the host's machine-id and SystemUUID
* On cloud platforms, MachineID and SystemUUID are identical,
but on bare-metal the two differ
2022-04-01 16:32:06 -07:00
2ad33cebaf Bump mkdocs-material from 8.2.5 to 8.2.8
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 8.2.5 to 8.2.8.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/CHANGELOG)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/8.2.5...8.2.8)

---
updated-dependencies:
- dependency-name: mkdocs-material
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-03-28 10:20:10 -07:00
a26abcf5b1 Bump mkdocs from 1.2.3 to 1.3.0
Bumps [mkdocs](https://github.com/mkdocs/mkdocs) from 1.2.3 to 1.3.0.
- [Release notes](https://github.com/mkdocs/mkdocs/releases)
- [Commits](https://github.com/mkdocs/mkdocs/compare/1.2.3...1.3.0)

---
updated-dependencies:
- dependency-name: mkdocs
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-03-28 10:07:34 -07:00
b8c4629548 Bump pymdown-extensions from 9.2 to 9.3
Bumps [pymdown-extensions](https://github.com/facelessuser/pymdown-extensions) from 9.2 to 9.3.
- [Release notes](https://github.com/facelessuser/pymdown-extensions/releases)
- [Commits](https://github.com/facelessuser/pymdown-extensions/compare/9.2...9.3)

---
updated-dependencies:
- dependency-name: pymdown-extensions
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-03-21 10:35:37 -07:00
c5814308ab Refresh Terraform providers shown in docs
* Update a few OS component details
2022-03-19 19:30:43 -07:00
b47edca6be Refresh Prometheus rules and Grafana dashboards
* Update Prometheus rules and Grafana dashboards
* Add new networking dashboards
2022-03-19 17:08:00 -07:00
e61d4b92da Update Kubernetes from v1.23.4 to v1.23.5
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.23.md#v1235
2022-03-16 21:01:41 -07:00
dca745fa4a Update monitoring addon components
* Update Prometheus, kube-state-metrics, and Grafana
2022-03-11 11:50:16 -08:00
661347fa71 Update nginx-ingress from v1.1.1 to v1.1.2
* https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v1.1.2
2022-03-11 11:42:33 -08:00
69770b4827 Update Calico from v3.21.2 to v3.22.1
* https://github.com/projectcalico/calico/releases/tag/v3.22.1
* Fix https://github.com/projectcalico/calico/issues/5011
2022-03-11 11:22:29 -08:00
f797f97675 Update Cilium from v1.11.1 to v1.11.2
* https://github.com/cilium/cilium/releases/tag/v1.11.2
2022-03-11 10:08:24 -08:00
9fe0f2fa6c Bump mkdocs-material from 8.2.3 to 8.2.5
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 8.2.3 to 8.2.5.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/CHANGELOG)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/8.2.3...8.2.5)

---
updated-dependencies:
- dependency-name: mkdocs-material
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-03-11 09:57:31 -08:00
268648c146 Bump mkdocs-material from 8.2.1 to 8.2.3
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 8.2.1 to 8.2.3.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/CHANGELOG)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/8.2.1...8.2.3)

---
updated-dependencies:
- dependency-name: mkdocs-material
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-02-28 09:36:48 -08:00
6cf40722de Revert kube-state-metrics upgrade
* kube-state-metrics:v2.4.0 isn't published, skip it
2022-02-21 19:57:47 -08:00
c230cdec46 Update Grafana and kube-state-metrics addons 2022-02-21 19:36:16 -08:00
cabf5b2c34 Update recommended Terraform provider versions
* Update poseidon/ct version from v0.9.1 to v0.10.0
* Update aws provider to v4.x series
2022-02-21 19:27:54 -08:00
ba8a951863 Bump mkdocs-material from 8.1.11 to 8.2.1
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 8.1.11 to 8.2.1.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/CHANGELOG)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/8.1.11...8.2.1)

---
updated-dependencies:
- dependency-name: mkdocs-material
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-02-21 09:53:27 -08:00
9aa99f1996 Allow upgrading AWS Terraform provider to v4.x
* https://github.com/hashicorp/terraform-provider-aws/releases/tag/v4.0.0
2022-02-17 09:35:15 -08:00
fc38ba45b1 Update Kubernetes from v1.23.3 to v1.23.4
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.23.md#v1234
2022-02-17 09:00:31 -08:00
28a42238c4 Update nginx-ingress, Prometheus, and Grafana addons
* Align `nginx-ingress` `--controller-class` with `IngressClass`
to provide a better example (e.g. if extended to multiple ingress
controllers)
2022-02-17 08:58:29 -08:00
de9b30a587 Bump mkdocs-material from 8.1.10 to 8.1.11
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 8.1.10 to 8.1.11.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/CHANGELOG)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/8.1.10...8.1.11)

---
updated-dependencies:
- dependency-name: mkdocs-material
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-02-14 11:11:06 -08:00
affb40d59b Bump pymdown-extensions from 9.1 to 9.2
Bumps [pymdown-extensions](https://github.com/facelessuser/pymdown-extensions) from 9.1 to 9.2.
- [Release notes](https://github.com/facelessuser/pymdown-extensions/releases)
- [Commits](https://github.com/facelessuser/pymdown-extensions/compare/9.1...9.2)

---
updated-dependencies:
- dependency-name: pymdown-extensions
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-02-14 11:10:56 -08:00
15ac49b34d Bump mkdocs-material from 8.1.9 to 8.1.10
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 8.1.9 to 8.1.10.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/CHANGELOG)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/8.1.9...8.1.10)

---
updated-dependencies:
- dependency-name: mkdocs-material
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-02-07 09:49:24 -08:00
6c70d06937 Update etcd from v3.5.1 to v3.5.2
* https://github.com/etcd-io/etcd/releases/tag/v3.5.2
2022-02-07 08:10:17 -08:00
cf4beeba34 Change default CNI provider from Calico to Cilium
* Cilium (v1.8) was added to Typhoon in v1.18.5 in June 2020
and its become more impressive since then. Its currently the
leading CNI provider choice.
* Calico has grown complex, has lots of CRDs, masks its
management complexity with an operator (which we won't use),
doesn't provide multi-arch images, and hasn't been compatible
with Kubernetes v1.23 (with ipvs) for several releases.
* Both have CNCF conformance quirks (flannel used for conformance),
but that's not the main factor in choosing the default
2022-02-07 08:07:00 -08:00
10b4ba14b6 Bump mkdocs-material from 8.1.8 to 8.1.9
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 8.1.8 to 8.1.9.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/CHANGELOG)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/8.1.8...8.1.9)

---
updated-dependencies:
- dependency-name: mkdocs-material
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-02-01 10:26:39 -08:00
e06ee042ee Switch to using Flatcar Linux images on Google Cloud
* Use the official Kinvolk Flatcar Linux image on Google Cloud
* Change `os_image` from a custom image name to `flatcar-stable`
(default), `flatcar-beta`, or `flatcar-alpha` (**action required**)
* Change `os_image` from a required to an optional variable
* Promote Typhoon on Flatcar Linux / Google Cloud to stable
* Remove docs about needing to upload a Flatcar Linux image
manually on Google Cloud and drop support for custom images
2022-01-28 21:04:10 -08:00
a527f73f5a Update Kubernetes from v1.23.2 to v1.23.3
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.23.md#v1233
2022-01-27 09:23:37 -08:00
c21a0479c0 Bump mkdocs-material from 8.1.7 to 8.1.8
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 8.1.7 to 8.1.8.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/CHANGELOG)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/8.1.7...8.1.8)

---
updated-dependencies:
- dependency-name: mkdocs-material
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-01-27 09:02:30 -08:00
f614c538cf Update Terraform provider recommendations in docs 2022-01-19 21:16:37 -08:00
3da8c1575c Update nginx-ingress and Grafana addons 2022-01-19 21:09:21 -08:00
dedd17d085 Upgrade to DigitalOcean Terraform provider v2.x
* Remove deprecated `private_networking` parameter
2022-01-19 18:32:17 -08:00
e274a451ff Update Kubernetes from v1.23.1 to v1.23.2
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.23.md#v1232
2022-01-19 17:59:49 -08:00
b2e36947ab Bump mkdocs-material from 8.1.5 to 8.1.7
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 8.1.5 to 8.1.7.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/CHANGELOG)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/8.1.5...8.1.7)

---
updated-dependencies:
- dependency-name: mkdocs-material
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-01-19 16:42:21 -08:00
5af0a5c5b9 Add Flatcar Linux ARM64 examples
* Fix content tabs format for switching between example
code blocks
2022-01-14 12:52:45 -08:00
2265ab5375 Remove Kubelet --network-plugin=cni flag
* Now that `docker-shim` is no longer used, the Kubelet flag
is no longer needed and will be removed in v1.24
2022-01-14 10:43:07 -08:00
08ea9776f3 Mask docker.service to prevent socket activation
* Kubelet now uses `containerd` as the container runtime, but
`docker.service` still starts when `docker.sock` is probed bc
the service is socket activated. Prevent this by masking the
`docker.service` unit
2022-01-14 10:31:47 -08:00
2e8bc99164 Remove template provider usage from terraform-render-bootstrap 2022-01-14 10:27:24 -08:00
b18b0a9f3d Remove unused ETCD_UNSUPPORTED_ARCH variable
* etcd used to require a special variable to use the arm64
container image, but this is no longer required
2022-01-14 10:25:45 -08:00
beb9f1477a Add experimental Flatcar Linux arm64 support on AWS
* Add `arch` variable to Flatcar Linux AWS `kubernetes` and
`workers` modules. Accept `amd64` (default) or `arm64` to support
native arm64/aarch64 clusters or mixed/hybrid clusters with arm64
workers
* Requires `flannel` or `cilium` CNI

Similar to https://github.com/poseidon/typhoon/pull/875
2022-01-14 10:24:48 -08:00
f544a9c71f Switch Fedora CoreOS from docker-shim to containerd
* Migrate from `docker-shim` to `containerd` in preparation
for Kubernetes v1.24.0 dropping `docker-shim` support
* Much consideration was given to the container runtime
choice. https://github.com/poseidon/typhoon/issues/899
provides relevant rationales
2022-01-13 09:17:29 -08:00
415b7fa19a Bump pygments from 2.11.1 to 2.11.2
Bumps [pygments](https://github.com/pygments/pygments) from 2.11.1 to 2.11.2.
- [Release notes](https://github.com/pygments/pygments/releases)
- [Changelog](https://github.com/pygments/pygments/blob/master/CHANGES)
- [Commits](https://github.com/pygments/pygments/compare/2.11.1...2.11.2)

---
updated-dependencies:
- dependency-name: pygments
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-01-13 09:03:25 -08:00
d0c29099ba Bump mkdocs-material from 8.1.4 to 8.1.5
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 8.1.4 to 8.1.5.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/CHANGELOG)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/8.1.4...8.1.5)

---
updated-dependencies:
- dependency-name: mkdocs-material
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-01-11 20:42:31 -08:00
30e4070474 Bump mkdocs-material from 8.1.3 to 8.1.4
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 8.1.3 to 8.1.4.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/CHANGELOG)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/8.1.3...8.1.4)

---
updated-dependencies:
- dependency-name: mkdocs-material
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-01-03 10:53:23 -08:00
43f6a19060 Bump pygments from 2.10.0 to 2.11.1
Bumps [pygments](https://github.com/pygments/pygments) from 2.10.0 to 2.11.1.
- [Release notes](https://github.com/pygments/pygments/releases)
- [Changelog](https://github.com/pygments/pygments/blob/master/CHANGES)
- [Commits](https://github.com/pygments/pygments/compare/2.10.0...2.11.1)

---
updated-dependencies:
- dependency-name: pygments
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-01-03 10:48:25 -08:00
50215e373b Add Prometheus config for monitoring Kubernetes Ingress
* Allow Kubernetes Ingress resources to be probed via Blackbox
Exporter (if present) if annotated `prometheus.io/probe: "true"`
* Fix probes of Services via Blackbox Exporter. Require Blackbox
Exporter to be deployed in the same `monitoring` namespace, be
named `blackbox-exporter`, and use port 8080
2021-12-29 11:57:50 -08:00
a9f9c59b91 Configure Prometheus to allow a custom scrape query param
* Set `prometheus.io/param` on a Kubernetes Service to scrape
the service endpoints and pass a custom query parameter
* For example, scrape Consul with `?format=prometheus`

```yaml
kind: Service
metadata:
  annotations:
    prometheus.io/scrape: 'true'
    prometheus.io/port: '8500'
    prometheus.io/path: /v1/agent/metrics
    prometheus.io/param: format=prometheus
```
2021-12-29 11:47:10 -08:00
6ed048eb65 Workaround Terraform v1.1 file provisioner regression
* Terraform v1.1 changed the behavior of provisioners and
`remote-exec` in a way that breaks support for expansions
in commands (including file provisioner, where `destination`
is part of an `scp` command)
* Terraform will likely revert the change eventually, but I
suspect it will take a while
* Instead, we can stop relying on Terraform's expansion
behavior. `/home/core` is a suitable choice for `$HOME` on
both Flatcar Linux and Fedora CoreOS (harldink `/var/home/core`)

Rel: https://github.com/hashicorp/terraform/issues/30243
2021-12-28 13:25:23 -08:00
ce7b2fa21f Bump mkdocs-material from 8.1.1 to 8.1.3
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 8.1.1 to 8.1.3.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/CHANGELOG)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/8.1.1...8.1.3)

---
updated-dependencies:
- dependency-name: mkdocs-material
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-12-23 14:33:26 -08:00
9e3807798f Update Kubernetes from v1.23.0 to v1.23.1
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.23.md#v1231
2021-12-20 08:36:19 -08:00
ef9c6aa423 Switch Flatcar Linux to using containerd CRI
* Use containerd as the Kubernetes Container Runtime
2021-12-15 08:42:13 -08:00
bb5e5811ec Update Prometheus and Grafana addons 2021-12-15 08:16:46 -08:00
16aa997604 Fix Azure backend_address_pool_id deprecation warning
* Change to `backend_address_pool_ids` list
2021-12-14 10:26:08 -08:00
fb6650b06b Bump mkdocs-material from 8.0.4 to 8.1.1
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 8.0.4 to 8.1.1.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/CHANGELOG)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/8.0.4...8.1.1)

---
updated-dependencies:
- dependency-name: mkdocs-material
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-12-13 17:44:02 -08:00
43c6558aaf Update nginx-ingress and monitoring addons 2021-12-10 11:29:49 -08:00
125008fbb3 Update Cilium from v1.10.5 to v1.11.0
* https://github.com/cilium/cilium/releases/tag/v1.11.0
2021-12-10 11:26:05 -08:00
136107b448 Set Kubelet resolver config to /run/systemd/resolve/resolv.conf
* Both Flatcar Linux and Fedora CoreOS use systemd-resolved,
but they setup /etc/resolv.conf symlinks differently
* Prefer using /run/systemd/resolve/resolv.conf directly, which
also updates to reflect runtime changes (e.g. resolvectl)
2021-12-10 08:22:30 -08:00
e97c1cc9e5 Enable Kubernetes aggregation by default
* Change `enable_aggregation` default from false to true
* These days, Kubernetes control plane components emit annoying
messages related to assumptions baked into the Kubernetes API
Aggregation Layer if you don't enable it. Further the conformance
tests force you to remember to enable it if you care about passing
those
* This change is motivated by eliminating annoyances, rather than
any enthusiasm for Kubernetes' aggregation features

Rel: https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/apiserver-aggregation/
2021-12-09 17:30:35 -08:00
39da5b53f5 Update operating system notes in architecture docs 2021-12-09 17:21:24 -08:00
41f739891b Normalize CA certs mounts in static Pods and kube-proxy
* Mount both /etc/ssl/certs and /etc/pki into control plane static
pods and kube-proxy, rather than choosing one based a variable
(set based on Flatcar Linux or Fedora CoreOS)
* Remove deprecated `--port` from `kube-scheduler` static Pod
2021-12-09 09:56:37 -08:00
861021ee98 Update Kubernetes from v1.22.4 to v1.23.0
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.23.md#v1230
* With Calico, add missing caliconodestatuses CRD added in v3.21.0
https://github.com/poseidon/terraform-render-bootstrap/pull/289
2021-12-09 09:28:41 -08:00
9d583ab377 Fix null provider version constraint on Google Cloud
* Part of https://github.com/poseidon/typhoon/pull/1074
2021-12-08 14:06:38 -08:00
c1d28e6f61 Change default disk_iops on Flatcar Linux
* Same as #1073, but for Flatcar Linux on AWS as well
2021-12-07 16:52:55 -08:00
a8fd21d250 Update minimum Terraform provider versions
* Update `null` provider to allow use of v3.1.x releases,
instead of being stuck on v2.1.2
* Update min versions in terraform-render-boostrap
https://github.com/poseidon/terraform-render-bootstrap/pull/287
* Document the recommended versions of Terraform cloud providers
2021-12-07 16:26:34 -08:00
9c626c9dbd Change default disk_iops from unset to 3000
* Since v1.21.3 switched controllers default disk type from
`gp2` to `gp3`, an iops diff has been shown (harmless, but
annoying)
* Controller nodes default to a 30GB `gp3` disk. `gp3` disks
do respect `iops` and the corresponding default is 3000
2021-12-07 15:44:09 -08:00
85252dec6e Switch FCOS workers to official Fedora CoreOS AMIs
* Fix worker nodes to use official Fedora CoreOS AMIs,
instead of the older Poseidon built AMIs (now removed).
This should have been part of #1038, but was missed in
code review
* Poseidon build AMIs have been deleted (so I don't have
to keep paying to host them for people)
2021-12-07 15:31:47 -08:00
298ea65d3e Bump mkdocs-material from 8.0.3 to 8.0.4
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 8.0.3 to 8.0.4.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/CHANGELOG)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/8.0.3...8.0.4)

---
updated-dependencies:
- dependency-name: mkdocs-material
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-12-07 15:29:00 -08:00
c0ab15ba22 Bump mkdocs-material from 7.3.6 to 8.0.3
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 7.3.6 to 8.0.3.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/CHANGELOG)
- [Upgrade guide](https://github.com/squidfunk/mkdocs-material/blob/master/docs/upgrade.md)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/7.3.6...8.0.3)

---
updated-dependencies:
- dependency-name: mkdocs-material
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-12-02 15:25:40 -08:00
5d7b6f611e Update nginx-ingess and Prometheus exporter addons 2021-11-21 09:28:17 -08:00
93594292eb Update Kubernetes from v1.22.3 to v1.22.4
* Update flannel from v0.15.0 to v0.15.1
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.22.md#v1224
2021-11-17 19:53:32 -08:00
0546608e77 Bump pymdown-extensions from 9.0 to 9.1
Bumps [pymdown-extensions](https://github.com/facelessuser/pymdown-extensions) from 9.0 to 9.1.
- [Release notes](https://github.com/facelessuser/pymdown-extensions/releases)
- [Commits](https://github.com/facelessuser/pymdown-extensions/compare/9.0...9.1)

---
updated-dependencies:
- dependency-name: pymdown-extensions
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-11-17 18:41:22 -08:00
94b2793e40 Update CoreDNS from v1.8.4 to v1.8.6
* https://coredns.io/2021/10/07/coredns-1.8.6-release/
2021-11-12 21:09:04 -08:00
4fd43b39ad Fix Flatcar Linux docker driver and add cgroups v2
* Remove `/sys/fs/cgroup/systemd` mount since Flatcar Linux
uses cgroups v2
* Flatcar Linux's `docker` switched from the `cgroupfs` to
`systemd` driver without notice
2021-11-12 21:07:20 -08:00
65083aca7d Update Calico and Flannel CNI providers
* Update Calico from v3.20.2 to v3.21.0
* Update Flannel from v0.14.0 to v0.15.0
2021-11-12 11:03:39 -08:00
07db4c1143 Allow use of google Terraform provider v4.0+
* https://github.com/hashicorp/terraform-provider-google/releases/tag/v4.0.0
2021-11-11 10:17:58 -08:00
e5d0ce5fd7 Bump mkdocs-material from 7.3.4 to 7.3.6
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 7.3.4 to 7.3.6.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/CHANGELOG)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/7.3.4...7.3.6)

---
updated-dependencies:
- dependency-name: mkdocs-material
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-11-07 17:01:41 -08:00
b934a13605 Update Prometheus and Grafana addons 2021-11-07 17:00:40 -08:00
cd005a0b27 Prepare for v1.22.3 release 2021-10-28 11:58:55 -07:00
dd4a5a4e7e Update Kubernetes from v1.22.2 to v1.22.3
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.22.md#v1223
2021-10-28 10:11:06 -07:00
af835f976f Update flannel from v0.13.0 to v0.14.0
* https://github.com/flannel-io/flannel/releases/tag/v0.14.0
2021-10-28 10:09:06 -07:00
9e4a369f76 Bump mkdocs-material from 7.3.3 to 7.3.4
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 7.3.3 to 7.3.4.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/CHANGELOG)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/7.3.3...7.3.4)

---
updated-dependencies:
- dependency-name: mkdocs-material
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-10-23 10:45:49 -07:00
831d897533 Bump mkdocs from 1.2.2 to 1.2.3
Bumps [mkdocs](https://github.com/mkdocs/mkdocs) from 1.2.2 to 1.2.3.
- [Release notes](https://github.com/mkdocs/mkdocs/releases)
- [Commits](https://github.com/mkdocs/mkdocs/compare/1.2.2...1.2.3)

---
updated-dependencies:
- dependency-name: mkdocs
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-10-23 10:31:55 -07:00
17dce49982 Update etcd from v3.5.0 to v3.5.1
* https://github.com/etcd-io/etcd/releases/tag/v3.5.1
2021-10-17 11:28:27 -07:00
5744e10329 Update Cilium from v1.0.4 to v1.0.5
* https://github.com/cilium/cilium/releases/tag/v1.10.5
2021-10-17 11:26:59 -07:00
20748536df Update nginx-ingress from v1.0.2 to v1.0.4
* https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v1.0.4
2021-10-17 11:17:43 -07:00
f2e6256dd9 Update Prometheus, kube-state-metrics, and Grafana
* Update monitoring addons
2021-10-17 11:15:39 -07:00
443bd5a26b Add file to hold nodes on iptables-legacy
* Add `/etc/fedora-coreos/iptables-legacy.stamp` to declare
that `iptables-legacy` should be used instead of `iptables-nft`
(until support is added in future releases)
* https://github.com/coreos/fedora-coreos-tracker/issues/676
2021-10-11 20:30:49 -07:00
f8162b9be3 Update Calico from v3.20.1 to v3.20.2
* Use Calico's iptables legacy vs nft auto-detection
2021-10-11 20:28:48 -07:00
20ffbba4bf Bump mkdocs-material from 7.3.1 to 7.3.3
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 7.3.1 to 7.3.3.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/CHANGELOG)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/7.3.1...7.3.3)

---
updated-dependencies:
- dependency-name: mkdocs-material
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-10-11 19:31:10 -07:00
15117fb95b Update Prometheus and nginx-ingress 2021-10-05 19:15:58 -07:00
10af8b4120 Bump mkdocs-material from 7.3.0 to 7.3.1
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 7.3.0 to 7.3.1.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/docs/changelog.md)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/7.3.0...7.3.1)

---
updated-dependencies:
- dependency-name: mkdocs-material
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-10-04 20:39:01 -07:00
e51b2903c1 Bump pymdown-extensions from 8.2 to 9.0
Bumps [pymdown-extensions](https://github.com/facelessuser/pymdown-extensions) from 8.2 to 9.0.
- [Release notes](https://github.com/facelessuser/pymdown-extensions/releases)
- [Commits](https://github.com/facelessuser/pymdown-extensions/compare/8.2...9.0)

---
updated-dependencies:
- dependency-name: pymdown-extensions
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-10-04 20:38:46 -07:00
cb72b261c7 Update Terraform provider poseidon/matchbox to v0.5+
* Relax version constraint to allow future minor version
releases to be used without a corresponding Typhoon change
2021-09-29 23:41:44 -07:00
209efd2f5b Update Prometheus, Grafana, and kube-state-metrics 2021-09-29 23:39:10 -07:00
388b1238bc Bump mkdocs-material from 7.2.8 to 7.3.0
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 7.2.8 to 7.3.0.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/docs/changelog.md)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/7.2.8...7.3.0)

---
updated-dependencies:
- dependency-name: mkdocs-material
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-09-27 20:40:44 -07:00
5a1e455220 Update nginx-ingress from v1.0.0 to v1.0.1 2021-09-24 09:38:18 -07:00
69f37c8b17 Update Prometheus from v2.29.2 to v2.30.0 2021-09-24 09:34:00 -07:00
b30de949b8 Update Calico and Cilium CNI
* Update Calico from v3.20.0 to v3.20.1
* Update Cilium from v1.10.3 to v1.10.4
2021-09-22 22:18:16 -07:00
4973178750 Bump mkdocs-material from 7.2.6 to 7.2.8
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 7.2.6 to 7.2.8.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/docs/changelog.md)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/7.2.6...7.2.8)

---
updated-dependencies:
- dependency-name: mkdocs-material
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-09-21 08:59:11 -07:00
bb7f31822e Update Kubernetes from v1.22.1 to v1.22.2
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.22.md#v1222
2021-09-15 19:56:24 -07:00
c6923b9ef3 Switch Fedora CoreOS to new ARM64 AMIs (#1038)
* Fedora CoreOS now publishes ARM64 AMIs
2021-09-12 11:49:13 -07:00
dae79d5916 Remove mention of freenode IRC
See #995
2021-09-12 10:10:49 -07:00
f4d5ac0ca7 Bump mkdocs-material from 7.2.5 to 7.2.6
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 7.2.5 to 7.2.6.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/docs/changelog.md)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/7.2.5...7.2.6)

---
updated-dependencies:
- dependency-name: mkdocs-material
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-09-11 12:49:06 -07:00
7e1b2cdba1 Discontinue Docker automated build publishing
* Poseidon infra publishes official multi-arch container
images for Kubelet to both Quay and Dockerhub (fallback).
There is no change here
* Automated builds by Quay and Dockerhub added separately
tagged images for those not able to trust our images and
preferring to trust Quay/Dockerhub. Going forward, we're
ending the use of Dockerhub automated builds. Docker has
moved automated builds to paid plans, even for open source
projects (we're not petitioning for a special exemption
given these are our unofficial images). Those still needing
Kubelet images built externally (i.e. not Poseidon Labs)
would still be able to use the Quay images tagged `build-SHA`
2021-09-01 11:52:57 -07:00
3bb20ce083 Bump mkdocs-material from 7.2.4 to 7.2.5
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 7.2.4 to 7.2.5.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/docs/changelog.md)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/7.2.4...7.2.5)

---
updated-dependencies:
- dependency-name: mkdocs-material
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-08-31 17:34:24 -07:00
eb29fb639b Update nginx-ingress, Prometheus, and Grafana addons 2021-08-24 22:14:57 -07:00
fcbdb50d93 Update Kubernetes from v1.22.0 to v1.22.1
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.22.md#v1221
2021-08-19 21:12:02 -07:00
efac611e9c Bump mkdocs-material from 7.2.2 to 7.2.4
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 7.2.2 to 7.2.4.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/docs/changelog.md)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/7.2.2...7.2.4)

---
updated-dependencies:
- dependency-name: mkdocs-material
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-08-16 11:51:08 -07:00
87ff431b80 Bump pygments from 2.9.0 to 2.10.0
Bumps [pygments](https://github.com/pygments/pygments) from 2.9.0 to 2.10.0.
- [Release notes](https://github.com/pygments/pygments/releases)
- [Changelog](https://github.com/pygments/pygments/blob/master/CHANGES)
- [Commits](https://github.com/pygments/pygments/compare/2.9.0...2.10.0)

---
updated-dependencies:
- dependency-name: pygments
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-08-16 11:40:09 -07:00
0d8ceae1d9 Add etcd v3.5.0 note to CHANGES 2021-08-11 09:24:43 -07:00
c5cf803634 Update Grafana and kube-state-metrics addons 2021-08-10 22:17:16 -07:00
61ee01f462 Show SSH keys with ssh-ed25519 instead of sha-rsa in docs
* For Fedora CoreOS, users should not be using sha-rsa public
keys anymore, so make sure the docs examples reflect this
* https://github.com/poseidon/typhoon/issues/915
2021-08-10 21:48:18 -07:00
cbef202eec Update Prometheus discovery of kube components
* Kubernetes v1.22.0 disabled kube-controller-manager insecure
port, which was used internally for Prometheus metrics scraping
* Configure Prometheus to discover and scrape endpoints for
kube-scheduler and kube-controller-manager via the authenticated
https ports, via bearer token
* Change firewall ports to allow Prometheus (on worker nodes)
to scrape kube-scheduler and kube-controller-manager targets
that run on controller(s) with hostNetwork
* Disable the insecure port on kube-scheduler
2021-08-10 21:25:19 -07:00
0c99b909a9 Update nginx-ingress from v0.47.0 to v1.0.0-beta.1
* https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v1.0.0-beta.1
2021-08-07 12:46:00 -07:00
739db3b35f Update Grafana and node-exporter addons
* https://github.com/grafana/grafana/releases/tag/v8.1.0
* https://github.com/prometheus/node_exporter/releases/tag/v1.2.1
2021-08-05 23:24:57 -07:00
c68b035a63 Update Flatcar Linux and Fedora CoreOS notes 2021-08-05 23:22:45 -07:00
1a5949824c Update etcd from v3.4.16 to v3.5.0
* Use multi-arch container image instead of a special
"-arm64" suffix on arm64
* https://github.com/etcd-io/etcd/releases/tag/v3.5.0
2021-08-04 22:10:07 -07:00
9bac641511 Update Kubernetes from v1.21.3 to v1.22.0
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.22.md#v1220
2021-08-04 22:09:19 -07:00
37ff3c28eb Bump mkdocs-material from 7.1.11 to 7.2.2
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 7.1.11 to 7.2.2.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/docs/changelog.md)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/7.1.11...7.2.2)

---
updated-dependencies:
- dependency-name: mkdocs-material
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-08-02 19:22:58 -07:00
f03045f0dc Update Cilium for cgroups v2 support
* On Fedora CoreOS, Cilium cross-node service IP load balancing
stopped working for a time (first observable as CoreDNS pods
located on worker nodes not being able to reach the kubernetes
API service 10.3.0.1). This turned out to have two parts:
* Fedora CoreOS switched to cgroups v2 by default. In our early
testing with cgroups v2, Calico (default) was used. With the
cgroups v2 change, SELinux policy denied some eBPF operations.
Since fixed in all Fedora CoreOS channels
* Cilium requires new mounts to support cgroups v2, which are
added here

* https://github.com/coreos/fedora-coreos-tracker/issues/292
* https://github.com/coreos/fedora-coreos-tracker/issues/881
* https://github.com/cilium/cilium/pull/16259
2021-07-24 10:36:47 -07:00
b603bbde3d Update Butane Config from v1.2.0 to v1.4.0
* Rename Fedora CoreOS Config (FCC) to Butane Config
* Require any snippets customizations use version v1.4.0

* https://typhoon.psdn.io/advanced/customization/#hosts
2021-07-19 23:53:51 -07:00
810236f6df Bump mkdocs-material from 7.1.10 to 7.1.11
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 7.1.10 to 7.1.11.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/docs/changelog.md)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/7.1.10...7.1.11)

---
updated-dependencies:
- dependency-name: mkdocs-material
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-07-19 10:38:59 -07:00
3c3d3a2473 Bump mkdocs from 1.2.1 to 1.2.2
Bumps [mkdocs](https://github.com/mkdocs/mkdocs) from 1.2.1 to 1.2.2.
- [Release notes](https://github.com/mkdocs/mkdocs/releases)
- [Commits](https://github.com/mkdocs/mkdocs/compare/1.2.1...1.2.2)

---
updated-dependencies:
- dependency-name: mkdocs
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-07-19 10:06:13 -07:00
1af9fd8094 Remove outdated Terraform migration docs
* Terraform v0.12.x and v0.13.x are now quite outdated,
remove the migration docs
2021-07-19 08:36:59 -07:00
c734fa7b84 Update node-exporter from v1.1.2 to v1.2.0
* https://github.com/prometheus/node_exporter/releases/tag/v1.2.0
2021-07-18 15:26:44 -07:00
fdade5b40c Update poseidon/ct provider from v0.8.0 to v0.9.0
* Continue targeting Ignition v3.2.0 for some time
2021-07-18 09:05:02 -07:00
171fd2c998 Update Kubernetes from v1.21.2 to v1.21.3
* https://github.com/kubernetes/kubernetes/releases/tag/v1.21.3
2021-07-17 18:22:24 -07:00
545bd79624 Update Grafana from v8.0.4 to v8.0.6
* https://github.com/grafana/grafana/releases/tag/v8.0.6
2021-07-16 12:02:36 -07:00
12b825c78f Bump mkdocs-material from 7.1.9 to 7.1.10
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 7.1.9 to 7.1.10.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/docs/changelog.md)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/7.1.9...7.1.10)

---
updated-dependencies:
- dependency-name: mkdocs-material
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-07-12 19:10:52 -07:00
66e7354c8a Change AWS default disk type from gp2 to gp3
* https://aws.amazon.com/about-aws/whats-new/2020/12/introducing-new-amazon-ebs-general-purpose-volumes-gp3/
2021-07-04 10:43:05 -07:00
3a71b2ccb1 Update Cilium from v1.10.1 to v1.10.2
* https://github.com/cilium/cilium/releases/tag/v1.10.2
2021-07-04 10:11:21 -07:00
c7e327417b Update Prometheus and Grafana addons 2021-07-04 10:02:44 -07:00
e313e733ab Bump mkdocs-material from 7.1.8 to 7.1.9
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 7.1.8 to 7.1.9.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/docs/changelog.md)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/7.1.8...7.1.9)

---
updated-dependencies:
- dependency-name: mkdocs-material
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-06-29 22:23:36 -07:00
d0e73b8174 Bump terraform-render-bootstrap 2021-06-27 18:11:43 -07:00
65ddd2419c Add Known Issues with FCOS to CHANGES 2021-06-27 16:51:59 -07:00
b0e9b1fa60 Update Prometheus and Grafana addons
* https://github.com/prometheus/prometheus/releases/tag/v2.28.0
* https://github.com/grafana/grafana/releases/tag/v8.0.3
2021-06-27 14:46:43 -07:00
485feb82c4 Update CoreDNS from v1.8.0 to v1.8.4
* https://coredns.io/2021/01/20/coredns-1.8.1-release/
* https://coredns.io/2021/02/23/coredns-1.8.2-release/
* https://coredns.io/2021/02/24/coredns-1.8.3-release/
* https://coredns.io/2021/05/28/coredns-1.8.4-release/
2021-06-23 23:31:25 -07:00
0b276b6b7e Update Kubernetes from v1.21.1 to v1.21.2
* https://github.com/kubernetes/kubernetes/releases/tag/v1.21.2
2021-06-17 16:15:20 -07:00
e8513e58bb Add support for Terraform v1.0.0
* https://github.com/hashicorp/terraform/releases/tag/v1.0.0
2021-06-17 13:32:56 -07:00
d77343be3a Workaround systemd 248 path units not working reliably
* On FCOS 34 / systemd 248, `kubelet.path` won't activate (stuck
waiting) when `/etc/kubernetes/kubeconfig` exists, even with
manual prodding of the file. The root cause isn't known, but
a workaround is to delay `/etc/kubernetes` directory creation
or to touch the directory later
* Fix DigitalOcean worker node kubelet.service being enabled
immediately. On bare-metal and DigitalOcean, the kubeconfig
should activate the Kubelet, so it doesn't crashloop needlessly
(nice to have, not required)
2021-06-16 10:19:39 -07:00
f2b01e1d75 Bump mkdocs-material from 7.1.7 to 7.1.8
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 7.1.7 to 7.1.8.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/docs/changelog.md)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/7.1.7...7.1.8)

---
updated-dependencies:
- dependency-name: mkdocs-material
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-06-14 15:06:18 -07:00
60c2107d7f Bump mkdocs from 1.1.2 to 1.2.1
Bumps [mkdocs](https://github.com/mkdocs/mkdocs) from 1.1.2 to 1.2.1.
- [Release notes](https://github.com/mkdocs/mkdocs/releases)
- [Commits](https://github.com/mkdocs/mkdocs/compare/1.1.2...1.2.1)

---
updated-dependencies:
- dependency-name: mkdocs
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-06-14 15:01:52 -07:00
30cfeec6c1 Update nginx-ingress from v0.46.0 to v0.47.0
* https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.47.0
2021-06-07 10:11:07 -07:00
ba8774ee0d Bump mkdocs-material from 7.1.6 to 7.1.7
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 7.1.6 to 7.1.7.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/docs/changelog.md)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/7.1.6...7.1.7)

---
updated-dependencies:
- dependency-name: mkdocs-material
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-06-07 09:43:23 -07:00
24e63bd134 Update Prometheus, Grafana, kube-state-metrics addons 2021-06-07 09:40:06 -07:00
996bdd9112 Update Calico from v3.19.0 to v3.19.1
* https://docs.projectcalico.org/archive/v3.19/release-notes/
2021-06-02 14:51:15 -07:00
a34d78f55d Bump mkdocs-material from 7.1.5 to 7.1.6
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 7.1.5 to 7.1.6.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/docs/changelog.md)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/7.1.5...7.1.6)

Signed-off-by: dependabot[bot] <support@github.com>
2021-05-31 14:39:01 -07:00
04b2e149ba Remove freenode IRC from help section
* Due to the takeover of freenode.net IRC, the channel
there should no longer be used
2021-05-26 11:31:25 -07:00
9f0126a410 Fix typo in CHANGES.md 2021-05-25 21:16:53 -07:00
a1bab9c96e Bump mkdocs-material from 7.1.4 to 7.1.5
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 7.1.4 to 7.1.5.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/docs/changelog.md)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/7.1.4...7.1.5)

Signed-off-by: dependabot[bot] <support@github.com>
2021-05-24 11:39:13 -07:00
966fd280b0 Update Cilium from v0.10.0-rc1 to v0.10.0
* https://github.com/cilium/cilium/releases/tag/v1.10.0
2021-05-24 11:16:51 -07:00
e4e074c894 Update Cilium from v1.9.6 to v1.10.0-rc1
* Add multi-arch container images and arm64 support
* https://github.com/cilium/cilium/releases/tag/v1.10.0-rc1
2021-05-14 14:24:52 -07:00
d51da49925 Update docs for Kubernetes v1.21.1 and Terraform v0.15.x 2021-05-13 11:34:01 -07:00
2076a779a3 Update Kubernetes from v1.21.0 to v1.21.1
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.21.md#v1211
2021-05-13 11:23:26 -07:00
048094b256 Update etcd from v3.4.15 to v3.4.16
* https://github.com/etcd-io/etcd/blob/main/CHANGELOG-3.4.md
2021-05-13 10:53:04 -07:00
75b063c586 Update Prometheus from v2.25.2 to v2.27.0
* Update Grafana from v7.5.4 to v7.5.6
* https://github.com/prometheus/prometheus/releases/tag/v2.27.0
* https://github.com/grafana/grafana/releases/tag/v7.5.6
2021-05-12 11:47:07 -07:00
1620d1e456 Bump mkdocs-material from 7.1.3 to 7.1.4
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 7.1.3 to 7.1.4.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/docs/changelog.md)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/7.1.3...7.1.4)

Signed-off-by: dependabot[bot] <support@github.com>
2021-05-10 14:53:17 -07:00
939bffbf98 Bump pymdown-extensions from 8.1.1 to 8.2
Bumps [pymdown-extensions](https://github.com/facelessuser/pymdown-extensions) from 8.1.1 to 8.2.
- [Release notes](https://github.com/facelessuser/pymdown-extensions/releases)
- [Commits](https://github.com/facelessuser/pymdown-extensions/compare/8.1.1...8.2)

Signed-off-by: dependabot[bot] <support@github.com>
2021-05-10 14:52:58 -07:00
bc96443710 Update nginx-ingress from v0.45.0 to v0.46.0
* https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.46.0
2021-05-05 12:06:20 -07:00
82a7422b3d Change Dependabot pip watcher to check weekly 2021-05-05 11:34:57 -07:00
132ab395a5 Bump pygments from 2.8.1 to 2.9.0
Bumps [pygments](https://github.com/pygments/pygments) from 2.8.1 to 2.9.0.
- [Release notes](https://github.com/pygments/pygments/releases)
- [Changelog](https://github.com/pygments/pygments/blob/master/CHANGES)
- [Commits](https://github.com/pygments/pygments/compare/2.8.1...2.9.0)

Signed-off-by: dependabot[bot] <support@github.com>
2021-05-05 11:32:02 -07:00
5f87eb3ec9 Update Fedora CoreOS Kubelet for cgroups v2
* Fedora CoreOS is beginning to switch from cgroups v1 to
cgroups v2 by default, which changes the sysfs hierarchy
* This will be needed when using a Fedora Coreos OS image
that enables cgroups v2 (`next` stream as of this writing)

Rel: https://github.com/coreos/fedora-coreos-tracker/issues/292
2021-04-26 11:48:58 -07:00
b152b9f973 Reduce the default disk_size from 40GB to 30GB
* We're typically reducing the `disk_size` in real clusters
since the space is under used. The default should be lower.
2021-04-26 11:43:26 -07:00
9c842395a8 Update Cilium from v1.9.5 to v1.9.6
* https://github.com/cilium/cilium/releases/tag/v1.9.6
2021-04-26 10:55:23 -07:00
6cb9c0341b Bump mkdocs-material from 7.1.2 to 7.1.3
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 7.1.2 to 7.1.3.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/docs/changelog.md)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/7.1.2...7.1.3)

Signed-off-by: dependabot[bot] <support@github.com>
2021-04-26 10:35:00 -07:00
d4fd6d4adb Bump mkdocs-material from 7.1.1 to 7.1.2
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 7.1.1 to 7.1.2.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/docs/changelog.md)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/7.1.1...7.1.2)

Signed-off-by: dependabot[bot] <support@github.com>
2021-04-23 14:26:27 -07:00
3664dfafc2 Update docs with video meetings and referral links
* Use our DigitalOcean referral code for new DigitalOcean
users. This gives new accounts free cloud credits and
provides a smaller cloud credit back to the project
* Link to the new video meeting via one-time Github Sponsor
feature that we're trying out
* List Fedora CoreOS ARM64 as a supported platform (alpha).
Before this was only mentioned in docs and on the blog.
2021-04-17 19:15:51 -07:00
e535ddd15a Update Grafana from v7.5.3 to v7.5.4
* https://github.com/grafana/grafana/releases/tag/v7.5.4
2021-04-17 11:38:14 -07:00
5752a8f041 Update kube-state-metrics from v2.0.0-rc.1 to v2.0.0
* https://github.com/kubernetes/kube-state-metrics/releases/tag/v2.0.0
2021-04-17 11:34:52 -07:00
68abbf7b0d Fix docs link on index page (#975)
* Fix Fedora CoreOS Google Cloud tutorial link
2021-04-17 10:52:59 -07:00
67047ead08 Update Terraform version to allow v0.15.0
* Require Terraform version v0.13 <= x < v0.16
2021-04-16 09:46:01 -07:00
c11e23fc50 Fix minor docs issues and missing changelog links 2021-04-13 09:35:11 -07:00
b647ad8806 Bump mkdocs-material from 7.1.0 to 7.1.1
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 7.1.0 to 7.1.1.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/docs/changelog.md)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/7.1.0...7.1.1)

Signed-off-by: dependabot[bot] <support@github.com>
2021-04-12 20:29:01 -07:00
2eb1ac1b4d Update nginx-ingress from v0.44.0 to v0.45.0
* https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.45.0
2021-04-12 00:18:47 -07:00
cb2721ef7d Update Grafana from v7.5.2 to v7.5.3
* https://github.com/grafana/grafana/releases/tag/v7.5.3
2021-04-12 00:17:22 -07:00
fc06d28e13 Remove deprecated field on azurerm_lb_backend_address_pool
* Remove the deprecated `resource_group_name` field from Azure
`azurerm_lb_backend_address_pool` resources
2021-04-11 23:59:17 -07:00
a9078cb52b Add sponsorship badge to Github repo 2021-04-11 16:00:16 -07:00
ebd9570ede Update Fedora CoreOS Config version from v1.1.0 to v1.2.0
* Require [poseidon/ct](https://github.com/poseidon/terraform-provider-ct)
Terraform provider v0.8+
* Require any [snippets](https://typhoon.psdn.io/advanced/customization/#hosts)
customizations to update to v1.2.0

See upgrade [notes](https://typhoon.psdn.io/topics/maintenance/#upgrade-terraform-provider-ct)
2021-04-11 15:26:54 -07:00
34e8db7aae Update static Pod manifests for Kubernetes v1.21.0
* https://github.com/poseidon/terraform-render-bootstrap/pull/257
2021-04-11 15:05:46 -07:00
084e8bea49 Allow custom initial node taints on worker pool nodes
* Add `node_taints` variable to worker modules to set custom
initial node taints on cloud platforms that support auto-scaling
worker pools of heterogeneous nodes (i.e. AWS, Azure, GCP)
* Worker pools could use custom `node_labels` to allowed workloads
to select among differentiated nodes, while custom `node_taints`
allows a worker pool's nodes to be tainted as special to prevent
scheduling, except by workloads that explicitly tolerate the
taint
* Expose `daemonset_tolerations` in AWS, Azure, and GCP kubernetes
cluster modules, to determine whether `kube-system` components
should tolerate the custom taint (advanced use covered in docs)

Rel: #550, #663
Closes #429
2021-04-11 15:00:11 -07:00
d73621c838 Update Kubernetes from v1.20.5 to v1.21.0
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.21.md#v1210
2021-04-08 21:44:31 -07:00
1a6481df04 Update Grafana from v7.5.1 to v7.5.2
* https://github.com/grafana/grafana/releases/tag/v7.5.2
2021-04-04 18:20:02 -07:00
798ec9a92f Change CNI config directory to /etc/cni/net.d
* Change CNI config directory from `/etc/kubernetes/cni/net.d`
to `/etc/cni/net.d` (Kubelet default)
* https://github.com/poseidon/terraform-render-bootstrap/pull/255
2021-04-02 00:03:48 -07:00
96aed4c3c3 Bump mkdocs-material from 7.0.6 to 7.1.0
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 7.0.6 to 7.1.0.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/docs/changelog.md)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/7.0.6...7.1.0)

Signed-off-by: dependabot[bot] <support@github.com>
2021-04-02 00:01:44 -07:00
7372d33af8 Update kube-state-metrics and Grafana
* https://github.com/kubernetes/kube-state-metrics/releases/tag/v2.0.0-rc.1
* https://github.com/grafana/grafana/releases/tag/v7.5.1
2021-03-28 10:53:52 -07:00
451ec771a8 Update Terraform providers and CHANGES for release 2021-03-23 08:45:57 -07:00
4d9846b83e Add DigitalOcean as a OSS sponsorship partner
* Include DigitalOcean logo and link on repo and site
2021-03-21 11:34:36 -07:00
597ca4acce Update CoreDNS from v1.7.0 to v1.8.0
* https://github.com/poseidon/terraform-render-bootstrap/pull/254
2021-03-20 16:47:25 -07:00
507c646e8b Add Kubelet provider-id on AWS
* Set the Kubelet `--provider-id` on AWS based on metadata from
Fedora CoreOS afterburn or Flatcar Linux coreos-metadata
* Based on https://github.com/poseidon/typhoon/pull/951
2021-03-19 12:43:37 -07:00
d8f7da6873 Add dependabot update watcher for docs pypi packages
* Update requirements.txt packages for mkdocs
2021-03-19 11:55:54 -07:00
048f1f514e Update Grafana from v7.4.3 to v7.4.5
* https://github.com/grafana/grafana/releases/tag/v7.4.5
2021-03-19 11:51:52 -07:00
b825cd9afe Update Prometheus from v2.25.1 to v2.25.2
* https://github.com/prometheus/prometheus/releases/tag/v2.25.2
2021-03-19 11:49:38 -07:00
796149d122 Update Kubernetes from v1.20.4 to v1.20.5
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.20.md#v1205
2021-03-19 11:27:31 -07:00
a66bccd590 Update Cilium from v1.9.4 to v1.9.5
* https://github.com/cilium/cilium/releases/tag/v1.9.5
2021-03-14 11:48:22 -07:00
30b1edfcc6 Mark bootstrap token as sensitive in plan/apply
* Mark the bootstrap token as sensitive, which is useful when
Terraform is run in automated CI/CD systems to avoid showing
the token
* https://github.com/poseidon/terraform-render-bootstrap/pull/251
2021-03-14 11:32:35 -07:00
a4afe06b64 Update Calico from v3.17.3 to v3.18.1
* https://docs.projectcalico.org/archive/v3.18/release-notes/
2021-03-14 10:35:24 -07:00
4d58be0816 Update Prometheus from v2.25.0 to v2.25.1
* https://github.com/prometheus/prometheus/releases/tag/v2.25.1
2021-03-14 09:43:15 -07:00
170b768ad8 Add KUBELET_IMAGE to Fedora CoreOS bootstrap.service (#945)
* Align with Flatcar Linux `bootstrap.service`
2021-03-14 09:35:42 -07:00
5bc1cd28c3 Switch kube-state-metrics image from quay to k8s.gcr.io
* kube-state-metrics is continuing publishing container images
to `k8s.gcr.io` instead of `quay.io`

Rel: https://github.com/kubernetes/kube-state-metrics/issues/1409
2021-03-11 10:56:18 -08:00
13fbac6c79 Update Grafana from v7.4.2 to v7.4.3
* https://github.com/grafana/grafana/releases/tag/v7.4.3
2021-03-05 17:19:54 -08:00
a8fa4a9a06 Update node-exporter and kube-state-metrics
* https://github.com/kubernetes/kube-state-metrics/releases/tag/v2.0.0-rc.0
* https://github.com/prometheus/node_exporter/releases/tag/v1.1.2
2021-03-05 17:13:45 -08:00
a5c1a96df1 Update etcd from v3.4.14 to v3.4.15
* https://github.com/etcd-io/etcd/releases/tag/v3.4.15
2021-03-05 17:02:57 -08:00
6a091e245e Remove Flatcar Linux Edge os_image option
* Flatcar Linux has not published an Edge channel image since
April 2020 and recently removed mention of the channel from
their documentation https://github.com/kinvolk/Flatcar/pull/345
* Users of Flatcar Linux Edge should move to the stable, beta, or
alpha channel, barring any alternate advice from upstream Flatcar
Linux
2021-02-20 16:09:54 -08:00
590796ee62 Update recommended Terraform provider versions
* Sync Terraform provider plugins with those used internally
2021-02-19 00:24:07 -08:00
ec389295fe Update Grafana from v7.4.0 to v7.4.2
* https://github.com/grafana/grafana/releases/tag/v7.4.2
2021-02-19 00:18:39 -08:00
3c807f3478 Update Prometheus from v2.24.1 to v2.25.0
* https://github.com/prometheus/prometheus/releases/tag/v2.25.0
2021-02-19 00:16:35 -08:00
e76fe80b45 Update Kubernetes from v1.20.3 to v1.20.4
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.20.md#v1204
2021-02-19 00:02:07 -08:00
32853aaa7b Update Kubernetes from v1.20.2 to v1.20.3
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.20.md#v1203
2021-02-17 22:29:33 -08:00
c32a54db40 Update node-exporter from v1.0.1 to v1.1.1
* https://github.com/prometheus/node_exporter/releases/tag/v1.1.1
2021-02-14 14:30:28 -08:00
9671b1c734 Update flannel-cni from v0.4.1 to v0.4.2
* https://github.com/poseidon/flannel-cni/releases/tag/v0.4.2
2021-02-14 12:04:59 -08:00
3b933e1ab3 Update Grafana from v7.3.7 to v7.4.0
* https://github.com/grafana/grafana/releases/tag/v7.4.0
2021-02-07 21:42:18 -08:00
58d8f6f505 Update Prometheus from v2.24.0 to v2.24.1
* https://github.com/prometheus/prometheus/releases/tag/v2.24.1
2021-02-04 22:28:32 -08:00
56853fe222 Update nginx-ingress from v0.43.0 to v0.44.0
* https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.44.0
2021-02-04 22:19:58 -08:00
18165d8076 Update Calico from v3.17.1 to v3.17.2
* https://github.com/projectcalico/calico/releases/tag/v3.17.2
2021-02-04 22:03:51 -08:00
50acf28ce5 Update Cilium from v1.9.3 to v1.9.4
* https://github.com/cilium/cilium/releases/tag/v1.9.4
2021-02-03 23:08:22 -08:00
ab793eb842 Update Cilium from v1.9.2 to v1.9.3
* https://github.com/cilium/cilium/releases/tag/v1.9.3
2021-01-26 17:13:52 -08:00
b74c958524 Update Cilium from v1.9.1 to v1.9.2
* https://github.com/cilium/cilium/releases/tag/v1.9.2
2021-01-20 22:06:45 -08:00
173 changed files with 15347 additions and 5789 deletions

1
.github/FUNDING.yml vendored Normal file
View File

@ -0,0 +1 @@
github: [poseidon]

9
.github/dependabot.yaml vendored Normal file
View File

@ -0,0 +1,9 @@
version: 2
updates:
- package-ecosystem: pip
directory: "/"
schedule:
interval: weekly
pull-request-branch-name:
separator: "-"
open-pull-requests-limit: 3

View File

@ -4,6 +4,445 @@ Notable changes between versions.
## Latest
## v1.24.1
* Kubernetes [v1.24.1](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.24.md#v1241)
* Update Cilium from v1.11.4 to [v1.11.5](https://github.com/cilium/cilium/releases/tag/v1.11.5)
### Addons
* Update Prometheus from v2.35.0 to [v2.36.0](https://github.com/prometheus/prometheus/releases/tag/v2.36.0)
* Update Grafana from v8.5.1 to [v8.5.3](https://github.com/grafana/grafana/releases/tag/v8.5.3)
* Update nginx-ingress from v1.2.1 to [v1.2.1](https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v1.2.1)
## v1.24.0
* Kubernetes [v1.24.0](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.24.md#v1240)
* Update etcd from v3.5.2 to [v3.5.4](https://github.com/etcd-io/etcd/releases/tag/v3.5.4)
* Add Kubelet mounts to enable relabeling workload volumes ([#1152](https://github.com/poseidon/typhoon/pull/1152))
* StorageClass no longer require explicit SELinux mount contexts
### Addons
* Update nginx-ingress from v1.1.3 to [v1.2.0](https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v1.2.0)
* Update Prometheus from v2.34.0 to [v2.35.0](https://github.com/prometheus/prometheus/releases/tag/v2.35.0)
* Update Grafana from v8.4.5 to [v8.5.1](https://github.com/grafana/grafana/releases/tag/v8.5.1)
## v1.23.6
* Kubernetes [v1.23.6](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.23.md#v1236)
* Update Cilium from v1.11.2 to [v1.11.4](https://github.com/cilium/cilium/releases/tag/v1.11.4)
* Rename Cilium DaemonSet from `cilium-agent` to `cilium` to match Cilium CLI tools ([#303](https://github.com/poseidon/terraform-render-bootstrap/pull/303))
* Update Calico from v3.22.1 to [v3.22.2](https://github.com/projectcalico/calico/releases/tag/v3.22.2)
* Mount /etc/machine-id from host into Kubelet ([#1143](https://github.com/poseidon/typhoon/pull/1143))
* Remove deprecated use of `key_algorithm` in `hashicorp/tls` resources
### Azure
* Allow upgrading Azure Terraform provider to v3.x ([#1144](https://github.com/poseidon/typhoon/pull/1144))
* Rename `worker_address_prefix` output to `worker_address_prefixes`
### Google Cloud
* Fix issue on Flatcar Linux with controller nodes not ignoring os image changes ([#1149](https://github.com/poseidon/typhoon/pull/1149))
* Nodes will auto-update, Terraform should not attempt to delete/recreate them
### Addons
* Update nginx-ingress from v1.1.2 to [v1.1.3](https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v1.1.3)
* Update Prometheus from v2.33.5 to [v2.34.0](https://github.com/prometheus/prometheus/releases/tag/v2.34.0)
* Update Grafana from v8.4.4 to [v8.4.5](https://github.com/grafana/grafana/releases/tag/v8.4.5)
## v1.23.5
* Kubernetes [v1.23.5](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.23.md#v1235)
* Update Cilium from v1.11.1 to [v1.11.2](https://github.com/cilium/cilium/releases/tag/v1.11.2)
* Update Calico from v3.21.2 to [v3.22.1](https://github.com/projectcalico/calico/releases/tag/v3.22.1)
* Fix [calico#5011](https://github.com/projectcalico/calico/issues/5011), broken since v1.23.0
### Addons
* Refresh Prometheus rules and Grafana dashboards ([#1136](https://github.com/poseidon/typhoon/pull/1136))
* Update nginx-ingress from v1.1.1 to [v1.1.2](https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v1.1.2)
* Update Prometheus from v2.33.3 to [v2.33.5](https://github.com/prometheus/prometheus/releases/tag/v2.33.5)
* Update Grafana from v8.4.1 to [v8.4.3](https://github.com/grafana/grafana/releases/tag/v8.4.3)
* Update kube-state-metrics from v2.3.0 to [v2.4.2](https://github.com/kubernetes/kube-state-metrics/releases/tag/v2.4.2)
## v1.23.4
* Kubernetes [v1.23.4](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.23.md#v1234)
* Update etcd from v3.5.1 to [v3.5.2](https://github.com/etcd-io/etcd/releases/tag/v3.5.2)
* Change default CNI `networking` provider from `calico` to `cilium` ([#1114](https://github.com/poseidon/typhoon/pull/1114))
### AWS
* Allow upgrading AWS Terraform Provider to v4.x
### Addons
* Align nginx-ingress `--controller-class` with `IngressClass`
* Watch only `public` IngressClass objects, better [example](https://kubernetes.github.io/ingress-nginx/user-guide/multiple-ingress/)
* Update Prometheus from v2.32.1 to [v2.33.3](https://github.com/prometheus/prometheus/releases/tag/v2.33.3)
* Update Grafana from v8.3.6 to [v8.4.1](https://github.com/grafana/grafana/releases/tag/v8.4.1)
## V1.23.3
* Kubernetes [v1.23.3](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.23.md#v1233)
### Flatcar Linux
#### Google Cloud
* Switch to using official Kinvolk Flatcar Linux images
* Promote Typhoon on Flatcar Linux / Google Cloud to stable
* Change `os_image` to `flatcar-stable`, `flatcar-beta`, or `flatcar-alpha` (**action required**)
## v1.23.2
* Kubernetes [v1.23.2](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.23.md#v1232)
* Update Cilium from v1.11.0 to [v1.11.1](https://github.com/cilium/cilium/releases/tag/v1.11.1)
* Remove Kubelet flag `--network-plugin`. Unused since `docker-shim` isn't used ([#1106](https://github.com/poseidon/typhoon/pull/1106))
### Fedora CoreOS
* Switch Kubernetes Container Runtime from `docker` to `containerd` ([#1101](https://github.com/poseidon/typhoon/pull/1101))
* Mask `docker.service` to prevent it from being socket activated ([#1105](https://github.com/poseidon/typhoon/pull/1105))
### Flatcar Linux
#### AWS
* Add experimental Flatcar Linux ARM64 support ([docs](https://typhoon.psdn.io/advanced/arm64/), [#1102](https://github.com/poseidon/typhoon/pull/1102))
* Add `arch` variable to AWS `kubernetes` and `workers` modules
* Allow arm64 full-cluster or mixed/hybrid cluster with arm64 workers
* Requires `flannel` or `cilium` CNI provider
### DigitalOcean
* Upgrade DigitalOcean Terraform provider to [v2.x](https://registry.terraform.io/providers/digitalocean/digitalocean/latest/docs) ([#1109](https://github.com/poseidon/typhoon/pull/1109))
### Addons
* Update nginx-ingress from v1.1.0 to [v1.1.1](https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v1.1.1)
* Update Grafana from v8.3.3 to [v8.3.4](https://github.com/grafana/grafana/releases/tag/v8.3.4)
## v1.23.1
* Kubernetes [v1.23.1](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.23.md#v1231)
* Workaround Terraform v1.1 regression in `file` provisioner ([#1093](https://github.com/poseidon/typhoon/pull/1093))
### Flatcar Linux
* Switch Kubernetes Container Runtime from `docker` to `containerd` ([#1087](https://github.com/poseidon/typhoon/pull/1087))
### Addons
* Configure Prometheus to allow a custom scrape query parameter ([#1095](https://github.com/poseidon/typhoon/pull/1095))
* Configure Prometheus to probe Kubernetes Ingress via `blackbox-exporter` ([#1096](https://github.com/poseidon/typhoon/pull/1096))
* Fix Prometheus Service probes to use `blackbox-exporter`, not `blackbox` ([#1096](https://github.com/poseidon/typhoon/pull/1096))
## v1.23.0
* Kubernetes [v1.23.0](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.23.md#v1230)
* Normalize CA cert mounts in static Pods and kube-proxy ([#1078](https://github.com/poseidon/typhoon/pull/1078))
* Set Kubelet resolver config to `/run/systemd/resolve/resolv.conf` ([#1082](https://github.com/poseidon/typhoon/pull/1082))
* Update Cilium from v1.10.5 to [v1.11.0](https://github.com/cilium/cilium/releases/tag/v1.11.0) ([#1083](https://github.com/poseidon/typhoon/pull/1083))
* With Calico, add missing `caliconodestatuses` CRD ([#289](https://github.com/poseidon/terraform-render-bootstrap/pull/289))
* Change `enable_aggregation` default to true ([#279](https://github.com/poseidon/terraform-render-bootstrap/pull/279))
* Remove deprecated `--port` from `kube-scheduler` ([#1078](https://github.com/poseidon/typhoon/pull/1078))
### AWS
* Change controller node default `disk_iops` to 3000 ([#1073](https://github.com/poseidon/typhoon/pull/1073))
### Azure
* Fix warning about deprecated `backend_address_pool_id` ([#1086](https://github.com/poseidon/typhoon/pull/1086))
### Fedora CoreOS
* Fix Fedora ARM64 workers to official Fedora CoreOS AMIs ([#1072](https://github.com/poseidon/typhoon/pull/1072))
* Should have been changed alongside controller AMIs in ([#1038](https://github.com/poseidon/typhoon/pull/1038))
* Old Poseidon built ARM64 AMIs have been deleted
### Addons
* Update nginx-ingress from v1.0.5 to [v1.1.0](https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v1.1.0)
* Update Prometheus from v2.31.1 to [v2.32.0](https://github.com/prometheus/prometheus/releases/tag/v2.32.0)
* Update kube-state-metrics from v2.2.4 to [v2.3.0](https://github.com/kubernetes/kube-state-metrics/releases/tag/v2.3.0)
* Update node-exporter from v1.3.0 to [v1.3.1](https://github.com/prometheus/node_exporter/releases/tag/v1.3.1)
* Update Grafana from v8.2.4 to [v8.3.3](https://github.com/grafana/grafana/releases/tag/v8.3.3)
### Known Issues
* Calico does not yet support Kubernetes v1.23.0, use `flannel` or `cilium` ([calico#5011](https://github.com/projectcalico/calico/issues/5011))
## v1.22.4
* Kubernetes [v1.22.4](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.22.md#v1224)
* Update CoreDNS from v1.8.4 to [v1.8.6](https://github.com/poseidon/terraform-render-bootstrap/pull/284)
* Update Calico from v3.20.2 to [v3.21.0](https://github.com/projectcalico/calico/releases/tag/v3.21.0)
* Update flannel from v0.14.0 to [v0.15.1](https://github.com/flannel-io/flannel/releases/tag/v0.15.1)
### Google
* Allow use of Terraform provider `google` [v4.0+](https://github.com/hashicorp/terraform-provider-google/releases/tag/v4.0.0)
### Flatcar Linux
* Change Kubelet mounts for cgroups v2 ([#1064](https://github.com/poseidon/typhoon/pull/1064))
* Update cgroup driver from cgroupfs to systemd (Flatcar Linux changed default) ([#1064](https://github.com/poseidon/typhoon/pull/1064))
### Addons
* Update Prometheus from v2.30.3 to [v2.31.1](https://github.com/prometheus/prometheus/releases/tag/v2.31.1)
* Update node-exporter from v1.2.2 to [v1.3.0](https://github.com/prometheus/node_exporter/releases/tag/v1.3.0)
* Update kube-state-metrics from v2.2.3 to [v2.2.4](https://github.com/kubernetes/kube-state-metrics/releases/tag/v2.2.4)
* Update Grafana from v8.2.1 to [v8.2.4](https://github.com/grafana/grafana/releases/tag/v8.2.4)
* Update nginx-ingress from v1.0.4 to [v1.0.5](https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v1.0.5)
## v1.23.3
* Kubernetes [v1.22.3](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.22.md#v1223)
* Update etcd from v3.5.0 to [v3.5.1](https://github.com/etcd-io/etcd/releases/tag/v3.5.1)
* Update Cilium from v1.10.4 to [v1.10.5](https://github.com/cilium/cilium/releases/tag/v1.10.5)
* Update Calico from v3.20.1 to [v3.20.2](https://github.com/projectcalico/calico/releases/tag/v3.20.2)
* Use Calico's iptables legacy vs nft auto-detection
* Update flannel from v0.13.0 to v0.14.0
### Bare-Metal
* Require Terraform provider `poseidon/matchbox` v0.5+ ([#1048](https://github.com/poseidon/typhoon/pull/1048))
### Addons
* Update nginx-ingress from v1.0.0 to [v1.0.4](https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v1.0.4)
* Update Prometheus from v2.29.2 to [v2.30.3](https://github.com/prometheus/prometheus/releases/tag/v2.30.3)
* Update kube-state-metrics from v2.2.0 to [v2.2.3](https://github.com/kubernetes/kube-state-metrics/releases/tag/v2.2.3)
* Update Grafana from v8.1.2 to [v8.2.1](https://github.com/grafana/grafana/releases/tag/v8.2.1)
## v1.22.2
* Kubernetes [v1.22.2](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.22.md#v1222)
* Update Cilium from v1.10.3 to [v1.10.4](https://github.com/cilium/cilium/releases/tag/v1.10.4)
* Update Calico from v3.20.0 to [v3.20.1](https://github.com/projectcalico/calico/releases/tag/v3.20.1)
* Fix access to ClusterIP services with Cilium ([#276](https://github.com/poseidon/terraform-render-bootstrap/pull/276))
### Fedora CoreOS
* Use Fedora CoreOS ARM64 AMIs ([#1038](https://github.com/poseidon/typhoon/pull/1038))
### Addons
* Update Prometheus from v2.29.1 to [v2.29.2](https://github.com/prometheus/prometheus/releases/tag/v2.29.2)
* Update kube-state-metrics from v2.1.1 to [v2.2.0](https://github.com/kubernetes/kube-state-metrics/releases/tag/v2.2.0)
## v1.22.1
* Kubernetes [v1.22.1](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.22.md#v1221)
* Update Calico from v3.19.1 to [v3.20.0](https://github.com/projectcalico/calico/releases/tag/v3.20.0)
### Addons
* Update nginx-ingress from v1.0.0-beta.1 to [v1.0.0](https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v1.0.0)
* Update Prometheus from v2.28.1 to [v2.29.1](https://github.com/prometheus/prometheus/releases/tag/v2.29.1)
* Update Grafana from v8.1.1 to [v8.1.2](https://github.com/grafana/grafana/releases/tag/v8.1.2)
## v1.22.0
* Kubernetes [v1.22.0](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.22.md#v1220)
* Update etcd from v3.4.16 to [v3.5.0](https://github.com/etcd-io/etcd/releases/tag/v3.5.0)
* Switch `kube-controller-manager` and `kube-scheduler` to use secure port only
* Update Prometheus config to discover endpoints and use a bearer token to scrape
### Fedora CoreOS
* Add Cilium cgroups v2 support on Fedora CoreOS
* Update Butane Config version from v1.2.0 to v1.4.0
* Rename Fedora CoreOS Config to Butane Config
* Require any [snippets](https://typhoon.psdn.io/advanced/customization/#hosts) customizations to update to v1.4.0
### Addons
* Update nginx-ingress from v0.47.0 to [v1.0.0-beta.1](https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v1.0.0-beta.1)
* Update node-exporter from v1.2.0 to [v1.2.2](https://github.com/prometheus/node_exporter/releases/tag/v1.2.2)
* Update kube-state-metrics from v2.1.0 to [v2.1.1](https://github.com/kubernetes/kube-state-metrics/releases/tag/v2.1.1)
* Update Grafana from v8.0.6 to [v8.1.1](https://github.com/grafana/grafana/releases/tag/v8.1.1)
## v1.21.3
* Kubernetes [v1.21.3](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.21.md#v1213)
* Update Cilium from v1.10.1 to [v1.10.3](https://github.com/cilium/cilium/releases/tag/v1.10.3)
* Require [poseidon/ct](https://github.com/poseidon/terraform-provider-ct) Terraform provider v0.9+ ([notes](https://typhoon.psdn.io/topics/maintenance/#upgrade-terraform-provider-ct))
### AWS
* Change default disk type from `gp2` to `gp3` ([#1012](https://github.com/poseidon/typhoon/pull/1012))
### Addons
* Update Prometheus from v2.28.0 to [v2.28.1](https://github.com/prometheus/prometheus/releases/tag/v2.28.1)
* Update node-exporter from v1.1.2 to [v1.2.0](https://github.com/prometheus/node_exporter/releases/tag/v1.2.0)
* Update Grafana from v8.0.3 to [v8.0.6](https://github.com/grafana/grafana/releases/tag/v8.0.6)
### Known Issues
* Cilium with recent Fedora CoreOS will have networking issues ([fedora-coreos#881](https://github.com/coreos/fedora-coreos-tracker/issues/881)) (fixed in v1.21.4)
## v1.21.2
* Kubernetes [v1.21.2](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.21.md#v1212)
* Add Terraform v1.0.x support ([#974](https://github.com/poseidon/typhoon/pull/974))
* Continue to support Terraform v0.13.x, v0.14.4+, and v0.15.x
* Update CoreDNS from v1.8.0 to [v1.8.4]([#1006](https://github.com/poseidon/typhoon/pull/1006))
* Update Cilium from v1.9.6 to [v1.10.1](https://github.com/cilium/cilium/releases/tag/v1.10.1)
* Update Calico from v3.19.0 to [v3.19.1](https://github.com/projectcalico/calico/releases/tag/v3.19.1)
### Addons
* Update kube-state-metrics from v2.0.0 to [v2.1.0](https://github.com/kubernetes/kube-state-metrics/releases/tag/v2.1.0)
* Update Prometheus from v2.27.0 to [v2.28.0](https://github.com/prometheus/prometheus/releases/tag/v2.28.0)
* Update Grafana from v7.5.6 to [v8.0.3](https://github.com/grafana/grafana/releases/tag/v8.0.3)
* Update nginx-ingress from v0.46.0 to [v0.47.0](https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.47.0)
### Fedora CoreOS
#### AWS
* Extend experimental Fedora CoreOS arm64 support with Cilium
* CNI provider may now be `flannel` or `cilium` (new)
#### Bare-Metal
* Workaround systemd path unit issue [fedora-coreos-tracker/#861](https://github.com/coreos/fedora-coreos-tracker/issues/861)
#### DigitalOcean
* Workaround systemd path unit issue [fedora-coreos-tracker/#861](https://github.com/coreos/fedora-coreos-tracker/issues/861)
### Known Issues
* Cilium with recent Fedora CoreOS will have networking issues ([fedora-coreos#881](https://github.com/coreos/fedora-coreos-tracker/issues/881)) (fixed in v1.21.4)
## v1.21.1
* Kubernetes [v1.21.1](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.21.md#v1211)
* Add Terraform v0.15.x support ([#974](https://github.com/poseidon/typhoon/pull/974))
* Continue to support Terraform v0.13.x and v0.14.4+
* Update etcd from v3.4.15 to [v3.4.16](https://github.com/etcd-io/etcd/releases/tag/v3.4.16)
* Update Cilium from v1.9.5 to [v1.9.6](https://github.com/cilium/cilium/releases/tag/v1.9.6)
* Update Calico from v3.18.1 to [v3.19.0](https://github.com/projectcalico/calico/releases/tag/v3.19.0)
### AWS
* Reduce the default `disk_size` from 40GB to 30GB ([#983](https://github.com/poseidon/typhoon/pull/983))
### Azure
* Reduce the default `disk_size` from 40GB to 30GB ([#983](https://github.com/poseidon/typhoon/pull/983))
### Google Cloud
* Reduce the default `disk_size` from 40GB to 30GB ([#983](https://github.com/poseidon/typhoon/pull/983))
### Fedora CoreOS
* Update Kubelet mounts for cgroups v2 ([#978](https://github.com/poseidon/typhoon/pull/978))
### Addons
* Update kube-state-metrics from v2.0.0-rc.1 to [v2.0.0](https://github.com/kubernetes/kube-state-metrics/releases/tag/v2.0.0)
* Update Prometheus from v2.25.2 to [v2.27.0](https://github.com/prometheus/prometheus/releases/tag/v2.27.0)
* Update Grafana from v7.5.3 to [v7.5.6](https://github.com/grafana/grafana/releases/tag/v7.5.6)
* Update nginx-ingress from v0.45.0 to [v0.46.0](https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.46.0)
## v1.21.0
* Kubernetes [v1.21.0](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.21.md#v1210)
* Enable `tokencleaner` controller ([#969](https://github.com/poseidon/typhoon/pull/969))
* Enable `kube-scheduler` and `kube-controller-manager` separate authn/z kubeconfig
* Change CNI config location from /etc/kubernetes/cni/net.d to /etc/cni/net.d ([#965](https://github.com/poseidon/typhoon/pull/965))
* Change `kube-controller-manager` to mount `/var/lib/kubelet/volumeplugins` directly
* Remove unused `cloud-provider` flags
* Update Fedora CoreOS Config version from v1.1.0 to v1.2.0 ([#970](https://github.com/poseidon/typhoon/pull/970))
* Require [poseidon/ct](https://github.com/poseidon/terraform-provider-ct) Terraform provider v0.8+ ([notes](https://typhoon.psdn.io/topics/maintenance/#upgrade-terraform-provider-ct))
* Require any [snippets](https://typhoon.psdn.io/advanced/customization/#hosts) customizations to update to v1.2.0
### AWS
* Allow setting custom initial node taints on worker pools ([#968](https://github.com/poseidon/typhoon/pull/968))
* Add `node_taints` variable to internal `workers` pool module to set initial node taints
* Add `daemonset_tolerations` so `kube-system` DaemonSets can tolerate custom taints
### Azure
* Allow setting custom initial node taints on worker pools ([#968](https://github.com/poseidon/typhoon/pull/968))
* Add `node_taints` variable to internal `workers` pool module to set initial node taints
* Add `daemonset_tolerations` so `kube-system` DaemonSets can tolerate custom taints
* Remove deprecated `azurerm_lb_backend_address_pool` field `resource_group_name` ([#972](https://github.com/poseidon/typhoon/pull/972))
### Google Cloud
* Allow setting custom initial node taints on worker pools ([#968](https://github.com/poseidon/typhoon/pull/968))
* Add `node_taints` variable to internal `workers` pool module to set initial node taints
* Add `daemonset_tolerations` so `kube-system` DaemonSets can tolerate custom taints
### Addons
* Update nginx-ingress from v0.44.0 to [v0.45.0](https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.45.0)
* Update kube-state-metrics from v2.0.0-rc.0 to [v2.0.0-rc.1](https://github.com/kubernetes/kube-state-metrics/releases/tag/v2.0.0-rc.1)
* Update Grafana from v7.4.5 to [v7.5.3](https://github.com/grafana/grafana/releases/tag/v7.5.3)
## v1.20.5
* Kubernetes [v1.20.5](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.20.md#v1205)
* Update etcd from v3.4.14 to [v3.4.15](https://github.com/etcd-io/etcd/releases/tag/v3.4.15)
* Update Cilium from v1.9.4 to [v1.9.5](https://github.com/cilium/cilium/releases/tag/v1.9.5)
* Update Calico from v3.17.3 to [v3.18.1](https://github.com/projectcalico/calico/releases/tag/v3.18.1)
* Update CoreDNS from v1.7.0 to [v1.8.0](https://coredns.io/2020/10/22/coredns-1.8.0-release/)
* Mark bootstrap token as sensitive in Terraform plans ([#949](https://github.com/poseidon/typhoon/pull/949))
### Fedora CoreOS
* Set Kubelet `provider-id` ([#951](https://github.com/poseidon/typhoon/pull/951))
### Flatcar Linux
#### AWS
* Set Kubelet `provider-id` ([#951](https://github.com/poseidon/typhoon/pull/951))
* Remove `os_image` option `flatcar-edge` ([#943](https://github.com/poseidon/typhoon/pull/943))
#### Azure
* Remove `os_image` option `flatcar-edge` ([#943](https://github.com/poseidon/typhoon/pull/943))
#### Bare-Metal
* Remove `os_channel` option `flatcar-edge` ([#943](https://github.com/poseidon/typhoon/pull/943))
### Addons
* Update Prometheus from v2.25.0 to [v2.25.2](https://github.com/prometheus/prometheus/releases/tag/v2.25.2)
* Update kube-state-metrics from v2.0.0-alpha.3 to [v2.0.0-rc.0](https://github.com/kubernetes/kube-state-metrics/releases/tag/v2.0.0-rc.0)
* Switch image from `quay.io` to `k8s.gcr.io` ([#946](https://github.com/poseidon/typhoon/pull/946))
* Update node-exporter from v1.1.1 to [v1.1.2](https://github.com/prometheus/node_exporter/releases/tag/v1.1.2)
* Update Grafana from v7.4.2 to [v7.4.5](https://github.com/grafana/grafana/releases/tag/v7.4.5)
## v1.20.4
* Kubernetes [v1.20.4](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.20.md#v1204)
* Update Cilium from v1.9.1 to [v1.9.4](https://github.com/cilium/cilium/releases/tag/v1.9.4)
* Update Calico from v3.17.1 to [v3.17.3](https://github.com/projectcalico/calico/releases/tag/v3.17.3)
* Update flannel-cni from v0.4.1 to [v0.4.2](https://github.com/poseidon/flannel-cni/releases/tag/v0.4.2)
### Addons
* Update nginx-ingress from v0.43.0 to [v0.44.0](https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.44.0)
* Update Prometheus from v2.24.0 to [v2.25.0](https://github.com/prometheus/prometheus/releases/tag/v2.25.0)
* Update node-exporter from v1.0.1 to [v1.1.1](https://github.com/prometheus/node_exporter/releases/tag/v1.1.1)
* Update Grafana from v7.3.7 to [v7.4.2](https://github.com/grafana/grafana/releases/tag/v7.4.2)
## v1.20.2
* Kubernetes [v1.20.2](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.20.md#v1202)

View File

@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.20.2 (upstream)
* Kubernetes v1.24.1 (upstream)
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/), SELinux enforcing
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [preemptible](https://typhoon.psdn.io/flatcar-linux/google-cloud/#preemption) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#hosts) customization
@ -31,6 +31,10 @@ Typhoon is available for [Fedora CoreOS](https://getfedora.org/coreos/).
| DigitalOcean | Fedora CoreOS | [digital-ocean/fedora-coreos/kubernetes](digital-ocean/fedora-coreos/kubernetes) | beta |
| Google Cloud | Fedora CoreOS | [google-cloud/fedora-coreos/kubernetes](google-cloud/fedora-coreos/kubernetes) | stable |
| Platform | Operating System | Terraform Module | Status |
|---------------|------------------|------------------|--------|
| AWS | Fedora CoreOS (ARM64) | [aws/fedora-coreos/kubernetes](aws/fedora-coreos/kubernetes) | alpha |
Typhoon is available for [Flatcar Linux](https://www.flatcar-linux.org/releases/).
| Platform | Operating System | Terraform Module | Status |
@ -39,7 +43,11 @@ Typhoon is available for [Flatcar Linux](https://www.flatcar-linux.org/releases/
| Azure | Flatcar Linux | [azure/flatcar-linux/kubernetes](azure/flatcar-linux/kubernetes) | alpha |
| Bare-Metal | Flatcar Linux | [bare-metal/flatcar-linux/kubernetes](bare-metal/flatcar-linux/kubernetes) | stable |
| DigitalOcean | Flatcar Linux | [digital-ocean/flatcar-linux/kubernetes](digital-ocean/flatcar-linux/kubernetes) | beta |
| Google Cloud | Flatcar Linux | [google-cloud/flatcar-linux/kubernetes](google-cloud/flatcar-linux/kubernetes) | beta |
| Google Cloud | Flatcar Linux | [google-cloud/flatcar-linux/kubernetes](google-cloud/flatcar-linux/kubernetes) | stable |
| Platform | Operating System | Terraform Module | Status |
|---------------|------------------|------------------|--------|
| AWS | Flatcar Linux (ARM64) | [aws/flatcar-linux/kubernetes](aws/flatcar-linux/kubernetes) | alpha |
## Documentation
@ -54,7 +62,7 @@ Define a Kubernetes cluster by using the Terraform module for your chosen platfo
```tf
module "yavin" {
source = "git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes?ref=v1.20.2"
source = "git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes?ref=v1.24.1"
# Google Cloud
cluster_name = "yavin"
@ -63,7 +71,7 @@ module "yavin" {
dns_zone_name = "example-zone"
# configuration
ssh_authorized_key = "ssh-rsa AAAAB3Nz..."
ssh_authorized_key = "ssh-ed25519 AAAAB3Nz..."
# optional
worker_count = 2
@ -93,9 +101,9 @@ In 4-8 minutes (varies by platform), the cluster will be ready. This Google Clou
$ export KUBECONFIG=/home/user/.kube/configs/yavin-config
$ kubectl get nodes
NAME ROLES STATUS AGE VERSION
yavin-controller-0.c.example-com.internal <none> Ready 6m v1.20.2
yavin-worker-jrbf.c.example-com.internal <none> Ready 5m v1.20.2
yavin-worker-mzdm.c.example-com.internal <none> Ready 5m v1.20.2
yavin-controller-0.c.example-com.internal <none> Ready 6m v1.24.1
yavin-worker-jrbf.c.example-com.internal <none> Ready 5m v1.24.1
yavin-worker-mzdm.c.example-com.internal <none> Ready 5m v1.24.1
```
List the pods.
@ -126,7 +134,7 @@ Typhoon is strict about minimalism, maturity, and scope. These are not in scope:
## Help
Ask questions on the IRC #typhoon channel on [freenode.net](http://freenode.net/).
Schedule a meeting via [Github Sponsors](https://github.com/sponsors/poseidon?frequency=one-time) to discuss your use case.
## Motivation
@ -142,6 +150,11 @@ Typhoon clusters will contain only [free](https://www.debian.org/intro/free) com
## Sponsors
Poseidon's Github [Sponsors](https://github.com/sponsors/poseidon) support the infrastructure and operational costs of providing Typhoon. If you'd like your company here, please contact dghubble at psdn.io.
Poseidon's Github [Sponsors](https://github.com/sponsors/poseidon) support the infrastructure and operational costs of providing Typhoon.
* [DigitalOcean](https://www.digitalocean.com/) kindly provides us cloud credits
<a href="https://www.digitalocean.com/">
<img src="https://opensource.nyc3.cdn.digitaloceanspaces.com/attribution/assets/SVG/DO_Logo_horizontal_blue.svg" width="201px">
</a>
<br>
If you'd like your company here, please contact dghubble at psdn.io.

View File

@ -140,7 +140,7 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "sum(rate(grpc_server_started_total{job=\"$cluster\",grpc_type=\"unary\"}[5m]))",
"expr": "sum(rate(grpc_server_started_total{job=\"$cluster\",grpc_type=\"unary\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "RPC Rate",
@ -149,7 +149,7 @@ data:
"step": 2
},
{
"expr": "sum(rate(grpc_server_handled_total{job=\"$cluster\",grpc_type=\"unary\",grpc_code!=\"OK\"}[5m]))",
"expr": "sum(rate(grpc_server_handled_total{job=\"$cluster\",grpc_type=\"unary\",grpc_code=~\"Unknown|FailedPrecondition|ResourceExhausted|Internal|Unavailable|DataLoss|DeadlineExceeded\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "RPC Failed Rate",
@ -430,7 +430,7 @@ data:
"steppedLine": true,
"targets": [
{
"expr": "histogram_quantile(0.99, sum(rate(etcd_disk_wal_fsync_duration_seconds_bucket{job=\"$cluster\"}[5m])) by (instance, le))",
"expr": "histogram_quantile(0.99, sum(rate(etcd_disk_wal_fsync_duration_seconds_bucket{job=\"$cluster\"}[$__rate_interval])) by (instance, le))",
"hide": false,
"intervalFactor": 2,
"legendFormat": "{{instance}} WAL fsync",
@ -439,7 +439,7 @@ data:
"step": 4
},
{
"expr": "histogram_quantile(0.99, sum(rate(etcd_disk_backend_commit_duration_seconds_bucket{job=\"$cluster\"}[5m])) by (instance, le))",
"expr": "histogram_quantile(0.99, sum(rate(etcd_disk_backend_commit_duration_seconds_bucket{job=\"$cluster\"}[$__rate_interval])) by (instance, le))",
"intervalFactor": 2,
"legendFormat": "{{instance}} DB fsync",
"metric": "etcd_disk_backend_commit_duration_seconds_bucket",
@ -617,7 +617,7 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "rate(etcd_network_client_grpc_received_bytes_total{job=\"$cluster\"}[5m])",
"expr": "rate(etcd_network_client_grpc_received_bytes_total{job=\"$cluster\"}[$__rate_interval])",
"intervalFactor": 2,
"legendFormat": "{{instance}} Client Traffic In",
"metric": "etcd_network_client_grpc_received_bytes_total",
@ -703,7 +703,7 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "rate(etcd_network_client_grpc_sent_bytes_total{job=\"$cluster\"}[5m])",
"expr": "rate(etcd_network_client_grpc_sent_bytes_total{job=\"$cluster\"}[$__rate_interval])",
"intervalFactor": 2,
"legendFormat": "{{instance}} Client Traffic Out",
"metric": "etcd_network_client_grpc_sent_bytes_total",
@ -789,7 +789,7 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "sum(rate(etcd_network_peer_received_bytes_total{job=\"$cluster\"}[5m])) by (instance)",
"expr": "sum(rate(etcd_network_peer_received_bytes_total{job=\"$cluster\"}[$__rate_interval])) by (instance)",
"intervalFactor": 2,
"legendFormat": "{{instance}} Peer Traffic In",
"metric": "etcd_network_peer_received_bytes_total",
@ -878,7 +878,7 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "sum(rate(etcd_network_peer_sent_bytes_total{job=\"$cluster\"}[5m])) by (instance)",
"expr": "sum(rate(etcd_network_peer_sent_bytes_total{job=\"$cluster\"}[$__rate_interval])) by (instance)",
"hide": false,
"interval": "",
"intervalFactor": 2,
@ -972,7 +972,7 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "sum(rate(etcd_server_proposals_failed_total{job=\"$cluster\"}[5m]))",
"expr": "sum(rate(etcd_server_proposals_failed_total{job=\"$cluster\"}[$__rate_interval]))",
"intervalFactor": 2,
"legendFormat": "Proposal Failure Rate",
"metric": "etcd_server_proposals_failed_total",
@ -988,7 +988,7 @@ data:
"step": 2
},
{
"expr": "sum(rate(etcd_server_proposals_committed_total{job=\"$cluster\"}[5m]))",
"expr": "sum(rate(etcd_server_proposals_committed_total{job=\"$cluster\"}[$__rate_interval]))",
"intervalFactor": 2,
"legendFormat": "Proposal Commit Rate",
"metric": "etcd_server_proposals_committed_total",
@ -996,7 +996,7 @@ data:
"step": 2
},
{
"expr": "sum(rate(etcd_server_proposals_applied_total{job=\"$cluster\"}[5m]))",
"expr": "sum(rate(etcd_server_proposals_applied_total{job=\"$cluster\"}[$__rate_interval]))",
"intervalFactor": 2,
"legendFormat": "Proposal Apply Rate",
"refId": "D",
@ -1131,6 +1131,131 @@ data:
"show": true
}
]
},
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"decimals": 0,
"editable": true,
"error": false,
"fieldConfig": {
"defaults": {
"custom": {
}
},
"overrides": [
]
},
"fill": 0,
"fillGradient": 0,
"gridPos": {
"h": 7,
"w": 12,
"x": 0,
"y": 28
},
"hiddenSeries": false,
"id": 42,
"isNew": true,
"legend": {
"alignAsTable": false,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": false,
"show": false,
"total": false,
"values": false
},
"lines": true,
"linewidth": 2,
"links": [
],
"nullPointMode": "connected",
"options": {
"alertThreshold": true
},
"percentage": false,
"pluginVersion": "7.4.3",
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "histogram_quantile(0.99, sum by (instance, le) (rate(etcd_network_peer_round_trip_time_seconds_bucket{job=\"$cluster\"}[$__rate_interval])))",
"interval": "",
"intervalFactor": 2,
"legendFormat": "{{instance}} Peer round trip time",
"metric": "etcd_network_peer_round_trip_time_seconds_bucket",
"refId": "A",
"step": 2
}
],
"thresholds": [
],
"timeFrom": null,
"timeRegions": [
],
"timeShift": null,
"title": "Peer round trip time",
"tooltip": {
"msResolution": false,
"shared": true,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"$$hashKey": "object:925",
"decimals": null,
"format": "s",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
},
{
"$$hashKey": "object:926",
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
}
],
"yaxis": {
"align": false,
"alignLevel": null
}
}
],
"title": "New row"
@ -1140,7 +1265,7 @@ data:
"sharedCrosshair": false,
"style": "dark",
"tags": [
"etcd-mixin"
],
"templating": {
"list": [
@ -1150,7 +1275,7 @@ data:
"value": "Prometheus"
},
"hide": 0,
"label": null,
"label": "Data Source",
"name": "datasource",
"options": [
@ -1176,7 +1301,7 @@ data:
],
"query": "label_values(etcd_server_has_leader, job)",
"refresh": 1,
"refresh": 2,
"regex": "",
"sort": 2,
"tagValuesQuery": "",

View File

@ -0,0 +1,7644 @@
apiVersion: v1
data:
cluster-total.json: |-
{
"__inputs": [
],
"__requires": [
],
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": "-- Grafana --",
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"type": "dashboard"
}
]
},
"editable": true,
"gnetId": null,
"graphTooltip": 0,
"hideControls": false,
"id": null,
"links": [
],
"panels": [
{
"collapse": false,
"collapsed": false,
"gridPos": {
"h": 1,
"w": 24,
"x": 0,
"y": 0
},
"id": 2,
"panels": [
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Current Bandwidth",
"titleSize": "h6",
"type": "row"
},
{
"aliasColors": {
},
"bars": true,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 2,
"fillGradient": 0,
"gridPos": {
"h": 9,
"w": 12,
"x": 0,
"y": 1
},
"id": 3,
"legend": {
"alignAsTable": true,
"avg": false,
"current": true,
"hideEmpty": true,
"hideZero": true,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"sideWidth": null,
"sort": "current",
"sortDesc": true,
"total": false,
"values": true
},
"lines": false,
"linewidth": 1,
"links": [
],
"minSpan": 24,
"nullPointMode": "null",
"paceLength": 10,
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 24,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "sort_desc(sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\",namespace=~\".+\"}[$interval:$resolution])) by (namespace))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{namespace}}",
"refId": "A",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Current Rate of Bytes Received",
"tooltip": {
"shared": true,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "series",
"name": null,
"show": false,
"values": [
"current"
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
}
]
},
{
"aliasColors": {
},
"bars": true,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 2,
"fillGradient": 0,
"gridPos": {
"h": 9,
"w": 12,
"x": 12,
"y": 1
},
"id": 4,
"legend": {
"alignAsTable": true,
"avg": false,
"current": true,
"hideEmpty": true,
"hideZero": true,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"sideWidth": null,
"sort": "current",
"sortDesc": true,
"total": false,
"values": true
},
"lines": false,
"linewidth": 1,
"links": [
],
"minSpan": 24,
"nullPointMode": "null",
"paceLength": 10,
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 24,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "sort_desc(sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\",namespace=~\".+\"}[$interval:$resolution])) by (namespace))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{namespace}}",
"refId": "A",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Current Rate of Bytes Transmitted",
"tooltip": {
"shared": true,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "series",
"name": null,
"show": false,
"values": [
"current"
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
}
]
},
{
"columns": [
{
"text": "Time",
"value": "Time"
},
{
"text": "Value #A",
"value": "Value #A"
},
{
"text": "Value #B",
"value": "Value #B"
},
{
"text": "Value #C",
"value": "Value #C"
},
{
"text": "Value #D",
"value": "Value #D"
},
{
"text": "Value #E",
"value": "Value #E"
},
{
"text": "Value #F",
"value": "Value #F"
},
{
"text": "Value #G",
"value": "Value #G"
},
{
"text": "Value #H",
"value": "Value #H"
},
{
"text": "namespace",
"value": "namespace"
}
],
"datasource": "$datasource",
"fill": 1,
"fontSize": "90%",
"gridPos": {
"h": 9,
"w": 24,
"x": 0,
"y": 10
},
"id": 5,
"lines": true,
"linewidth": 1,
"links": [
],
"minSpan": 24,
"nullPointMode": "null as zero",
"renderer": "flot",
"scroll": true,
"showHeader": true,
"sort": {
"col": 0,
"desc": false
},
"spaceLength": 10,
"span": 24,
"styles": [
{
"alias": "Time",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Time",
"thresholds": [
],
"type": "hidden",
"unit": "short"
},
{
"alias": "Current Bandwidth Received",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #A",
"thresholds": [
],
"type": "number",
"unit": "Bps"
},
{
"alias": "Current Bandwidth Transmitted",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #B",
"thresholds": [
],
"type": "number",
"unit": "Bps"
},
{
"alias": "Average Bandwidth Received",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #C",
"thresholds": [
],
"type": "number",
"unit": "Bps"
},
{
"alias": "Average Bandwidth Transmitted",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #D",
"thresholds": [
],
"type": "number",
"unit": "Bps"
},
{
"alias": "Rate of Received Packets",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #E",
"thresholds": [
],
"type": "number",
"unit": "pps"
},
{
"alias": "Rate of Transmitted Packets",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #F",
"thresholds": [
],
"type": "number",
"unit": "pps"
},
{
"alias": "Rate of Received Packets Dropped",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #G",
"thresholds": [
],
"type": "number",
"unit": "pps"
},
{
"alias": "Rate of Transmitted Packets Dropped",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #H",
"thresholds": [
],
"type": "number",
"unit": "pps"
},
{
"alias": "Namespace",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": true,
"linkTooltip": "Drill down",
"linkUrl": "d/8b7a8b326d7a6f1f04244066368c67af/kubernetes-networking-namespace-pods?orgId=1&refresh=30s&var-namespace=$__cell",
"pattern": "namespace",
"thresholds": [
],
"type": "number",
"unit": "short"
}
],
"targets": [
{
"expr": "sort_desc(sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\",namespace=~\".+\"}[$interval:$resolution])) by (namespace))",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "A",
"step": 10
},
{
"expr": "sort_desc(sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\",namespace=~\".+\"}[$interval:$resolution])) by (namespace))",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "B",
"step": 10
},
{
"expr": "sort_desc(avg(irate(container_network_receive_bytes_total{cluster=\"$cluster\",namespace=~\".+\"}[$interval:$resolution])) by (namespace))",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "C",
"step": 10
},
{
"expr": "sort_desc(avg(irate(container_network_transmit_bytes_total{cluster=\"$cluster\",namespace=~\".+\"}[$interval:$resolution])) by (namespace))",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "D",
"step": 10
},
{
"expr": "sort_desc(sum(irate(container_network_receive_packets_total{cluster=\"$cluster\",namespace=~\".+\"}[$interval:$resolution])) by (namespace))",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "E",
"step": 10
},
{
"expr": "sort_desc(sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\",namespace=~\".+\"}[$interval:$resolution])) by (namespace))",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "F",
"step": 10
},
{
"expr": "sort_desc(sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\",namespace=~\".+\"}[$interval:$resolution])) by (namespace))",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "G",
"step": 10
},
{
"expr": "sort_desc(sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\",namespace=~\".+\"}[$interval:$resolution])) by (namespace))",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "H",
"step": 10
}
],
"timeFrom": null,
"timeShift": null,
"title": "Current Status",
"type": "table"
},
{
"collapse": true,
"collapsed": true,
"gridPos": {
"h": 1,
"w": 24,
"x": 0,
"y": 10
},
"id": 6,
"panels": [
{
"aliasColors": {
},
"bars": true,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 2,
"fillGradient": 0,
"gridPos": {
"h": 9,
"w": 12,
"x": 0,
"y": 11
},
"id": 7,
"legend": {
"alignAsTable": true,
"avg": false,
"current": true,
"hideEmpty": true,
"hideZero": true,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"sideWidth": null,
"sort": "current",
"sortDesc": true,
"total": false,
"values": true
},
"lines": false,
"linewidth": 1,
"links": [
],
"minSpan": 24,
"nullPointMode": "null",
"paceLength": 10,
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 24,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "sort_desc(avg(irate(container_network_receive_bytes_total{cluster=\"$cluster\",namespace=~\".+\"}[$interval:$resolution])) by (namespace))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{namespace}}",
"refId": "A",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Average Rate of Bytes Received",
"tooltip": {
"shared": true,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "series",
"name": null,
"show": false,
"values": [
"current"
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
}
]
},
{
"aliasColors": {
},
"bars": true,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 2,
"fillGradient": 0,
"gridPos": {
"h": 9,
"w": 12,
"x": 12,
"y": 11
},
"id": 8,
"legend": {
"alignAsTable": true,
"avg": false,
"current": true,
"hideEmpty": true,
"hideZero": true,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"sideWidth": null,
"sort": "current",
"sortDesc": true,
"total": false,
"values": true
},
"lines": false,
"linewidth": 1,
"links": [
],
"minSpan": 24,
"nullPointMode": "null",
"paceLength": 10,
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 24,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "sort_desc(avg(irate(container_network_transmit_bytes_total{cluster=\"$cluster\",namespace=~\".+\"}[$interval:$resolution])) by (namespace))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{namespace}}",
"refId": "A",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Average Rate of Bytes Transmitted",
"tooltip": {
"shared": true,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "series",
"name": null,
"show": false,
"values": [
"current"
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Average Bandwidth",
"titleSize": "h6",
"type": "row"
},
{
"collapse": false,
"collapsed": false,
"gridPos": {
"h": 1,
"w": 24,
"x": 0,
"y": 11
},
"id": 9,
"panels": [
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Bandwidth History",
"titleSize": "h6",
"type": "row"
},
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 2,
"fillGradient": 0,
"gridPos": {
"h": 9,
"w": 24,
"x": 0,
"y": 12
},
"id": 10,
"legend": {
"alignAsTable": true,
"avg": true,
"current": true,
"hideEmpty": true,
"hideZero": true,
"max": true,
"min": true,
"rightSide": true,
"show": true,
"sideWidth": null,
"total": false,
"values": true
},
"lines": true,
"linewidth": 2,
"links": [
],
"minSpan": 24,
"nullPointMode": "connected",
"paceLength": 10,
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 24,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sort_desc(sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\",namespace=~\".+\"}[$interval:$resolution])) by (namespace))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{namespace}}",
"refId": "A",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Receive Bandwidth",
"tooltip": {
"shared": true,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
}
]
},
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 2,
"fillGradient": 0,
"gridPos": {
"h": 9,
"w": 24,
"x": 0,
"y": 21
},
"id": 11,
"legend": {
"alignAsTable": true,
"avg": true,
"current": true,
"hideEmpty": true,
"hideZero": true,
"max": true,
"min": true,
"rightSide": true,
"show": true,
"sideWidth": null,
"total": false,
"values": true
},
"lines": true,
"linewidth": 2,
"links": [
],
"minSpan": 24,
"nullPointMode": "connected",
"paceLength": 10,
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 24,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sort_desc(sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\",namespace=~\".+\"}[$interval:$resolution])) by (namespace))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{namespace}}",
"refId": "A",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Transmit Bandwidth",
"tooltip": {
"shared": true,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
}
]
},
{
"collapse": true,
"collapsed": true,
"gridPos": {
"h": 1,
"w": 24,
"x": 0,
"y": 30
},
"id": 12,
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 2,
"fillGradient": 0,
"gridPos": {
"h": 9,
"w": 24,
"x": 0,
"y": 31
},
"id": 13,
"legend": {
"alignAsTable": true,
"avg": true,
"current": true,
"hideEmpty": true,
"hideZero": true,
"max": true,
"min": true,
"rightSide": true,
"show": true,
"sideWidth": null,
"total": false,
"values": true
},
"lines": true,
"linewidth": 2,
"links": [
],
"minSpan": 24,
"nullPointMode": "connected",
"paceLength": 10,
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 24,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sort_desc(sum(irate(container_network_receive_packets_total{cluster=\"$cluster\",namespace=~\".+\"}[$interval:$resolution])) by (namespace))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{namespace}}",
"refId": "A",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Rate of Received Packets",
"tooltip": {
"shared": true,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
}
]
},
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 2,
"fillGradient": 0,
"gridPos": {
"h": 9,
"w": 24,
"x": 0,
"y": 40
},
"id": 14,
"legend": {
"alignAsTable": true,
"avg": true,
"current": true,
"hideEmpty": true,
"hideZero": true,
"max": true,
"min": true,
"rightSide": true,
"show": true,
"sideWidth": null,
"total": false,
"values": true
},
"lines": true,
"linewidth": 2,
"links": [
],
"minSpan": 24,
"nullPointMode": "connected",
"paceLength": 10,
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 24,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sort_desc(sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\",namespace=~\".+\"}[$interval:$resolution])) by (namespace))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{namespace}}",
"refId": "A",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Rate of Transmitted Packets",
"tooltip": {
"shared": true,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Packets",
"titleSize": "h6",
"type": "row"
},
{
"collapse": true,
"collapsed": true,
"gridPos": {
"h": 1,
"w": 24,
"x": 0,
"y": 31
},
"id": 15,
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 2,
"fillGradient": 0,
"gridPos": {
"h": 9,
"w": 24,
"x": 0,
"y": 50
},
"id": 16,
"legend": {
"alignAsTable": true,
"avg": true,
"current": true,
"hideEmpty": true,
"hideZero": true,
"max": true,
"min": true,
"rightSide": true,
"show": true,
"sideWidth": null,
"total": false,
"values": true
},
"lines": true,
"linewidth": 2,
"links": [
],
"minSpan": 24,
"nullPointMode": "connected",
"paceLength": 10,
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 24,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sort_desc(sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\",namespace=~\".+\"}[$interval:$resolution])) by (namespace))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{namespace}}",
"refId": "A",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Rate of Received Packets Dropped",
"tooltip": {
"shared": true,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
}
]
},
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 2,
"fillGradient": 0,
"gridPos": {
"h": 9,
"w": 24,
"x": 0,
"y": 59
},
"id": 17,
"legend": {
"alignAsTable": true,
"avg": true,
"current": true,
"hideEmpty": true,
"hideZero": true,
"max": true,
"min": true,
"rightSide": true,
"show": true,
"sideWidth": null,
"total": false,
"values": true
},
"lines": true,
"linewidth": 2,
"links": [
],
"minSpan": 24,
"nullPointMode": "connected",
"paceLength": 10,
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 24,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sort_desc(sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\",namespace=~\".+\"}[$interval:$resolution])) by (namespace))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{namespace}}",
"refId": "A",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Rate of Transmitted Packets Dropped",
"tooltip": {
"shared": true,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
}
]
},
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 2,
"fillGradient": 0,
"gridPos": {
"h": 9,
"w": 24,
"x": 0,
"y": 59
},
"id": 18,
"legend": {
"alignAsTable": true,
"avg": true,
"current": true,
"hideEmpty": true,
"hideZero": true,
"max": true,
"min": true,
"rightSide": true,
"show": true,
"sideWidth": null,
"total": false,
"values": true
},
"lines": true,
"linewidth": 2,
"links": [
{
"targetBlank": true,
"title": "What is TCP Retransmit?",
"url": "https://accedian.com/enterprises/blog/network-packet-loss-retransmissions-and-duplicate-acknowledgements/"
}
],
"minSpan": 24,
"nullPointMode": "connected",
"paceLength": 10,
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 24,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sort_desc(sum(rate(node_netstat_Tcp_RetransSegs{cluster=\"$cluster\"}[$interval:$resolution]) / rate(node_netstat_Tcp_OutSegs{cluster=\"$cluster\"}[$interval:$resolution])) by (instance))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{instance}}",
"refId": "A",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Rate of TCP Retransmits out of all sent segments",
"tooltip": {
"shared": true,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "percentunit",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "percentunit",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
}
]
},
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 2,
"fillGradient": 0,
"gridPos": {
"h": 9,
"w": 24,
"x": 0,
"y": 59
},
"id": 19,
"legend": {
"alignAsTable": true,
"avg": true,
"current": true,
"hideEmpty": true,
"hideZero": true,
"max": true,
"min": true,
"rightSide": true,
"show": true,
"sideWidth": null,
"total": false,
"values": true
},
"lines": true,
"linewidth": 2,
"links": [
{
"targetBlank": true,
"title": "Why monitor SYN retransmits?",
"url": "https://github.com/prometheus/node_exporter/issues/1023#issuecomment-408128365"
}
],
"minSpan": 24,
"nullPointMode": "connected",
"paceLength": 10,
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 24,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sort_desc(sum(rate(node_netstat_TcpExt_TCPSynRetrans{cluster=\"$cluster\"}[$interval:$resolution]) / rate(node_netstat_Tcp_RetransSegs{cluster=\"$cluster\"}[$interval:$resolution])) by (instance))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{instance}}",
"refId": "A",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Rate of TCP SYN Retransmits out of all retransmits",
"tooltip": {
"shared": true,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "percentunit",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "percentunit",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Errors",
"titleSize": "h6",
"type": "row"
}
],
"refresh": "10s",
"rows": [
],
"schemaVersion": 18,
"style": "dark",
"tags": [
"kubernetes-mixin"
],
"templating": {
"list": [
{
"allValue": null,
"auto": false,
"auto_count": 30,
"auto_min": "10s",
"current": {
"text": "5m",
"value": "5m"
},
"datasource": "$datasource",
"hide": 0,
"includeAll": false,
"label": null,
"multi": false,
"name": "resolution",
"options": [
{
"selected": false,
"text": "30s",
"value": "30s"
},
{
"selected": true,
"text": "5m",
"value": "5m"
},
{
"selected": false,
"text": "1h",
"value": "1h"
}
],
"query": "30s,5m,1h",
"refresh": 2,
"regex": "",
"skipUrlSync": false,
"sort": 1,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "interval",
"useTags": false
},
{
"allValue": null,
"auto": false,
"auto_count": 30,
"auto_min": "10s",
"current": {
"text": "5m",
"value": "5m"
},
"datasource": "$datasource",
"hide": 2,
"includeAll": false,
"label": null,
"multi": false,
"name": "interval",
"options": [
{
"selected": true,
"text": "4h",
"value": "4h"
}
],
"query": "4h",
"refresh": 2,
"regex": "",
"skipUrlSync": false,
"sort": 1,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "interval",
"useTags": false
},
{
"current": {
"text": "default",
"value": "default"
},
"hide": 0,
"label": "Data Source",
"name": "datasource",
"options": [
],
"query": "prometheus",
"refresh": 1,
"regex": "",
"type": "datasource"
},
{
"allValue": null,
"current": {
},
"datasource": "$datasource",
"hide": 2,
"includeAll": false,
"label": null,
"multi": false,
"name": "cluster",
"options": [
],
"query": "label_values(up{job=\"kubernetes-cadvisor\"}, cluster)",
"refresh": 2,
"regex": "",
"sort": 0,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "query",
"useTags": false
}
]
},
"time": {
"from": "now-1h",
"to": "now"
},
"timepicker": {
"refresh_intervals": [
"5s",
"10s",
"30s",
"1m",
"5m",
"15m",
"30m",
"1h",
"2h",
"1d"
],
"time_options": [
"5m",
"15m",
"1h",
"6h",
"12h",
"24h",
"2d",
"7d",
"30d"
]
},
"timezone": "UTC",
"title": "Kubernetes / Networking / Cluster",
"uid": "ff635a025bcfea7bc3dd4f508990a3e9",
"version": 0
}
namespace-by-pod.json: |-
{
"__inputs": [
],
"__requires": [
],
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": "-- Grafana --",
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"type": "dashboard"
}
]
},
"editable": true,
"gnetId": null,
"graphTooltip": 0,
"hideControls": false,
"id": null,
"links": [
],
"panels": [
{
"collapse": false,
"collapsed": false,
"gridPos": {
"h": 1,
"w": 24,
"x": 0,
"y": 0
},
"id": 2,
"panels": [
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Current Bandwidth",
"titleSize": "h6",
"type": "row"
},
{
"cacheTimeout": null,
"colorBackground": false,
"colorValue": false,
"colors": [
"#299c46",
"rgba(237, 129, 40, 0.89)",
"#d44a3a"
],
"datasource": "$datasource",
"decimals": 0,
"format": "time_series",
"gauge": {
"maxValue": 100,
"minValue": 0,
"show": false,
"thresholdLabels": false,
"thresholdMarkers": true
},
"gridPos": {
"h": 9,
"w": 12,
"x": 0,
"y": 1
},
"height": 9,
"id": 3,
"interval": null,
"links": [
],
"mappingType": 1,
"mappingTypes": [
{
"name": "value to text",
"value": 1
},
{
"name": "range to text",
"value": 2
}
],
"maxDataPoints": 100,
"minSpan": 12,
"nullPointMode": "connected",
"nullText": null,
"options": {
"fieldOptions": {
"calcs": [
"last"
],
"defaults": {
"max": 10000000000,
"min": 0,
"title": "$namespace",
"unit": "Bps"
},
"mappings": [
],
"override": {
},
"thresholds": [
{
"color": "dark-green",
"index": 0,
"value": null
},
{
"color": "dark-yellow",
"index": 1,
"value": 5000000000
},
{
"color": "dark-red",
"index": 2,
"value": 7000000000
}
],
"values": false
}
},
"postfix": "",
"postfixFontSize": "50%",
"prefix": "",
"prefixFontSize": "50%",
"rangeMaps": [
{
"from": "null",
"text": "N/A",
"to": "null"
}
],
"span": 12,
"sparkline": {
"fillColor": "rgba(31, 118, 189, 0.18)",
"full": false,
"lineColor": "rgb(31, 120, 193)",
"show": false
},
"tableColumn": "",
"targets": [
{
"expr": "sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\",namespace=~\"$namespace\"}[$interval:$resolution]))",
"format": "time_series",
"instant": null,
"intervalFactor": 1,
"legendFormat": "",
"refId": "A"
}
],
"thresholds": "",
"timeFrom": null,
"timeShift": null,
"title": "Current Rate of Bytes Received",
"type": "gauge",
"valueFontSize": "80%",
"valueMaps": [
{
"op": "=",
"text": "N/A",
"value": "null"
}
],
"valueName": "current"
},
{
"cacheTimeout": null,
"colorBackground": false,
"colorValue": false,
"colors": [
"#299c46",
"rgba(237, 129, 40, 0.89)",
"#d44a3a"
],
"datasource": "$datasource",
"decimals": 0,
"format": "time_series",
"gauge": {
"maxValue": 100,
"minValue": 0,
"show": false,
"thresholdLabels": false,
"thresholdMarkers": true
},
"gridPos": {
"h": 9,
"w": 12,
"x": 12,
"y": 1
},
"height": 9,
"id": 4,
"interval": null,
"links": [
],
"mappingType": 1,
"mappingTypes": [
{
"name": "value to text",
"value": 1
},
{
"name": "range to text",
"value": 2
}
],
"maxDataPoints": 100,
"minSpan": 12,
"nullPointMode": "connected",
"nullText": null,
"options": {
"fieldOptions": {
"calcs": [
"last"
],
"defaults": {
"max": 10000000000,
"min": 0,
"title": "$namespace",
"unit": "Bps"
},
"mappings": [
],
"override": {
},
"thresholds": [
{
"color": "dark-green",
"index": 0,
"value": null
},
{
"color": "dark-yellow",
"index": 1,
"value": 5000000000
},
{
"color": "dark-red",
"index": 2,
"value": 7000000000
}
],
"values": false
}
},
"postfix": "",
"postfixFontSize": "50%",
"prefix": "",
"prefixFontSize": "50%",
"rangeMaps": [
{
"from": "null",
"text": "N/A",
"to": "null"
}
],
"span": 12,
"sparkline": {
"fillColor": "rgba(31, 118, 189, 0.18)",
"full": false,
"lineColor": "rgb(31, 120, 193)",
"show": false
},
"tableColumn": "",
"targets": [
{
"expr": "sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\",namespace=~\"$namespace\"}[$interval:$resolution]))",
"format": "time_series",
"instant": null,
"intervalFactor": 1,
"legendFormat": "",
"refId": "A"
}
],
"thresholds": "",
"timeFrom": null,
"timeShift": null,
"title": "Current Rate of Bytes Transmitted",
"type": "gauge",
"valueFontSize": "80%",
"valueMaps": [
{
"op": "=",
"text": "N/A",
"value": "null"
}
],
"valueName": "current"
},
{
"columns": [
{
"text": "Time",
"value": "Time"
},
{
"text": "Value #A",
"value": "Value #A"
},
{
"text": "Value #B",
"value": "Value #B"
},
{
"text": "Value #C",
"value": "Value #C"
},
{
"text": "Value #D",
"value": "Value #D"
},
{
"text": "Value #E",
"value": "Value #E"
},
{
"text": "Value #F",
"value": "Value #F"
},
{
"text": "pod",
"value": "pod"
}
],
"datasource": "$datasource",
"fill": 1,
"fontSize": "100%",
"gridPos": {
"h": 9,
"w": 24,
"x": 0,
"y": 10
},
"id": 5,
"lines": true,
"linewidth": 1,
"links": [
],
"minSpan": 24,
"nullPointMode": "null as zero",
"renderer": "flot",
"scroll": true,
"showHeader": true,
"sort": {
"col": 0,
"desc": false
},
"spaceLength": 10,
"span": 24,
"styles": [
{
"alias": "Time",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Time",
"thresholds": [
],
"type": "hidden",
"unit": "short"
},
{
"alias": "Bandwidth Received",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #A",
"thresholds": [
],
"type": "number",
"unit": "Bps"
},
{
"alias": "Bandwidth Transmitted",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #B",
"thresholds": [
],
"type": "number",
"unit": "Bps"
},
{
"alias": "Rate of Received Packets",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #C",
"thresholds": [
],
"type": "number",
"unit": "pps"
},
{
"alias": "Rate of Transmitted Packets",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #D",
"thresholds": [
],
"type": "number",
"unit": "pps"
},
{
"alias": "Rate of Received Packets Dropped",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #E",
"thresholds": [
],
"type": "number",
"unit": "pps"
},
{
"alias": "Rate of Transmitted Packets Dropped",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #F",
"thresholds": [
],
"type": "number",
"unit": "pps"
},
{
"alias": "Pod",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": true,
"linkTooltip": "Drill down",
"linkUrl": "d/7a18067ce943a40ae25454675c19ff5c/kubernetes-networking-pod?orgId=1&refresh=30s&var-namespace=$namespace&var-pod=$__cell",
"pattern": "pod",
"thresholds": [
],
"type": "number",
"unit": "short"
}
],
"targets": [
{
"expr": "sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\",namespace=~\"$namespace\"}[$interval:$resolution])) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "A",
"step": 10
},
{
"expr": "sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\",namespace=~\"$namespace\"}[$interval:$resolution])) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "B",
"step": 10
},
{
"expr": "sum(irate(container_network_receive_packets_total{cluster=\"$cluster\",namespace=~\"$namespace\"}[$interval:$resolution])) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "C",
"step": 10
},
{
"expr": "sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\",namespace=~\"$namespace\"}[$interval:$resolution])) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "D",
"step": 10
},
{
"expr": "sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\",namespace=~\"$namespace\"}[$interval:$resolution])) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "E",
"step": 10
},
{
"expr": "sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\",namespace=~\"$namespace\"}[$interval:$resolution])) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "F",
"step": 10
}
],
"timeFrom": null,
"timeShift": null,
"title": "Current Status",
"type": "table"
},
{
"collapse": false,
"collapsed": false,
"gridPos": {
"h": 1,
"w": 24,
"x": 0,
"y": 19
},
"id": 6,
"panels": [
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Bandwidth",
"titleSize": "h6",
"type": "row"
},
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 2,
"fillGradient": 0,
"gridPos": {
"h": 9,
"w": 12,
"x": 0,
"y": 20
},
"id": 7,
"legend": {
"alignAsTable": false,
"avg": false,
"current": false,
"hideEmpty": true,
"hideZero": true,
"max": false,
"min": false,
"rightSide": false,
"show": true,
"sideWidth": null,
"total": false,
"values": false
},
"lines": true,
"linewidth": 2,
"links": [
],
"minSpan": 12,
"nullPointMode": "connected",
"paceLength": 10,
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\",namespace=~\"$namespace\"}[$interval:$resolution])) by (pod)",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{pod}}",
"refId": "A",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Receive Bandwidth",
"tooltip": {
"shared": true,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
}
]
},
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 2,
"fillGradient": 0,
"gridPos": {
"h": 9,
"w": 12,
"x": 12,
"y": 20
},
"id": 8,
"legend": {
"alignAsTable": false,
"avg": false,
"current": false,
"hideEmpty": true,
"hideZero": true,
"max": false,
"min": false,
"rightSide": false,
"show": true,
"sideWidth": null,
"total": false,
"values": false
},
"lines": true,
"linewidth": 2,
"links": [
],
"minSpan": 12,
"nullPointMode": "connected",
"paceLength": 10,
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\",namespace=~\"$namespace\"}[$interval:$resolution])) by (pod)",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{pod}}",
"refId": "A",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Transmit Bandwidth",
"tooltip": {
"shared": true,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
}
]
},
{
"collapse": true,
"collapsed": true,
"gridPos": {
"h": 1,
"w": 24,
"x": 0,
"y": 29
},
"id": 9,
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 2,
"fillGradient": 0,
"gridPos": {
"h": 10,
"w": 12,
"x": 0,
"y": 30
},
"id": 10,
"legend": {
"alignAsTable": false,
"avg": false,
"current": false,
"hideEmpty": true,
"hideZero": true,
"max": false,
"min": false,
"rightSide": false,
"show": true,
"sideWidth": null,
"total": false,
"values": false
},
"lines": true,
"linewidth": 2,
"links": [
],
"minSpan": 12,
"nullPointMode": "connected",
"paceLength": 10,
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(irate(container_network_receive_packets_total{cluster=\"$cluster\",namespace=~\"$namespace\"}[$interval:$resolution])) by (pod)",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{pod}}",
"refId": "A",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Rate of Received Packets",
"tooltip": {
"shared": true,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
}
]
},
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 2,
"fillGradient": 0,
"gridPos": {
"h": 10,
"w": 12,
"x": 12,
"y": 30
},
"id": 11,
"legend": {
"alignAsTable": false,
"avg": false,
"current": false,
"hideEmpty": true,
"hideZero": true,
"max": false,
"min": false,
"rightSide": false,
"show": true,
"sideWidth": null,
"total": false,
"values": false
},
"lines": true,
"linewidth": 2,
"links": [
],
"minSpan": 12,
"nullPointMode": "connected",
"paceLength": 10,
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\",namespace=~\"$namespace\"}[$interval:$resolution])) by (pod)",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{pod}}",
"refId": "A",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Rate of Transmitted Packets",
"tooltip": {
"shared": true,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Packets",
"titleSize": "h6",
"type": "row"
},
{
"collapse": true,
"collapsed": true,
"gridPos": {
"h": 1,
"w": 24,
"x": 0,
"y": 30
},
"id": 12,
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 2,
"fillGradient": 0,
"gridPos": {
"h": 10,
"w": 12,
"x": 0,
"y": 40
},
"id": 13,
"legend": {
"alignAsTable": false,
"avg": false,
"current": false,
"hideEmpty": true,
"hideZero": true,
"max": false,
"min": false,
"rightSide": false,
"show": true,
"sideWidth": null,
"total": false,
"values": false
},
"lines": true,
"linewidth": 2,
"links": [
],
"minSpan": 12,
"nullPointMode": "connected",
"paceLength": 10,
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\",namespace=~\"$namespace\"}[$interval:$resolution])) by (pod)",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{pod}}",
"refId": "A",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Rate of Received Packets Dropped",
"tooltip": {
"shared": true,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
}
]
},
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 2,
"fillGradient": 0,
"gridPos": {
"h": 10,
"w": 12,
"x": 12,
"y": 40
},
"id": 14,
"legend": {
"alignAsTable": false,
"avg": false,
"current": false,
"hideEmpty": true,
"hideZero": true,
"max": false,
"min": false,
"rightSide": false,
"show": true,
"sideWidth": null,
"total": false,
"values": false
},
"lines": true,
"linewidth": 2,
"links": [
],
"minSpan": 12,
"nullPointMode": "connected",
"paceLength": 10,
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\",namespace=~\"$namespace\"}[$interval:$resolution])) by (pod)",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{pod}}",
"refId": "A",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Rate of Transmitted Packets Dropped",
"tooltip": {
"shared": true,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Errors",
"titleSize": "h6",
"type": "row"
}
],
"refresh": "10s",
"rows": [
],
"schemaVersion": 18,
"style": "dark",
"tags": [
"kubernetes-mixin"
],
"templating": {
"list": [
{
"current": {
"text": "default",
"value": "default"
},
"hide": 0,
"label": "Data Source",
"name": "datasource",
"options": [
],
"query": "prometheus",
"refresh": 1,
"regex": "",
"type": "datasource"
},
{
"allValue": null,
"current": {
},
"datasource": "$datasource",
"hide": 2,
"includeAll": false,
"label": null,
"multi": false,
"name": "cluster",
"options": [
],
"query": "label_values(up{job=\"kubernetes-cadvisor\"}, cluster)",
"refresh": 2,
"regex": "",
"sort": 0,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": ".+",
"auto": false,
"auto_count": 30,
"auto_min": "10s",
"current": {
"text": "kube-system",
"value": "kube-system"
},
"datasource": "$datasource",
"definition": "label_values(container_network_receive_packets_total{cluster=\"$cluster\"}, namespace)",
"hide": 0,
"includeAll": true,
"label": null,
"multi": false,
"name": "namespace",
"options": [
],
"query": "label_values(container_network_receive_packets_total{cluster=\"$cluster\"}, namespace)",
"refresh": 2,
"regex": "",
"skipUrlSync": false,
"sort": 1,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": null,
"auto": false,
"auto_count": 30,
"auto_min": "10s",
"current": {
"text": "5m",
"value": "5m"
},
"datasource": "$datasource",
"hide": 0,
"includeAll": false,
"label": null,
"multi": false,
"name": "resolution",
"options": [
{
"selected": false,
"text": "30s",
"value": "30s"
},
{
"selected": true,
"text": "5m",
"value": "5m"
},
{
"selected": false,
"text": "1h",
"value": "1h"
}
],
"query": "30s,5m,1h",
"refresh": 2,
"regex": "",
"skipUrlSync": false,
"sort": 1,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "interval",
"useTags": false
},
{
"allValue": null,
"auto": false,
"auto_count": 30,
"auto_min": "10s",
"current": {
"text": "5m",
"value": "5m"
},
"datasource": "$datasource",
"hide": 2,
"includeAll": false,
"label": null,
"multi": false,
"name": "interval",
"options": [
{
"selected": true,
"text": "4h",
"value": "4h"
}
],
"query": "4h",
"refresh": 2,
"regex": "",
"skipUrlSync": false,
"sort": 1,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "interval",
"useTags": false
}
]
},
"time": {
"from": "now-1h",
"to": "now"
},
"timepicker": {
"refresh_intervals": [
"5s",
"10s",
"30s",
"1m",
"5m",
"15m",
"30m",
"1h",
"2h",
"1d"
],
"time_options": [
"5m",
"15m",
"1h",
"6h",
"12h",
"24h",
"2d",
"7d",
"30d"
]
},
"timezone": "UTC",
"title": "Kubernetes / Networking / Namespace (Pods)",
"uid": "8b7a8b326d7a6f1f04244066368c67af",
"version": 0
}
namespace-by-workload.json: |-
{
"__inputs": [
],
"__requires": [
],
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": "-- Grafana --",
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"type": "dashboard"
}
]
},
"editable": true,
"gnetId": null,
"graphTooltip": 0,
"hideControls": false,
"id": null,
"links": [
],
"panels": [
{
"collapse": false,
"collapsed": false,
"gridPos": {
"h": 1,
"w": 24,
"x": 0,
"y": 0
},
"id": 2,
"panels": [
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Current Bandwidth",
"titleSize": "h6",
"type": "row"
},
{
"aliasColors": {
},
"bars": true,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 2,
"fillGradient": 0,
"gridPos": {
"h": 9,
"w": 12,
"x": 0,
"y": 1
},
"id": 3,
"legend": {
"alignAsTable": true,
"avg": false,
"current": true,
"hideEmpty": true,
"hideZero": true,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"sideWidth": null,
"sort": "current",
"sortDesc": true,
"total": false,
"values": true
},
"lines": false,
"linewidth": 1,
"links": [
],
"minSpan": 24,
"nullPointMode": "null",
"paceLength": 10,
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 24,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "sort_desc(sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\",namespace=\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\",namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ workload }}",
"refId": "A",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Current Rate of Bytes Received",
"tooltip": {
"shared": true,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "series",
"name": null,
"show": false,
"values": [
"current"
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
}
]
},
{
"aliasColors": {
},
"bars": true,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 2,
"fillGradient": 0,
"gridPos": {
"h": 9,
"w": 12,
"x": 12,
"y": 1
},
"id": 4,
"legend": {
"alignAsTable": true,
"avg": false,
"current": true,
"hideEmpty": true,
"hideZero": true,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"sideWidth": null,
"sort": "current",
"sortDesc": true,
"total": false,
"values": true
},
"lines": false,
"linewidth": 1,
"links": [
],
"minSpan": 24,
"nullPointMode": "null",
"paceLength": 10,
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 24,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "sort_desc(sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\",namespace=\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\",namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ workload }}",
"refId": "A",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Current Rate of Bytes Transmitted",
"tooltip": {
"shared": true,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "series",
"name": null,
"show": false,
"values": [
"current"
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
}
]
},
{
"columns": [
{
"text": "Time",
"value": "Time"
},
{
"text": "Value #A",
"value": "Value #A"
},
{
"text": "Value #B",
"value": "Value #B"
},
{
"text": "Value #C",
"value": "Value #C"
},
{
"text": "Value #D",
"value": "Value #D"
},
{
"text": "Value #E",
"value": "Value #E"
},
{
"text": "Value #F",
"value": "Value #F"
},
{
"text": "Value #G",
"value": "Value #G"
},
{
"text": "Value #H",
"value": "Value #H"
},
{
"text": "workload",
"value": "workload"
}
],
"datasource": "$datasource",
"fill": 1,
"fontSize": "90%",
"gridPos": {
"h": 9,
"w": 24,
"x": 0,
"y": 10
},
"id": 5,
"lines": true,
"linewidth": 1,
"links": [
],
"minSpan": 24,
"nullPointMode": "null as zero",
"renderer": "flot",
"scroll": true,
"showHeader": true,
"sort": {
"col": 0,
"desc": false
},
"spaceLength": 10,
"span": 24,
"styles": [
{
"alias": "Time",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Time",
"thresholds": [
],
"type": "hidden",
"unit": "short"
},
{
"alias": "Current Bandwidth Received",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #A",
"thresholds": [
],
"type": "number",
"unit": "Bps"
},
{
"alias": "Current Bandwidth Transmitted",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #B",
"thresholds": [
],
"type": "number",
"unit": "Bps"
},
{
"alias": "Average Bandwidth Received",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #C",
"thresholds": [
],
"type": "number",
"unit": "Bps"
},
{
"alias": "Average Bandwidth Transmitted",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #D",
"thresholds": [
],
"type": "number",
"unit": "Bps"
},
{
"alias": "Rate of Received Packets",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #E",
"thresholds": [
],
"type": "number",
"unit": "pps"
},
{
"alias": "Rate of Transmitted Packets",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #F",
"thresholds": [
],
"type": "number",
"unit": "pps"
},
{
"alias": "Rate of Received Packets Dropped",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #G",
"thresholds": [
],
"type": "number",
"unit": "pps"
},
{
"alias": "Rate of Transmitted Packets Dropped",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #H",
"thresholds": [
],
"type": "number",
"unit": "pps"
},
{
"alias": "Workload",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": true,
"linkTooltip": "Drill down",
"linkUrl": "d/728bf77cc1166d2f3133bf25846876cc/kubernetes-networking-workload?orgId=1&refresh=30s&var-namespace=$namespace&var-type=$type&var-workload=$__cell",
"pattern": "workload",
"thresholds": [
],
"type": "number",
"unit": "short"
}
],
"targets": [
{
"expr": "sort_desc(sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\",namespace=\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\",namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "A",
"step": 10
},
{
"expr": "sort_desc(sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\",namespace=\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\",namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "B",
"step": 10
},
{
"expr": "sort_desc(avg(irate(container_network_receive_bytes_total{cluster=\"$cluster\",namespace=\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\",namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "C",
"step": 10
},
{
"expr": "sort_desc(avg(irate(container_network_transmit_bytes_total{cluster=\"$cluster\",namespace=\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\",namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "D",
"step": 10
},
{
"expr": "sort_desc(sum(irate(container_network_receive_packets_total{cluster=\"$cluster\",namespace=\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\",namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "E",
"step": 10
},
{
"expr": "sort_desc(sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\",namespace=\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\",namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "F",
"step": 10
},
{
"expr": "sort_desc(sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\",namespace=\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\",namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "G",
"step": 10
},
{
"expr": "sort_desc(sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\",namespace=\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\",namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "H",
"step": 10
}
],
"timeFrom": null,
"timeShift": null,
"title": "Current Status",
"type": "table"
},
{
"collapse": true,
"collapsed": true,
"gridPos": {
"h": 1,
"w": 24,
"x": 0,
"y": 19
},
"id": 6,
"panels": [
{
"aliasColors": {
},
"bars": true,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 2,
"fillGradient": 0,
"gridPos": {
"h": 9,
"w": 12,
"x": 0,
"y": 20
},
"id": 7,
"legend": {
"alignAsTable": true,
"avg": false,
"current": true,
"hideEmpty": true,
"hideZero": true,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"sideWidth": null,
"sort": "current",
"sortDesc": true,
"total": false,
"values": true
},
"lines": false,
"linewidth": 1,
"links": [
],
"minSpan": 24,
"nullPointMode": "null",
"paceLength": 10,
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 24,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "sort_desc(avg(irate(container_network_receive_bytes_total{cluster=\"$cluster\",namespace=\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\",namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ workload }}",
"refId": "A",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Average Rate of Bytes Received",
"tooltip": {
"shared": true,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "series",
"name": null,
"show": false,
"values": [
"current"
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
}
]
},
{
"aliasColors": {
},
"bars": true,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 2,
"fillGradient": 0,
"gridPos": {
"h": 9,
"w": 12,
"x": 12,
"y": 20
},
"id": 8,
"legend": {
"alignAsTable": true,
"avg": false,
"current": true,
"hideEmpty": true,
"hideZero": true,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"sideWidth": null,
"sort": "current",
"sortDesc": true,
"total": false,
"values": true
},
"lines": false,
"linewidth": 1,
"links": [
],
"minSpan": 24,
"nullPointMode": "null",
"paceLength": 10,
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 24,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "sort_desc(avg(irate(container_network_transmit_bytes_total{cluster=\"$cluster\",namespace=\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\",namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ workload }}",
"refId": "A",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Average Rate of Bytes Transmitted",
"tooltip": {
"shared": true,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "series",
"name": null,
"show": false,
"values": [
"current"
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Average Bandwidth",
"titleSize": "h6",
"type": "row"
},
{
"collapse": false,
"collapsed": false,
"gridPos": {
"h": 1,
"w": 24,
"x": 0,
"y": 29
},
"id": 9,
"panels": [
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Bandwidth HIstory",
"titleSize": "h6",
"type": "row"
},
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 2,
"fillGradient": 0,
"gridPos": {
"h": 9,
"w": 12,
"x": 0,
"y": 38
},
"id": 10,
"legend": {
"alignAsTable": false,
"avg": false,
"current": false,
"hideEmpty": true,
"hideZero": true,
"max": false,
"min": false,
"rightSide": false,
"show": true,
"sideWidth": null,
"total": false,
"values": false
},
"lines": true,
"linewidth": 2,
"links": [
],
"minSpan": 12,
"nullPointMode": "connected",
"paceLength": 10,
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sort_desc(sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\",namespace=\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\",namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{workload}}",
"refId": "A",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Receive Bandwidth",
"tooltip": {
"shared": true,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
}
]
},
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 2,
"fillGradient": 0,
"gridPos": {
"h": 9,
"w": 12,
"x": 12,
"y": 38
},
"id": 11,
"legend": {
"alignAsTable": false,
"avg": false,
"current": false,
"hideEmpty": true,
"hideZero": true,
"max": false,
"min": false,
"rightSide": false,
"show": true,
"sideWidth": null,
"total": false,
"values": false
},
"lines": true,
"linewidth": 2,
"links": [
],
"minSpan": 12,
"nullPointMode": "connected",
"paceLength": 10,
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sort_desc(sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\",namespace=\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\",namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{workload}}",
"refId": "A",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Transmit Bandwidth",
"tooltip": {
"shared": true,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
}
]
},
{
"collapse": true,
"collapsed": true,
"gridPos": {
"h": 1,
"w": 24,
"x": 0,
"y": 39
},
"id": 12,
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 2,
"fillGradient": 0,
"gridPos": {
"h": 9,
"w": 12,
"x": 0,
"y": 40
},
"id": 13,
"legend": {
"alignAsTable": false,
"avg": false,
"current": false,
"hideEmpty": true,
"hideZero": true,
"max": false,
"min": false,
"rightSide": false,
"show": true,
"sideWidth": null,
"total": false,
"values": false
},
"lines": true,
"linewidth": 2,
"links": [
],
"minSpan": 12,
"nullPointMode": "connected",
"paceLength": 10,
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sort_desc(sum(irate(container_network_receive_packets_total{cluster=\"$cluster\",namespace=\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\",namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{workload}}",
"refId": "A",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Rate of Received Packets",
"tooltip": {
"shared": true,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
}
]
},
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 2,
"fillGradient": 0,
"gridPos": {
"h": 9,
"w": 12,
"x": 12,
"y": 40
},
"id": 14,
"legend": {
"alignAsTable": false,
"avg": false,
"current": false,
"hideEmpty": true,
"hideZero": true,
"max": false,
"min": false,
"rightSide": false,
"show": true,
"sideWidth": null,
"total": false,
"values": false
},
"lines": true,
"linewidth": 2,
"links": [
],
"minSpan": 12,
"nullPointMode": "connected",
"paceLength": 10,
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sort_desc(sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\",namespace=\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\",namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{workload}}",
"refId": "A",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Rate of Transmitted Packets",
"tooltip": {
"shared": true,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Packets",
"titleSize": "h6",
"type": "row"
},
{
"collapse": true,
"collapsed": true,
"gridPos": {
"h": 1,
"w": 24,
"x": 0,
"y": 40
},
"id": 15,
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 2,
"fillGradient": 0,
"gridPos": {
"h": 9,
"w": 12,
"x": 0,
"y": 41
},
"id": 16,
"legend": {
"alignAsTable": false,
"avg": false,
"current": false,
"hideEmpty": true,
"hideZero": true,
"max": false,
"min": false,
"rightSide": false,
"show": true,
"sideWidth": null,
"total": false,
"values": false
},
"lines": true,
"linewidth": 2,
"links": [
],
"minSpan": 12,
"nullPointMode": "connected",
"paceLength": 10,
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sort_desc(sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\",namespace=\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\",namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{workload}}",
"refId": "A",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Rate of Received Packets Dropped",
"tooltip": {
"shared": true,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
}
]
},
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 2,
"fillGradient": 0,
"gridPos": {
"h": 9,
"w": 12,
"x": 12,
"y": 41
},
"id": 17,
"legend": {
"alignAsTable": false,
"avg": false,
"current": false,
"hideEmpty": true,
"hideZero": true,
"max": false,
"min": false,
"rightSide": false,
"show": true,
"sideWidth": null,
"total": false,
"values": false
},
"lines": true,
"linewidth": 2,
"links": [
],
"minSpan": 12,
"nullPointMode": "connected",
"paceLength": 10,
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sort_desc(sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\",namespace=\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\",namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{workload}}",
"refId": "A",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Rate of Transmitted Packets Dropped",
"tooltip": {
"shared": true,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Errors",
"titleSize": "h6",
"type": "row"
}
],
"refresh": "10s",
"rows": [
],
"schemaVersion": 18,
"style": "dark",
"tags": [
"kubernetes-mixin"
],
"templating": {
"list": [
{
"current": {
"text": "default",
"value": "default"
},
"hide": 0,
"label": "Data Source",
"name": "datasource",
"options": [
],
"query": "prometheus",
"refresh": 1,
"regex": "",
"type": "datasource"
},
{
"allValue": null,
"current": {
},
"datasource": "$datasource",
"hide": 2,
"includeAll": false,
"label": null,
"multi": false,
"name": "cluster",
"options": [
],
"query": "label_values(up{job=\"kubernetes-cadvisor\"}, cluster)",
"refresh": 2,
"regex": "",
"sort": 0,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": null,
"auto": false,
"auto_count": 30,
"auto_min": "10s",
"current": {
"text": "kube-system",
"value": "kube-system"
},
"datasource": "$datasource",
"definition": "label_values(container_network_receive_packets_total{cluster=\"$cluster\"}, namespace)",
"hide": 0,
"includeAll": false,
"label": null,
"multi": false,
"name": "namespace",
"options": [
],
"query": "label_values(container_network_receive_packets_total{cluster=\"$cluster\"}, namespace)",
"refresh": 2,
"regex": "",
"skipUrlSync": false,
"sort": 1,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": null,
"auto": false,
"auto_count": 30,
"auto_min": "10s",
"current": {
"text": "deployment",
"value": "deployment"
},
"datasource": "$datasource",
"definition": "label_values(namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\",namespace=\"$namespace\", workload=~\".+\"}, workload_type)",
"hide": 0,
"includeAll": false,
"label": null,
"multi": false,
"name": "type",
"options": [
],
"query": "label_values(namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\",namespace=\"$namespace\", workload=~\".+\"}, workload_type)",
"refresh": 2,
"regex": "",
"skipUrlSync": false,
"sort": 0,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": null,
"auto": false,
"auto_count": 30,
"auto_min": "10s",
"current": {
"text": "5m",
"value": "5m"
},
"datasource": "$datasource",
"hide": 0,
"includeAll": false,
"label": null,
"multi": false,
"name": "resolution",
"options": [
{
"selected": false,
"text": "30s",
"value": "30s"
},
{
"selected": true,
"text": "5m",
"value": "5m"
},
{
"selected": false,
"text": "1h",
"value": "1h"
}
],
"query": "30s,5m,1h",
"refresh": 2,
"regex": "",
"skipUrlSync": false,
"sort": 1,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "interval",
"useTags": false
},
{
"allValue": null,
"auto": false,
"auto_count": 30,
"auto_min": "10s",
"current": {
"text": "5m",
"value": "5m"
},
"datasource": "$datasource",
"hide": 2,
"includeAll": false,
"label": null,
"multi": false,
"name": "interval",
"options": [
{
"selected": true,
"text": "4h",
"value": "4h"
}
],
"query": "4h",
"refresh": 2,
"regex": "",
"skipUrlSync": false,
"sort": 1,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "interval",
"useTags": false
}
]
},
"time": {
"from": "now-1h",
"to": "now"
},
"timepicker": {
"refresh_intervals": [
"5s",
"10s",
"30s",
"1m",
"5m",
"15m",
"30m",
"1h",
"2h",
"1d"
],
"time_options": [
"5m",
"15m",
"1h",
"6h",
"12h",
"24h",
"2d",
"7d",
"30d"
]
},
"timezone": "UTC",
"title": "Kubernetes / Networking / Namespace (Workload)",
"uid": "bbb2a765a623ae38130206c7d94a160f",
"version": 0
}
pod-total.json: |-
{
"__inputs": [
],
"__requires": [
],
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": "-- Grafana --",
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"type": "dashboard"
}
]
},
"editable": true,
"gnetId": null,
"graphTooltip": 0,
"hideControls": false,
"id": null,
"links": [
],
"panels": [
{
"collapse": false,
"collapsed": false,
"gridPos": {
"h": 1,
"w": 24,
"x": 0,
"y": 0
},
"id": 2,
"panels": [
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Current Bandwidth",
"titleSize": "h6",
"type": "row"
},
{
"cacheTimeout": null,
"colorBackground": false,
"colorValue": false,
"colors": [
"#299c46",
"rgba(237, 129, 40, 0.89)",
"#d44a3a"
],
"datasource": "$datasource",
"decimals": 0,
"format": "time_series",
"gauge": {
"maxValue": 100,
"minValue": 0,
"show": false,
"thresholdLabels": false,
"thresholdMarkers": true
},
"gridPos": {
"h": 9,
"w": 12,
"x": 0,
"y": 1
},
"height": 9,
"id": 3,
"interval": null,
"links": [
],
"mappingType": 1,
"mappingTypes": [
{
"name": "value to text",
"value": 1
},
{
"name": "range to text",
"value": 2
}
],
"maxDataPoints": 100,
"minSpan": 12,
"nullPointMode": "connected",
"nullText": null,
"options": {
"fieldOptions": {
"calcs": [
"last"
],
"defaults": {
"max": 10000000000,
"min": 0,
"title": "$namespace: $pod",
"unit": "Bps"
},
"mappings": [
],
"override": {
},
"thresholds": [
{
"color": "dark-green",
"index": 0,
"value": null
},
{
"color": "dark-yellow",
"index": 1,
"value": 5000000000
},
{
"color": "dark-red",
"index": 2,
"value": 7000000000
}
],
"values": false
}
},
"postfix": "",
"postfixFontSize": "50%",
"prefix": "",
"prefixFontSize": "50%",
"rangeMaps": [
{
"from": "null",
"text": "N/A",
"to": "null"
}
],
"span": 12,
"sparkline": {
"fillColor": "rgba(31, 118, 189, 0.18)",
"full": false,
"lineColor": "rgb(31, 120, 193)",
"show": false
},
"tableColumn": "",
"targets": [
{
"expr": "sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\",namespace=~\"$namespace\", pod=~\"$pod\"}[$interval:$resolution]))",
"format": "time_series",
"instant": null,
"intervalFactor": 1,
"legendFormat": "",
"refId": "A"
}
],
"thresholds": "",
"timeFrom": null,
"timeShift": null,
"title": "Current Rate of Bytes Received",
"type": "gauge",
"valueFontSize": "80%",
"valueMaps": [
{
"op": "=",
"text": "N/A",
"value": "null"
}
],
"valueName": "current"
},
{
"cacheTimeout": null,
"colorBackground": false,
"colorValue": false,
"colors": [
"#299c46",
"rgba(237, 129, 40, 0.89)",
"#d44a3a"
],
"datasource": "$datasource",
"decimals": 0,
"format": "time_series",
"gauge": {
"maxValue": 100,
"minValue": 0,
"show": false,
"thresholdLabels": false,
"thresholdMarkers": true
},
"gridPos": {
"h": 9,
"w": 12,
"x": 12,
"y": 1
},
"height": 9,
"id": 4,
"interval": null,
"links": [
],
"mappingType": 1,
"mappingTypes": [
{
"name": "value to text",
"value": 1
},
{
"name": "range to text",
"value": 2
}
],
"maxDataPoints": 100,
"minSpan": 12,
"nullPointMode": "connected",
"nullText": null,
"options": {
"fieldOptions": {
"calcs": [
"last"
],
"defaults": {
"max": 10000000000,
"min": 0,
"title": "$namespace: $pod",
"unit": "Bps"
},
"mappings": [
],
"override": {
},
"thresholds": [
{
"color": "dark-green",
"index": 0,
"value": null
},
{
"color": "dark-yellow",
"index": 1,
"value": 5000000000
},
{
"color": "dark-red",
"index": 2,
"value": 7000000000
}
],
"values": false
}
},
"postfix": "",
"postfixFontSize": "50%",
"prefix": "",
"prefixFontSize": "50%",
"rangeMaps": [
{
"from": "null",
"text": "N/A",
"to": "null"
}
],
"span": 12,
"sparkline": {
"fillColor": "rgba(31, 118, 189, 0.18)",
"full": false,
"lineColor": "rgb(31, 120, 193)",
"show": false
},
"tableColumn": "",
"targets": [
{
"expr": "sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\",namespace=~\"$namespace\", pod=~\"$pod\"}[$interval:$resolution]))",
"format": "time_series",
"instant": null,
"intervalFactor": 1,
"legendFormat": "",
"refId": "A"
}
],
"thresholds": "",
"timeFrom": null,
"timeShift": null,
"title": "Current Rate of Bytes Transmitted",
"type": "gauge",
"valueFontSize": "80%",
"valueMaps": [
{
"op": "=",
"text": "N/A",
"value": "null"
}
],
"valueName": "current"
},
{
"collapse": false,
"collapsed": false,
"gridPos": {
"h": 1,
"w": 24,
"x": 0,
"y": 10
},
"id": 5,
"panels": [
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Bandwidth",
"titleSize": "h6",
"type": "row"
},
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 2,
"fillGradient": 0,
"gridPos": {
"h": 9,
"w": 12,
"x": 0,
"y": 11
},
"id": 6,
"legend": {
"alignAsTable": false,
"avg": false,
"current": false,
"hideEmpty": true,
"hideZero": true,
"max": false,
"min": false,
"rightSide": false,
"show": true,
"sideWidth": null,
"total": false,
"values": false
},
"lines": true,
"linewidth": 2,
"links": [
],
"minSpan": 12,
"nullPointMode": "connected",
"paceLength": 10,
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\",namespace=~\"$namespace\", pod=~\"$pod\"}[$interval:$resolution])) by (pod)",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{pod}}",
"refId": "A",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Receive Bandwidth",
"tooltip": {
"shared": true,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
}
]
},
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 2,
"fillGradient": 0,
"gridPos": {
"h": 9,
"w": 12,
"x": 12,
"y": 11
},
"id": 7,
"legend": {
"alignAsTable": false,
"avg": false,
"current": false,
"hideEmpty": true,
"hideZero": true,
"max": false,
"min": false,
"rightSide": false,
"show": true,
"sideWidth": null,
"total": false,
"values": false
},
"lines": true,
"linewidth": 2,
"links": [
],
"minSpan": 12,
"nullPointMode": "connected",
"paceLength": 10,
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\",namespace=~\"$namespace\", pod=~\"$pod\"}[$interval:$resolution])) by (pod)",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{pod}}",
"refId": "A",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Transmit Bandwidth",
"tooltip": {
"shared": true,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
}
]
},
{
"collapse": true,
"collapsed": true,
"gridPos": {
"h": 1,
"w": 24,
"x": 0,
"y": 20
},
"id": 8,
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 2,
"fillGradient": 0,
"gridPos": {
"h": 10,
"w": 12,
"x": 0,
"y": 21
},
"id": 9,
"legend": {
"alignAsTable": false,
"avg": false,
"current": false,
"hideEmpty": true,
"hideZero": true,
"max": false,
"min": false,
"rightSide": false,
"show": true,
"sideWidth": null,
"total": false,
"values": false
},
"lines": true,
"linewidth": 2,
"links": [
],
"minSpan": 12,
"nullPointMode": "connected",
"paceLength": 10,
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(irate(container_network_receive_packets_total{cluster=\"$cluster\",namespace=~\"$namespace\", pod=~\"$pod\"}[$interval:$resolution])) by (pod)",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{pod}}",
"refId": "A",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Rate of Received Packets",
"tooltip": {
"shared": true,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
}
]
},
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 2,
"fillGradient": 0,
"gridPos": {
"h": 10,
"w": 12,
"x": 12,
"y": 21
},
"id": 10,
"legend": {
"alignAsTable": false,
"avg": false,
"current": false,
"hideEmpty": true,
"hideZero": true,
"max": false,
"min": false,
"rightSide": false,
"show": true,
"sideWidth": null,
"total": false,
"values": false
},
"lines": true,
"linewidth": 2,
"links": [
],
"minSpan": 12,
"nullPointMode": "connected",
"paceLength": 10,
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\",namespace=~\"$namespace\", pod=~\"$pod\"}[$interval:$resolution])) by (pod)",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{pod}}",
"refId": "A",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Rate of Transmitted Packets",
"tooltip": {
"shared": true,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Packets",
"titleSize": "h6",
"type": "row"
},
{
"collapse": true,
"collapsed": true,
"gridPos": {
"h": 1,
"w": 24,
"x": 0,
"y": 21
},
"id": 11,
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 2,
"fillGradient": 0,
"gridPos": {
"h": 10,
"w": 12,
"x": 0,
"y": 32
},
"id": 12,
"legend": {
"alignAsTable": false,
"avg": false,
"current": false,
"hideEmpty": true,
"hideZero": true,
"max": false,
"min": false,
"rightSide": false,
"show": true,
"sideWidth": null,
"total": false,
"values": false
},
"lines": true,
"linewidth": 2,
"links": [
],
"minSpan": 12,
"nullPointMode": "connected",
"paceLength": 10,
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\",namespace=~\"$namespace\", pod=~\"$pod\"}[$interval:$resolution])) by (pod)",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{pod}}",
"refId": "A",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Rate of Received Packets Dropped",
"tooltip": {
"shared": true,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
}
]
},
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 2,
"fillGradient": 0,
"gridPos": {
"h": 10,
"w": 12,
"x": 12,
"y": 32
},
"id": 13,
"legend": {
"alignAsTable": false,
"avg": false,
"current": false,
"hideEmpty": true,
"hideZero": true,
"max": false,
"min": false,
"rightSide": false,
"show": true,
"sideWidth": null,
"total": false,
"values": false
},
"lines": true,
"linewidth": 2,
"links": [
],
"minSpan": 12,
"nullPointMode": "connected",
"paceLength": 10,
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\",namespace=~\"$namespace\", pod=~\"$pod\"}[$interval:$resolution])) by (pod)",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{pod}}",
"refId": "A",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Rate of Transmitted Packets Dropped",
"tooltip": {
"shared": true,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Errors",
"titleSize": "h6",
"type": "row"
}
],
"refresh": "10s",
"rows": [
],
"schemaVersion": 18,
"style": "dark",
"tags": [
"kubernetes-mixin"
],
"templating": {
"list": [
{
"current": {
"text": "default",
"value": "default"
},
"hide": 0,
"label": "Data Source",
"name": "datasource",
"options": [
],
"query": "prometheus",
"refresh": 1,
"regex": "",
"type": "datasource"
},
{
"allValue": null,
"current": {
},
"datasource": "$datasource",
"hide": 2,
"includeAll": false,
"label": null,
"multi": false,
"name": "cluster",
"options": [
],
"query": "label_values(up{job=\"kubernetes-cadvisor\"}, cluster)",
"refresh": 2,
"regex": "",
"sort": 0,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": ".+",
"auto": false,
"auto_count": 30,
"auto_min": "10s",
"current": {
"text": "kube-system",
"value": "kube-system"
},
"datasource": "$datasource",
"definition": "label_values(container_network_receive_packets_total{cluster=\"$cluster\"}, namespace)",
"hide": 0,
"includeAll": true,
"label": null,
"multi": false,
"name": "namespace",
"options": [
],
"query": "label_values(container_network_receive_packets_total{cluster=\"$cluster\"}, namespace)",
"refresh": 2,
"regex": "",
"skipUrlSync": false,
"sort": 1,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": ".+",
"auto": false,
"auto_count": 30,
"auto_min": "10s",
"current": {
"text": "",
"value": ""
},
"datasource": "$datasource",
"definition": "label_values(container_network_receive_packets_total{cluster=\"$cluster\",namespace=~\"$namespace\"}, pod)",
"hide": 0,
"includeAll": false,
"label": null,
"multi": false,
"name": "pod",
"options": [
],
"query": "label_values(container_network_receive_packets_total{cluster=\"$cluster\",namespace=~\"$namespace\"}, pod)",
"refresh": 2,
"regex": "",
"skipUrlSync": false,
"sort": 1,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": null,
"auto": false,
"auto_count": 30,
"auto_min": "10s",
"current": {
"text": "5m",
"value": "5m"
},
"datasource": "$datasource",
"hide": 0,
"includeAll": false,
"label": null,
"multi": false,
"name": "resolution",
"options": [
{
"selected": false,
"text": "30s",
"value": "30s"
},
{
"selected": true,
"text": "5m",
"value": "5m"
},
{
"selected": false,
"text": "1h",
"value": "1h"
}
],
"query": "30s,5m,1h",
"refresh": 2,
"regex": "",
"skipUrlSync": false,
"sort": 1,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "interval",
"useTags": false
},
{
"allValue": null,
"auto": false,
"auto_count": 30,
"auto_min": "10s",
"current": {
"text": "5m",
"value": "5m"
},
"datasource": "$datasource",
"hide": 2,
"includeAll": false,
"label": null,
"multi": false,
"name": "interval",
"options": [
{
"selected": true,
"text": "4h",
"value": "4h"
}
],
"query": "4h",
"refresh": 2,
"regex": "",
"skipUrlSync": false,
"sort": 1,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "interval",
"useTags": false
}
]
},
"time": {
"from": "now-1h",
"to": "now"
},
"timepicker": {
"refresh_intervals": [
"5s",
"10s",
"30s",
"1m",
"5m",
"15m",
"30m",
"1h",
"2h",
"1d"
],
"time_options": [
"5m",
"15m",
"1h",
"6h",
"12h",
"24h",
"2d",
"7d",
"30d"
]
},
"timezone": "UTC",
"title": "Kubernetes / Networking / Pod",
"uid": "7a18067ce943a40ae25454675c19ff5c",
"version": 0
}
workload-total.json: |-
{
"__inputs": [
],
"__requires": [
],
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": "-- Grafana --",
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"type": "dashboard"
}
]
},
"editable": true,
"gnetId": null,
"graphTooltip": 0,
"hideControls": false,
"id": null,
"links": [
],
"panels": [
{
"collapse": false,
"collapsed": false,
"gridPos": {
"h": 1,
"w": 24,
"x": 0,
"y": 0
},
"id": 2,
"panels": [
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Current Bandwidth",
"titleSize": "h6",
"type": "row"
},
{
"aliasColors": {
},
"bars": true,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 2,
"fillGradient": 0,
"gridPos": {
"h": 9,
"w": 12,
"x": 0,
"y": 1
},
"id": 3,
"legend": {
"alignAsTable": true,
"avg": false,
"current": true,
"hideEmpty": true,
"hideZero": true,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"sideWidth": null,
"sort": "current",
"sortDesc": true,
"total": false,
"values": true
},
"lines": false,
"linewidth": 1,
"links": [
],
"minSpan": 24,
"nullPointMode": "null",
"paceLength": 10,
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 24,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "sort_desc(sum(irate(container_network_receive_bytes_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\",namespace=~\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\",namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ pod }}",
"refId": "A",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Current Rate of Bytes Received",
"tooltip": {
"shared": true,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "series",
"name": null,
"show": false,
"values": [
"current"
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
}
]
},
{
"aliasColors": {
},
"bars": true,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 2,
"fillGradient": 0,
"gridPos": {
"h": 9,
"w": 12,
"x": 12,
"y": 1
},
"id": 4,
"legend": {
"alignAsTable": true,
"avg": false,
"current": true,
"hideEmpty": true,
"hideZero": true,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"sideWidth": null,
"sort": "current",
"sortDesc": true,
"total": false,
"values": true
},
"lines": false,
"linewidth": 1,
"links": [
],
"minSpan": 24,
"nullPointMode": "null",
"paceLength": 10,
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 24,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "sort_desc(sum(irate(container_network_transmit_bytes_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\",namespace=~\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\",namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ pod }}",
"refId": "A",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Current Rate of Bytes Transmitted",
"tooltip": {
"shared": true,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "series",
"name": null,
"show": false,
"values": [
"current"
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
}
]
},
{
"collapse": true,
"collapsed": true,
"gridPos": {
"h": 1,
"w": 24,
"x": 0,
"y": 10
},
"id": 5,
"panels": [
{
"aliasColors": {
},
"bars": true,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 2,
"fillGradient": 0,
"gridPos": {
"h": 9,
"w": 12,
"x": 0,
"y": 11
},
"id": 6,
"legend": {
"alignAsTable": true,
"avg": false,
"current": true,
"hideEmpty": true,
"hideZero": true,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"sideWidth": null,
"sort": "current",
"sortDesc": true,
"total": false,
"values": true
},
"lines": false,
"linewidth": 1,
"links": [
],
"minSpan": 24,
"nullPointMode": "null",
"paceLength": 10,
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 24,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "sort_desc(avg(irate(container_network_receive_bytes_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\",namespace=~\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\",namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ pod }}",
"refId": "A",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Average Rate of Bytes Received",
"tooltip": {
"shared": true,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "series",
"name": null,
"show": false,
"values": [
"current"
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
}
]
},
{
"aliasColors": {
},
"bars": true,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 2,
"fillGradient": 0,
"gridPos": {
"h": 9,
"w": 12,
"x": 12,
"y": 11
},
"id": 7,
"legend": {
"alignAsTable": true,
"avg": false,
"current": true,
"hideEmpty": true,
"hideZero": true,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"sideWidth": null,
"sort": "current",
"sortDesc": true,
"total": false,
"values": true
},
"lines": false,
"linewidth": 1,
"links": [
],
"minSpan": 24,
"nullPointMode": "null",
"paceLength": 10,
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 24,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "sort_desc(avg(irate(container_network_transmit_bytes_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\",namespace=~\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\",namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ pod }}",
"refId": "A",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Average Rate of Bytes Transmitted",
"tooltip": {
"shared": true,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "series",
"name": null,
"show": false,
"values": [
"current"
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Average Bandwidth",
"titleSize": "h6",
"type": "row"
},
{
"collapse": false,
"collapsed": false,
"gridPos": {
"h": 1,
"w": 24,
"x": 0,
"y": 11
},
"id": 8,
"panels": [
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Bandwidth HIstory",
"titleSize": "h6",
"type": "row"
},
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 2,
"fillGradient": 0,
"gridPos": {
"h": 9,
"w": 12,
"x": 0,
"y": 12
},
"id": 9,
"legend": {
"alignAsTable": false,
"avg": false,
"current": false,
"hideEmpty": true,
"hideZero": true,
"max": false,
"min": false,
"rightSide": false,
"show": true,
"sideWidth": null,
"total": false,
"values": false
},
"lines": true,
"linewidth": 2,
"links": [
],
"minSpan": 12,
"nullPointMode": "connected",
"paceLength": 10,
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sort_desc(sum(irate(container_network_receive_bytes_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\",namespace=~\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\",namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{pod}}",
"refId": "A",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Receive Bandwidth",
"tooltip": {
"shared": true,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
}
]
},
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 2,
"fillGradient": 0,
"gridPos": {
"h": 9,
"w": 12,
"x": 12,
"y": 12
},
"id": 10,
"legend": {
"alignAsTable": false,
"avg": false,
"current": false,
"hideEmpty": true,
"hideZero": true,
"max": false,
"min": false,
"rightSide": false,
"show": true,
"sideWidth": null,
"total": false,
"values": false
},
"lines": true,
"linewidth": 2,
"links": [
],
"minSpan": 12,
"nullPointMode": "connected",
"paceLength": 10,
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sort_desc(sum(irate(container_network_transmit_bytes_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\",namespace=~\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\",namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{pod}}",
"refId": "A",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Transmit Bandwidth",
"tooltip": {
"shared": true,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
}
]
},
{
"collapse": true,
"collapsed": true,
"gridPos": {
"h": 1,
"w": 24,
"x": 0,
"y": 21
},
"id": 11,
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 2,
"fillGradient": 0,
"gridPos": {
"h": 9,
"w": 12,
"x": 0,
"y": 22
},
"id": 12,
"legend": {
"alignAsTable": false,
"avg": false,
"current": false,
"hideEmpty": true,
"hideZero": true,
"max": false,
"min": false,
"rightSide": false,
"show": true,
"sideWidth": null,
"total": false,
"values": false
},
"lines": true,
"linewidth": 2,
"links": [
],
"minSpan": 12,
"nullPointMode": "connected",
"paceLength": 10,
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sort_desc(sum(irate(container_network_receive_packets_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\",namespace=~\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\",namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{pod}}",
"refId": "A",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Rate of Received Packets",
"tooltip": {
"shared": true,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
}
]
},
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 2,
"fillGradient": 0,
"gridPos": {
"h": 9,
"w": 12,
"x": 12,
"y": 22
},
"id": 13,
"legend": {
"alignAsTable": false,
"avg": false,
"current": false,
"hideEmpty": true,
"hideZero": true,
"max": false,
"min": false,
"rightSide": false,
"show": true,
"sideWidth": null,
"total": false,
"values": false
},
"lines": true,
"linewidth": 2,
"links": [
],
"minSpan": 12,
"nullPointMode": "connected",
"paceLength": 10,
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sort_desc(sum(irate(container_network_transmit_packets_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\",namespace=~\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\",namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{pod}}",
"refId": "A",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Rate of Transmitted Packets",
"tooltip": {
"shared": true,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Packets",
"titleSize": "h6",
"type": "row"
},
{
"collapse": true,
"collapsed": true,
"gridPos": {
"h": 1,
"w": 24,
"x": 0,
"y": 22
},
"id": 14,
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 2,
"fillGradient": 0,
"gridPos": {
"h": 9,
"w": 12,
"x": 0,
"y": 23
},
"id": 15,
"legend": {
"alignAsTable": false,
"avg": false,
"current": false,
"hideEmpty": true,
"hideZero": true,
"max": false,
"min": false,
"rightSide": false,
"show": true,
"sideWidth": null,
"total": false,
"values": false
},
"lines": true,
"linewidth": 2,
"links": [
],
"minSpan": 12,
"nullPointMode": "connected",
"paceLength": 10,
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sort_desc(sum(irate(container_network_receive_packets_dropped_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\",namespace=~\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\",namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{pod}}",
"refId": "A",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Rate of Received Packets Dropped",
"tooltip": {
"shared": true,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
}
]
},
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 2,
"fillGradient": 0,
"gridPos": {
"h": 9,
"w": 12,
"x": 12,
"y": 23
},
"id": 16,
"legend": {
"alignAsTable": false,
"avg": false,
"current": false,
"hideEmpty": true,
"hideZero": true,
"max": false,
"min": false,
"rightSide": false,
"show": true,
"sideWidth": null,
"total": false,
"values": false
},
"lines": true,
"linewidth": 2,
"links": [
],
"minSpan": 12,
"nullPointMode": "connected",
"paceLength": 10,
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sort_desc(sum(irate(container_network_transmit_packets_dropped_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\",namespace=~\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\",namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{pod}}",
"refId": "A",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Rate of Transmitted Packets Dropped",
"tooltip": {
"shared": true,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Errors",
"titleSize": "h6",
"type": "row"
}
],
"refresh": "10s",
"rows": [
],
"schemaVersion": 18,
"style": "dark",
"tags": [
"kubernetes-mixin"
],
"templating": {
"list": [
{
"current": {
"text": "default",
"value": "default"
},
"hide": 0,
"label": "Data Source",
"name": "datasource",
"options": [
],
"query": "prometheus",
"refresh": 1,
"regex": "",
"type": "datasource"
},
{
"allValue": null,
"current": {
},
"datasource": "$datasource",
"hide": 2,
"includeAll": false,
"label": null,
"multi": false,
"name": "cluster",
"options": [
],
"query": "label_values(kube_pod_info{job=\"kube-state-metrics\"}, cluster)",
"refresh": 2,
"regex": "",
"sort": 0,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": ".+",
"auto": false,
"auto_count": 30,
"auto_min": "10s",
"current": {
"text": "kube-system",
"value": "kube-system"
},
"datasource": "$datasource",
"definition": "label_values(container_network_receive_packets_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\"}, namespace)",
"hide": 0,
"includeAll": true,
"label": null,
"multi": false,
"name": "namespace",
"options": [
],
"query": "label_values(container_network_receive_packets_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\"}, namespace)",
"refresh": 2,
"regex": "",
"skipUrlSync": false,
"sort": 1,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": null,
"auto": false,
"auto_count": 30,
"auto_min": "10s",
"current": {
"text": "",
"value": ""
},
"datasource": "$datasource",
"definition": "label_values(namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\",namespace=~\"$namespace\"}, workload)",
"hide": 0,
"includeAll": false,
"label": null,
"multi": false,
"name": "workload",
"options": [
],
"query": "label_values(namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\",namespace=~\"$namespace\"}, workload)",
"refresh": 2,
"regex": "",
"skipUrlSync": false,
"sort": 1,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": null,
"auto": false,
"auto_count": 30,
"auto_min": "10s",
"current": {
"text": "deployment",
"value": "deployment"
},
"datasource": "$datasource",
"definition": "label_values(namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\",namespace=~\"$namespace\", workload=~\"$workload\"}, workload_type)",
"hide": 0,
"includeAll": false,
"label": null,
"multi": false,
"name": "type",
"options": [
],
"query": "label_values(namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\",namespace=~\"$namespace\", workload=~\"$workload\"}, workload_type)",
"refresh": 2,
"regex": "",
"skipUrlSync": false,
"sort": 0,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": null,
"auto": false,
"auto_count": 30,
"auto_min": "10s",
"current": {
"text": "5m",
"value": "5m"
},
"datasource": "$datasource",
"hide": 0,
"includeAll": false,
"label": null,
"multi": false,
"name": "resolution",
"options": [
{
"selected": false,
"text": "30s",
"value": "30s"
},
{
"selected": true,
"text": "5m",
"value": "5m"
},
{
"selected": false,
"text": "1h",
"value": "1h"
}
],
"query": "30s,5m,1h",
"refresh": 2,
"regex": "",
"skipUrlSync": false,
"sort": 1,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "interval",
"useTags": false
},
{
"allValue": null,
"auto": false,
"auto_count": 30,
"auto_min": "10s",
"current": {
"text": "5m",
"value": "5m"
},
"datasource": "$datasource",
"hide": 2,
"includeAll": false,
"label": null,
"multi": false,
"name": "interval",
"options": [
{
"selected": true,
"text": "4h",
"value": "4h"
}
],
"query": "4h",
"refresh": 2,
"regex": "",
"skipUrlSync": false,
"sort": 1,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "interval",
"useTags": false
}
]
},
"time": {
"from": "now-1h",
"to": "now"
},
"timepicker": {
"refresh_intervals": [
"5s",
"10s",
"30s",
"1m",
"5m",
"15m",
"30m",
"1h",
"2h",
"1d"
],
"time_options": [
"5m",
"15m",
"1h",
"6h",
"12h",
"24h",
"2d",
"7d",
"30d"
]
},
"timezone": "UTC",
"title": "Kubernetes / Networking / Workload",
"uid": "728bf77cc1166d2f3133bf25846876cc",
"version": 0
}
kind: ConfigMap
metadata:
name: grafana-dashboards-k8s-network
namespace: monitoring

View File

@ -21,155 +21,108 @@ data:
"links": [
],
"refresh": "10s",
"rows": [
{
"collapse": false,
"collapsed": false,
"panels": [
{
"cacheTimeout": null,
"colorBackground": false,
"colorValue": false,
"colors": [
"#299c46",
"rgba(237, 129, 40, 0.89)",
"#d44a3a"
],
"datasource": "$datasource",
"format": "none",
"gauge": {
"maxValue": 100,
"minValue": 0,
"show": false,
"thresholdLabels": false,
"thresholdMarkers": true
},
"gridPos": {
},
"id": 2,
"interval": null,
"fieldConfig": {
"defaults": {
"links": [
],
"mappingType": 1,
"mappingTypes": [
{
"name": "value to text",
"value": 1
},
{
"name": "range to text",
"value": 2
}
"mappings": [
],
"maxDataPoints": 100,
"nullPointMode": "connected",
"nullText": null,
"postfix": "",
"postfixFontSize": "50%",
"prefix": "",
"prefixFontSize": "50%",
"rangeMaps": [
{
"from": "null",
"text": "N/A",
"to": "null"
}
],
"span": 2,
"sparkline": {
"fillColor": "rgba(31, 118, 189, 0.18)",
"full": false,
"lineColor": "rgb(31, 120, 193)",
"show": false
"thresholds": {
"mode": "absolute",
"steps": [
]
},
"tableColumn": "",
"unit": "none"
}
},
"gridPos": {
"h": 7,
"w": 4,
"x": 0,
"y": 0
},
"id": 2,
"links": [
],
"options": {
"colorMode": "value",
"graphMode": "area",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"textMode": "auto"
},
"pluginVersion": "7",
"targets": [
{
"expr": "sum(up{cluster=\"$cluster\", job=\"kubelet\"})",
"expr": "sum(kubelet_node_name{cluster=\"$cluster\", job=\"kubelet\"})",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "",
"refId": "A"
}
],
"thresholds": "",
"title": "Up",
"tooltip": {
"shared": false
},
"type": "singlestat",
"valueFontSize": "80%",
"valueMaps": [
{
"op": "=",
"text": "N/A",
"value": "null"
}
],
"valueName": "min"
"title": "Running Kubelets",
"transparent": false,
"type": "stat"
},
{
"cacheTimeout": null,
"colorBackground": false,
"colorValue": false,
"colors": [
"#299c46",
"rgba(237, 129, 40, 0.89)",
"#d44a3a"
],
"datasource": "$datasource",
"format": "none",
"gauge": {
"maxValue": 100,
"minValue": 0,
"show": false,
"thresholdLabels": false,
"thresholdMarkers": true
},
"gridPos": {
},
"id": 3,
"interval": null,
"fieldConfig": {
"defaults": {
"links": [
],
"mappingType": 1,
"mappingTypes": [
{
"name": "value to text",
"value": 1
},
{
"name": "range to text",
"value": 2
}
"mappings": [
],
"maxDataPoints": 100,
"nullPointMode": "connected",
"nullText": null,
"postfix": "",
"postfixFontSize": "50%",
"prefix": "",
"prefixFontSize": "50%",
"rangeMaps": [
{
"from": "null",
"text": "N/A",
"to": "null"
}
],
"span": 2,
"sparkline": {
"fillColor": "rgba(31, 118, 189, 0.18)",
"full": false,
"lineColor": "rgb(31, 120, 193)",
"show": false
"thresholds": {
"mode": "absolute",
"steps": [
]
},
"tableColumn": "",
"unit": "none"
}
},
"gridPos": {
"h": 7,
"w": 4,
"x": 4,
"y": 0
},
"id": 3,
"links": [
],
"options": {
"colorMode": "value",
"graphMode": "area",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"textMode": "auto"
},
"pluginVersion": "7",
"targets": [
{
"expr": "sum(kubelet_running_pods{cluster=\"$cluster\", job=\"kubelet\", instance=~\"$instance\"}) OR sum(kubelet_running_pod_count{cluster=\"$cluster\", job=\"kubelet\", instance=~\"$instance\"})",
@ -179,81 +132,54 @@ data:
"refId": "A"
}
],
"thresholds": "",
"title": "Running Pods",
"tooltip": {
"shared": false
},
"type": "singlestat",
"valueFontSize": "80%",
"valueMaps": [
{
"op": "=",
"text": "N/A",
"value": "null"
}
],
"valueName": "min"
"transparent": false,
"type": "stat"
},
{
"cacheTimeout": null,
"colorBackground": false,
"colorValue": false,
"colors": [
"#299c46",
"rgba(237, 129, 40, 0.89)",
"#d44a3a"
],
"datasource": "$datasource",
"format": "none",
"gauge": {
"maxValue": 100,
"minValue": 0,
"show": false,
"thresholdLabels": false,
"thresholdMarkers": true
},
"gridPos": {
},
"id": 4,
"interval": null,
"fieldConfig": {
"defaults": {
"links": [
],
"mappingType": 1,
"mappingTypes": [
{
"name": "value to text",
"value": 1
},
{
"name": "range to text",
"value": 2
}
"mappings": [
],
"maxDataPoints": 100,
"nullPointMode": "connected",
"nullText": null,
"postfix": "",
"postfixFontSize": "50%",
"prefix": "",
"prefixFontSize": "50%",
"rangeMaps": [
{
"from": "null",
"text": "N/A",
"to": "null"
}
],
"span": 2,
"sparkline": {
"fillColor": "rgba(31, 118, 189, 0.18)",
"full": false,
"lineColor": "rgb(31, 120, 193)",
"show": false
"thresholds": {
"mode": "absolute",
"steps": [
]
},
"tableColumn": "",
"unit": "none"
}
},
"gridPos": {
"h": 7,
"w": 4,
"x": 8,
"y": 0
},
"id": 4,
"links": [
],
"options": {
"colorMode": "value",
"graphMode": "area",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"textMode": "auto"
},
"pluginVersion": "7",
"targets": [
{
"expr": "sum(kubelet_running_containers{cluster=\"$cluster\", job=\"kubelet\", instance=~\"$instance\"}) OR sum(kubelet_running_container_count{cluster=\"$cluster\", job=\"kubelet\", instance=~\"$instance\"})",
@ -263,81 +189,54 @@ data:
"refId": "A"
}
],
"thresholds": "",
"title": "Running Container",
"tooltip": {
"shared": false
},
"type": "singlestat",
"valueFontSize": "80%",
"valueMaps": [
{
"op": "=",
"text": "N/A",
"value": "null"
}
],
"valueName": "min"
"title": "Running Containers",
"transparent": false,
"type": "stat"
},
{
"cacheTimeout": null,
"colorBackground": false,
"colorValue": false,
"colors": [
"#299c46",
"rgba(237, 129, 40, 0.89)",
"#d44a3a"
],
"datasource": "$datasource",
"format": "none",
"gauge": {
"maxValue": 100,
"minValue": 0,
"show": false,
"thresholdLabels": false,
"thresholdMarkers": true
},
"gridPos": {
},
"id": 5,
"interval": null,
"fieldConfig": {
"defaults": {
"links": [
],
"mappingType": 1,
"mappingTypes": [
{
"name": "value to text",
"value": 1
},
{
"name": "range to text",
"value": 2
}
"mappings": [
],
"maxDataPoints": 100,
"nullPointMode": "connected",
"nullText": null,
"postfix": "",
"postfixFontSize": "50%",
"prefix": "",
"prefixFontSize": "50%",
"rangeMaps": [
{
"from": "null",
"text": "N/A",
"to": "null"
}
],
"span": 2,
"sparkline": {
"fillColor": "rgba(31, 118, 189, 0.18)",
"full": false,
"lineColor": "rgb(31, 120, 193)",
"show": false
"thresholds": {
"mode": "absolute",
"steps": [
]
},
"tableColumn": "",
"unit": "none"
}
},
"gridPos": {
"h": 7,
"w": 4,
"x": 12,
"y": 0
},
"id": 5,
"links": [
],
"options": {
"colorMode": "value",
"graphMode": "area",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"textMode": "auto"
},
"pluginVersion": "7",
"targets": [
{
"expr": "sum(volume_manager_total_volumes{cluster=\"$cluster\", job=\"kubelet\", instance=~\"$instance\", state=\"actual_state_of_world\"})",
@ -347,81 +246,54 @@ data:
"refId": "A"
}
],
"thresholds": "",
"title": "Actual Volume Count",
"tooltip": {
"shared": false
},
"type": "singlestat",
"valueFontSize": "80%",
"valueMaps": [
{
"op": "=",
"text": "N/A",
"value": "null"
}
],
"valueName": "min"
"transparent": false,
"type": "stat"
},
{
"cacheTimeout": null,
"colorBackground": false,
"colorValue": false,
"colors": [
"#299c46",
"rgba(237, 129, 40, 0.89)",
"#d44a3a"
],
"datasource": "$datasource",
"format": "none",
"gauge": {
"maxValue": 100,
"minValue": 0,
"show": false,
"thresholdLabels": false,
"thresholdMarkers": true
},
"gridPos": {
},
"id": 6,
"interval": null,
"fieldConfig": {
"defaults": {
"links": [
],
"mappingType": 1,
"mappingTypes": [
{
"name": "value to text",
"value": 1
},
{
"name": "range to text",
"value": 2
}
"mappings": [
],
"maxDataPoints": 100,
"nullPointMode": "connected",
"nullText": null,
"postfix": "",
"postfixFontSize": "50%",
"prefix": "",
"prefixFontSize": "50%",
"rangeMaps": [
{
"from": "null",
"text": "N/A",
"to": "null"
}
],
"span": 2,
"sparkline": {
"fillColor": "rgba(31, 118, 189, 0.18)",
"full": false,
"lineColor": "rgb(31, 120, 193)",
"show": false
"thresholds": {
"mode": "absolute",
"steps": [
]
},
"tableColumn": "",
"unit": "none"
}
},
"gridPos": {
"h": 7,
"w": 4,
"x": 16,
"y": 0
},
"id": 6,
"links": [
],
"options": {
"colorMode": "value",
"graphMode": "area",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"textMode": "auto"
},
"pluginVersion": "7",
"targets": [
{
"expr": "sum(volume_manager_total_volumes{cluster=\"$cluster\", job=\"kubelet\", instance=~\"$instance\",state=\"desired_state_of_world\"})",
@ -431,119 +303,67 @@ data:
"refId": "A"
}
],
"thresholds": "",
"title": "Desired Volume Count",
"tooltip": {
"shared": false
},
"type": "singlestat",
"valueFontSize": "80%",
"valueMaps": [
{
"op": "=",
"text": "N/A",
"value": "null"
}
],
"valueName": "min"
"transparent": false,
"type": "stat"
},
{
"cacheTimeout": null,
"colorBackground": false,
"colorValue": false,
"colors": [
"#299c46",
"rgba(237, 129, 40, 0.89)",
"#d44a3a"
],
"datasource": "$datasource",
"format": "none",
"gauge": {
"maxValue": 100,
"minValue": 0,
"show": false,
"thresholdLabels": false,
"thresholdMarkers": true
},
"gridPos": {
},
"id": 7,
"interval": null,
"fieldConfig": {
"defaults": {
"links": [
],
"mappingType": 1,
"mappingTypes": [
{
"name": "value to text",
"value": 1
},
{
"name": "range to text",
"value": 2
}
"mappings": [
],
"maxDataPoints": 100,
"nullPointMode": "connected",
"nullText": null,
"postfix": "",
"postfixFontSize": "50%",
"prefix": "",
"prefixFontSize": "50%",
"rangeMaps": [
{
"from": "null",
"text": "N/A",
"to": "null"
}
],
"span": 2,
"sparkline": {
"fillColor": "rgba(31, 118, 189, 0.18)",
"full": false,
"lineColor": "rgb(31, 120, 193)",
"show": false
"thresholds": {
"mode": "absolute",
"steps": [
]
},
"tableColumn": "",
"unit": "none"
}
},
"gridPos": {
"h": 7,
"w": 4,
"x": 20,
"y": 0
},
"id": 7,
"links": [
],
"options": {
"colorMode": "value",
"graphMode": "area",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"textMode": "auto"
},
"pluginVersion": "7",
"targets": [
{
"expr": "sum(rate(kubelet_node_config_error{cluster=\"$cluster\", job=\"kubelet\", instance=~\"$instance\"}[5m]))",
"expr": "sum(rate(kubelet_node_config_error{cluster=\"$cluster\", job=\"kubelet\", instance=~\"$instance\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{instance}}",
"refId": "A"
}
],
"thresholds": "",
"title": "Config Error Count",
"tooltip": {
"shared": false
"transparent": false,
"type": "stat"
},
"type": "singlestat",
"valueFontSize": "80%",
"valueMaps": [
{
"op": "=",
"text": "N/A",
"value": "null"
}
],
"valueName": "min"
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": false,
"title": "Dashboard Row",
"titleSize": "h6",
"type": "row"
},
{
"collapse": false,
"collapsed": false,
"panels": [
{
"aliasColors": {
@ -555,7 +375,10 @@ data:
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 7,
"w": 12,
"x": 0,
"y": 7
},
"id": 8,
"legend": {
@ -585,12 +408,11 @@ data:
],
"spaceLength": 10,
"span": 6,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "sum(rate(kubelet_runtime_operations_total{cluster=\"$cluster\",job=\"kubelet\",instance=~\"$instance\"}[5m])) by (operation_type, instance)",
"expr": "sum(rate(kubelet_runtime_operations_total{cluster=\"$cluster\",job=\"kubelet\",instance=~\"$instance\"}[$__rate_interval])) by (operation_type, instance)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{instance}} {{operation_type}}",
@ -604,7 +426,7 @@ data:
"timeShift": null,
"title": "Operation Rate",
"tooltip": {
"shared": false,
"shared": true,
"sort": 0,
"value_type": "individual"
},
@ -648,7 +470,10 @@ data:
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 7,
"w": 12,
"x": 12,
"y": 7
},
"id": 9,
"legend": {
@ -678,12 +503,11 @@ data:
],
"spaceLength": 10,
"span": 6,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "sum(rate(kubelet_runtime_operations_errors_total{cluster=\"$cluster\",job=\"kubelet\",instance=~\"$instance\"}[5m])) by (instance, operation_type)",
"expr": "sum(rate(kubelet_runtime_operations_errors_total{cluster=\"$cluster\",job=\"kubelet\",instance=~\"$instance\"}[$__rate_interval])) by (instance, operation_type)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{instance}} {{operation_type}}",
@ -697,7 +521,7 @@ data:
"timeShift": null,
"title": "Operation Error Rate",
"tooltip": {
"shared": false,
"shared": true,
"sort": 0,
"value_type": "individual"
},
@ -717,7 +541,7 @@ data:
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"min": null,
"show": true
},
{
@ -725,24 +549,11 @@ data:
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"min": null,
"show": true
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": false,
"title": "Dashboard Row",
"titleSize": "h6",
"type": "row"
},
{
"collapse": false,
"collapsed": false,
"panels": [
{
"aliasColors": {
@ -754,7 +565,10 @@ data:
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 7,
"w": 24,
"x": 0,
"y": 14
},
"id": 10,
"legend": {
@ -784,12 +598,11 @@ data:
],
"spaceLength": 10,
"span": 12,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "histogram_quantile(0.99, sum(rate(kubelet_runtime_operations_duration_seconds_bucket{cluster=\"$cluster\",job=\"kubelet\",instance=~\"$instance\"}[5m])) by (instance, operation_type, le))",
"expr": "histogram_quantile(0.99, sum(rate(kubelet_runtime_operations_duration_seconds_bucket{cluster=\"$cluster\",job=\"kubelet\",instance=~\"$instance\"}[$__rate_interval])) by (instance, operation_type, le))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{instance}} {{operation_type}}",
@ -803,7 +616,7 @@ data:
"timeShift": null,
"title": "Operation duration 99th quantile",
"tooltip": {
"shared": false,
"shared": true,
"sort": 0,
"value_type": "individual"
},
@ -835,20 +648,7 @@ data:
"show": true
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": false,
"title": "Dashboard Row",
"titleSize": "h6",
"type": "row"
},
{
"collapse": false,
"collapsed": false,
"panels": [
{
"aliasColors": {
@ -860,7 +660,10 @@ data:
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 7,
"w": 12,
"x": 0,
"y": 21
},
"id": 11,
"legend": {
@ -890,19 +693,18 @@ data:
],
"spaceLength": 10,
"span": 6,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "sum(rate(kubelet_pod_start_duration_seconds_count{cluster=\"$cluster\",job=\"kubelet\",instance=~\"$instance\"}[5m])) by (instance)",
"expr": "sum(rate(kubelet_pod_start_duration_seconds_count{cluster=\"$cluster\",job=\"kubelet\",instance=~\"$instance\"}[$__rate_interval])) by (instance)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{instance}} pod",
"refId": "A"
},
{
"expr": "sum(rate(kubelet_pod_worker_duration_seconds_count{cluster=\"$cluster\",job=\"kubelet\",instance=~\"$instance\"}[5m])) by (instance)",
"expr": "sum(rate(kubelet_pod_worker_duration_seconds_count{cluster=\"$cluster\",job=\"kubelet\",instance=~\"$instance\"}[$__rate_interval])) by (instance)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{instance}} worker",
@ -916,7 +718,7 @@ data:
"timeShift": null,
"title": "Pod Start Rate",
"tooltip": {
"shared": false,
"shared": true,
"sort": 0,
"value_type": "individual"
},
@ -936,7 +738,7 @@ data:
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"min": null,
"show": true
},
{
@ -944,7 +746,7 @@ data:
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"min": null,
"show": true
}
]
@ -960,7 +762,10 @@ data:
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 7,
"w": 12,
"x": 12,
"y": 21
},
"id": 12,
"legend": {
@ -990,19 +795,18 @@ data:
],
"spaceLength": 10,
"span": 6,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "histogram_quantile(0.99, sum(rate(kubelet_pod_start_duration_seconds_count{cluster=\"$cluster\",job=\"kubelet\",instance=~\"$instance\"}[5m])) by (instance, le))",
"expr": "histogram_quantile(0.99, sum(rate(kubelet_pod_start_duration_seconds_count{cluster=\"$cluster\",job=\"kubelet\",instance=~\"$instance\"}[$__rate_interval])) by (instance, le))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{instance}} pod",
"refId": "A"
},
{
"expr": "histogram_quantile(0.99, sum(rate(kubelet_pod_worker_duration_seconds_bucket{cluster=\"$cluster\",job=\"kubelet\",instance=~\"$instance\"}[5m])) by (instance, le))",
"expr": "histogram_quantile(0.99, sum(rate(kubelet_pod_worker_duration_seconds_bucket{cluster=\"$cluster\",job=\"kubelet\",instance=~\"$instance\"}[$__rate_interval])) by (instance, le))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{instance}} worker",
@ -1016,7 +820,7 @@ data:
"timeShift": null,
"title": "Pod Start Duration",
"tooltip": {
"shared": false,
"shared": true,
"sort": 0,
"value_type": "individual"
},
@ -1036,7 +840,7 @@ data:
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"min": null,
"show": true
},
{
@ -1044,24 +848,11 @@ data:
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"min": null,
"show": true
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": false,
"title": "Dashboard Row",
"titleSize": "h6",
"type": "row"
},
{
"collapse": false,
"collapsed": false,
"panels": [
{
"aliasColors": {
@ -1073,7 +864,10 @@ data:
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 7,
"w": 12,
"x": 0,
"y": 28
},
"id": 13,
"legend": {
@ -1105,12 +899,11 @@ data:
],
"spaceLength": 10,
"span": 6,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "sum(rate(storage_operation_duration_seconds_count{cluster=\"$cluster\",job=\"kubelet\",instance=~\"$instance\"}[5m])) by (instance, operation_name, volume_plugin)",
"expr": "sum(rate(storage_operation_duration_seconds_count{cluster=\"$cluster\",job=\"kubelet\",instance=~\"$instance\"}[$__rate_interval])) by (instance, operation_name, volume_plugin)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{instance}} {{operation_name}} {{volume_plugin}}",
@ -1124,7 +917,7 @@ data:
"timeShift": null,
"title": "Storage Operation Rate",
"tooltip": {
"shared": false,
"shared": true,
"sort": 0,
"value_type": "individual"
},
@ -1144,7 +937,7 @@ data:
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"min": null,
"show": true
},
{
@ -1152,7 +945,7 @@ data:
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"min": null,
"show": true
}
]
@ -1168,7 +961,10 @@ data:
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 7,
"w": 12,
"x": 12,
"y": 28
},
"id": 14,
"legend": {
@ -1200,12 +996,11 @@ data:
],
"spaceLength": 10,
"span": 6,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "sum(rate(storage_operation_errors_total{cluster=\"$cluster\",job=\"kubelet\",instance=~\"$instance\"}[5m])) by (instance, operation_name, volume_plugin)",
"expr": "sum(rate(storage_operation_errors_total{cluster=\"$cluster\",job=\"kubelet\",instance=~\"$instance\"}[$__rate_interval])) by (instance, operation_name, volume_plugin)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{instance}} {{operation_name}} {{volume_plugin}}",
@ -1219,7 +1014,7 @@ data:
"timeShift": null,
"title": "Storage Operation Error Rate",
"tooltip": {
"shared": false,
"shared": true,
"sort": 0,
"value_type": "individual"
},
@ -1239,7 +1034,7 @@ data:
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"min": null,
"show": true
},
{
@ -1247,24 +1042,11 @@ data:
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"min": null,
"show": true
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": false,
"title": "Dashboard Row",
"titleSize": "h6",
"type": "row"
},
{
"collapse": false,
"collapsed": false,
"panels": [
{
"aliasColors": {
@ -1276,7 +1058,10 @@ data:
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 7,
"w": 24,
"x": 0,
"y": 35
},
"id": 15,
"legend": {
@ -1308,12 +1093,11 @@ data:
],
"spaceLength": 10,
"span": 12,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "histogram_quantile(0.99, sum(rate(storage_operation_duration_seconds_bucket{cluster=\"$cluster\", job=\"kubelet\", instance=~\"$instance\"}[5m])) by (instance, operation_name, volume_plugin, le))",
"expr": "histogram_quantile(0.99, sum(rate(storage_operation_duration_seconds_bucket{cluster=\"$cluster\", job=\"kubelet\", instance=~\"$instance\"}[$__rate_interval])) by (instance, operation_name, volume_plugin, le))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{instance}} {{operation_name}} {{volume_plugin}}",
@ -1327,7 +1111,7 @@ data:
"timeShift": null,
"title": "Storage Operation Duration 99th quantile",
"tooltip": {
"shared": false,
"shared": true,
"sort": 0,
"value_type": "individual"
},
@ -1347,7 +1131,7 @@ data:
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"min": null,
"show": true
},
{
@ -1355,24 +1139,11 @@ data:
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"min": null,
"show": true
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": false,
"title": "Dashboard Row",
"titleSize": "h6",
"type": "row"
},
{
"collapse": false,
"collapsed": false,
"panels": [
{
"aliasColors": {
@ -1384,7 +1155,10 @@ data:
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 7,
"w": 12,
"x": 0,
"y": 42
},
"id": 16,
"legend": {
@ -1414,12 +1188,11 @@ data:
],
"spaceLength": 10,
"span": 6,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "sum(rate(kubelet_cgroup_manager_duration_seconds_count{cluster=\"$cluster\", job=\"kubelet\", instance=~\"$instance\"}[5m])) by (instance, operation_type)",
"expr": "sum(rate(kubelet_cgroup_manager_duration_seconds_count{cluster=\"$cluster\", job=\"kubelet\", instance=~\"$instance\"}[$__rate_interval])) by (instance, operation_type)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{operation_type}}",
@ -1433,7 +1206,7 @@ data:
"timeShift": null,
"title": "Cgroup manager operation rate",
"tooltip": {
"shared": false,
"shared": true,
"sort": 0,
"value_type": "individual"
},
@ -1453,7 +1226,7 @@ data:
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"min": null,
"show": true
},
{
@ -1461,7 +1234,7 @@ data:
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"min": null,
"show": true
}
]
@ -1477,7 +1250,10 @@ data:
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 7,
"w": 12,
"x": 12,
"y": 42
},
"id": 17,
"legend": {
@ -1507,12 +1283,11 @@ data:
],
"spaceLength": 10,
"span": 6,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "histogram_quantile(0.99, sum(rate(kubelet_cgroup_manager_duration_seconds_bucket{cluster=\"$cluster\", job=\"kubelet\", instance=~\"$instance\"}[5m])) by (instance, operation_type, le))",
"expr": "histogram_quantile(0.99, sum(rate(kubelet_cgroup_manager_duration_seconds_bucket{cluster=\"$cluster\", job=\"kubelet\", instance=~\"$instance\"}[$__rate_interval])) by (instance, operation_type, le))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{instance}} {{operation_type}}",
@ -1526,7 +1301,7 @@ data:
"timeShift": null,
"title": "Cgroup manager 99th quantile",
"tooltip": {
"shared": false,
"shared": true,
"sort": 0,
"value_type": "individual"
},
@ -1546,7 +1321,7 @@ data:
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"min": null,
"show": true
},
{
@ -1554,24 +1329,11 @@ data:
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"min": null,
"show": true
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": false,
"title": "Dashboard Row",
"titleSize": "h6",
"type": "row"
},
{
"collapse": false,
"collapsed": false,
"panels": [
{
"aliasColors": {
@ -1584,7 +1346,10 @@ data:
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 7,
"w": 12,
"x": 0,
"y": 49
},
"id": 18,
"legend": {
@ -1614,12 +1379,11 @@ data:
],
"spaceLength": 10,
"span": 6,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "sum(rate(kubelet_pleg_relist_duration_seconds_count{cluster=\"$cluster\", job=\"kubelet\", instance=~\"$instance\"}[5m])) by (instance)",
"expr": "sum(rate(kubelet_pleg_relist_duration_seconds_count{cluster=\"$cluster\", job=\"kubelet\", instance=~\"$instance\"}[$__rate_interval])) by (instance)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{instance}}",
@ -1633,7 +1397,7 @@ data:
"timeShift": null,
"title": "PLEG relist rate",
"tooltip": {
"shared": false,
"shared": true,
"sort": 0,
"value_type": "individual"
},
@ -1653,7 +1417,7 @@ data:
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"min": null,
"show": true
},
{
@ -1661,7 +1425,7 @@ data:
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"min": null,
"show": true
}
]
@ -1677,7 +1441,10 @@ data:
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 7,
"w": 12,
"x": 12,
"y": 49
},
"id": 19,
"legend": {
@ -1707,12 +1474,11 @@ data:
],
"spaceLength": 10,
"span": 6,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "histogram_quantile(0.99, sum(rate(kubelet_pleg_relist_interval_seconds_bucket{cluster=\"$cluster\",job=\"kubelet\",instance=~\"$instance\"}[5m])) by (instance, le))",
"expr": "histogram_quantile(0.99, sum(rate(kubelet_pleg_relist_interval_seconds_bucket{cluster=\"$cluster\",job=\"kubelet\",instance=~\"$instance\"}[$__rate_interval])) by (instance, le))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{instance}}",
@ -1726,7 +1492,7 @@ data:
"timeShift": null,
"title": "PLEG relist interval",
"tooltip": {
"shared": false,
"shared": true,
"sort": 0,
"value_type": "individual"
},
@ -1746,7 +1512,7 @@ data:
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"min": null,
"show": true
},
{
@ -1754,24 +1520,11 @@ data:
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"min": null,
"show": true
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": false,
"title": "Dashboard Row",
"titleSize": "h6",
"type": "row"
},
{
"collapse": false,
"collapsed": false,
"panels": [
{
"aliasColors": {
@ -1783,7 +1536,10 @@ data:
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 7,
"w": 24,
"x": 0,
"y": 56
},
"id": 20,
"legend": {
@ -1813,12 +1569,11 @@ data:
],
"spaceLength": 10,
"span": 12,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "histogram_quantile(0.99, sum(rate(kubelet_pleg_relist_duration_seconds_bucket{cluster=\"$cluster\",job=\"kubelet\",instance=~\"$instance\"}[5m])) by (instance, le))",
"expr": "histogram_quantile(0.99, sum(rate(kubelet_pleg_relist_duration_seconds_bucket{cluster=\"$cluster\",job=\"kubelet\",instance=~\"$instance\"}[$__rate_interval])) by (instance, le))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{instance}}",
@ -1832,7 +1587,7 @@ data:
"timeShift": null,
"title": "PLEG relist duration",
"tooltip": {
"shared": false,
"shared": true,
"sort": 0,
"value_type": "individual"
},
@ -1852,7 +1607,7 @@ data:
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"min": null,
"show": true
},
{
@ -1860,24 +1615,11 @@ data:
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"min": null,
"show": true
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": false,
"title": "Dashboard Row",
"titleSize": "h6",
"type": "row"
},
{
"collapse": false,
"collapsed": false,
"panels": [
{
"aliasColors": {
@ -1889,7 +1631,10 @@ data:
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 7,
"w": 24,
"x": 0,
"y": 63
},
"id": 21,
"legend": {
@ -1919,33 +1664,32 @@ data:
],
"spaceLength": 10,
"span": 12,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "sum(rate(rest_client_requests_total{cluster=\"$cluster\",job=\"kubelet\", instance=~\"$instance\",code=~\"2..\"}[5m]))",
"expr": "sum(rate(rest_client_requests_total{cluster=\"$cluster\",job=\"kubelet\", instance=~\"$instance\",code=~\"2..\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "2xx",
"refId": "A"
},
{
"expr": "sum(rate(rest_client_requests_total{cluster=\"$cluster\",job=\"kubelet\", instance=~\"$instance\",code=~\"3..\"}[5m]))",
"expr": "sum(rate(rest_client_requests_total{cluster=\"$cluster\",job=\"kubelet\", instance=~\"$instance\",code=~\"3..\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "3xx",
"refId": "B"
},
{
"expr": "sum(rate(rest_client_requests_total{cluster=\"$cluster\",job=\"kubelet\", instance=~\"$instance\",code=~\"4..\"}[5m]))",
"expr": "sum(rate(rest_client_requests_total{cluster=\"$cluster\",job=\"kubelet\", instance=~\"$instance\",code=~\"4..\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "4xx",
"refId": "C"
},
{
"expr": "sum(rate(rest_client_requests_total{cluster=\"$cluster\",job=\"kubelet\", instance=~\"$instance\",code=~\"5..\"}[5m]))",
"expr": "sum(rate(rest_client_requests_total{cluster=\"$cluster\",job=\"kubelet\", instance=~\"$instance\",code=~\"5..\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "5xx",
@ -1959,7 +1703,7 @@ data:
"timeShift": null,
"title": "RPC Rate",
"tooltip": {
"shared": false,
"shared": true,
"sort": 0,
"value_type": "individual"
},
@ -1979,7 +1723,7 @@ data:
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"min": null,
"show": true
},
{
@ -1987,24 +1731,11 @@ data:
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"min": null,
"show": true
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": false,
"title": "Dashboard Row",
"titleSize": "h6",
"type": "row"
},
{
"collapse": false,
"collapsed": false,
"panels": [
{
"aliasColors": {
@ -2016,7 +1747,10 @@ data:
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 7,
"w": 24,
"x": 0,
"y": 70
},
"id": 22,
"legend": {
@ -2046,12 +1780,11 @@ data:
],
"spaceLength": 10,
"span": 12,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "histogram_quantile(0.99, sum(rate(rest_client_request_duration_seconds_bucket{cluster=\"$cluster\",job=\"kubelet\", instance=~\"$instance\"}[5m])) by (instance, verb, url, le))",
"expr": "histogram_quantile(0.99, sum(rate(rest_client_request_duration_seconds_bucket{cluster=\"$cluster\",job=\"kubelet\", instance=~\"$instance\"}[$__rate_interval])) by (instance, verb, url, le))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{instance}} {{verb}} {{url}}",
@ -2065,7 +1798,7 @@ data:
"timeShift": null,
"title": "Request duration 99th quantile",
"tooltip": {
"shared": false,
"shared": true,
"sort": 0,
"value_type": "individual"
},
@ -2085,7 +1818,7 @@ data:
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"min": null,
"show": true
},
{
@ -2093,24 +1826,11 @@ data:
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"min": null,
"show": true
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": false,
"title": "Dashboard Row",
"titleSize": "h6",
"type": "row"
},
{
"collapse": false,
"collapsed": false,
"panels": [
{
"aliasColors": {
@ -2122,7 +1842,10 @@ data:
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 7,
"w": 8,
"x": 0,
"y": 77
},
"id": 23,
"legend": {
@ -2152,7 +1875,6 @@ data:
],
"spaceLength": 10,
"span": 4,
"stack": false,
"steppedLine": false,
"targets": [
@ -2171,7 +1893,7 @@ data:
"timeShift": null,
"title": "Memory",
"tooltip": {
"shared": false,
"shared": true,
"sort": 0,
"value_type": "individual"
},
@ -2215,7 +1937,10 @@ data:
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 7,
"w": 8,
"x": 8,
"y": 77
},
"id": 24,
"legend": {
@ -2245,12 +1970,11 @@ data:
],
"spaceLength": 10,
"span": 4,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "rate(process_cpu_seconds_total{cluster=\"$cluster\",job=\"kubelet\",instance=~\"$instance\"}[5m])",
"expr": "rate(process_cpu_seconds_total{cluster=\"$cluster\",job=\"kubelet\",instance=~\"$instance\"}[$__rate_interval])",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{instance}}",
@ -2264,7 +1988,7 @@ data:
"timeShift": null,
"title": "CPU usage",
"tooltip": {
"shared": false,
"shared": true,
"sort": 0,
"value_type": "individual"
},
@ -2284,7 +2008,7 @@ data:
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"min": null,
"show": true
},
{
@ -2292,7 +2016,7 @@ data:
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"min": null,
"show": true
}
]
@ -2308,7 +2032,10 @@ data:
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 7,
"w": 8,
"x": 16,
"y": 77
},
"id": 25,
"legend": {
@ -2338,7 +2065,6 @@ data:
],
"spaceLength": 10,
"span": 4,
"stack": false,
"steppedLine": false,
"targets": [
@ -2357,7 +2083,7 @@ data:
"timeShift": null,
"title": "Goroutines",
"tooltip": {
"shared": false,
"shared": true,
"sort": 0,
"value_type": "individual"
},
@ -2391,14 +2117,9 @@ data:
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": false,
"title": "Dashboard Row",
"titleSize": "h6",
"type": "row"
}
"refresh": "10s",
"rows": [
],
"schemaVersion": 14,
"style": "dark",
@ -2413,7 +2134,7 @@ data:
"value": "default"
},
"hide": 0,
"label": null,
"label": "Data Source",
"name": "datasource",
"options": [
@ -2437,7 +2158,7 @@ data:
"options": [
],
"query": "label_values(kube_pod_info, cluster)",
"query": "label_values(up{job=\"kubelet\"}, cluster)",
"refresh": 2,
"regex": "",
"sort": 1,
@ -2457,13 +2178,13 @@ data:
"datasource": "$datasource",
"hide": 0,
"includeAll": true,
"label": null,
"label": "instance",
"multi": false,
"name": "instance",
"options": [
],
"query": "label_values(kubelet_runtime_operations_total{cluster=\"$cluster\", job=\"kubelet\"}, instance)",
"query": "label_values(up{job=\"kubelet\",cluster=\"$cluster\"}, instance)",
"refresh": 2,
"regex": "",
"sort": 1,
@ -2560,7 +2281,11 @@ data:
},
"id": 2,
"interval": null,
"interval": "1m",
"legend": {
"alignAsTable": true,
"rightSide": true
},
"links": [
],
@ -2599,7 +2324,7 @@ data:
"tableColumn": "",
"targets": [
{
"expr": "sum(up{job=\"kube-proxy\"})",
"expr": "sum(up{cluster=\"$cluster\", job=\"kube-proxy\"})",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "",
@ -2636,13 +2361,14 @@ data:
},
"id": 3,
"interval": "1m",
"legend": {
"alignAsTable": false,
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": false,
"rightSide": true,
"show": true,
"sideWidth": null,
"total": false,
@ -2668,7 +2394,7 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "sum(rate(kubeproxy_sync_proxy_rules_duration_seconds_count{job=\"kube-proxy\", instance=~\"$instance\"}[5m]))",
"expr": "sum(rate(kubeproxy_sync_proxy_rules_duration_seconds_count{cluster=\"$cluster\", job=\"kube-proxy\", instance=~\"$instance\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "rate",
@ -2729,6 +2455,7 @@ data:
},
"id": 4,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
@ -2761,7 +2488,7 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "histogram_quantile(0.99,rate(kubeproxy_sync_proxy_rules_duration_seconds_bucket{job=\"kube-proxy\", instance=~\"$instance\"}[5m]))",
"expr": "histogram_quantile(0.99,rate(kubeproxy_sync_proxy_rules_duration_seconds_bucket{cluster=\"$cluster\", job=\"kube-proxy\", instance=~\"$instance\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{instance}}",
@ -2835,13 +2562,14 @@ data:
},
"id": 5,
"interval": "1m",
"legend": {
"alignAsTable": false,
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": false,
"rightSide": true,
"show": true,
"sideWidth": null,
"total": false,
@ -2867,7 +2595,7 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "sum(rate(kubeproxy_network_programming_duration_seconds_count{job=\"kube-proxy\", instance=~\"$instance\"}[5m]))",
"expr": "sum(rate(kubeproxy_network_programming_duration_seconds_count{cluster=\"$cluster\", job=\"kube-proxy\", instance=~\"$instance\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "rate",
@ -2928,6 +2656,7 @@ data:
},
"id": 6,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
@ -2960,7 +2689,7 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "histogram_quantile(0.99, sum(rate(kubeproxy_network_programming_duration_seconds_bucket{job=\"kube-proxy\", instance=~\"$instance\"}[5m])) by (instance, le))",
"expr": "histogram_quantile(0.99, sum(rate(kubeproxy_network_programming_duration_seconds_bucket{cluster=\"$cluster\", job=\"kube-proxy\", instance=~\"$instance\"}[$__rate_interval])) by (instance, le))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{instance}}",
@ -3034,13 +2763,14 @@ data:
},
"id": 7,
"interval": "1m",
"legend": {
"alignAsTable": false,
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": false,
"rightSide": true,
"show": true,
"sideWidth": null,
"total": false,
@ -3066,28 +2796,28 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "sum(rate(rest_client_requests_total{job=\"kube-proxy\", instance=~\"$instance\",code=~\"2..\"}[5m]))",
"expr": "sum(rate(rest_client_requests_total{cluster=\"$cluster\", job=\"kube-proxy\", instance=~\"$instance\",code=~\"2..\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "2xx",
"refId": "A"
},
{
"expr": "sum(rate(rest_client_requests_total{job=\"kube-proxy\", instance=~\"$instance\",code=~\"3..\"}[5m]))",
"expr": "sum(rate(rest_client_requests_total{cluster=\"$cluster\", job=\"kube-proxy\", instance=~\"$instance\",code=~\"3..\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "3xx",
"refId": "B"
},
{
"expr": "sum(rate(rest_client_requests_total{job=\"kube-proxy\", instance=~\"$instance\",code=~\"4..\"}[5m]))",
"expr": "sum(rate(rest_client_requests_total{cluster=\"$cluster\", job=\"kube-proxy\", instance=~\"$instance\",code=~\"4..\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "4xx",
"refId": "C"
},
{
"expr": "sum(rate(rest_client_requests_total{job=\"kube-proxy\", instance=~\"$instance\",code=~\"5..\"}[5m]))",
"expr": "sum(rate(rest_client_requests_total{cluster=\"$cluster\", job=\"kube-proxy\", instance=~\"$instance\",code=~\"5..\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "5xx",
@ -3148,13 +2878,14 @@ data:
},
"id": 8,
"interval": "1m",
"legend": {
"alignAsTable": false,
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": false,
"rightSide": true,
"show": true,
"sideWidth": null,
"total": false,
@ -3180,7 +2911,7 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "histogram_quantile(0.99, sum(rate(rest_client_request_duration_seconds_bucket{job=\"kube-proxy\",instance=~\"$instance\",verb=\"POST\"}[5m])) by (verb, url, le))",
"expr": "histogram_quantile(0.99, sum(rate(rest_client_request_duration_seconds_bucket{cluster=\"$cluster\", job=\"kube-proxy\",instance=~\"$instance\",verb=\"POST\"}[$__rate_interval])) by (verb, url, le))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{verb}} {{url}}",
@ -3254,6 +2985,7 @@ data:
},
"id": 9,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
@ -3286,7 +3018,7 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "histogram_quantile(0.99, sum(rate(rest_client_request_duration_seconds_bucket{job=\"kube-proxy\", instance=~\"$instance\", verb=\"GET\"}[5m])) by (verb, url, le))",
"expr": "histogram_quantile(0.99, sum(rate(rest_client_request_duration_seconds_bucket{cluster=\"$cluster\", job=\"kube-proxy\", instance=~\"$instance\", verb=\"GET\"}[$__rate_interval])) by (verb, url, le))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{verb}} {{url}}",
@ -3360,13 +3092,14 @@ data:
},
"id": 10,
"interval": "1m",
"legend": {
"alignAsTable": false,
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": false,
"rightSide": true,
"show": true,
"sideWidth": null,
"total": false,
@ -3392,7 +3125,7 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "process_resident_memory_bytes{job=\"kube-proxy\",instance=~\"$instance\"}",
"expr": "process_resident_memory_bytes{cluster=\"$cluster\", job=\"kube-proxy\",instance=~\"$instance\"}",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{instance}}",
@ -3453,13 +3186,14 @@ data:
},
"id": 11,
"interval": "1m",
"legend": {
"alignAsTable": false,
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": false,
"rightSide": true,
"show": true,
"sideWidth": null,
"total": false,
@ -3485,7 +3219,7 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "rate(process_cpu_seconds_total{job=\"kube-proxy\",instance=~\"$instance\"}[5m])",
"expr": "rate(process_cpu_seconds_total{cluster=\"$cluster\", job=\"kube-proxy\",instance=~\"$instance\"}[$__rate_interval])",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{instance}}",
@ -3546,13 +3280,14 @@ data:
},
"id": 12,
"interval": "1m",
"legend": {
"alignAsTable": false,
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": false,
"rightSide": true,
"show": true,
"sideWidth": null,
"total": false,
@ -3578,7 +3313,7 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "go_goroutines{job=\"kube-proxy\",instance=~\"$instance\"}",
"expr": "go_goroutines{cluster=\"$cluster\", job=\"kube-proxy\",instance=~\"$instance\"}",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{instance}}",
@ -3648,7 +3383,7 @@ data:
"value": "default"
},
"hide": 0,
"label": null,
"label": "Data Source",
"name": "datasource",
"options": [
@ -3662,6 +3397,32 @@ data:
"allValue": null,
"current": {
},
"datasource": "$datasource",
"hide": 2,
"includeAll": false,
"label": "cluster",
"multi": false,
"name": "cluster",
"options": [
],
"query": "label_values(up{job=\"kube-proxy\"}, cluster)",
"refresh": 2,
"regex": "",
"sort": 1,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": null,
"current": {
},
"datasource": "$datasource",
"hide": 0,
@ -3672,7 +3433,7 @@ data:
"options": [
],
"query": "label_values(kubeproxy_network_programming_duration_seconds_bucket{job=\"kube-proxy\"}, instance)",
"query": "label_values(up{job=\"kube-proxy\", cluster=\"$cluster\", job=\"kube-proxy\"}, instance)",
"refresh": 2,
"regex": "",
"sort": 1,

View File

@ -33,10 +33,12 @@ data:
"id": 1,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -60,7 +62,7 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "1 - avg(rate(node_cpu_seconds_total{mode=\"idle\", cluster=\"$cluster\"}[$__rate_interval]))",
"expr": "cluster:node_cpu:ratio_rate5m{cluster=\"$cluster\"}",
"format": "time_series",
"instant": true,
"intervalFactor": 2,
@ -73,7 +75,7 @@ data:
"title": "CPU Utilisation",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "singlestat",
@ -116,11 +118,14 @@ data:
"fill": 1,
"format": "percentunit",
"id": 2,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -144,7 +149,7 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "sum(kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\"}) / sum(kube_node_status_allocatable_cpu_cores{cluster=\"$cluster\"})",
"expr": "sum(namespace_cpu:kube_pod_container_resource_requests:sum{cluster=\"$cluster\"}) / sum(kube_node_status_allocatable{job=\"kube-state-metrics\",resource=\"cpu\",cluster=\"$cluster\"})",
"format": "time_series",
"instant": true,
"intervalFactor": 2,
@ -157,7 +162,7 @@ data:
"title": "CPU Requests Commitment",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "singlestat",
@ -200,11 +205,14 @@ data:
"fill": 1,
"format": "percentunit",
"id": 3,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -228,7 +236,7 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "sum(kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\"}) / sum(kube_node_status_allocatable_cpu_cores{cluster=\"$cluster\"})",
"expr": "sum(namespace_cpu:kube_pod_container_resource_limits:sum{cluster=\"$cluster\"}) / sum(kube_node_status_allocatable{job=\"kube-state-metrics\",resource=\"cpu\",cluster=\"$cluster\"})",
"format": "time_series",
"instant": true,
"intervalFactor": 2,
@ -241,7 +249,7 @@ data:
"title": "CPU Limits Commitment",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "singlestat",
@ -284,11 +292,14 @@ data:
"fill": 1,
"format": "percentunit",
"id": 4,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -312,7 +323,7 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "1 - sum(:node_memory_MemAvailable_bytes:sum{cluster=\"$cluster\"}) / sum(kube_node_status_allocatable_memory_bytes{cluster=\"$cluster\"})",
"expr": "1 - sum(:node_memory_MemAvailable_bytes:sum{cluster=\"$cluster\"}) / sum(node_memory_MemTotal_bytes{job=\"node-exporter\",cluster=\"$cluster\"})",
"format": "time_series",
"instant": true,
"intervalFactor": 2,
@ -325,7 +336,7 @@ data:
"title": "Memory Utilisation",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "singlestat",
@ -368,11 +379,14 @@ data:
"fill": 1,
"format": "percentunit",
"id": 5,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -396,7 +410,7 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "sum(kube_pod_container_resource_requests_memory_bytes{cluster=\"$cluster\"}) / sum(kube_node_status_allocatable_memory_bytes{cluster=\"$cluster\"})",
"expr": "sum(namespace_memory:kube_pod_container_resource_requests:sum{cluster=\"$cluster\"}) / sum(kube_node_status_allocatable{job=\"kube-state-metrics\",resource=\"memory\",cluster=\"$cluster\"})",
"format": "time_series",
"instant": true,
"intervalFactor": 2,
@ -409,7 +423,7 @@ data:
"title": "Memory Requests Commitment",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "singlestat",
@ -452,11 +466,14 @@ data:
"fill": 1,
"format": "percentunit",
"id": 6,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -480,7 +497,7 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "sum(kube_pod_container_resource_limits_memory_bytes{cluster=\"$cluster\"}) / sum(kube_node_status_allocatable_memory_bytes{cluster=\"$cluster\"})",
"expr": "sum(namespace_memory:kube_pod_container_resource_limits:sum{cluster=\"$cluster\"}) / sum(kube_node_status_allocatable{job=\"kube-state-metrics\",resource=\"memory\",cluster=\"$cluster\"})",
"format": "time_series",
"instant": true,
"intervalFactor": 2,
@ -493,7 +510,7 @@ data:
"title": "Memory Limits Commitment",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "singlestat",
@ -547,11 +564,14 @@ data:
"datasource": "$datasource",
"fill": 10,
"id": 7,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -575,7 +595,7 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\"}) by (namespace)",
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{cluster=\"$cluster\"}) by (namespace)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{namespace}}",
@ -591,7 +611,7 @@ data:
"title": "CPU Usage",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -645,11 +665,14 @@ data:
"datasource": "$datasource",
"fill": 1,
"id": 8,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -689,7 +712,7 @@ data:
"link": true,
"linkTargetBlank": false,
"linkTooltip": "Drill down to pods",
"linkUrl": "./d/85a562078cdf77779eaa1add43ccec1e/k8s-resources-namespace?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$__cell_1",
"linkUrl": "/d/85a562078cdf77779eaa1add43ccec1e/k8s-resources-namespace?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$__cell_1",
"pattern": "Value #A",
"thresholds": [
@ -708,7 +731,7 @@ data:
"link": true,
"linkTargetBlank": false,
"linkTooltip": "Drill down to workloads",
"linkUrl": "./d/a87fb0d919ec0ea5f6543124e16c42a5/k8s-resources-workloads-namespace?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$__cell_1",
"linkUrl": "/d/a87fb0d919ec0ea5f6543124e16c42a5/k8s-resources-workloads-namespace?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$__cell_1",
"pattern": "Value #B",
"thresholds": [
@ -822,7 +845,7 @@ data:
"link": true,
"linkTargetBlank": false,
"linkTooltip": "Drill down to pods",
"linkUrl": "./d/85a562078cdf77779eaa1add43ccec1e/k8s-resources-namespace?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$__cell",
"linkUrl": "/d/85a562078cdf77779eaa1add43ccec1e/k8s-resources-namespace?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$__cell",
"pattern": "namespace",
"thresholds": [
@ -848,7 +871,7 @@ data:
],
"targets": [
{
"expr": "sum(kube_pod_owner{cluster=\"$cluster\"}) by (namespace)",
"expr": "sum(kube_pod_owner{job=\"kube-state-metrics\", cluster=\"$cluster\"}) by (namespace)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -866,7 +889,7 @@ data:
"step": 10
},
{
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\"}) by (namespace)",
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{cluster=\"$cluster\"}) by (namespace)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -875,7 +898,7 @@ data:
"step": 10
},
{
"expr": "sum(kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\"}) by (namespace)",
"expr": "sum(namespace_cpu:kube_pod_container_resource_requests:sum{cluster=\"$cluster\"}) by (namespace)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -884,7 +907,7 @@ data:
"step": 10
},
{
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\"}) by (namespace) / sum(kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\"}) by (namespace)",
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{cluster=\"$cluster\"}) by (namespace) / sum(namespace_cpu:kube_pod_container_resource_requests:sum{cluster=\"$cluster\"}) by (namespace)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -893,7 +916,7 @@ data:
"step": 10
},
{
"expr": "sum(kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\"}) by (namespace)",
"expr": "sum(namespace_cpu:kube_pod_container_resource_limits:sum{cluster=\"$cluster\"}) by (namespace)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -902,7 +925,7 @@ data:
"step": 10
},
{
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\"}) by (namespace) / sum(kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\"}) by (namespace)",
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{cluster=\"$cluster\"}) by (namespace) / sum(namespace_cpu:kube_pod_container_resource_limits:sum{cluster=\"$cluster\"}) by (namespace)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -919,7 +942,7 @@ data:
"title": "CPU Quota",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"transform": "table",
@ -974,11 +997,14 @@ data:
"datasource": "$datasource",
"fill": 10,
"id": 9,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -1002,7 +1028,7 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "sum(container_memory_rss{cluster=\"$cluster\", container!=\"\"}) by (namespace)",
"expr": "sum(container_memory_rss{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", container!=\"\"}) by (namespace)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{namespace}}",
@ -1018,7 +1044,7 @@ data:
"title": "Memory Usage (w/o cache)",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -1072,11 +1098,14 @@ data:
"datasource": "$datasource",
"fill": 1,
"id": 10,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -1116,7 +1145,7 @@ data:
"link": true,
"linkTargetBlank": false,
"linkTooltip": "Drill down to pods",
"linkUrl": "./d/85a562078cdf77779eaa1add43ccec1e/k8s-resources-namespace?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$__cell_1",
"linkUrl": "/d/85a562078cdf77779eaa1add43ccec1e/k8s-resources-namespace?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$__cell_1",
"pattern": "Value #A",
"thresholds": [
@ -1135,7 +1164,7 @@ data:
"link": true,
"linkTargetBlank": false,
"linkTooltip": "Drill down to workloads",
"linkUrl": "./d/a87fb0d919ec0ea5f6543124e16c42a5/k8s-resources-workloads-namespace?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$__cell_1",
"linkUrl": "/d/a87fb0d919ec0ea5f6543124e16c42a5/k8s-resources-workloads-namespace?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$__cell_1",
"pattern": "Value #B",
"thresholds": [
@ -1249,7 +1278,7 @@ data:
"link": true,
"linkTargetBlank": false,
"linkTooltip": "Drill down to pods",
"linkUrl": "./d/85a562078cdf77779eaa1add43ccec1e/k8s-resources-namespace?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$__cell",
"linkUrl": "/d/85a562078cdf77779eaa1add43ccec1e/k8s-resources-namespace?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$__cell",
"pattern": "namespace",
"thresholds": [
@ -1275,7 +1304,7 @@ data:
],
"targets": [
{
"expr": "sum(kube_pod_owner{cluster=\"$cluster\"}) by (namespace)",
"expr": "sum(kube_pod_owner{job=\"kube-state-metrics\", cluster=\"$cluster\"}) by (namespace)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -1293,7 +1322,7 @@ data:
"step": 10
},
{
"expr": "sum(container_memory_rss{cluster=\"$cluster\", container!=\"\"}) by (namespace)",
"expr": "sum(container_memory_rss{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", container!=\"\"}) by (namespace)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -1302,7 +1331,7 @@ data:
"step": 10
},
{
"expr": "sum(kube_pod_container_resource_requests_memory_bytes{cluster=\"$cluster\"}) by (namespace)",
"expr": "sum(namespace_memory:kube_pod_container_resource_requests:sum{cluster=\"$cluster\"}) by (namespace)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -1311,7 +1340,7 @@ data:
"step": 10
},
{
"expr": "sum(container_memory_rss{cluster=\"$cluster\", container!=\"\"}) by (namespace) / sum(kube_pod_container_resource_requests_memory_bytes{cluster=\"$cluster\"}) by (namespace)",
"expr": "sum(container_memory_rss{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", container!=\"\"}) by (namespace) / sum(namespace_memory:kube_pod_container_resource_requests:sum{cluster=\"$cluster\"}) by (namespace)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -1320,7 +1349,7 @@ data:
"step": 10
},
{
"expr": "sum(kube_pod_container_resource_limits_memory_bytes{cluster=\"$cluster\"}) by (namespace)",
"expr": "sum(namespace_memory:kube_pod_container_resource_limits:sum{cluster=\"$cluster\"}) by (namespace)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -1329,7 +1358,7 @@ data:
"step": 10
},
{
"expr": "sum(container_memory_rss{cluster=\"$cluster\", container!=\"\"}) by (namespace) / sum(kube_pod_container_resource_limits_memory_bytes{cluster=\"$cluster\"}) by (namespace)",
"expr": "sum(container_memory_rss{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", container!=\"\"}) by (namespace) / sum(namespace_memory:kube_pod_container_resource_limits:sum{cluster=\"$cluster\"}) by (namespace)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -1346,7 +1375,7 @@ data:
"title": "Requests by Namespace",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"transform": "table",
@ -1403,10 +1432,12 @@ data:
"id": 11,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -1560,7 +1591,7 @@ data:
"link": true,
"linkTargetBlank": false,
"linkTooltip": "Drill down to pods",
"linkUrl": "./d/85a562078cdf77779eaa1add43ccec1e/k8s-resources-namespace?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$__cell",
"linkUrl": "/d/85a562078cdf77779eaa1add43ccec1e/k8s-resources-namespace?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$__cell",
"pattern": "namespace",
"thresholds": [
@ -1586,7 +1617,7 @@ data:
],
"targets": [
{
"expr": "sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
"expr": "sum(irate(container_network_receive_bytes_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -1595,7 +1626,7 @@ data:
"step": 10
},
{
"expr": "sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
"expr": "sum(irate(container_network_transmit_bytes_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -1604,7 +1635,7 @@ data:
"step": 10
},
{
"expr": "sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
"expr": "sum(irate(container_network_receive_packets_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -1613,7 +1644,7 @@ data:
"step": 10
},
{
"expr": "sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
"expr": "sum(irate(container_network_transmit_packets_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -1622,7 +1653,7 @@ data:
"step": 10
},
{
"expr": "sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
"expr": "sum(irate(container_network_receive_packets_dropped_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -1631,7 +1662,7 @@ data:
"step": 10
},
{
"expr": "sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
"expr": "sum(irate(container_network_transmit_packets_dropped_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -1648,7 +1679,7 @@ data:
"title": "Current Network Usage",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"transform": "table",
@ -1686,7 +1717,7 @@ data:
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"title": "Current Network Usage",
"titleSize": "h6"
},
{
@ -1703,11 +1734,14 @@ data:
"datasource": "$datasource",
"fill": 10,
"id": 12,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -1726,12 +1760,12 @@ data:
],
"spaceLength": 10,
"span": 12,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
"expr": "sum(irate(container_network_receive_bytes_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{namespace}}",
@ -1747,7 +1781,7 @@ data:
"title": "Receive Bandwidth",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -1778,19 +1812,7 @@ data:
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
@ -1801,11 +1823,14 @@ data:
"datasource": "$datasource",
"fill": 10,
"id": 13,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -1824,12 +1849,12 @@ data:
],
"spaceLength": 10,
"span": 12,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
"expr": "sum(irate(container_network_transmit_bytes_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{namespace}}",
@ -1845,7 +1870,7 @@ data:
"title": "Transmit Bandwidth",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -1882,7 +1907,7 @@ data:
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"title": "Bandwidth",
"titleSize": "h6"
},
{
@ -1899,11 +1924,14 @@ data:
"datasource": "$datasource",
"fill": 10,
"id": 14,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -1922,12 +1950,12 @@ data:
],
"spaceLength": 10,
"span": 12,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "avg(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
"expr": "avg(irate(container_network_receive_bytes_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{namespace}}",
@ -1943,7 +1971,7 @@ data:
"title": "Average Container Bandwidth by Namespace: Received",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -1974,19 +2002,7 @@ data:
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
@ -1997,11 +2013,14 @@ data:
"datasource": "$datasource",
"fill": 10,
"id": 15,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -2020,12 +2039,12 @@ data:
],
"spaceLength": 10,
"span": 12,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "avg(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
"expr": "avg(irate(container_network_transmit_bytes_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{namespace}}",
@ -2041,7 +2060,7 @@ data:
"title": "Average Container Bandwidth by Namespace: Transmitted",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -2078,7 +2097,7 @@ data:
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"title": "Average Container Bandwidth by Namespace",
"titleSize": "h6"
},
{
@ -2095,11 +2114,14 @@ data:
"datasource": "$datasource",
"fill": 10,
"id": 16,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -2118,12 +2140,12 @@ data:
],
"spaceLength": 10,
"span": 12,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
"expr": "sum(irate(container_network_receive_packets_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{namespace}}",
@ -2139,7 +2161,7 @@ data:
"title": "Rate of Received Packets",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -2154,7 +2176,7 @@ data:
},
"yaxes": [
{
"format": "Bps",
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
@ -2170,19 +2192,7 @@ data:
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
@ -2193,11 +2203,14 @@ data:
"datasource": "$datasource",
"fill": 10,
"id": 17,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -2216,12 +2229,12 @@ data:
],
"spaceLength": 10,
"span": 12,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
"expr": "sum(irate(container_network_transmit_packets_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{namespace}}",
@ -2237,7 +2250,7 @@ data:
"title": "Rate of Transmitted Packets",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -2252,7 +2265,7 @@ data:
},
"yaxes": [
{
"format": "Bps",
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
@ -2274,7 +2287,7 @@ data:
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"title": "Rate of Packets",
"titleSize": "h6"
},
{
@ -2291,11 +2304,14 @@ data:
"datasource": "$datasource",
"fill": 10,
"id": 18,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -2314,12 +2330,12 @@ data:
],
"spaceLength": 10,
"span": 12,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
"expr": "sum(irate(container_network_receive_packets_dropped_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{namespace}}",
@ -2335,7 +2351,7 @@ data:
"title": "Rate of Received Packets Dropped",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -2350,7 +2366,7 @@ data:
},
"yaxes": [
{
"format": "Bps",
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
@ -2366,19 +2382,7 @@ data:
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
@ -2389,11 +2393,14 @@ data:
"datasource": "$datasource",
"fill": 10,
"id": 19,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -2412,12 +2419,12 @@ data:
],
"spaceLength": 10,
"span": 12,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
"expr": "sum(irate(container_network_transmit_packets_dropped_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{namespace}}",
@ -2433,7 +2440,198 @@ data:
"title": "Rate of Transmitted Packets Dropped",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Rate of Packets Dropped",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"decimals": -1,
"fill": 10,
"id": 20,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "ceil(sum by(namespace) (rate(container_fs_reads_total{job=\"kubernetes-cadvisor\", container!=\"\", device=~\"mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+\", cluster=\"$cluster\", namespace!=\"\"}[$__rate_interval]) + rate(container_fs_writes_total{job=\"kubernetes-cadvisor\", container!=\"\", cluster=\"$cluster\", namespace!=\"\"}[$__rate_interval])))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{namespace}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "IOPS(Reads+Writes)",
"tooltip": {
"shared": false,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
},
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 21,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum by(namespace) (rate(container_fs_reads_bytes_total{job=\"kubernetes-cadvisor\", container!=\"\", device=~\"mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+\", cluster=\"$cluster\", namespace!=\"\"}[$__rate_interval]) + rate(container_fs_writes_bytes_total{job=\"kubernetes-cadvisor\", container!=\"\", cluster=\"$cluster\", namespace!=\"\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{namespace}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "ThroughPut(Read+Write)",
"tooltip": {
"shared": false,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -2470,7 +2668,315 @@ data:
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"title": "Storage IO",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"id": 22,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"sort": {
"col": 4,
"desc": true
},
"spaceLength": 10,
"span": 12,
"stack": false,
"steppedLine": false,
"styles": [
{
"alias": "Time",
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"pattern": "Time",
"type": "hidden"
},
{
"alias": "IOPS(Reads)",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": -1,
"link": false,
"linkTargetBlank": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #A",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "IOPS(Writes)",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": -1,
"link": false,
"linkTargetBlank": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #B",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "IOPS(Reads + Writes)",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": -1,
"link": false,
"linkTargetBlank": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #C",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "Throughput(Read)",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTargetBlank": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #D",
"thresholds": [
],
"type": "number",
"unit": "Bps"
},
{
"alias": "Throughput(Write)",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTargetBlank": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #E",
"thresholds": [
],
"type": "number",
"unit": "Bps"
},
{
"alias": "Throughput(Read + Write)",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTargetBlank": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #F",
"thresholds": [
],
"type": "number",
"unit": "Bps"
},
{
"alias": "Namespace",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": true,
"linkTargetBlank": false,
"linkTooltip": "Drill down to pods",
"linkUrl": "/d/85a562078cdf77779eaa1add43ccec1e/k8s-resources-namespace?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$__cell",
"pattern": "namespace",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"pattern": "/.*/",
"thresholds": [
],
"type": "string",
"unit": "short"
}
],
"targets": [
{
"expr": "sum by(namespace) (rate(container_fs_reads_total{job=\"kubernetes-cadvisor\", device=~\"mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+\", container!=\"\", cluster=\"$cluster\", namespace!=\"\"}[$__rate_interval]))",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "A",
"step": 10
},
{
"expr": "sum by(namespace) (rate(container_fs_writes_total{job=\"kubernetes-cadvisor\", device=~\"mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+\", container!=\"\", cluster=\"$cluster\", namespace!=\"\"}[$__rate_interval]))",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "B",
"step": 10
},
{
"expr": "sum by(namespace) (rate(container_fs_reads_total{job=\"kubernetes-cadvisor\", device=~\"mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+\", container!=\"\", cluster=\"$cluster\", namespace!=\"\"}[$__rate_interval]) + rate(container_fs_writes_total{job=\"kubernetes-cadvisor\", device=~\"mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+\", container!=\"\", cluster=\"$cluster\", namespace!=\"\"}[$__rate_interval]))",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "C",
"step": 10
},
{
"expr": "sum by(namespace) (rate(container_fs_reads_bytes_total{job=\"kubernetes-cadvisor\", device=~\"mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+\", container!=\"\", cluster=\"$cluster\", namespace!=\"\"}[$__rate_interval]))",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "D",
"step": 10
},
{
"expr": "sum by(namespace) (rate(container_fs_writes_bytes_total{job=\"kubernetes-cadvisor\", device=~\"mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+\", container!=\"\", cluster=\"$cluster\", namespace!=\"\"}[$__rate_interval]))",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "E",
"step": 10
},
{
"expr": "sum by(namespace) (rate(container_fs_reads_bytes_total{job=\"kubernetes-cadvisor\", device=~\"mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+\", container!=\"\", cluster=\"$cluster\", namespace!=\"\"}[$__rate_interval]) + rate(container_fs_writes_bytes_total{job=\"kubernetes-cadvisor\", device=~\"mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+\", container!=\"\", cluster=\"$cluster\", namespace!=\"\"}[$__rate_interval]))",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "F",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Current Storage IO",
"tooltip": {
"shared": false,
"sort": 2,
"value_type": "individual"
},
"transform": "table",
"type": "table",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Storage IO - Distribution",
"titleSize": "h6"
}
],
@ -2487,7 +2993,7 @@ data:
"value": "default"
},
"hide": 0,
"label": null,
"label": "Data Source",
"name": "datasource",
"options": [
@ -2512,7 +3018,7 @@ data:
"options": [
],
"query": "label_values(node_cpu_seconds_total, cluster)",
"query": "label_values(up{job=\"kubernetes-cadvisor\"}, cluster)",
"refresh": 2,
"regex": "",
"sort": 1,
@ -2591,11 +3097,14 @@ data:
"fill": 1,
"format": "percentunit",
"id": 1,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -2619,7 +3128,7 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}) / sum(kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"})",
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{cluster=\"$cluster\", namespace=\"$namespace\"}) / sum(kube_pod_container_resource_requests{job=\"kube-state-metrics\", cluster=\"$cluster\", namespace=\"$namespace\", resource=\"cpu\"})",
"format": "time_series",
"instant": true,
"intervalFactor": 2,
@ -2632,7 +3141,7 @@ data:
"title": "CPU Utilisation (from requests)",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "singlestat",
@ -2675,11 +3184,14 @@ data:
"fill": 1,
"format": "percentunit",
"id": 2,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -2703,7 +3215,7 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}) / sum(kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"})",
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{cluster=\"$cluster\", namespace=\"$namespace\"}) / sum(kube_pod_container_resource_limits{job=\"kube-state-metrics\", cluster=\"$cluster\", namespace=\"$namespace\", resource=\"cpu\"})",
"format": "time_series",
"instant": true,
"intervalFactor": 2,
@ -2716,7 +3228,7 @@ data:
"title": "CPU Utilisation (from limits)",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "singlestat",
@ -2759,11 +3271,14 @@ data:
"fill": 1,
"format": "percentunit",
"id": 3,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -2787,7 +3302,7 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\",container!=\"\", image!=\"\"}) / sum(kube_pod_container_resource_requests_memory_bytes{namespace=\"$namespace\"})",
"expr": "sum(container_memory_working_set_bytes{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\",container!=\"\", image!=\"\"}) / sum(kube_pod_container_resource_requests{job=\"kube-state-metrics\", cluster=\"$cluster\", namespace=\"$namespace\", resource=\"memory\"})",
"format": "time_series",
"instant": true,
"intervalFactor": 2,
@ -2797,10 +3312,10 @@ data:
"thresholds": "70,80",
"timeFrom": null,
"timeShift": null,
"title": "Memory Utilization (from requests)",
"title": "Memory Utilisation (from requests)",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "singlestat",
@ -2843,11 +3358,14 @@ data:
"fill": 1,
"format": "percentunit",
"id": 4,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -2871,7 +3389,7 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\",container!=\"\", image!=\"\"}) / sum(kube_pod_container_resource_limits_memory_bytes{namespace=\"$namespace\"})",
"expr": "sum(container_memory_working_set_bytes{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\",container!=\"\", image!=\"\"}) / sum(kube_pod_container_resource_limits{job=\"kube-state-metrics\", cluster=\"$cluster\", namespace=\"$namespace\", resource=\"memory\"})",
"format": "time_series",
"instant": true,
"intervalFactor": 2,
@ -2884,7 +3402,7 @@ data:
"title": "Memory Utilisation (from limits)",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "singlestat",
@ -2938,11 +3456,14 @@ data:
"datasource": "$datasource",
"fill": 10,
"id": 5,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -2963,8 +3484,9 @@ data:
"color": "#F2495C",
"dashes": true,
"fill": 0,
"hiddenSeries": true,
"hideTooltip": true,
"legend": false,
"legend": true,
"linewidth": 2,
"stack": false
},
@ -2973,8 +3495,9 @@ data:
"color": "#FF9830",
"dashes": true,
"fill": 0,
"hiddenSeries": true,
"hideTooltip": true,
"legend": false,
"legend": true,
"linewidth": 2,
"stack": false
}
@ -2985,7 +3508,7 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}) by (pod)",
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{cluster=\"$cluster\", namespace=\"$namespace\"}) by (pod)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
@ -3017,7 +3540,7 @@ data:
"title": "CPU Usage",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -3071,11 +3594,14 @@ data:
"datasource": "$datasource",
"fill": 1,
"id": 6,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -3210,7 +3736,7 @@ data:
"link": true,
"linkTargetBlank": false,
"linkTooltip": "Drill down",
"linkUrl": "./d/6581e46e4e5c7ba40a07646395ef7b23/k8s-resources-pod?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-pod=$__cell",
"linkUrl": "/d/6581e46e4e5c7ba40a07646395ef7b23/k8s-resources-pod?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-pod=$__cell",
"pattern": "pod",
"thresholds": [
@ -3236,7 +3762,7 @@ data:
],
"targets": [
{
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}) by (pod)",
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{cluster=\"$cluster\", namespace=\"$namespace\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -3245,7 +3771,7 @@ data:
"step": 10
},
{
"expr": "sum(kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}) by (pod)",
"expr": "sum(cluster:namespace:pod_cpu:active:kube_pod_container_resource_requests{cluster=\"$cluster\", namespace=\"$namespace\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -3254,7 +3780,7 @@ data:
"step": 10
},
{
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}) by (pod) / sum(kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}) by (pod)",
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{cluster=\"$cluster\", namespace=\"$namespace\"}) by (pod) / sum(cluster:namespace:pod_cpu:active:kube_pod_container_resource_requests{cluster=\"$cluster\", namespace=\"$namespace\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -3263,7 +3789,7 @@ data:
"step": 10
},
{
"expr": "sum(kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}) by (pod)",
"expr": "sum(cluster:namespace:pod_cpu:active:kube_pod_container_resource_limits{cluster=\"$cluster\", namespace=\"$namespace\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -3272,7 +3798,7 @@ data:
"step": 10
},
{
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}) by (pod) / sum(kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}) by (pod)",
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{cluster=\"$cluster\", namespace=\"$namespace\"}) by (pod) / sum(cluster:namespace:pod_cpu:active:kube_pod_container_resource_limits{cluster=\"$cluster\", namespace=\"$namespace\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -3289,7 +3815,7 @@ data:
"title": "CPU Quota",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"transform": "table",
@ -3344,11 +3870,14 @@ data:
"datasource": "$datasource",
"fill": 10,
"id": 7,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -3369,8 +3898,9 @@ data:
"color": "#F2495C",
"dashes": true,
"fill": 0,
"hiddenSeries": true,
"hideTooltip": true,
"legend": false,
"legend": true,
"linewidth": 2,
"stack": false
},
@ -3379,8 +3909,9 @@ data:
"color": "#FF9830",
"dashes": true,
"fill": 0,
"hiddenSeries": true,
"hideTooltip": true,
"legend": false,
"legend": true,
"linewidth": 2,
"stack": false
}
@ -3391,7 +3922,7 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\", image!=\"\"}) by (pod)",
"expr": "sum(container_memory_working_set_bytes{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\", image!=\"\"}) by (pod)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
@ -3423,7 +3954,7 @@ data:
"title": "Memory Usage (w/o cache)",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -3477,11 +4008,14 @@ data:
"datasource": "$datasource",
"fill": 1,
"id": 8,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -3673,7 +4207,7 @@ data:
"link": true,
"linkTargetBlank": false,
"linkTooltip": "Drill down",
"linkUrl": "./d/6581e46e4e5c7ba40a07646395ef7b23/k8s-resources-pod?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-pod=$__cell",
"linkUrl": "/d/6581e46e4e5c7ba40a07646395ef7b23/k8s-resources-pod?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-pod=$__cell",
"pattern": "pod",
"thresholds": [
@ -3699,7 +4233,7 @@ data:
],
"targets": [
{
"expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\",container!=\"\", image!=\"\"}) by (pod)",
"expr": "sum(container_memory_working_set_bytes{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\",container!=\"\", image!=\"\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -3708,7 +4242,7 @@ data:
"step": 10
},
{
"expr": "sum(kube_pod_container_resource_requests_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\"}) by (pod)",
"expr": "sum(cluster:namespace:pod_memory:active:kube_pod_container_resource_requests{cluster=\"$cluster\", namespace=\"$namespace\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -3717,7 +4251,7 @@ data:
"step": 10
},
{
"expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\",container!=\"\", image!=\"\"}) by (pod) / sum(kube_pod_container_resource_requests_memory_bytes{namespace=\"$namespace\"}) by (pod)",
"expr": "sum(container_memory_working_set_bytes{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\",container!=\"\", image!=\"\"}) by (pod) / sum(cluster:namespace:pod_memory:active:kube_pod_container_resource_requests{cluster=\"$cluster\", namespace=\"$namespace\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -3726,7 +4260,7 @@ data:
"step": 10
},
{
"expr": "sum(kube_pod_container_resource_limits_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\"}) by (pod)",
"expr": "sum(cluster:namespace:pod_memory:active:kube_pod_container_resource_limits{cluster=\"$cluster\", namespace=\"$namespace\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -3735,7 +4269,7 @@ data:
"step": 10
},
{
"expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\",container!=\"\", image!=\"\"}) by (pod) / sum(kube_pod_container_resource_limits_memory_bytes{namespace=\"$namespace\"}) by (pod)",
"expr": "sum(container_memory_working_set_bytes{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\",container!=\"\", image!=\"\"}) by (pod) / sum(cluster:namespace:pod_memory:active:kube_pod_container_resource_limits{cluster=\"$cluster\", namespace=\"$namespace\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -3744,7 +4278,7 @@ data:
"step": 10
},
{
"expr": "sum(container_memory_rss{cluster=\"$cluster\", namespace=\"$namespace\",container!=\"\"}) by (pod)",
"expr": "sum(container_memory_rss{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\",container!=\"\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -3753,7 +4287,7 @@ data:
"step": 10
},
{
"expr": "sum(container_memory_cache{cluster=\"$cluster\", namespace=\"$namespace\",container!=\"\"}) by (pod)",
"expr": "sum(container_memory_cache{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\",container!=\"\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -3762,7 +4296,7 @@ data:
"step": 10
},
{
"expr": "sum(container_memory_swap{cluster=\"$cluster\", namespace=\"$namespace\",container!=\"\"}) by (pod)",
"expr": "sum(container_memory_swap{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\",container!=\"\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -3779,7 +4313,7 @@ data:
"title": "Memory Quota",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"transform": "table",
@ -3836,10 +4370,12 @@ data:
"id": 9,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -3993,7 +4529,7 @@ data:
"link": true,
"linkTargetBlank": false,
"linkTooltip": "Drill down to pods",
"linkUrl": "./d/6581e46e4e5c7ba40a07646395ef7b23/k8s-resources-pod?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-pod=$__cell",
"linkUrl": "/d/6581e46e4e5c7ba40a07646395ef7b23/k8s-resources-pod?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-pod=$__cell",
"pattern": "pod",
"thresholds": [
@ -4019,7 +4555,7 @@ data:
],
"targets": [
{
"expr": "sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])) by (pod)",
"expr": "sum(irate(container_network_receive_bytes_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval])) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -4028,7 +4564,7 @@ data:
"step": 10
},
{
"expr": "sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])) by (pod)",
"expr": "sum(irate(container_network_transmit_bytes_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval])) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -4037,7 +4573,7 @@ data:
"step": 10
},
{
"expr": "sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])) by (pod)",
"expr": "sum(irate(container_network_receive_packets_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval])) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -4046,7 +4582,7 @@ data:
"step": 10
},
{
"expr": "sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])) by (pod)",
"expr": "sum(irate(container_network_transmit_packets_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval])) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -4055,7 +4591,7 @@ data:
"step": 10
},
{
"expr": "sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])) by (pod)",
"expr": "sum(irate(container_network_receive_packets_dropped_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval])) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -4064,7 +4600,7 @@ data:
"step": 10
},
{
"expr": "sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])) by (pod)",
"expr": "sum(irate(container_network_transmit_packets_dropped_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval])) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -4081,7 +4617,7 @@ data:
"title": "Current Network Usage",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"transform": "table",
@ -4119,7 +4655,7 @@ data:
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"title": "Current Network Usage",
"titleSize": "h6"
},
{
@ -4136,11 +4672,14 @@ data:
"datasource": "$datasource",
"fill": 10,
"id": 10,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -4159,12 +4698,12 @@ data:
],
"spaceLength": 10,
"span": 12,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])) by (pod)",
"expr": "sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval])) by (pod)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
@ -4180,7 +4719,7 @@ data:
"title": "Receive Bandwidth",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -4211,19 +4750,7 @@ data:
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
@ -4234,11 +4761,14 @@ data:
"datasource": "$datasource",
"fill": 10,
"id": 11,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -4257,12 +4787,12 @@ data:
],
"spaceLength": 10,
"span": 12,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])) by (pod)",
"expr": "sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval])) by (pod)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
@ -4278,7 +4808,7 @@ data:
"title": "Transmit Bandwidth",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -4315,7 +4845,7 @@ data:
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"title": "Bandwidth",
"titleSize": "h6"
},
{
@ -4332,11 +4862,14 @@ data:
"datasource": "$datasource",
"fill": 10,
"id": 12,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -4355,12 +4888,12 @@ data:
],
"spaceLength": 10,
"span": 12,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])) by (pod)",
"expr": "sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval])) by (pod)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
@ -4376,7 +4909,7 @@ data:
"title": "Rate of Received Packets",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -4391,7 +4924,7 @@ data:
},
"yaxes": [
{
"format": "Bps",
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
@ -4407,19 +4940,7 @@ data:
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
@ -4430,11 +4951,14 @@ data:
"datasource": "$datasource",
"fill": 10,
"id": 13,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -4453,12 +4977,12 @@ data:
],
"spaceLength": 10,
"span": 12,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])) by (pod)",
"expr": "sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval])) by (pod)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
@ -4474,7 +4998,7 @@ data:
"title": "Rate of Transmitted Packets",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -4489,7 +5013,7 @@ data:
},
"yaxes": [
{
"format": "Bps",
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
@ -4511,7 +5035,7 @@ data:
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"title": "Rate of Packets",
"titleSize": "h6"
},
{
@ -4528,11 +5052,14 @@ data:
"datasource": "$datasource",
"fill": 10,
"id": 14,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -4551,12 +5078,12 @@ data:
],
"spaceLength": 10,
"span": 12,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])) by (pod)",
"expr": "sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval])) by (pod)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
@ -4572,7 +5099,7 @@ data:
"title": "Rate of Received Packets Dropped",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -4587,7 +5114,7 @@ data:
},
"yaxes": [
{
"format": "Bps",
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
@ -4603,19 +5130,7 @@ data:
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
@ -4626,11 +5141,14 @@ data:
"datasource": "$datasource",
"fill": 10,
"id": 15,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -4649,12 +5167,12 @@ data:
],
"spaceLength": 10,
"span": 12,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])) by (pod)",
"expr": "sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval])) by (pod)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
@ -4670,7 +5188,198 @@ data:
"title": "Rate of Transmitted Packets Dropped",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Rate of Packets Dropped",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"decimals": -1,
"fill": 10,
"id": 16,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "ceil(sum by(pod) (rate(container_fs_reads_total{container!=\"\", device=~\"mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+\", cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval]) + rate(container_fs_writes_total{container!=\"\", device=~\"mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+\", cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval])))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "IOPS(Reads+Writes)",
"tooltip": {
"shared": false,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
},
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 17,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum by(pod) (rate(container_fs_reads_bytes_total{container!=\"\", device=~\"mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+\", cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval]) + rate(container_fs_writes_bytes_total{container!=\"\", device=~\"mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+\", cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "ThroughPut(Read+Write)",
"tooltip": {
"shared": false,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -4707,7 +5416,315 @@ data:
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"title": "Storage IO",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"id": 18,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"sort": {
"col": 4,
"desc": true
},
"spaceLength": 10,
"span": 12,
"stack": false,
"steppedLine": false,
"styles": [
{
"alias": "Time",
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"pattern": "Time",
"type": "hidden"
},
{
"alias": "IOPS(Reads)",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": -1,
"link": false,
"linkTargetBlank": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #A",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "IOPS(Writes)",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": -1,
"link": false,
"linkTargetBlank": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #B",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "IOPS(Reads + Writes)",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": -1,
"link": false,
"linkTargetBlank": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #C",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "Throughput(Read)",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTargetBlank": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #D",
"thresholds": [
],
"type": "number",
"unit": "Bps"
},
{
"alias": "Throughput(Write)",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTargetBlank": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #E",
"thresholds": [
],
"type": "number",
"unit": "Bps"
},
{
"alias": "Throughput(Read + Write)",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTargetBlank": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #F",
"thresholds": [
],
"type": "number",
"unit": "Bps"
},
{
"alias": "Pod",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": true,
"linkTargetBlank": false,
"linkTooltip": "Drill down to pods",
"linkUrl": "/d/6581e46e4e5c7ba40a07646395ef7b23/k8s-resources-pod?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-pod=$__cell",
"pattern": "pod",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"pattern": "/.*/",
"thresholds": [
],
"type": "string",
"unit": "short"
}
],
"targets": [
{
"expr": "sum by(pod) (rate(container_fs_reads_total{job=\"kubernetes-cadvisor\", device=~\"mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+\", container!=\"\", cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval]))",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "A",
"step": 10
},
{
"expr": "sum by(pod) (rate(container_fs_writes_total{job=\"kubernetes-cadvisor\", device=~\"mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+\", container!=\"\", cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval]))",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "B",
"step": 10
},
{
"expr": "sum by(pod) (rate(container_fs_reads_total{job=\"kubernetes-cadvisor\", device=~\"mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+\", container!=\"\", cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval]) + rate(container_fs_writes_total{job=\"kubernetes-cadvisor\", device=~\"mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+\", container!=\"\", cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval]))",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "C",
"step": 10
},
{
"expr": "sum by(pod) (rate(container_fs_reads_bytes_total{job=\"kubernetes-cadvisor\", device=~\"mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+\", container!=\"\", cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval]))",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "D",
"step": 10
},
{
"expr": "sum by(pod) (rate(container_fs_writes_bytes_total{job=\"kubernetes-cadvisor\", device=~\"mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+\", container!=\"\", cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval]))",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "E",
"step": 10
},
{
"expr": "sum by(pod) (rate(container_fs_reads_bytes_total{job=\"kubernetes-cadvisor\", device=~\"mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+\", container!=\"\", cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval]) + rate(container_fs_writes_bytes_total{job=\"kubernetes-cadvisor\", device=~\"mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+\", container!=\"\", cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval]))",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "F",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Current Storage IO",
"tooltip": {
"shared": false,
"sort": 2,
"value_type": "individual"
},
"transform": "table",
"type": "table",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Storage IO - Distribution",
"titleSize": "h6"
}
],
@ -4724,7 +5741,7 @@ data:
"value": "default"
},
"hide": 0,
"label": null,
"label": "Data Source",
"name": "datasource",
"options": [
@ -4749,8 +5766,8 @@ data:
"options": [
],
"query": "label_values(kube_pod_info, cluster)",
"refresh": 1,
"query": "label_values(up{job=\"kube-state-metrics\"}, cluster)",
"refresh": 2,
"regex": "",
"sort": 1,
"tagValuesQuery": "",
@ -4776,8 +5793,8 @@ data:
"options": [
],
"query": "label_values(kube_pod_info{cluster=\"$cluster\"}, namespace)",
"refresh": 1,
"query": "label_values(kube_namespace_status_phase{job=\"kube-state-metrics\", cluster=\"$cluster\"}, namespace)",
"refresh": 2,
"regex": "",
"sort": 1,
"tagValuesQuery": "",
@ -4854,11 +5871,14 @@ data:
"datasource": "$datasource",
"fill": 10,
"id": 1,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -4874,7 +5894,17 @@ data:
"points": false,
"renderer": "flot",
"seriesOverrides": [
{
"alias": "max capacity",
"color": "#F2495C",
"dashes": true,
"fill": 0,
"hiddenSeries": true,
"hideTooltip": true,
"legend": true,
"linewidth": 2,
"stack": false
}
],
"spaceLength": 10,
"span": 12,
@ -4882,7 +5912,15 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", node=~\"$node\"}) by (pod)",
"expr": "sum(kube_node_status_capacity{cluster=\"$cluster\", node=~\"$node\", resource=\"cpu\"})",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "max capacity",
"legendLink": null,
"step": 10
},
{
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{cluster=\"$cluster\", node=~\"$node\"}) by (pod)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
@ -4898,7 +5936,7 @@ data:
"title": "CPU Usage",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -4952,11 +5990,14 @@ data:
"datasource": "$datasource",
"fill": 1,
"id": 2,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -5117,7 +6158,7 @@ data:
],
"targets": [
{
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", node=~\"$node\"}) by (pod)",
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{cluster=\"$cluster\", node=~\"$node\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -5126,7 +6167,7 @@ data:
"step": 10
},
{
"expr": "sum(kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\", node=~\"$node\"}) by (pod)",
"expr": "sum(cluster:namespace:pod_cpu:active:kube_pod_container_resource_requests{cluster=\"$cluster\", node=~\"$node\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -5135,7 +6176,7 @@ data:
"step": 10
},
{
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", node=~\"$node\"}) by (pod) / sum(kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\", node=~\"$node\"}) by (pod)",
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{cluster=\"$cluster\", node=~\"$node\"}) by (pod) / sum(cluster:namespace:pod_cpu:active:kube_pod_container_resource_requests{cluster=\"$cluster\", node=~\"$node\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -5144,7 +6185,7 @@ data:
"step": 10
},
{
"expr": "sum(kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\", node=~\"$node\"}) by (pod)",
"expr": "sum(cluster:namespace:pod_cpu:active:kube_pod_container_resource_limits{cluster=\"$cluster\", node=~\"$node\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -5153,7 +6194,7 @@ data:
"step": 10
},
{
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", node=~\"$node\"}) by (pod) / sum(kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\", node=~\"$node\"}) by (pod)",
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{cluster=\"$cluster\", node=~\"$node\"}) by (pod) / sum(cluster:namespace:pod_cpu:active:kube_pod_container_resource_limits{cluster=\"$cluster\", node=~\"$node\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -5170,7 +6211,7 @@ data:
"title": "CPU Quota",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"transform": "table",
@ -5225,11 +6266,14 @@ data:
"datasource": "$datasource",
"fill": 10,
"id": 3,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -5245,13 +6289,31 @@ data:
"points": false,
"renderer": "flot",
"seriesOverrides": [
{
"alias": "max capacity",
"color": "#F2495C",
"dashes": true,
"fill": 0,
"hiddenSeries": true,
"hideTooltip": true,
"legend": true,
"linewidth": 2,
"stack": false
}
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(kube_node_status_capacity{cluster=\"$cluster\", node=~\"$node\", resource=\"memory\"})",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "max capacity",
"legendLink": null,
"step": 10
},
{
"expr": "sum(node_namespace_pod_container:container_memory_working_set_bytes{cluster=\"$cluster\", node=~\"$node\", container!=\"\"}) by (pod)",
"format": "time_series",
@ -5269,7 +6331,7 @@ data:
"title": "Memory Usage (w/o cache)",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -5323,11 +6385,14 @@ data:
"datasource": "$datasource",
"fill": 1,
"id": 4,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -5554,7 +6619,7 @@ data:
"step": 10
},
{
"expr": "sum(kube_pod_container_resource_requests_memory_bytes{cluster=\"$cluster\", node=~\"$node\"}) by (pod)",
"expr": "sum(cluster:namespace:pod_memory:active:kube_pod_container_resource_requests{cluster=\"$cluster\", node=~\"$node\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -5563,7 +6628,7 @@ data:
"step": 10
},
{
"expr": "sum(node_namespace_pod_container:container_memory_working_set_bytes{cluster=\"$cluster\", node=~\"$node\",container!=\"\"}) by (pod) / sum(kube_pod_container_resource_requests_memory_bytes{node=~\"$node\"}) by (pod)",
"expr": "sum(node_namespace_pod_container:container_memory_working_set_bytes{cluster=\"$cluster\", node=~\"$node\",container!=\"\"}) by (pod) / sum(cluster:namespace:pod_memory:active:kube_pod_container_resource_requests{cluster=\"$cluster\", node=~\"$node\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -5572,7 +6637,7 @@ data:
"step": 10
},
{
"expr": "sum(kube_pod_container_resource_limits_memory_bytes{cluster=\"$cluster\", node=~\"$node\"}) by (pod)",
"expr": "sum(cluster:namespace:pod_memory:active:kube_pod_container_resource_limits{cluster=\"$cluster\", node=~\"$node\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -5581,7 +6646,7 @@ data:
"step": 10
},
{
"expr": "sum(node_namespace_pod_container:container_memory_working_set_bytes{cluster=\"$cluster\", node=~\"$node\",container!=\"\"}) by (pod) / sum(kube_pod_container_resource_limits_memory_bytes{node=~\"$node\"}) by (pod)",
"expr": "sum(node_namespace_pod_container:container_memory_working_set_bytes{cluster=\"$cluster\", node=~\"$node\",container!=\"\"}) by (pod) / sum(cluster:namespace:pod_memory:active:kube_pod_container_resource_limits{cluster=\"$cluster\", node=~\"$node\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -5625,7 +6690,7 @@ data:
"title": "Memory Quota",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"transform": "table",
@ -5680,7 +6745,7 @@ data:
"value": "default"
},
"hide": 0,
"label": null,
"label": "Data Source",
"name": "datasource",
"options": [
@ -5705,8 +6770,8 @@ data:
"options": [
],
"query": "label_values(kube_pod_info, cluster)",
"refresh": 1,
"query": "label_values(up{job=\"kube-state-metrics\"}, cluster)",
"refresh": 2,
"regex": "",
"sort": 1,
"tagValuesQuery": "",
@ -5732,8 +6797,8 @@ data:
"options": [
],
"query": "label_values(kube_pod_info{cluster=\"$cluster\"}, node)",
"refresh": 1,
"query": "label_values(kube_node_info{cluster=\"$cluster\"}, node)",
"refresh": 2,
"regex": "",
"sort": 1,
"tagValuesQuery": "",

View File

@ -30,11 +30,14 @@ data:
"datasource": "$datasource",
"fill": 10,
"id": 1,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -75,7 +78,7 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", cluster=\"$cluster\"}) by (container)",
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{namespace=\"$namespace\", pod=\"$pod\", cluster=\"$cluster\"}) by (container)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{container}}",
@ -83,7 +86,7 @@ data:
"step": 10
},
{
"expr": "sum(\n kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"})\n",
"expr": "sum(\n kube_pod_container_resource_requests{job=\"kube-state-metrics\", cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", resource=\"cpu\"}\n)\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "requests",
@ -91,7 +94,7 @@ data:
"step": 10
},
{
"expr": "sum(\n kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"})\n",
"expr": "sum(\n kube_pod_container_resource_limits{job=\"kube-state-metrics\", cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", resource=\"cpu\"}\n)\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "limits",
@ -107,7 +110,7 @@ data:
"title": "CPU Usage",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -161,11 +164,14 @@ data:
"datasource": "$datasource",
"fill": 10,
"id": 2,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": true,
"max": true,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -189,7 +195,7 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "sum(increase(container_cpu_cfs_throttled_periods_total{namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", container!=\"\", cluster=\"$cluster\"}[5m])) by (container) /sum(increase(container_cpu_cfs_periods_total{namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", container!=\"\", cluster=\"$cluster\"}[5m])) by (container)",
"expr": "sum(increase(container_cpu_cfs_throttled_periods_total{job=\"kubernetes-cadvisor\", namespace=\"$namespace\", pod=\"$pod\", container!=\"\", cluster=\"$cluster\"}[$__rate_interval])) by (container) /sum(increase(container_cpu_cfs_periods_total{job=\"kubernetes-cadvisor\", namespace=\"$namespace\", pod=\"$pod\", container!=\"\", cluster=\"$cluster\"}[$__rate_interval])) by (container)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{container}}",
@ -212,7 +218,7 @@ data:
"title": "CPU Throttling",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -266,11 +272,14 @@ data:
"datasource": "$datasource",
"fill": 1,
"id": 3,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -431,7 +440,7 @@ data:
],
"targets": [
{
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\"}) by (container)",
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}) by (container)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -440,7 +449,7 @@ data:
"step": 10
},
{
"expr": "sum(kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}) by (container)",
"expr": "sum(cluster:namespace:pod_cpu:active:kube_pod_container_resource_requests{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}) by (container)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -449,7 +458,7 @@ data:
"step": 10
},
{
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}) by (container) / sum(kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}) by (container)",
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}) by (container) / sum(cluster:namespace:pod_cpu:active:kube_pod_container_resource_requests{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}) by (container)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -458,7 +467,7 @@ data:
"step": 10
},
{
"expr": "sum(kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}) by (container)",
"expr": "sum(cluster:namespace:pod_cpu:active:kube_pod_container_resource_limits{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}) by (container)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -467,7 +476,7 @@ data:
"step": 10
},
{
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}) by (container) / sum(kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}) by (container)",
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}) by (container) / sum(cluster:namespace:pod_cpu:active:kube_pod_container_resource_limits{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}) by (container)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -484,7 +493,7 @@ data:
"title": "CPU Quota",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"transform": "table",
@ -539,11 +548,14 @@ data:
"datasource": "$datasource",
"fill": 10,
"id": 4,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -565,7 +577,7 @@ data:
"dashes": true,
"fill": 0,
"hideTooltip": true,
"legend": false,
"legend": true,
"linewidth": 2,
"stack": false
},
@ -575,7 +587,7 @@ data:
"dashes": true,
"fill": 0,
"hideTooltip": true,
"legend": false,
"legend": true,
"linewidth": 2,
"stack": false
}
@ -586,7 +598,7 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", container!=\"\", image!=\"\"}) by (container)",
"expr": "sum(container_memory_working_set_bytes{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", container!=\"\", image!=\"\"}) by (container)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{container}}",
@ -594,7 +606,7 @@ data:
"step": 10
},
{
"expr": "sum(\n kube_pod_container_resource_requests_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"})\n",
"expr": "sum(\n kube_pod_container_resource_requests{job=\"kube-state-metrics\", cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", resource=\"memory\"}\n)\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "requests",
@ -602,7 +614,7 @@ data:
"step": 10
},
{
"expr": "sum(\n kube_pod_container_resource_limits_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"})\n",
"expr": "sum(\n kube_pod_container_resource_limits{job=\"kube-state-metrics\", cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", resource=\"memory\"}\n)\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "limits",
@ -615,10 +627,10 @@ data:
],
"timeFrom": null,
"timeShift": null,
"title": "Memory Usage",
"title": "Memory Usage (WSS)",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -672,11 +684,14 @@ data:
"datasource": "$datasource",
"fill": 1,
"id": 5,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -706,7 +721,7 @@ data:
"type": "hidden"
},
{
"alias": "Memory Usage",
"alias": "Memory Usage (WSS)",
"colorMode": null,
"colors": [
@ -894,7 +909,7 @@ data:
],
"targets": [
{
"expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", container!=\"\", image!=\"\"}) by (container)",
"expr": "sum(container_memory_working_set_bytes{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", container!=\"\", image!=\"\"}) by (container)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -903,7 +918,7 @@ data:
"step": 10
},
{
"expr": "sum(kube_pod_container_resource_requests_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}) by (container)",
"expr": "sum(cluster:namespace:pod_memory:active:kube_pod_container_resource_requests{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}) by (container)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -912,7 +927,7 @@ data:
"step": 10
},
{
"expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", image!=\"\"}) by (container) / sum(kube_pod_container_resource_requests_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}) by (container)",
"expr": "sum(container_memory_working_set_bytes{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", image!=\"\"}) by (container) / sum(cluster:namespace:pod_memory:active:kube_pod_container_resource_requests{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}) by (container)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -921,7 +936,7 @@ data:
"step": 10
},
{
"expr": "sum(kube_pod_container_resource_limits_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", container!=\"\"}) by (container)",
"expr": "sum(cluster:namespace:pod_memory:active:kube_pod_container_resource_limits{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}) by (container)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -930,7 +945,7 @@ data:
"step": 10
},
{
"expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", container!=\"\", image!=\"\"}) by (container) / sum(kube_pod_container_resource_limits_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}) by (container)",
"expr": "sum(container_memory_working_set_bytes{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", container!=\"\", image!=\"\"}) by (container) / sum(cluster:namespace:pod_memory:active:kube_pod_container_resource_limits{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}) by (container)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -939,7 +954,7 @@ data:
"step": 10
},
{
"expr": "sum(container_memory_rss{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", container != \"\", container != \"POD\"}) by (container)",
"expr": "sum(container_memory_rss{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", container != \"\", container != \"POD\"}) by (container)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -948,7 +963,7 @@ data:
"step": 10
},
{
"expr": "sum(container_memory_cache{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", container != \"\", container != \"POD\"}) by (container)",
"expr": "sum(container_memory_cache{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", container != \"\", container != \"POD\"}) by (container)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -957,7 +972,7 @@ data:
"step": 10
},
{
"expr": "sum(container_memory_swap{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", container != \"\", container != \"POD\"}) by (container)",
"expr": "sum(container_memory_swap{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", container != \"\", container != \"POD\"}) by (container)",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -974,7 +989,7 @@ data:
"title": "Memory Quota",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"transform": "table",
@ -1031,10 +1046,12 @@ data:
"id": 6,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -1053,12 +1070,12 @@ data:
],
"spaceLength": 10,
"span": 12,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(irate(container_network_receive_bytes_total{namespace=~\"$namespace\", pod=~\"$pod\"}[$__rate_interval])) by (pod)",
"expr": "sum(irate(container_network_receive_bytes_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\", pod=~\"$pod\"}[$__rate_interval])) by (pod)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
@ -1074,7 +1091,7 @@ data:
"title": "Receive Bandwidth",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -1105,19 +1122,7 @@ data:
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
@ -1130,10 +1135,12 @@ data:
"id": 7,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -1152,12 +1159,12 @@ data:
],
"spaceLength": 10,
"span": 12,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(irate(container_network_transmit_bytes_total{namespace=~\"$namespace\", pod=~\"$pod\"}[$__rate_interval])) by (pod)",
"expr": "sum(irate(container_network_transmit_bytes_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\", pod=~\"$pod\"}[$__rate_interval])) by (pod)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
@ -1173,7 +1180,7 @@ data:
"title": "Transmit Bandwidth",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -1210,7 +1217,7 @@ data:
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"title": "Bandwidth",
"titleSize": "h6"
},
{
@ -1229,10 +1236,12 @@ data:
"id": 8,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -1251,12 +1260,12 @@ data:
],
"spaceLength": 10,
"span": 12,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(irate(container_network_receive_packets_total{namespace=~\"$namespace\", pod=~\"$pod\"}[$__rate_interval])) by (pod)",
"expr": "sum(irate(container_network_receive_packets_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\", pod=~\"$pod\"}[$__rate_interval])) by (pod)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
@ -1272,7 +1281,7 @@ data:
"title": "Rate of Received Packets",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -1287,7 +1296,7 @@ data:
},
"yaxes": [
{
"format": "Bps",
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
@ -1303,19 +1312,7 @@ data:
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
@ -1328,10 +1325,12 @@ data:
"id": 9,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -1350,12 +1349,12 @@ data:
],
"spaceLength": 10,
"span": 12,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(irate(container_network_transmit_packets_total{namespace=~\"$namespace\", pod=~\"$pod\"}[$__rate_interval])) by (pod)",
"expr": "sum(irate(container_network_transmit_packets_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\", pod=~\"$pod\"}[$__rate_interval])) by (pod)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
@ -1371,7 +1370,7 @@ data:
"title": "Rate of Transmitted Packets",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -1386,7 +1385,7 @@ data:
},
"yaxes": [
{
"format": "Bps",
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
@ -1408,7 +1407,7 @@ data:
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"title": "Rate of Packets",
"titleSize": "h6"
},
{
@ -1427,10 +1426,12 @@ data:
"id": 10,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -1449,12 +1450,12 @@ data:
],
"spaceLength": 10,
"span": 12,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(irate(container_network_receive_packets_dropped_total{namespace=~\"$namespace\", pod=~\"$pod\"}[$__rate_interval])) by (pod)",
"expr": "sum(irate(container_network_receive_packets_dropped_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\", pod=~\"$pod\"}[$__rate_interval])) by (pod)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
@ -1470,7 +1471,7 @@ data:
"title": "Rate of Received Packets Dropped",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -1485,7 +1486,7 @@ data:
},
"yaxes": [
{
"format": "Bps",
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
@ -1501,19 +1502,7 @@ data:
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
@ -1526,10 +1515,12 @@ data:
"id": 11,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -1548,12 +1539,12 @@ data:
],
"spaceLength": 10,
"span": 12,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(irate(container_network_transmit_packets_dropped_total{namespace=~\"$namespace\", pod=~\"$pod\"}[$__rate_interval])) by (pod)",
"expr": "sum(irate(container_network_transmit_packets_dropped_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\", pod=~\"$pod\"}[$__rate_interval])) by (pod)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
@ -1569,7 +1560,214 @@ data:
"title": "Rate of Transmitted Packets Dropped",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Rate of Packets Dropped",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"decimals": -1,
"fill": 10,
"id": 12,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "ceil(sum by(pod) (rate(container_fs_reads_total{job=\"kubernetes-cadvisor\", device=~\"mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+\", container!=\"\", cluster=\"$cluster\", namespace=\"$namespace\", pod=~\"$pod\"}[$__rate_interval])))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "Reads",
"legendLink": null,
"step": 10
},
{
"expr": "ceil(sum by(pod) (rate(container_fs_writes_total{job=\"kubernetes-cadvisor\", device=~\"mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+\", container!=\"\", cluster=\"$cluster\",namespace=\"$namespace\", pod=~\"$pod\"}[$__rate_interval])))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "Writes",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "IOPS",
"tooltip": {
"shared": false,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
},
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 13,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum by(pod) (rate(container_fs_reads_bytes_total{job=\"kubernetes-cadvisor\", device=~\"mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+\", container!=\"\", cluster=\"$cluster\", namespace=\"$namespace\", pod=~\"$pod\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "Reads",
"legendLink": null,
"step": 10
},
{
"expr": "sum by(pod) (rate(container_fs_writes_bytes_total{job=\"kubernetes-cadvisor\", device=~\"mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+\", container!=\"\", cluster=\"$cluster\", namespace=\"$namespace\", pod=~\"$pod\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "Writes",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "ThroughPut",
"tooltip": {
"shared": false,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -1606,7 +1804,506 @@ data:
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"title": "Storage IO - Distribution(Pod - Read & Writes)",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"decimals": -1,
"fill": 10,
"id": 14,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "ceil(sum by(container) (rate(container_fs_reads_total{job=\"kubernetes-cadvisor\", container!=\"\", cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}[$__rate_interval]) + rate(container_fs_writes_total{job=\"kubernetes-cadvisor\", container!=\"\", cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}[$__rate_interval])))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{container}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "IOPS(Reads+Writes)",
"tooltip": {
"shared": false,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
},
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 15,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum by(container) (rate(container_fs_reads_bytes_total{job=\"kubernetes-cadvisor\", container!=\"\", cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}[$__rate_interval]) + rate(container_fs_writes_bytes_total{job=\"kubernetes-cadvisor\", container!=\"\", cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{container}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "ThroughPut(Read+Write)",
"tooltip": {
"shared": false,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Storage IO - Distribution(Containers)",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"id": 16,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"sort": {
"col": 4,
"desc": true
},
"spaceLength": 10,
"span": 12,
"stack": false,
"steppedLine": false,
"styles": [
{
"alias": "Time",
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"pattern": "Time",
"type": "hidden"
},
{
"alias": "IOPS(Reads)",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": -1,
"link": false,
"linkTargetBlank": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #A",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "IOPS(Writes)",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": -1,
"link": false,
"linkTargetBlank": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #B",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "IOPS(Reads + Writes)",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": -1,
"link": false,
"linkTargetBlank": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #C",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "Throughput(Read)",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTargetBlank": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #D",
"thresholds": [
],
"type": "number",
"unit": "Bps"
},
{
"alias": "Throughput(Write)",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTargetBlank": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #E",
"thresholds": [
],
"type": "number",
"unit": "Bps"
},
{
"alias": "Throughput(Read + Write)",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTargetBlank": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #F",
"thresholds": [
],
"type": "number",
"unit": "Bps"
},
{
"alias": "Container",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTargetBlank": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "container",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"pattern": "/.*/",
"thresholds": [
],
"type": "string",
"unit": "short"
}
],
"targets": [
{
"expr": "sum by(container) (rate(container_fs_reads_total{job=\"kubernetes-cadvisor\", device=~\"mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+\", container!=\"\", cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}[$__rate_interval]))",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "A",
"step": 10
},
{
"expr": "sum by(container) (rate(container_fs_writes_total{job=\"kubernetes-cadvisor\",device=~\"mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+\", container!=\"\", cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}[$__rate_interval]))",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "B",
"step": 10
},
{
"expr": "sum by(container) (rate(container_fs_reads_total{job=\"kubernetes-cadvisor\", device=~\"mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+\", container!=\"\", cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}[$__rate_interval]) + rate(container_fs_writes_total{job=\"kubernetes-cadvisor\", device=~\"mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+\", container!=\"\", cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}[$__rate_interval]))",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "C",
"step": 10
},
{
"expr": "sum by(container) (rate(container_fs_reads_bytes_total{job=\"kubernetes-cadvisor\", device=~\"mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+\", container!=\"\", cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}[$__rate_interval]))",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "D",
"step": 10
},
{
"expr": "sum by(container) (rate(container_fs_writes_bytes_total{job=\"kubernetes-cadvisor\", device=~\"mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+\", container!=\"\", cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}[$__rate_interval]))",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "E",
"step": 10
},
{
"expr": "sum by(container) (rate(container_fs_reads_bytes_total{job=\"kubernetes-cadvisor\", device=~\"mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+\", container!=\"\", cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}[$__rate_interval]) + rate(container_fs_writes_bytes_total{job=\"kubernetes-cadvisor\", device=~\"mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+\", container!=\"\", cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}[$__rate_interval]))",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "F",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Current Storage IO",
"tooltip": {
"shared": false,
"sort": 2,
"value_type": "individual"
},
"transform": "table",
"type": "table",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Storage IO - Distribution",
"titleSize": "h6"
}
],
@ -1623,7 +2320,7 @@ data:
"value": "default"
},
"hide": 0,
"label": null,
"label": "Data Source",
"name": "datasource",
"options": [
@ -1648,8 +2345,8 @@ data:
"options": [
],
"query": "label_values(kube_pod_info, cluster)",
"refresh": 1,
"query": "label_values(up{job=\"kube-state-metrics\"}, cluster)",
"refresh": 2,
"regex": "",
"sort": 1,
"tagValuesQuery": "",
@ -1675,8 +2372,8 @@ data:
"options": [
],
"query": "label_values(kube_pod_info{cluster=\"$cluster\"}, namespace)",
"refresh": 1,
"query": "label_values(kube_namespace_status_phase{job=\"kube-state-metrics\", cluster=\"$cluster\"}, namespace)",
"refresh": 2,
"regex": "",
"sort": 1,
"tagValuesQuery": "",
@ -1702,7 +2399,7 @@ data:
"options": [
],
"query": "label_values(kube_pod_info{cluster=\"$cluster\", namespace=\"$namespace\"}, pod)",
"query": "label_values(kube_pod_info{job=\"kube-state-metrics\", cluster=\"$cluster\", namespace=\"$namespace\"}, pod)",
"refresh": 2,
"regex": "",
"sort": 1,
@ -1780,11 +2477,14 @@ data:
"datasource": "$datasource",
"fill": 10,
"id": 1,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -1808,7 +2508,7 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
"expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
@ -1824,7 +2524,7 @@ data:
"title": "CPU Usage",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -1878,11 +2578,14 @@ data:
"datasource": "$datasource",
"fill": 1,
"id": 2,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -2017,7 +2720,7 @@ data:
"link": true,
"linkTargetBlank": false,
"linkTooltip": "Drill down",
"linkUrl": "./d/6581e46e4e5c7ba40a07646395ef7b23/k8s-resources-pod?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-pod=$__cell",
"linkUrl": "/d/6581e46e4e5c7ba40a07646395ef7b23/k8s-resources-pod?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-pod=$__cell",
"pattern": "pod",
"thresholds": [
@ -2043,7 +2746,7 @@ data:
],
"targets": [
{
"expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
"expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -2052,7 +2755,7 @@ data:
"step": 10
},
{
"expr": "sum(\n kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
"expr": "sum(\n kube_pod_container_resource_requests{job=\"kube-state-metrics\", cluster=\"$cluster\", namespace=\"$namespace\", resource=\"cpu\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -2061,7 +2764,7 @@ data:
"step": 10
},
{
"expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n/sum(\n kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
"expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n/sum(\n kube_pod_container_resource_requests{job=\"kube-state-metrics\", cluster=\"$cluster\", namespace=\"$namespace\", resource=\"cpu\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -2070,7 +2773,7 @@ data:
"step": 10
},
{
"expr": "sum(\n kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
"expr": "sum(\n kube_pod_container_resource_limits{job=\"kube-state-metrics\", cluster=\"$cluster\", namespace=\"$namespace\", resource=\"cpu\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -2079,7 +2782,7 @@ data:
"step": 10
},
{
"expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n/sum(\n kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
"expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n/sum(\n kube_pod_container_resource_limits{job=\"kube-state-metrics\", cluster=\"$cluster\", namespace=\"$namespace\", resource=\"cpu\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -2096,7 +2799,7 @@ data:
"title": "CPU Quota",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"transform": "table",
@ -2151,11 +2854,14 @@ data:
"datasource": "$datasource",
"fill": 10,
"id": 3,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -2195,7 +2901,7 @@ data:
"title": "Memory Usage",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -2249,11 +2955,14 @@ data:
"datasource": "$datasource",
"fill": 1,
"id": 4,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -2388,7 +3097,7 @@ data:
"link": true,
"linkTargetBlank": false,
"linkTooltip": "Drill down",
"linkUrl": "./d/6581e46e4e5c7ba40a07646395ef7b23/k8s-resources-pod?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-pod=$__cell",
"linkUrl": "/d/6581e46e4e5c7ba40a07646395ef7b23/k8s-resources-pod?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-pod=$__cell",
"pattern": "pod",
"thresholds": [
@ -2423,7 +3132,7 @@ data:
"step": 10
},
{
"expr": "sum(\n kube_pod_container_resource_requests_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
"expr": "sum(\n kube_pod_container_resource_requests{job=\"kube-state-metrics\", cluster=\"$cluster\", namespace=\"$namespace\", resource=\"memory\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -2432,7 +3141,7 @@ data:
"step": 10
},
{
"expr": "sum(\n container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\", image!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n/sum(\n kube_pod_container_resource_requests_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
"expr": "sum(\n container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\", image!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n/sum(\n kube_pod_container_resource_requests{job=\"kube-state-metrics\", cluster=\"$cluster\", namespace=\"$namespace\", resource=\"memory\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -2441,7 +3150,7 @@ data:
"step": 10
},
{
"expr": "sum(\n kube_pod_container_resource_limits_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
"expr": "sum(\n kube_pod_container_resource_limits{job=\"kube-state-metrics\", cluster=\"$cluster\", namespace=\"$namespace\", resource=\"memory\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -2450,7 +3159,7 @@ data:
"step": 10
},
{
"expr": "sum(\n container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\", image!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n/sum(\n kube_pod_container_resource_limits_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
"expr": "sum(\n container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\", image!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n/sum(\n kube_pod_container_resource_limits{job=\"kube-state-metrics\", cluster=\"$cluster\", namespace=\"$namespace\", resource=\"memory\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -2467,7 +3176,7 @@ data:
"title": "Memory Quota",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"transform": "table",
@ -2524,10 +3233,12 @@ data:
"id": 5,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -2681,7 +3392,7 @@ data:
"link": true,
"linkTargetBlank": false,
"linkTooltip": "Drill down",
"linkUrl": "./d/6581e46e4e5c7ba40a07646395ef7b23/k8s-resources-pod?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-pod=$__cell",
"linkUrl": "/d/6581e46e4e5c7ba40a07646395ef7b23/k8s-resources-pod?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-pod=$__cell",
"pattern": "pod",
"thresholds": [
@ -2707,7 +3418,7 @@ data:
],
"targets": [
{
"expr": "(sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"expr": "(sum(irate(container_network_receive_bytes_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -2716,7 +3427,7 @@ data:
"step": 10
},
{
"expr": "(sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"expr": "(sum(irate(container_network_transmit_bytes_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -2725,7 +3436,7 @@ data:
"step": 10
},
{
"expr": "(sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"expr": "(sum(irate(container_network_receive_packets_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -2734,7 +3445,7 @@ data:
"step": 10
},
{
"expr": "(sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"expr": "(sum(irate(container_network_transmit_packets_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -2743,7 +3454,7 @@ data:
"step": 10
},
{
"expr": "(sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"expr": "(sum(irate(container_network_receive_packets_dropped_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -2752,7 +3463,7 @@ data:
"step": 10
},
{
"expr": "(sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"expr": "(sum(irate(container_network_transmit_packets_dropped_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -2769,7 +3480,7 @@ data:
"title": "Current Network Usage",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"transform": "table",
@ -2807,7 +3518,7 @@ data:
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"title": "Current Network Usage",
"titleSize": "h6"
},
{
@ -2824,11 +3535,14 @@ data:
"datasource": "$datasource",
"fill": 10,
"id": 6,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -2847,12 +3561,12 @@ data:
],
"spaceLength": 10,
"span": 12,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "(sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"expr": "(sum(irate(container_network_receive_bytes_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
@ -2868,7 +3582,7 @@ data:
"title": "Receive Bandwidth",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -2899,19 +3613,7 @@ data:
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
@ -2922,11 +3624,14 @@ data:
"datasource": "$datasource",
"fill": 10,
"id": 7,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -2945,12 +3650,12 @@ data:
],
"spaceLength": 10,
"span": 12,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "(sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"expr": "(sum(irate(container_network_transmit_bytes_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
@ -2966,7 +3671,7 @@ data:
"title": "Transmit Bandwidth",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -3003,7 +3708,7 @@ data:
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"title": "Bandwidth",
"titleSize": "h6"
},
{
@ -3020,11 +3725,14 @@ data:
"datasource": "$datasource",
"fill": 10,
"id": 8,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -3043,12 +3751,12 @@ data:
],
"spaceLength": 10,
"span": 12,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "(avg(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"expr": "(avg(irate(container_network_receive_bytes_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
@ -3064,7 +3772,7 @@ data:
"title": "Average Container Bandwidth by Pod: Received",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -3095,19 +3803,7 @@ data:
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
@ -3118,11 +3814,14 @@ data:
"datasource": "$datasource",
"fill": 10,
"id": 9,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -3141,12 +3840,12 @@ data:
],
"spaceLength": 10,
"span": 12,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "(avg(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"expr": "(avg(irate(container_network_transmit_bytes_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
@ -3162,7 +3861,7 @@ data:
"title": "Average Container Bandwidth by Pod: Transmitted",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -3199,7 +3898,7 @@ data:
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"title": "Average Container Bandwidth by Pod",
"titleSize": "h6"
},
{
@ -3216,11 +3915,14 @@ data:
"datasource": "$datasource",
"fill": 10,
"id": 10,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -3239,12 +3941,12 @@ data:
],
"spaceLength": 10,
"span": 12,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "(sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"expr": "(sum(irate(container_network_receive_packets_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
@ -3260,7 +3962,7 @@ data:
"title": "Rate of Received Packets",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -3275,7 +3977,7 @@ data:
},
"yaxes": [
{
"format": "Bps",
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
@ -3291,19 +3993,7 @@ data:
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
@ -3314,11 +4004,14 @@ data:
"datasource": "$datasource",
"fill": 10,
"id": 11,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -3337,12 +4030,12 @@ data:
],
"spaceLength": 10,
"span": 12,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "(sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"expr": "(sum(irate(container_network_transmit_packets_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
@ -3358,7 +4051,7 @@ data:
"title": "Rate of Transmitted Packets",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -3373,7 +4066,7 @@ data:
},
"yaxes": [
{
"format": "Bps",
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
@ -3395,7 +4088,7 @@ data:
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"title": "Rate of Packets",
"titleSize": "h6"
},
{
@ -3412,11 +4105,14 @@ data:
"datasource": "$datasource",
"fill": 10,
"id": 12,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -3435,12 +4131,12 @@ data:
],
"spaceLength": 10,
"span": 12,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "(sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"expr": "(sum(irate(container_network_receive_packets_dropped_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
@ -3456,7 +4152,7 @@ data:
"title": "Rate of Received Packets Dropped",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -3471,7 +4167,7 @@ data:
},
"yaxes": [
{
"format": "Bps",
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
@ -3487,19 +4183,7 @@ data:
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
@ -3510,11 +4194,14 @@ data:
"datasource": "$datasource",
"fill": 10,
"id": 13,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -3533,12 +4220,12 @@ data:
],
"spaceLength": 10,
"span": 12,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "(sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"expr": "(sum(irate(container_network_transmit_packets_dropped_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
@ -3554,7 +4241,7 @@ data:
"title": "Rate of Transmitted Packets Dropped",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -3569,7 +4256,7 @@ data:
},
"yaxes": [
{
"format": "Bps",
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
@ -3591,7 +4278,7 @@ data:
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"title": "Rate of Packets Dropped",
"titleSize": "h6"
}
],
@ -3608,7 +4295,7 @@ data:
"value": "default"
},
"hide": 0,
"label": null,
"label": "Data Source",
"name": "datasource",
"options": [
@ -3633,8 +4320,8 @@ data:
"options": [
],
"query": "label_values(kube_pod_info, cluster)",
"refresh": 1,
"query": "label_values(up{job=\"kube-state-metrics\"}, cluster)",
"refresh": 2,
"regex": "",
"sort": 1,
"tagValuesQuery": "",
@ -3660,35 +4347,8 @@ data:
"options": [
],
"query": "label_values(kube_pod_info{cluster=\"$cluster\"}, namespace)",
"refresh": 1,
"regex": "",
"sort": 1,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": null,
"current": {
"text": "",
"value": ""
},
"datasource": "$datasource",
"hide": 0,
"includeAll": false,
"label": null,
"multi": false,
"name": "workload",
"options": [
],
"query": "label_values(namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\"}, workload)",
"refresh": 1,
"query": "label_values(kube_namespace_status_phase{job=\"kube-state-metrics\", cluster=\"$cluster\"}, namespace)",
"refresh": 2,
"regex": "",
"sort": 1,
"tagValuesQuery": "",
@ -3714,8 +4374,35 @@ data:
"options": [
],
"query": "label_values(namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\"}, workload_type)",
"refresh": 1,
"query": "label_values(namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\"}, workload_type)",
"refresh": 2,
"regex": "",
"sort": 1,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": null,
"current": {
"text": "",
"value": ""
},
"datasource": "$datasource",
"hide": 0,
"includeAll": false,
"label": null,
"multi": false,
"name": "workload",
"options": [
],
"query": "label_values(namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}, workload)",
"refresh": 2,
"regex": "",
"sort": 1,
"tagValuesQuery": "",
@ -3792,11 +4479,14 @@ data:
"datasource": "$datasource",
"fill": 10,
"id": 1,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -3817,8 +4507,9 @@ data:
"color": "#F2495C",
"dashes": true,
"fill": 0,
"hiddenSeries": true,
"hideTooltip": true,
"legend": false,
"legend": true,
"linewidth": 2,
"stack": false
},
@ -3827,8 +4518,9 @@ data:
"color": "#FF9830",
"dashes": true,
"fill": 0,
"hiddenSeries": true,
"hideTooltip": true,
"legend": false,
"legend": true,
"linewidth": 2,
"stack": false
}
@ -3839,7 +4531,7 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
"expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{workload}} - {{workload_type}}",
@ -3871,7 +4563,7 @@ data:
"title": "CPU Usage",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -3925,11 +4617,14 @@ data:
"datasource": "$datasource",
"fill": 1,
"id": 2,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -4083,7 +4778,7 @@ data:
"link": true,
"linkTargetBlank": false,
"linkTooltip": "Drill down",
"linkUrl": "./d/a164a7f0339f99e89cea5cb47e9be617/k8s-resources-workload?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-workload=$__cell&var-type=$__cell_2",
"linkUrl": "/d/a164a7f0339f99e89cea5cb47e9be617/k8s-resources-workload?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-workload=$__cell&var-type=$__cell_2",
"pattern": "workload",
"thresholds": [
@ -4137,7 +4832,7 @@ data:
"step": 10
},
{
"expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
"expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -4146,7 +4841,7 @@ data:
"step": 10
},
{
"expr": "sum(\n kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
"expr": "sum(\n kube_pod_container_resource_requests{job=\"kube-state-metrics\", cluster=\"$cluster\", namespace=\"$namespace\", resource=\"cpu\"}\n* on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -4155,7 +4850,7 @@ data:
"step": 10
},
{
"expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n/sum(\n kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
"expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n/sum(\n kube_pod_container_resource_requests{job=\"kube-state-metrics\", cluster=\"$cluster\", namespace=\"$namespace\", resource=\"cpu\"}\n* on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -4164,7 +4859,7 @@ data:
"step": 10
},
{
"expr": "sum(\n kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
"expr": "sum(\n kube_pod_container_resource_limits{job=\"kube-state-metrics\", cluster=\"$cluster\", namespace=\"$namespace\", resource=\"cpu\"}\n* on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -4173,7 +4868,7 @@ data:
"step": 10
},
{
"expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n/sum(\n kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
"expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n/sum(\n kube_pod_container_resource_limits{job=\"kube-state-metrics\", cluster=\"$cluster\", namespace=\"$namespace\", resource=\"cpu\"}\n* on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -4190,7 +4885,7 @@ data:
"title": "CPU Quota",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"transform": "table",
@ -4245,11 +4940,14 @@ data:
"datasource": "$datasource",
"fill": 10,
"id": 3,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -4270,8 +4968,9 @@ data:
"color": "#F2495C",
"dashes": true,
"fill": 0,
"hiddenSeries": true,
"hideTooltip": true,
"legend": false,
"legend": true,
"linewidth": 2,
"stack": false
},
@ -4280,8 +4979,9 @@ data:
"color": "#FF9830",
"dashes": true,
"fill": 0,
"hiddenSeries": true,
"hideTooltip": true,
"legend": false,
"legend": true,
"linewidth": 2,
"stack": false
}
@ -4292,7 +4992,7 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "sum(\n container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\", image!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
"expr": "sum(\n container_memory_working_set_bytes{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\", image!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{workload}} - {{workload_type}}",
@ -4324,7 +5024,7 @@ data:
"title": "Memory Usage",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -4378,11 +5078,14 @@ data:
"datasource": "$datasource",
"fill": 1,
"id": 4,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -4536,7 +5239,7 @@ data:
"link": true,
"linkTargetBlank": false,
"linkTooltip": "Drill down",
"linkUrl": "./d/a164a7f0339f99e89cea5cb47e9be617/k8s-resources-workload?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-workload=$__cell&var-type=$__cell_2",
"linkUrl": "/d/a164a7f0339f99e89cea5cb47e9be617/k8s-resources-workload?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-workload=$__cell&var-type=$__cell_2",
"pattern": "workload",
"thresholds": [
@ -4590,7 +5293,7 @@ data:
"step": 10
},
{
"expr": "sum(\n container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\", image!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
"expr": "sum(\n container_memory_working_set_bytes{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\", image!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -4599,7 +5302,7 @@ data:
"step": 10
},
{
"expr": "sum(\n kube_pod_container_resource_requests_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
"expr": "sum(\n kube_pod_container_resource_requests{job=\"kube-state-metrics\", cluster=\"$cluster\", namespace=\"$namespace\", resource=\"memory\"}\n* on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -4608,7 +5311,7 @@ data:
"step": 10
},
{
"expr": "sum(\n container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\", image!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n/sum(\n kube_pod_container_resource_requests_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
"expr": "sum(\n container_memory_working_set_bytes{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\", image!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n/sum(\n kube_pod_container_resource_requests{job=\"kube-state-metrics\", cluster=\"$cluster\", namespace=\"$namespace\", resource=\"memory\"}\n* on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -4617,7 +5320,7 @@ data:
"step": 10
},
{
"expr": "sum(\n kube_pod_container_resource_limits_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
"expr": "sum(\n kube_pod_container_resource_limits{job=\"kube-state-metrics\", cluster=\"$cluster\", namespace=\"$namespace\", resource=\"memory\"}\n* on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -4626,7 +5329,7 @@ data:
"step": 10
},
{
"expr": "sum(\n container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\", image!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n/sum(\n kube_pod_container_resource_limits_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
"expr": "sum(\n container_memory_working_set_bytes{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\", image!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n/sum(\n kube_pod_container_resource_limits{job=\"kube-state-metrics\", cluster=\"$cluster\", namespace=\"$namespace\", resource=\"memory\"}\n* on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -4643,7 +5346,7 @@ data:
"title": "Memory Quota",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"transform": "table",
@ -4700,10 +5403,12 @@ data:
"id": 5,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -4857,7 +5562,7 @@ data:
"link": true,
"linkTargetBlank": false,
"linkTooltip": "Drill down to pods",
"linkUrl": "./d/a164a7f0339f99e89cea5cb47e9be617/k8s-resources-workload?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-workload=$__cell&var-type=$type",
"linkUrl": "/d/a164a7f0339f99e89cea5cb47e9be617/k8s-resources-workload?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-workload=$__cell&var-type=$type",
"pattern": "workload",
"thresholds": [
@ -4902,7 +5607,7 @@ data:
],
"targets": [
{
"expr": "(sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload_type=\"$type\"}) by (workload))\n",
"expr": "(sum(irate(container_network_receive_bytes_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}) by (workload))\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -4911,7 +5616,7 @@ data:
"step": 10
},
{
"expr": "(sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload_type=\"$type\"}) by (workload))\n",
"expr": "(sum(irate(container_network_transmit_bytes_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}) by (workload))\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -4920,7 +5625,7 @@ data:
"step": 10
},
{
"expr": "(sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload_type=\"$type\"}) by (workload))\n",
"expr": "(sum(irate(container_network_receive_packets_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}) by (workload))\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -4929,7 +5634,7 @@ data:
"step": 10
},
{
"expr": "(sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload_type=\"$type\"}) by (workload))\n",
"expr": "(sum(irate(container_network_transmit_packets_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}) by (workload))\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -4938,7 +5643,7 @@ data:
"step": 10
},
{
"expr": "(sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload_type=\"$type\"}) by (workload))\n",
"expr": "(sum(irate(container_network_receive_packets_dropped_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}) by (workload))\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -4947,7 +5652,7 @@ data:
"step": 10
},
{
"expr": "(sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload_type=\"$type\"}) by (workload))\n",
"expr": "(sum(irate(container_network_transmit_packets_dropped_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}) by (workload))\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
@ -4964,7 +5669,7 @@ data:
"title": "Current Network Usage",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"transform": "table",
@ -5002,7 +5707,7 @@ data:
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"title": "Current Network Usage",
"titleSize": "h6"
},
{
@ -5019,11 +5724,14 @@ data:
"datasource": "$datasource",
"fill": 10,
"id": 6,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -5042,12 +5750,12 @@ data:
],
"spaceLength": 10,
"span": 12,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "(sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
"expr": "(sum(irate(container_network_receive_bytes_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{workload}}",
@ -5063,7 +5771,7 @@ data:
"title": "Receive Bandwidth",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -5094,19 +5802,7 @@ data:
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
@ -5117,11 +5813,14 @@ data:
"datasource": "$datasource",
"fill": 10,
"id": 7,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -5140,12 +5839,12 @@ data:
],
"spaceLength": 10,
"span": 12,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "(sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
"expr": "(sum(irate(container_network_transmit_bytes_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{workload}}",
@ -5161,7 +5860,7 @@ data:
"title": "Transmit Bandwidth",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -5198,7 +5897,7 @@ data:
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"title": "Bandwidth",
"titleSize": "h6"
},
{
@ -5215,11 +5914,14 @@ data:
"datasource": "$datasource",
"fill": 10,
"id": 8,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -5238,12 +5940,12 @@ data:
],
"spaceLength": 10,
"span": 12,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "(avg(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
"expr": "(avg(irate(container_network_receive_bytes_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{workload}}",
@ -5259,7 +5961,7 @@ data:
"title": "Average Container Bandwidth by Workload: Received",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -5290,19 +5992,7 @@ data:
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
@ -5313,11 +6003,14 @@ data:
"datasource": "$datasource",
"fill": 10,
"id": 9,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -5336,12 +6029,12 @@ data:
],
"spaceLength": 10,
"span": 12,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "(avg(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
"expr": "(avg(irate(container_network_transmit_bytes_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{workload}}",
@ -5357,7 +6050,7 @@ data:
"title": "Average Container Bandwidth by Workload: Transmitted",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -5394,7 +6087,7 @@ data:
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"title": "Average Container Bandwidth by Workload",
"titleSize": "h6"
},
{
@ -5411,11 +6104,14 @@ data:
"datasource": "$datasource",
"fill": 10,
"id": 10,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -5434,12 +6130,12 @@ data:
],
"spaceLength": 10,
"span": 12,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "(sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
"expr": "(sum(irate(container_network_receive_packets_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{workload}}",
@ -5455,7 +6151,7 @@ data:
"title": "Rate of Received Packets",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -5470,7 +6166,7 @@ data:
},
"yaxes": [
{
"format": "Bps",
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
@ -5486,19 +6182,7 @@ data:
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
@ -5509,11 +6193,14 @@ data:
"datasource": "$datasource",
"fill": 10,
"id": 11,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -5532,12 +6219,12 @@ data:
],
"spaceLength": 10,
"span": 12,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "(sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
"expr": "(sum(irate(container_network_transmit_packets_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{workload}}",
@ -5553,7 +6240,7 @@ data:
"title": "Rate of Transmitted Packets",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -5568,7 +6255,7 @@ data:
},
"yaxes": [
{
"format": "Bps",
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
@ -5590,7 +6277,7 @@ data:
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"title": "Rate of Packets",
"titleSize": "h6"
},
{
@ -5607,11 +6294,14 @@ data:
"datasource": "$datasource",
"fill": 10,
"id": 12,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -5630,12 +6320,12 @@ data:
],
"spaceLength": 10,
"span": 12,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "(sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
"expr": "(sum(irate(container_network_receive_packets_dropped_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{workload}}",
@ -5651,7 +6341,7 @@ data:
"title": "Rate of Received Packets Dropped",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -5666,7 +6356,7 @@ data:
},
"yaxes": [
{
"format": "Bps",
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
@ -5682,19 +6372,7 @@ data:
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
@ -5705,11 +6383,14 @@ data:
"datasource": "$datasource",
"fill": 10,
"id": 13,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"total": false,
"values": false
@ -5728,12 +6409,12 @@ data:
],
"spaceLength": 10,
"span": 12,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "(sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
"expr": "(sum(irate(container_network_transmit_packets_dropped_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{workload}}",
@ -5749,7 +6430,7 @@ data:
"title": "Rate of Transmitted Packets Dropped",
"tooltip": {
"shared": false,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -5764,7 +6445,7 @@ data:
},
"yaxes": [
{
"format": "Bps",
"format": "pps",
"label": null,
"logBase": 1,
"max": null,
@ -5786,7 +6467,7 @@ data:
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"title": "Rate of Packets Dropped",
"titleSize": "h6"
}
],
@ -5803,7 +6484,7 @@ data:
"value": "default"
},
"hide": 0,
"label": null,
"label": "Data Source",
"name": "datasource",
"options": [
@ -5813,38 +6494,6 @@ data:
"regex": "",
"type": "datasource"
},
{
"allValue": null,
"auto": false,
"auto_count": 30,
"auto_min": "10s",
"current": {
"text": "deployment",
"value": "deployment"
},
"datasource": "$datasource",
"definition": "label_values(namespace_workload_pod:kube_pod_owner:relabel{namespace=~\"$namespace\", workload=~\".+\"}, workload_type)",
"hide": 0,
"includeAll": false,
"label": null,
"multi": false,
"name": "type",
"options": [
],
"query": "label_values(namespace_workload_pod:kube_pod_owner:relabel{namespace=~\"$namespace\", workload=~\".+\"}, workload_type)",
"refresh": 1,
"regex": "",
"skipUrlSync": false,
"sort": 0,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": null,
"current": {
@ -5860,8 +6509,8 @@ data:
"options": [
],
"query": "label_values(kube_pod_info, cluster)",
"refresh": 1,
"query": "label_values(up{job=\"kube-state-metrics\"}, cluster)",
"refresh": 2,
"regex": "",
"sort": 1,
"tagValuesQuery": "",
@ -5887,13 +6536,45 @@ data:
"options": [
],
"query": "label_values(kube_pod_info{cluster=\"$cluster\"}, namespace)",
"refresh": 1,
"query": "label_values(kube_pod_info{job=\"kube-state-metrics\", cluster=\"$cluster\"}, namespace)",
"refresh": 2,
"regex": "",
"sort": 1,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": null,
"auto": false,
"auto_count": 30,
"auto_min": "10s",
"current": {
"text": "deployment",
"value": "deployment"
},
"datasource": "$datasource",
"definition": "label_values(namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=~\".+\"}, workload_type)",
"hide": 0,
"includeAll": false,
"label": null,
"multi": false,
"name": "type",
"options": [
],
"query": "label_values(namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=\"$namespace\", workload=~\".+\"}, workload_type)",
"refresh": 2,
"regex": "",
"skipUrlSync": false,
"sort": 0,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "query",

View File

@ -69,7 +69,11 @@ data:
},
"id": 3,
"interval": null,
"interval": "1m",
"legend": {
"alignAsTable": true,
"rightSide": true
},
"links": [
],
@ -147,13 +151,14 @@ data:
},
"id": 4,
"interval": "1m",
"legend": {
"alignAsTable": false,
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": false,
"rightSide": true,
"show": true,
"sideWidth": null,
"total": false,
@ -265,7 +270,11 @@ data:
},
"id": 5,
"interval": null,
"interval": "1m",
"legend": {
"alignAsTable": true,
"rightSide": true
},
"links": [
],
@ -342,13 +351,14 @@ data:
},
"id": 6,
"interval": "1m",
"legend": {
"alignAsTable": false,
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": false,
"rightSide": true,
"show": true,
"sideWidth": null,
"total": false,
@ -451,13 +461,14 @@ data:
},
"id": 7,
"interval": "1m",
"legend": {
"alignAsTable": false,
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": false,
"rightSide": true,
"show": true,
"sideWidth": null,
"total": false,
@ -545,13 +556,14 @@ data:
},
"id": 8,
"interval": "1m",
"legend": {
"alignAsTable": false,
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": false,
"rightSide": true,
"show": true,
"sideWidth": null,
"total": false,
@ -661,7 +673,11 @@ data:
},
"id": 9,
"interval": null,
"interval": "1m",
"legend": {
"alignAsTable": true,
"rightSide": true
},
"links": [
],
@ -738,13 +754,14 @@ data:
},
"id": 10,
"interval": "1m",
"legend": {
"alignAsTable": false,
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": false,
"rightSide": true,
"show": true,
"sideWidth": null,
"total": false,
@ -847,13 +864,14 @@ data:
},
"id": 11,
"interval": "1m",
"legend": {
"alignAsTable": false,
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": false,
"rightSide": true,
"show": true,
"sideWidth": null,
"total": false,
@ -941,13 +959,14 @@ data:
},
"id": 12,
"interval": "1m",
"legend": {
"alignAsTable": false,
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": false,
"rightSide": true,
"show": true,
"sideWidth": null,
"total": false,
@ -1047,13 +1066,14 @@ data:
},
"id": 13,
"interval": "1m",
"legend": {
"alignAsTable": false,
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": false,
"rightSide": true,
"show": false,
"sideWidth": null,
"total": false,
@ -1079,7 +1099,7 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "sum(rate(workqueue_adds_total{job=\"apiserver\", instance=~\"$instance\", cluster=\"$cluster\"}[5m])) by (instance, name)",
"expr": "sum(rate(workqueue_adds_total{job=\"apiserver\", instance=~\"$instance\", cluster=\"$cluster\"}[$__rate_interval])) by (instance, name)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{instance}} {{name}}",
@ -1140,13 +1160,14 @@ data:
},
"id": 14,
"interval": "1m",
"legend": {
"alignAsTable": false,
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": false,
"rightSide": true,
"show": false,
"sideWidth": null,
"total": false,
@ -1172,7 +1193,7 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "sum(rate(workqueue_depth{job=\"apiserver\", instance=~\"$instance\", cluster=\"$cluster\"}[5m])) by (instance, name)",
"expr": "sum(rate(workqueue_depth{job=\"apiserver\", instance=~\"$instance\", cluster=\"$cluster\"}[$__rate_interval])) by (instance, name)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{instance}} {{name}}",
@ -1233,6 +1254,7 @@ data:
},
"id": 15,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
@ -1265,7 +1287,7 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "histogram_quantile(0.99, sum(rate(workqueue_queue_duration_seconds_bucket{job=\"apiserver\", instance=~\"$instance\", cluster=\"$cluster\"}[5m])) by (instance, name, le))",
"expr": "histogram_quantile(0.99, sum(rate(workqueue_queue_duration_seconds_bucket{job=\"apiserver\", instance=~\"$instance\", cluster=\"$cluster\"}[$__rate_interval])) by (instance, name, le))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{instance}} {{name}}",
@ -1339,13 +1361,14 @@ data:
},
"id": 16,
"interval": "1m",
"legend": {
"alignAsTable": false,
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": false,
"rightSide": true,
"show": true,
"sideWidth": null,
"total": false,
@ -1432,13 +1455,14 @@ data:
},
"id": 17,
"interval": "1m",
"legend": {
"alignAsTable": false,
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": false,
"rightSide": true,
"show": true,
"sideWidth": null,
"total": false,
@ -1464,7 +1488,7 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "rate(process_cpu_seconds_total{job=\"apiserver\",instance=~\"$instance\", cluster=\"$cluster\"}[5m])",
"expr": "rate(process_cpu_seconds_total{job=\"apiserver\",instance=~\"$instance\", cluster=\"$cluster\"}[$__rate_interval])",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{instance}}",
@ -1525,13 +1549,14 @@ data:
},
"id": 18,
"interval": "1m",
"legend": {
"alignAsTable": false,
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": false,
"rightSide": true,
"show": true,
"sideWidth": null,
"total": false,
@ -1627,7 +1652,7 @@ data:
"value": "default"
},
"hide": 0,
"label": null,
"label": "Data Source",
"name": "datasource",
"options": [
@ -1651,7 +1676,7 @@ data:
"options": [
],
"query": "label_values(apiserver_request_total, cluster)",
"query": "label_values(up{job=\"apiserver\"}, cluster)",
"refresh": 2,
"regex": "",
"sort": 1,
@ -1677,7 +1702,7 @@ data:
"options": [
],
"query": "label_values(apiserver_request_total{job=\"apiserver\", cluster=\"$cluster\"}, instance)",
"query": "label_values(up{job=\"apiserver\", cluster=\"$cluster\"}, instance)",
"refresh": 2,
"regex": "",
"sort": 1,
@ -1774,7 +1799,11 @@ data:
},
"id": 2,
"interval": null,
"interval": "1m",
"legend": {
"alignAsTable": true,
"rightSide": true
},
"links": [
],
@ -1813,7 +1842,7 @@ data:
"tableColumn": "",
"targets": [
{
"expr": "sum(up{job=\"kube-controller-manager\"})",
"expr": "sum(up{cluster=\"$cluster\", job=\"kube-controller-manager\"})",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "",
@ -1850,6 +1879,7 @@ data:
},
"id": 3,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
@ -1882,10 +1912,10 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "sum(rate(workqueue_adds_total{job=\"kube-controller-manager\", instance=~\"$instance\"}[5m])) by (instance, name)",
"expr": "sum(rate(workqueue_adds_total{cluster=\"$cluster\", job=\"kube-controller-manager\", instance=~\"$instance\"}[$__rate_interval])) by (cluster, instance, name)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{instance}} {{name}}",
"legendFormat": "{{cluster}} {{instance}} {{name}}",
"refId": "A"
}
],
@ -1956,6 +1986,7 @@ data:
},
"id": 4,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
@ -1988,10 +2019,10 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "sum(rate(workqueue_depth{job=\"kube-controller-manager\", instance=~\"$instance\"}[5m])) by (instance, name)",
"expr": "sum(rate(workqueue_depth{cluster=\"$cluster\", job=\"kube-controller-manager\", instance=~\"$instance\"}[$__rate_interval])) by (cluster, instance, name)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{instance}} {{name}}",
"legendFormat": "{{cluster}} {{instance}} {{name}}",
"refId": "A"
}
],
@ -2062,6 +2093,7 @@ data:
},
"id": 5,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
@ -2094,10 +2126,10 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "histogram_quantile(0.99, sum(rate(workqueue_queue_duration_seconds_bucket{job=\"kube-controller-manager\", instance=~\"$instance\"}[5m])) by (instance, name, le))",
"expr": "histogram_quantile(0.99, sum(rate(workqueue_queue_duration_seconds_bucket{cluster=\"$cluster\", job=\"kube-controller-manager\", instance=~\"$instance\"}[$__rate_interval])) by (cluster, instance, name, le))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{instance}} {{name}}",
"legendFormat": "{{cluster}} {{instance}} {{name}}",
"refId": "A"
}
],
@ -2168,13 +2200,14 @@ data:
},
"id": 6,
"interval": "1m",
"legend": {
"alignAsTable": false,
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": false,
"rightSide": true,
"show": true,
"sideWidth": null,
"total": false,
@ -2200,28 +2233,28 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "sum(rate(rest_client_requests_total{job=\"kube-controller-manager\", instance=~\"$instance\",code=~\"2..\"}[5m]))",
"expr": "sum(rate(rest_client_requests_total{job=\"kube-controller-manager\", instance=~\"$instance\",code=~\"2..\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "2xx",
"refId": "A"
},
{
"expr": "sum(rate(rest_client_requests_total{job=\"kube-controller-manager\", instance=~\"$instance\",code=~\"3..\"}[5m]))",
"expr": "sum(rate(rest_client_requests_total{job=\"kube-controller-manager\", instance=~\"$instance\",code=~\"3..\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "3xx",
"refId": "B"
},
{
"expr": "sum(rate(rest_client_requests_total{job=\"kube-controller-manager\", instance=~\"$instance\",code=~\"4..\"}[5m]))",
"expr": "sum(rate(rest_client_requests_total{job=\"kube-controller-manager\", instance=~\"$instance\",code=~\"4..\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "4xx",
"refId": "C"
},
{
"expr": "sum(rate(rest_client_requests_total{job=\"kube-controller-manager\", instance=~\"$instance\",code=~\"5..\"}[5m]))",
"expr": "sum(rate(rest_client_requests_total{job=\"kube-controller-manager\", instance=~\"$instance\",code=~\"5..\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "5xx",
@ -2282,13 +2315,14 @@ data:
},
"id": 7,
"interval": "1m",
"legend": {
"alignAsTable": false,
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": false,
"rightSide": true,
"show": true,
"sideWidth": null,
"total": false,
@ -2314,7 +2348,7 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "histogram_quantile(0.99, sum(rate(rest_client_request_duration_seconds_bucket{job=\"kube-controller-manager\", instance=~\"$instance\", verb=\"POST\"}[5m])) by (verb, url, le))",
"expr": "histogram_quantile(0.99, sum(rate(rest_client_request_duration_seconds_bucket{cluster=\"$cluster\", job=\"kube-controller-manager\", instance=~\"$instance\", verb=\"POST\"}[$__rate_interval])) by (verb, url, le))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{verb}} {{url}}",
@ -2388,6 +2422,7 @@ data:
},
"id": 8,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
@ -2420,7 +2455,7 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "histogram_quantile(0.99, sum(rate(rest_client_request_duration_seconds_bucket{job=\"kube-controller-manager\", instance=~\"$instance\", verb=\"GET\"}[5m])) by (verb, url, le))",
"expr": "histogram_quantile(0.99, sum(rate(rest_client_request_duration_seconds_bucket{cluster=\"$cluster\", job=\"kube-controller-manager\", instance=~\"$instance\", verb=\"GET\"}[$__rate_interval])) by (verb, url, le))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{verb}} {{url}}",
@ -2494,13 +2529,14 @@ data:
},
"id": 9,
"interval": "1m",
"legend": {
"alignAsTable": false,
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": false,
"rightSide": true,
"show": true,
"sideWidth": null,
"total": false,
@ -2526,7 +2562,7 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "process_resident_memory_bytes{job=\"kube-controller-manager\",instance=~\"$instance\"}",
"expr": "process_resident_memory_bytes{cluster=\"$cluster\", job=\"kube-controller-manager\",instance=~\"$instance\"}",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{instance}}",
@ -2587,13 +2623,14 @@ data:
},
"id": 10,
"interval": "1m",
"legend": {
"alignAsTable": false,
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": false,
"rightSide": true,
"show": true,
"sideWidth": null,
"total": false,
@ -2619,7 +2656,7 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "rate(process_cpu_seconds_total{job=\"kube-controller-manager\",instance=~\"$instance\"}[5m])",
"expr": "rate(process_cpu_seconds_total{cluster=\"$cluster\", job=\"kube-controller-manager\",instance=~\"$instance\"}[$__rate_interval])",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{instance}}",
@ -2680,13 +2717,14 @@ data:
},
"id": 11,
"interval": "1m",
"legend": {
"alignAsTable": false,
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": false,
"rightSide": true,
"show": true,
"sideWidth": null,
"total": false,
@ -2712,7 +2750,7 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "go_goroutines{job=\"kube-controller-manager\",instance=~\"$instance\"}",
"expr": "go_goroutines{cluster=\"$cluster\", job=\"kube-controller-manager\",instance=~\"$instance\"}",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{instance}}",
@ -2782,7 +2820,7 @@ data:
"value": "default"
},
"hide": 0,
"label": null,
"label": "Data Source",
"name": "datasource",
"options": [
@ -2796,6 +2834,32 @@ data:
"allValue": null,
"current": {
},
"datasource": "$datasource",
"hide": 2,
"includeAll": false,
"label": "cluster",
"multi": false,
"name": "cluster",
"options": [
],
"query": "label_values(up{job=\"kube-controller-manager\"}, cluster)",
"refresh": 2,
"regex": "",
"sort": 1,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": null,
"current": {
},
"datasource": "$datasource",
"hide": 0,
@ -2806,7 +2870,7 @@ data:
"options": [
],
"query": "label_values(process_cpu_seconds_total{job=\"kube-controller-manager\"}, instance)",
"query": "label_values(up{cluster=\"$cluster\", job=\"kube-controller-manager\"}, instance)",
"refresh": 2,
"regex": "",
"sort": 1,
@ -2895,13 +2959,14 @@ data:
},
"id": 2,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": true,
"current": true,
"max": true,
"min": true,
"rightSide": false,
"rightSide": true,
"show": true,
"sideWidth": null,
"total": false,
@ -2927,14 +2992,14 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "(\n sum without(instance, node) (kubelet_volume_stats_capacity_bytes{cluster=\"$cluster\", job=\"kubelet\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"})\n -\n sum without(instance, node) (kubelet_volume_stats_available_bytes{cluster=\"$cluster\", job=\"kubelet\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"})\n)\n",
"expr": "(\n sum without(instance, node) (topk(1, (kubelet_volume_stats_capacity_bytes{cluster=\"$cluster\", job=\"kubelet\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"})))\n -\n sum without(instance, node) (topk(1, (kubelet_volume_stats_available_bytes{cluster=\"$cluster\", job=\"kubelet\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"})))\n)\n",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Used Space",
"refId": "A"
},
{
"expr": "sum without(instance, node) (kubelet_volume_stats_available_bytes{cluster=\"$cluster\", job=\"kubelet\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"})\n",
"expr": "sum without(instance, node) (topk(1, (kubelet_volume_stats_available_bytes{cluster=\"$cluster\", job=\"kubelet\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"})))\n",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Free Space",
@ -3003,7 +3068,11 @@ data:
},
"id": 3,
"interval": null,
"interval": "1m",
"legend": {
"alignAsTable": true,
"rightSide": true
},
"links": [
],
@ -3042,7 +3111,7 @@ data:
"tableColumn": "",
"targets": [
{
"expr": "max without(instance,node) (\n(\n kubelet_volume_stats_capacity_bytes{cluster=\"$cluster\", job=\"kubelet\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"}\n -\n kubelet_volume_stats_available_bytes{cluster=\"$cluster\", job=\"kubelet\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"}\n)\n/\nkubelet_volume_stats_capacity_bytes{cluster=\"$cluster\", job=\"kubelet\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"}\n* 100)\n",
"expr": "max without(instance,node) (\n(\n topk(1, kubelet_volume_stats_capacity_bytes{cluster=\"$cluster\", job=\"kubelet\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"})\n -\n topk(1, kubelet_volume_stats_available_bytes{cluster=\"$cluster\", job=\"kubelet\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"})\n)\n/\ntopk(1, kubelet_volume_stats_capacity_bytes{cluster=\"$cluster\", job=\"kubelet\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"})\n* 100)\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "",
@ -3092,13 +3161,14 @@ data:
},
"id": 4,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": true,
"current": true,
"max": true,
"min": true,
"rightSide": false,
"rightSide": true,
"show": true,
"sideWidth": null,
"total": false,
@ -3124,14 +3194,14 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "sum without(instance, node) (kubelet_volume_stats_inodes_used{cluster=\"$cluster\", job=\"kubelet\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"})\n",
"expr": "sum without(instance, node) (topk(1, (kubelet_volume_stats_inodes_used{cluster=\"$cluster\", job=\"kubelet\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"})))\n",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Used inodes",
"refId": "A"
},
{
"expr": "(\n sum without(instance, node) (kubelet_volume_stats_inodes{cluster=\"$cluster\", job=\"kubelet\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"})\n -\n sum without(instance, node) (kubelet_volume_stats_inodes_used{cluster=\"$cluster\", job=\"kubelet\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"})\n)\n",
"expr": "(\n sum without(instance, node) (topk(1, (kubelet_volume_stats_inodes{cluster=\"$cluster\", job=\"kubelet\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"})))\n -\n sum without(instance, node) (topk(1, (kubelet_volume_stats_inodes_used{cluster=\"$cluster\", job=\"kubelet\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"})))\n)\n",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": " Free inodes",
@ -3200,7 +3270,11 @@ data:
},
"id": 5,
"interval": null,
"interval": "1m",
"legend": {
"alignAsTable": true,
"rightSide": true
},
"links": [
],
@ -3239,7 +3313,7 @@ data:
"tableColumn": "",
"targets": [
{
"expr": "max without(instance,node) (\nkubelet_volume_stats_inodes_used{cluster=\"$cluster\", job=\"kubelet\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"}\n/\nkubelet_volume_stats_inodes{cluster=\"$cluster\", job=\"kubelet\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"}\n* 100)\n",
"expr": "max without(instance,node) (\ntopk(1, kubelet_volume_stats_inodes_used{cluster=\"$cluster\", job=\"kubelet\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"})\n/\ntopk(1, kubelet_volume_stats_inodes{cluster=\"$cluster\", job=\"kubelet\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"})\n* 100)\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "",
@ -3285,7 +3359,7 @@ data:
"value": "default"
},
"hide": 0,
"label": null,
"label": "Data Source",
"name": "datasource",
"options": [
@ -3309,7 +3383,7 @@ data:
"options": [
],
"query": "label_values(kubelet_volume_stats_capacity_bytes, cluster)",
"query": "label_values(kubelet_volume_stats_capacity_bytes{job=\"kubelet\"}, cluster)",
"refresh": 2,
"regex": "",
"sort": 1,
@ -3458,7 +3532,11 @@ data:
},
"id": 2,
"interval": null,
"interval": "1m",
"legend": {
"alignAsTable": true,
"rightSide": true
},
"links": [
],
@ -3497,7 +3575,7 @@ data:
"tableColumn": "",
"targets": [
{
"expr": "sum(up{job=\"kube-scheduler\"})",
"expr": "sum(up{cluster=\"$cluster\", job=\"kube-scheduler\"})",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "",
@ -3534,6 +3612,7 @@ data:
},
"id": 3,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
@ -3566,31 +3645,31 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "sum(rate(scheduler_e2e_scheduling_duration_seconds_count{job=\"kube-scheduler\", instance=~\"$instance\"}[5m])) by (instance)",
"expr": "sum(rate(scheduler_e2e_scheduling_duration_seconds_count{cluster=\"$cluster\", job=\"kube-scheduler\", instance=~\"$instance\"}[$__rate_interval])) by (cluster, instance)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{instance}} e2e",
"legendFormat": "{{cluster}} {{instance}} e2e",
"refId": "A"
},
{
"expr": "sum(rate(scheduler_binding_duration_seconds_count{job=\"kube-scheduler\", instance=~\"$instance\"}[5m])) by (instance)",
"expr": "sum(rate(scheduler_binding_duration_seconds_count{cluster=\"$cluster\", job=\"kube-scheduler\", instance=~\"$instance\"}[$__rate_interval])) by (cluster, instance)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{instance}} binding",
"legendFormat": "{{cluster}} {{instance}} binding",
"refId": "B"
},
{
"expr": "sum(rate(scheduler_scheduling_algorithm_duration_seconds_count{job=\"kube-scheduler\", instance=~\"$instance\"}[5m])) by (instance)",
"expr": "sum(rate(scheduler_scheduling_algorithm_duration_seconds_count{cluster=\"$cluster\", job=\"kube-scheduler\", instance=~\"$instance\"}[$__rate_interval])) by (cluster, instance)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{instance}} scheduling algorithm",
"legendFormat": "{{cluster}} {{instance}} scheduling algorithm",
"refId": "C"
},
{
"expr": "sum(rate(scheduler_volume_scheduling_duration_seconds_count{job=\"kube-scheduler\", instance=~\"$instance\"}[5m])) by (instance)",
"expr": "sum(rate(scheduler_volume_scheduling_duration_seconds_count{cluster=\"$cluster\", job=\"kube-scheduler\", instance=~\"$instance\"}[$__rate_interval])) by (cluster, instance)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{instance}} volume",
"legendFormat": "{{cluster}} {{instance}} volume",
"refId": "D"
}
],
@ -3648,6 +3727,7 @@ data:
},
"id": 4,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
@ -3680,31 +3760,31 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "histogram_quantile(0.99, sum(rate(scheduler_e2e_scheduling_duration_seconds_bucket{job=\"kube-scheduler\",instance=~\"$instance\"}[5m])) by (instance, le))",
"expr": "histogram_quantile(0.99, sum(rate(scheduler_e2e_scheduling_duration_seconds_bucket{cluster=\"$cluster\", job=\"kube-scheduler\",instance=~\"$instance\"}[$__rate_interval])) by (cluster, instance, le))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{instance}} e2e",
"legendFormat": "{{cluster}} {{instance}} e2e",
"refId": "A"
},
{
"expr": "histogram_quantile(0.99, sum(rate(scheduler_binding_duration_seconds_bucket{job=\"kube-scheduler\",instance=~\"$instance\"}[5m])) by (instance, le))",
"expr": "histogram_quantile(0.99, sum(rate(scheduler_binding_duration_seconds_bucket{cluster=\"$cluster\", job=\"kube-scheduler\",instance=~\"$instance\"}[$__rate_interval])) by (cluster, instance, le))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{instance}} binding",
"legendFormat": "{{cluster}} {{instance}} binding",
"refId": "B"
},
{
"expr": "histogram_quantile(0.99, sum(rate(scheduler_scheduling_algorithm_duration_seconds_bucket{job=\"kube-scheduler\",instance=~\"$instance\"}[5m])) by (instance, le))",
"expr": "histogram_quantile(0.99, sum(rate(scheduler_scheduling_algorithm_duration_seconds_bucket{cluster=\"$cluster\", job=\"kube-scheduler\",instance=~\"$instance\"}[$__rate_interval])) by (cluster, instance, le))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{instance}} scheduling algorithm",
"legendFormat": "{{cluster}} {{instance}} scheduling algorithm",
"refId": "C"
},
{
"expr": "histogram_quantile(0.99, sum(rate(scheduler_volume_scheduling_duration_seconds_bucket{job=\"kube-scheduler\",instance=~\"$instance\"}[5m])) by (instance, le))",
"expr": "histogram_quantile(0.99, sum(rate(scheduler_volume_scheduling_duration_seconds_bucket{cluster=\"$cluster\", job=\"kube-scheduler\",instance=~\"$instance\"}[$__rate_interval])) by (cluster, instance, le))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{instance}} volume",
"legendFormat": "{{cluster}} {{instance}} volume",
"refId": "D"
}
],
@ -3775,13 +3855,14 @@ data:
},
"id": 5,
"interval": "1m",
"legend": {
"alignAsTable": false,
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": false,
"rightSide": true,
"show": true,
"sideWidth": null,
"total": false,
@ -3807,28 +3888,28 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "sum(rate(rest_client_requests_total{job=\"kube-scheduler\", instance=~\"$instance\",code=~\"2..\"}[5m]))",
"expr": "sum(rate(rest_client_requests_total{cluster=\"$cluster\", job=\"kube-scheduler\", instance=~\"$instance\",code=~\"2..\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "2xx",
"refId": "A"
},
{
"expr": "sum(rate(rest_client_requests_total{job=\"kube-scheduler\", instance=~\"$instance\",code=~\"3..\"}[5m]))",
"expr": "sum(rate(rest_client_requests_total{cluster=\"$cluster\", job=\"kube-scheduler\", instance=~\"$instance\",code=~\"3..\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "3xx",
"refId": "B"
},
{
"expr": "sum(rate(rest_client_requests_total{job=\"kube-scheduler\", instance=~\"$instance\",code=~\"4..\"}[5m]))",
"expr": "sum(rate(rest_client_requests_total{cluster=\"$cluster\", job=\"kube-scheduler\", instance=~\"$instance\",code=~\"4..\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "4xx",
"refId": "C"
},
{
"expr": "sum(rate(rest_client_requests_total{job=\"kube-scheduler\", instance=~\"$instance\",code=~\"5..\"}[5m]))",
"expr": "sum(rate(rest_client_requests_total{cluster=\"$cluster\", job=\"kube-scheduler\", instance=~\"$instance\",code=~\"5..\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "5xx",
@ -3889,13 +3970,14 @@ data:
},
"id": 6,
"interval": "1m",
"legend": {
"alignAsTable": false,
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": false,
"rightSide": true,
"show": true,
"sideWidth": null,
"total": false,
@ -3921,7 +4003,7 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "histogram_quantile(0.99, sum(rate(rest_client_request_duration_seconds_bucket{job=\"kube-scheduler\", instance=~\"$instance\", verb=\"POST\"}[5m])) by (verb, url, le))",
"expr": "histogram_quantile(0.99, sum(rate(rest_client_request_duration_seconds_bucket{cluster=\"$cluster\", job=\"kube-scheduler\", instance=~\"$instance\", verb=\"POST\"}[$__rate_interval])) by (verb, url, le))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{verb}} {{url}}",
@ -3995,6 +4077,7 @@ data:
},
"id": 7,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": false,
@ -4027,7 +4110,7 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "histogram_quantile(0.99, sum(rate(rest_client_request_duration_seconds_bucket{job=\"kube-scheduler\", instance=~\"$instance\", verb=\"GET\"}[5m])) by (verb, url, le))",
"expr": "histogram_quantile(0.99, sum(rate(rest_client_request_duration_seconds_bucket{cluster=\"$cluster\", job=\"kube-scheduler\", instance=~\"$instance\", verb=\"GET\"}[$__rate_interval])) by (verb, url, le))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{verb}} {{url}}",
@ -4101,13 +4184,14 @@ data:
},
"id": 8,
"interval": "1m",
"legend": {
"alignAsTable": false,
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": false,
"rightSide": true,
"show": true,
"sideWidth": null,
"total": false,
@ -4133,7 +4217,7 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "process_resident_memory_bytes{job=\"kube-scheduler\", instance=~\"$instance\"}",
"expr": "process_resident_memory_bytes{cluster=\"$cluster\", job=\"kube-scheduler\", instance=~\"$instance\"}",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{instance}}",
@ -4194,13 +4278,14 @@ data:
},
"id": 9,
"interval": "1m",
"legend": {
"alignAsTable": false,
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": false,
"rightSide": true,
"show": true,
"sideWidth": null,
"total": false,
@ -4226,7 +4311,7 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "rate(process_cpu_seconds_total{job=\"kube-scheduler\", instance=~\"$instance\"}[5m])",
"expr": "rate(process_cpu_seconds_total{cluster=\"$cluster\", job=\"kube-scheduler\", instance=~\"$instance\"}[$__rate_interval])",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{instance}}",
@ -4287,13 +4372,14 @@ data:
},
"id": 10,
"interval": "1m",
"legend": {
"alignAsTable": false,
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": false,
"rightSide": true,
"show": true,
"sideWidth": null,
"total": false,
@ -4319,7 +4405,7 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "go_goroutines{job=\"kube-scheduler\",instance=~\"$instance\"}",
"expr": "go_goroutines{cluster=\"$cluster\", job=\"kube-scheduler\",instance=~\"$instance\"}",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{instance}}",
@ -4389,7 +4475,7 @@ data:
"value": "default"
},
"hide": 0,
"label": null,
"label": "Data Source",
"name": "datasource",
"options": [
@ -4403,6 +4489,32 @@ data:
"allValue": null,
"current": {
},
"datasource": "$datasource",
"hide": 2,
"includeAll": false,
"label": "cluster",
"multi": false,
"name": "cluster",
"options": [
],
"query": "label_values(up{job=\"kube-scheduler\"}, cluster)",
"refresh": 2,
"regex": "",
"sort": 1,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": null,
"current": {
},
"datasource": "$datasource",
"hide": 0,
@ -4413,7 +4525,7 @@ data:
"options": [
],
"query": "label_values(process_cpu_seconds_total{job=\"kube-scheduler\"}, instance)",
"query": "label_values(up{job=\"kube-scheduler\", cluster=\"$cluster\"}, instance)",
"refresh": 2,
"regex": "",
"sort": 1,
@ -4461,912 +4573,6 @@ data:
"uid": "2e6b6a3b4bddf1427b3a55aa1311c656",
"version": 0
}
statefulset.json: |-
{
"__inputs": [
],
"__requires": [
],
"annotations": {
"list": [
]
},
"editable": false,
"gnetId": null,
"graphTooltip": 0,
"hideControls": false,
"id": null,
"links": [
],
"refresh": "",
"rows": [
{
"collapse": false,
"collapsed": false,
"panels": [
{
"cacheTimeout": null,
"colorBackground": false,
"colorValue": false,
"colors": [
"#299c46",
"rgba(237, 129, 40, 0.89)",
"#d44a3a"
],
"datasource": "$datasource",
"format": "none",
"gauge": {
"maxValue": 100,
"minValue": 0,
"show": false,
"thresholdLabels": false,
"thresholdMarkers": true
},
"gridPos": {
},
"id": 2,
"interval": null,
"links": [
],
"mappingType": 1,
"mappingTypes": [
{
"name": "value to text",
"value": 1
},
{
"name": "range to text",
"value": 2
}
],
"maxDataPoints": 100,
"nullPointMode": "connected",
"nullText": null,
"postfix": "cores",
"postfixFontSize": "50%",
"prefix": "",
"prefixFontSize": "50%",
"rangeMaps": [
{
"from": "null",
"text": "N/A",
"to": "null"
}
],
"span": 4,
"sparkline": {
"fillColor": "rgba(31, 118, 189, 0.18)",
"lineColor": "rgb(31, 120, 193)",
"show": true
},
"tableColumn": "",
"targets": [
{
"expr": "sum(rate(container_cpu_usage_seconds_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", container!=\"\", namespace=\"$namespace\", pod=~\"$statefulset.*\"}[3m]))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "",
"refId": "A"
}
],
"thresholds": "",
"title": "CPU",
"tooltip": {
"shared": false
},
"type": "singlestat",
"valueFontSize": "80%",
"valueMaps": [
{
"op": "=",
"text": "0",
"value": "null"
}
],
"valueName": "current"
},
{
"cacheTimeout": null,
"colorBackground": false,
"colorValue": false,
"colors": [
"#299c46",
"rgba(237, 129, 40, 0.89)",
"#d44a3a"
],
"datasource": "$datasource",
"format": "none",
"gauge": {
"maxValue": 100,
"minValue": 0,
"show": false,
"thresholdLabels": false,
"thresholdMarkers": true
},
"gridPos": {
},
"id": 3,
"interval": null,
"links": [
],
"mappingType": 1,
"mappingTypes": [
{
"name": "value to text",
"value": 1
},
{
"name": "range to text",
"value": 2
}
],
"maxDataPoints": 100,
"nullPointMode": "connected",
"nullText": null,
"postfix": "GB",
"postfixFontSize": "50%",
"prefix": "",
"prefixFontSize": "50%",
"rangeMaps": [
{
"from": "null",
"text": "N/A",
"to": "null"
}
],
"span": 4,
"sparkline": {
"fillColor": "rgba(31, 118, 189, 0.18)",
"lineColor": "rgb(31, 120, 193)",
"show": true
},
"tableColumn": "",
"targets": [
{
"expr": "sum(container_memory_usage_bytes{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", container!=\"\", namespace=\"$namespace\", pod=~\"$statefulset.*\"}) / 1024^3",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "",
"refId": "A"
}
],
"thresholds": "",
"title": "Memory",
"tooltip": {
"shared": false
},
"type": "singlestat",
"valueFontSize": "80%",
"valueMaps": [
{
"op": "=",
"text": "0",
"value": "null"
}
],
"valueName": "current"
},
{
"cacheTimeout": null,
"colorBackground": false,
"colorValue": false,
"colors": [
"#299c46",
"rgba(237, 129, 40, 0.89)",
"#d44a3a"
],
"datasource": "$datasource",
"format": "none",
"gauge": {
"maxValue": 100,
"minValue": 0,
"show": false,
"thresholdLabels": false,
"thresholdMarkers": true
},
"gridPos": {
},
"id": 4,
"interval": null,
"links": [
],
"mappingType": 1,
"mappingTypes": [
{
"name": "value to text",
"value": 1
},
{
"name": "range to text",
"value": 2
}
],
"maxDataPoints": 100,
"nullPointMode": "connected",
"nullText": null,
"postfix": "Bps",
"postfixFontSize": "50%",
"prefix": "",
"prefixFontSize": "50%",
"rangeMaps": [
{
"from": "null",
"text": "N/A",
"to": "null"
}
],
"span": 4,
"sparkline": {
"fillColor": "rgba(31, 118, 189, 0.18)",
"lineColor": "rgb(31, 120, 193)",
"show": true
},
"tableColumn": "",
"targets": [
{
"expr": "sum(rate(container_network_transmit_bytes_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\", pod=~\"$statefulset.*\"}[3m])) + sum(rate(container_network_receive_bytes_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\",pod=~\"$statefulset.*\"}[3m]))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "",
"refId": "A"
}
],
"thresholds": "",
"title": "Network",
"tooltip": {
"shared": false
},
"type": "singlestat",
"valueFontSize": "80%",
"valueMaps": [
{
"op": "=",
"text": "0",
"value": "null"
}
],
"valueName": "current"
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": false,
"title": "Dashboard Row",
"titleSize": "h6",
"type": "row"
},
{
"collapse": false,
"collapsed": false,
"height": "100px",
"panels": [
{
"cacheTimeout": null,
"colorBackground": false,
"colorValue": false,
"colors": [
"#299c46",
"rgba(237, 129, 40, 0.89)",
"#d44a3a"
],
"datasource": "$datasource",
"format": "none",
"gauge": {
"maxValue": 100,
"minValue": 0,
"show": false,
"thresholdLabels": false,
"thresholdMarkers": true
},
"gridPos": {
},
"id": 5,
"interval": null,
"links": [
],
"mappingType": 1,
"mappingTypes": [
{
"name": "value to text",
"value": 1
},
{
"name": "range to text",
"value": 2
}
],
"maxDataPoints": 100,
"nullPointMode": "connected",
"nullText": null,
"postfix": "",
"postfixFontSize": "50%",
"prefix": "",
"prefixFontSize": "50%",
"rangeMaps": [
{
"from": "null",
"text": "N/A",
"to": "null"
}
],
"span": 3,
"sparkline": {
"fillColor": "rgba(31, 118, 189, 0.18)",
"full": false,
"lineColor": "rgb(31, 120, 193)",
"show": false
},
"tableColumn": "",
"targets": [
{
"expr": "max(kube_statefulset_replicas{job=\"kube-state-metrics\", cluster=\"$cluster\", namespace=\"$namespace\", statefulset=\"$statefulset\"}) without (instance, pod)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "",
"refId": "A"
}
],
"thresholds": "",
"title": "Desired Replicas",
"tooltip": {
"shared": false
},
"type": "singlestat",
"valueFontSize": "80%",
"valueMaps": [
{
"op": "=",
"text": "0",
"value": "null"
}
],
"valueName": "current"
},
{
"cacheTimeout": null,
"colorBackground": false,
"colorValue": false,
"colors": [
"#299c46",
"rgba(237, 129, 40, 0.89)",
"#d44a3a"
],
"datasource": "$datasource",
"format": "none",
"gauge": {
"maxValue": 100,
"minValue": 0,
"show": false,
"thresholdLabels": false,
"thresholdMarkers": true
},
"gridPos": {
},
"id": 6,
"interval": null,
"links": [
],
"mappingType": 1,
"mappingTypes": [
{
"name": "value to text",
"value": 1
},
{
"name": "range to text",
"value": 2
}
],
"maxDataPoints": 100,
"nullPointMode": "connected",
"nullText": null,
"postfix": "",
"postfixFontSize": "50%",
"prefix": "",
"prefixFontSize": "50%",
"rangeMaps": [
{
"from": "null",
"text": "N/A",
"to": "null"
}
],
"span": 3,
"sparkline": {
"fillColor": "rgba(31, 118, 189, 0.18)",
"full": false,
"lineColor": "rgb(31, 120, 193)",
"show": false
},
"tableColumn": "",
"targets": [
{
"expr": "min(kube_statefulset_status_replicas_current{job=\"kube-state-metrics\", cluster=\"$cluster\", namespace=\"$namespace\", statefulset=\"$statefulset\"}) without (instance, pod)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "",
"refId": "A"
}
],
"thresholds": "",
"title": "Replicas of current version",
"tooltip": {
"shared": false
},
"type": "singlestat",
"valueFontSize": "80%",
"valueMaps": [
{
"op": "=",
"text": "0",
"value": "null"
}
],
"valueName": "current"
},
{
"cacheTimeout": null,
"colorBackground": false,
"colorValue": false,
"colors": [
"#299c46",
"rgba(237, 129, 40, 0.89)",
"#d44a3a"
],
"datasource": "$datasource",
"format": "none",
"gauge": {
"maxValue": 100,
"minValue": 0,
"show": false,
"thresholdLabels": false,
"thresholdMarkers": true
},
"gridPos": {
},
"id": 7,
"interval": null,
"links": [
],
"mappingType": 1,
"mappingTypes": [
{
"name": "value to text",
"value": 1
},
{
"name": "range to text",
"value": 2
}
],
"maxDataPoints": 100,
"nullPointMode": "connected",
"nullText": null,
"postfix": "",
"postfixFontSize": "50%",
"prefix": "",
"prefixFontSize": "50%",
"rangeMaps": [
{
"from": "null",
"text": "N/A",
"to": "null"
}
],
"span": 3,
"sparkline": {
"fillColor": "rgba(31, 118, 189, 0.18)",
"full": false,
"lineColor": "rgb(31, 120, 193)",
"show": false
},
"tableColumn": "",
"targets": [
{
"expr": "max(kube_statefulset_status_observed_generation{job=\"kube-state-metrics\", cluster=\"$cluster\", namespace=\"$namespace\", statefulset=\"$statefulset\"}) without (instance, pod)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "",
"refId": "A"
}
],
"thresholds": "",
"title": "Observed Generation",
"tooltip": {
"shared": false
},
"type": "singlestat",
"valueFontSize": "80%",
"valueMaps": [
{
"op": "=",
"text": "0",
"value": "null"
}
],
"valueName": "current"
},
{
"cacheTimeout": null,
"colorBackground": false,
"colorValue": false,
"colors": [
"#299c46",
"rgba(237, 129, 40, 0.89)",
"#d44a3a"
],
"datasource": "$datasource",
"format": "none",
"gauge": {
"maxValue": 100,
"minValue": 0,
"show": false,
"thresholdLabels": false,
"thresholdMarkers": true
},
"gridPos": {
},
"id": 8,
"interval": null,
"links": [
],
"mappingType": 1,
"mappingTypes": [
{
"name": "value to text",
"value": 1
},
{
"name": "range to text",
"value": 2
}
],
"maxDataPoints": 100,
"nullPointMode": "connected",
"nullText": null,
"postfix": "",
"postfixFontSize": "50%",
"prefix": "",
"prefixFontSize": "50%",
"rangeMaps": [
{
"from": "null",
"text": "N/A",
"to": "null"
}
],
"span": 3,
"sparkline": {
"fillColor": "rgba(31, 118, 189, 0.18)",
"full": false,
"lineColor": "rgb(31, 120, 193)",
"show": false
},
"tableColumn": "",
"targets": [
{
"expr": "max(kube_statefulset_metadata_generation{job=\"kube-state-metrics\", statefulset=\"$statefulset\", cluster=\"$cluster\", namespace=\"$namespace\"}) without (instance, pod)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "",
"refId": "A"
}
],
"thresholds": "",
"title": "Metadata Generation",
"tooltip": {
"shared": false
},
"type": "singlestat",
"valueFontSize": "80%",
"valueMaps": [
{
"op": "=",
"text": "0",
"value": "null"
}
],
"valueName": "current"
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": false,
"title": "Dashboard Row",
"titleSize": "h6",
"type": "row"
},
{
"collapse": false,
"collapsed": false,
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"fillGradient": 0,
"gridPos": {
},
"id": 9,
"legend": {
"alignAsTable": false,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": false,
"show": true,
"sideWidth": null,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [
],
"nullPointMode": "null",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "max(kube_statefulset_replicas{job=\"kube-state-metrics\", statefulset=\"$statefulset\", cluster=\"$cluster\", namespace=\"$namespace\"}) without (instance, pod)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "replicas specified",
"refId": "A"
},
{
"expr": "max(kube_statefulset_status_replicas{job=\"kube-state-metrics\", statefulset=\"$statefulset\", cluster=\"$cluster\", namespace=\"$namespace\"}) without (instance, pod)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "replicas created",
"refId": "B"
},
{
"expr": "min(kube_statefulset_status_replicas_ready{job=\"kube-state-metrics\", statefulset=\"$statefulset\", cluster=\"$cluster\", namespace=\"$namespace\"}) without (instance, pod)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "ready",
"refId": "C"
},
{
"expr": "min(kube_statefulset_status_replicas_current{job=\"kube-state-metrics\", statefulset=\"$statefulset\", cluster=\"$cluster\", namespace=\"$namespace\"}) without (instance, pod)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "replicas of current version",
"refId": "D"
},
{
"expr": "min(kube_statefulset_status_replicas_updated{job=\"kube-state-metrics\", statefulset=\"$statefulset\", cluster=\"$cluster\", namespace=\"$namespace\"}) without (instance, pod)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "updated",
"refId": "E"
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Replicas",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": false,
"title": "Dashboard Row",
"titleSize": "h6",
"type": "row"
}
],
"schemaVersion": 14,
"style": "dark",
"tags": [
"kubernetes-mixin"
],
"templating": {
"list": [
{
"current": {
"text": "default",
"value": "default"
},
"hide": 0,
"label": null,
"name": "datasource",
"options": [
],
"query": "prometheus",
"refresh": 1,
"regex": "",
"type": "datasource"
},
{
"allValue": null,
"current": {
},
"datasource": "$datasource",
"hide": 2,
"includeAll": false,
"label": "cluster",
"multi": false,
"name": "cluster",
"options": [
],
"query": "label_values(kube_statefulset_metadata_generation, cluster)",
"refresh": 2,
"regex": "",
"sort": 1,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": null,
"current": {
},
"datasource": "$datasource",
"hide": 0,
"includeAll": false,
"label": "Namespace",
"multi": false,
"name": "namespace",
"options": [
],
"query": "label_values(kube_statefulset_metadata_generation{job=\"kube-state-metrics\", cluster=\"$cluster\"}, namespace)",
"refresh": 2,
"regex": "",
"sort": 1,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": null,
"current": {
},
"datasource": "$datasource",
"hide": 0,
"includeAll": false,
"label": "Name",
"multi": false,
"name": "statefulset",
"options": [
],
"query": "label_values(kube_statefulset_metadata_generation{job=\"kube-state-metrics\", cluster=\"$cluster\", namespace=\"$namespace\"}, statefulset)",
"refresh": 2,
"regex": "",
"sort": 1,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "query",
"useTags": false
}
]
},
"time": {
"from": "now-1h",
"to": "now"
},
"timepicker": {
"refresh_intervals": [
"5s",
"10s",
"30s",
"1m",
"5m",
"15m",
"30m",
"1h",
"2h",
"1d"
],
"time_options": [
"5m",
"15m",
"1h",
"6h",
"12h",
"24h",
"2d",
"7d",
"30d"
]
},
"timezone": "UTC",
"title": "Kubernetes / StatefulSets",
"uid": "a31c1f46e6f727cb37c0d731a7245005",
"version": 0
}
kind: ConfigMap
metadata:
name: grafana-dashboards-k8s

View File

@ -15,13 +15,13 @@ data:
},
"editable": false,
"gnetId": null,
"graphTooltip": 0,
"graphTooltip": 1,
"hideControls": false,
"id": null,
"links": [
],
"refresh": "",
"refresh": "30s",
"rows": [
{
"collapse": false,
@ -73,9 +73,8 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "(\n (1 - rate(node_cpu_seconds_total{job=\"node-exporter\", mode=\"idle\", instance=\"$instance\"}[$__interval]))\n/ ignoring(cpu) group_left\n count without (cpu)( node_cpu_seconds_total{job=\"node-exporter\", mode=\"idle\", instance=\"$instance\"})\n)\n",
"expr": "(\n (1 - sum without (mode) (rate(node_cpu_seconds_total{job=\"node-exporter\", mode=~\"idle|iowait|steal\", instance=\"$instance\"}[$__rate_interval])))\n/ ignoring(cpu) group_left\n count without (cpu, mode) (node_cpu_seconds_total{job=\"node-exporter\", mode=\"idle\", instance=\"$instance\"})\n)\n",
"format": "time_series",
"interval": "1m",
"intervalFactor": 5,
"legendFormat": "{{cpu}}",
"refId": "A"
@ -509,25 +508,22 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "rate(node_disk_read_bytes_total{job=\"node-exporter\", instance=\"$instance\", device!~\"dm.*\"}[$__interval])",
"expr": "rate(node_disk_read_bytes_total{job=\"node-exporter\", instance=\"$instance\", device!~\"dm.*\"}[$__rate_interval])",
"format": "time_series",
"interval": "1m",
"intervalFactor": 2,
"legendFormat": "{{device}} read",
"refId": "A"
},
{
"expr": "rate(node_disk_written_bytes_total{job=\"node-exporter\", instance=\"$instance\", device!~\"dm.*\"}[$__interval])",
"expr": "rate(node_disk_written_bytes_total{job=\"node-exporter\", instance=\"$instance\", device!~\"dm.*\"}[$__rate_interval])",
"format": "time_series",
"interval": "1m",
"intervalFactor": 2,
"legendFormat": "{{device}} written",
"refId": "B"
},
{
"expr": "rate(node_disk_io_time_seconds_total{job=\"node-exporter\", instance=\"$instance\", device!~\"dm.*\"}[$__interval])",
"expr": "rate(node_disk_io_time_seconds_total{job=\"node-exporter\", instance=\"$instance\", device!~\"dm.*\"}[$__rate_interval])",
"format": "time_series",
"interval": "1m",
"intervalFactor": 2,
"legendFormat": "{{device}} io time",
"refId": "C"
@ -739,9 +735,8 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "rate(node_network_receive_bytes_total{job=\"node-exporter\", instance=\"$instance\", device!=\"lo\"}[$__interval])",
"expr": "rate(node_network_receive_bytes_total{job=\"node-exporter\", instance=\"$instance\", device!=\"lo\"}[$__rate_interval])",
"format": "time_series",
"interval": "1m",
"intervalFactor": 2,
"legendFormat": "{{device}}",
"refId": "A"
@ -833,9 +828,8 @@ data:
"steppedLine": false,
"targets": [
{
"expr": "rate(node_network_transmit_bytes_total{job=\"node-exporter\", instance=\"$instance\", device!=\"lo\"}[$__interval])",
"expr": "rate(node_network_transmit_bytes_total{job=\"node-exporter\", instance=\"$instance\", device!=\"lo\"}[$__rate_interval])",
"format": "time_series",
"interval": "1m",
"intervalFactor": 2,
"legendFormat": "{{device}}",
"refId": "A"
@ -894,17 +888,17 @@ data:
"schemaVersion": 14,
"style": "dark",
"tags": [
"node-exporter-mixin"
],
"templating": {
"list": [
{
"current": {
"text": "Prometheus",
"value": "Prometheus"
"text": "default",
"value": "default"
},
"hide": 0,
"label": null,
"label": "Data Source",
"name": "datasource",
"options": [
@ -972,7 +966,7 @@ data:
]
},
"timezone": "",
"title": "Nodes",
"title": "Node Exporter / Nodes",
"uid": "fa49a4706d07a042595b664c87fb33ea",
"version": 0
}

View File

@ -1536,11 +1536,11 @@ data:
"includeAll": true,
"label": null,
"multi": false,
"name": "instance",
"name": "cluster",
"options": [
],
"query": "label_values(prometheus_build_info, instance)",
"query": "label_values(kube_pod_container_info{image=~\".*prometheus.*\"}, cluster)",
"refresh": 2,
"regex": "",
"sort": 0,
@ -1571,11 +1571,11 @@ data:
"includeAll": true,
"label": null,
"multi": false,
"name": "cluster",
"name": "instance",
"options": [
],
"query": "label_values(kube_pod_container_info{image=~\".*prometheus.*\"}, cluster)",
"query": "label_values(prometheus_build_info{cluster=~\"$cluster\"}, instance)",
"refresh": 2,
"regex": "",
"sort": 0,
@ -1850,7 +1850,7 @@ data:
"title": "Prometheus Stats",
"tooltip": {
"shared": true,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"transform": "table",
@ -1949,7 +1949,7 @@ data:
"title": "Target Sync",
"tooltip": {
"shared": true,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -2035,7 +2035,7 @@ data:
"title": "Targets",
"tooltip": {
"shared": true,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -2133,7 +2133,7 @@ data:
"title": "Average Scrape Interval Duration",
"tooltip": {
"shared": true,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -2202,6 +2202,14 @@ data:
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum by (job) (rate(prometheus_target_scrapes_exceeded_body_size_limit_total[1m]))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "exceeded body size limit: {{job}}",
"legendLink": null,
"step": 10
},
{
"expr": "sum by (job) (rate(prometheus_target_scrapes_exceeded_sample_limit_total[1m]))",
"format": "time_series",
@ -2243,7 +2251,7 @@ data:
"title": "Scrape failures",
"tooltip": {
"shared": true,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -2329,7 +2337,7 @@ data:
"title": "Appended Samples",
"tooltip": {
"shared": true,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -2427,7 +2435,7 @@ data:
"title": "Head Series",
"tooltip": {
"shared": true,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -2513,7 +2521,7 @@ data:
"title": "Head Chunks",
"tooltip": {
"shared": true,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -2611,7 +2619,7 @@ data:
"title": "Query Rate",
"tooltip": {
"shared": true,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -2697,7 +2705,7 @@ data:
"title": "Stage Duration",
"tooltip": {
"shared": true,
"sort": 0,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
@ -2751,7 +2759,7 @@ data:
"value": "default"
},
"hide": 0,
"label": null,
"label": "Data Source",
"name": "datasource",
"options": [
@ -2762,7 +2770,7 @@ data:
"type": "datasource"
},
{
"allValue": null,
"allValue": ".+",
"current": {
"selected": true,
"text": "All",
@ -2777,7 +2785,7 @@ data:
"options": [
],
"query": "label_values(prometheus_build_info, job)",
"query": "label_values(prometheus_build_info{job=\"prometheus\"}, job)",
"refresh": 1,
"regex": "",
"sort": 2,
@ -2790,7 +2798,7 @@ data:
"useTags": false
},
{
"allValue": null,
"allValue": ".+",
"current": {
"selected": true,
"text": "All",
@ -2805,7 +2813,7 @@ data:
"options": [
],
"query": "label_values(prometheus_build_info, instance)",
"query": "label_values(prometheus_build_info{job=~\"$job\"}, instance)",
"refresh": 1,
"regex": "",
"sort": 2,

View File

@ -24,7 +24,7 @@ spec:
type: RuntimeDefault
containers:
- name: grafana
image: docker.io/grafana/grafana:7.3.7
image: docker.io/grafana/grafana:8.5.3
env:
- name: GF_PATHS_CONFIG
value: "/etc/grafana/custom.ini"
@ -69,6 +69,8 @@ spec:
mountPath: /etc/grafana/dashboards/k8s-resources-1
- name: dashboards-k8s-resources-2
mountPath: /etc/grafana/dashboards/k8s-resources-2
- name: dashboards-k8s-network
mountPath: /etc/grafana/dashboards/k8s-network
- name: dashboards-coredns
mountPath: /etc/grafana/dashboards/coredns
- name: dashboards-nginx-ingress
@ -101,6 +103,9 @@ spec:
- name: dashboards-k8s-resources-1
configMap:
name: grafana-dashboards-k8s-resources-1
- name: dashboards-k8s-network
configMap:
name: grafana-dashboards-k8s-network
- name: dashboards-k8s-resources-2
configMap:
name: grafana-dashboards-k8s-resources-2

View File

@ -3,4 +3,4 @@ kind: IngressClass
metadata:
name: public
spec:
controller: k8s.io/ingress-nginx
controller: k8s.io/public

View File

@ -23,9 +23,10 @@ spec:
type: RuntimeDefault
containers:
- name: nginx-ingress-controller
image: k8s.gcr.io/ingress-nginx/controller:v0.43.0
image: k8s.gcr.io/ingress-nginx/controller:v1.2.1
args:
- /nginx-ingress-controller
- --controller-class=k8s.io/public
- --ingress-class=public
# use downward API
env:

View File

@ -3,4 +3,4 @@ kind: IngressClass
metadata:
name: public
spec:
controller: k8s.io/ingress-nginx
controller: k8s.io/public

View File

@ -23,9 +23,10 @@ spec:
type: RuntimeDefault
containers:
- name: nginx-ingress-controller
image: k8s.gcr.io/ingress-nginx/controller:v0.43.0
image: k8s.gcr.io/ingress-nginx/controller:v1.2.1
args:
- /nginx-ingress-controller
- --controller-class=k8s.io/public
- --ingress-class=public
# use downward API
env:

View File

@ -3,4 +3,4 @@ kind: IngressClass
metadata:
name: public
spec:
controller: k8s.io/ingress-nginx
controller: k8s.io/public

View File

@ -23,9 +23,10 @@ spec:
type: RuntimeDefault
containers:
- name: nginx-ingress-controller
image: k8s.gcr.io/ingress-nginx/controller:v0.43.0
image: k8s.gcr.io/ingress-nginx/controller:v1.2.1
args:
- /nginx-ingress-controller
- --controller-class=k8s.io/public
- --ingress-class=public
# use downward API
env:

View File

@ -3,4 +3,4 @@ kind: IngressClass
metadata:
name: public
spec:
controller: k8s.io/ingress-nginx
controller: k8s.io/public

View File

@ -23,9 +23,10 @@ spec:
type: RuntimeDefault
containers:
- name: nginx-ingress-controller
image: k8s.gcr.io/ingress-nginx/controller:v0.43.0
image: k8s.gcr.io/ingress-nginx/controller:v1.2.1
args:
- /nginx-ingress-controller
- --controller-class=k8s.io/public
- --ingress-class=public
# use downward API
env:

View File

@ -3,4 +3,4 @@ kind: IngressClass
metadata:
name: public
spec:
controller: k8s.io/ingress-nginx
controller: k8s.io/public

View File

@ -23,9 +23,10 @@ spec:
type: RuntimeDefault
containers:
- name: nginx-ingress-controller
image: k8s.gcr.io/ingress-nginx/controller:v0.43.0
image: k8s.gcr.io/ingress-nginx/controller:v1.2.1
args:
- /nginx-ingress-controller
- --controller-class=k8s.io/public
- --ingress-class=public
# use downward API
env:

View File

@ -72,6 +72,48 @@ data:
regex: apiserver_request_duration_seconds_count;.+
action: drop
# Scrape config for kube-controller-manager endpoints.
#
# kube-controller-manager service endpoints can be discovered by using the
# `endpoints` role and relabelling to only keep only endpoints associated with
# kube-system/kube-controller-manager and the `https` port.
- job_name: 'kube-controller-manager'
kubernetes_sd_configs:
- role: endpoints
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
insecure_skip_verify: true
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
action: keep
regex: kube-system;kube-controller-manager;metrics
- replacement: kube-controller-manager
action: replace
target_label: job
# Scrape config for kube-scheduler endpoints.
#
# kube-scheduler service endpoints can be discovered by using the `endpoints`
# role and relabelling to only keep only endpoints associated with
# kube-system/kube-scheduler and the `https` port.
- job_name: 'kube-scheduler'
kubernetes_sd_configs:
- role: endpoints
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
insecure_skip_verify: true
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
action: keep
regex: kube-system;kube-scheduler;metrics
- replacement: kube-scheduler
action: replace
target_label: job
# Scrape config for node (i.e. kubelet) /metrics (e.g. 'kubelet_'). Explore
# metrics from a node by scraping kubelet (127.0.0.1:10250/metrics).
- job_name: 'kubelet'
@ -133,6 +175,7 @@ data:
# * `prometheus.io/path`: If the metrics path is not `/metrics` override this.
# * `prometheus.io/port`: If the metrics are exposed on a different port to the
# service then set this appropriately.
# * `prometheus.io/param`: Custom metrics query parameter, like "format=prometheus".
- job_name: 'kubernetes-service-endpoints'
kubernetes_sd_configs:
- role: endpoints
@ -155,6 +198,11 @@ data:
target_label: __address__
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_param]
action: replace
target_label: __param_$1
regex: ([^=]+)=(.*)
replacement: $2
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
@ -172,38 +220,6 @@ data:
action: drop
regex: etcd_(debugging|disk|request|server).*
# Example scrape config for probing services via the Blackbox Exporter.
#
# The relabeling allows the actual service scrape endpoint to be configured
# via the following annotations:
#
# * `prometheus.io/probe`: Only probe services that have a value of `true`
- job_name: 'kubernetes-services'
metrics_path: /probe
params:
module: [http_2xx]
kubernetes_sd_configs:
- role: service
relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_probe]
action: keep
regex: true
- source_labels: [__address__]
target_label: __param_target
- target_label: __address__
replacement: blackbox
- source_labels: [__param_target]
target_label: instance
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
target_label: namespace
- source_labels: [__meta_kubernetes_service_name]
target_label: job
# Example scrape config for pods
#
# The relabeling allows the actual pod scrape endpoint to be configured via the
@ -240,6 +256,67 @@ data:
action: replace
target_label: kubernetes_pod_name
# Example scrape config for probing Services via the Blackbox Exporter.
#
# Relabeling allows service scraping to be configured via annotations:
# * `prometheus.io/probe`: Only probe services that have a value of `true`
- job_name: 'kubernetes-services'
metrics_path: /probe
params:
module: [http_2xx]
kubernetes_sd_configs:
- role: service
relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_probe]
action: keep
regex: true
- source_labels: [__address__]
target_label: __param_target
- target_label: __address__
replacement: blackbox-exporter:8080
- source_labels: [__param_target]
target_label: instance
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
target_label: namespace
- source_labels: [__meta_kubernetes_service_name]
target_label: job
# Example scrape config for probing Ingresses via a Blackbox Exporter.
#
# Relabeling allows service scraping to be configured via annotations:
# * `prometheus.io/probe`: Only probe ingresses that have a value of `true`
- job_name: 'kubernetes-ingresses'
metrics_path: /probe
params:
module: [http_2xx]
kubernetes_sd_configs:
- role: ingress
relabel_configs:
- source_labels: [__meta_kubernetes_ingress_annotation_prometheus_io_probe]
action: keep
regex: true
- source_labels: [__meta_kubernetes_ingress_scheme, __address__, __meta_kubernetes_ingress_path]
regex: (.+);(.+);(.+)
replacement: ${1}://${2}${3}
target_label: __param_target
- target_label: __address__
replacement: blackbox-exporter:8080
- source_labels: [__param_target]
target_label: instance
- action: labelmap
regex: __meta_kubernetes_ingress_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
target_label: namespace
- source_labels: [__meta_kubernetes_service_name]
target_label: job
# Rule files
rule_files:
- "/etc/prometheus/rules/*.rules"

View File

@ -21,7 +21,7 @@ spec:
serviceAccountName: prometheus
containers:
- name: prometheus
image: quay.io/prometheus/prometheus:v2.24.0
image: quay.io/prometheus/prometheus:v2.36.0
args:
- --web.listen-address=0.0.0.0:9090
- --config.file=/etc/prometheus/prometheus.yaml

View File

@ -1,11 +1,9 @@
# Allow Prometheus to scrape service endpoints
# Allow Prometheus to discover service endpoints
apiVersion: v1
kind: Service
metadata:
name: kube-controller-manager
namespace: kube-system
annotations:
prometheus.io/scrape: 'true'
spec:
type: ClusterIP
clusterIP: None
@ -14,5 +12,5 @@ spec:
ports:
- name: metrics
protocol: TCP
port: 10252
targetPort: 10252
port: 10257
targetPort: 10257

View File

@ -1,11 +1,9 @@
# Allow Prometheus to scrape service endpoints
# Allow Prometheus to discover service endpoints
apiVersion: v1
kind: Service
metadata:
name: kube-scheduler
namespace: kube-system
annotations:
prometheus.io/scrape: 'true'
spec:
type: ClusterIP
clusterIP: None
@ -14,5 +12,5 @@ spec:
ports:
- name: metrics
protocol: TCP
port: 10251
targetPort: 10251
port: 10259
targetPort: 10259

View File

@ -25,7 +25,7 @@ spec:
serviceAccountName: kube-state-metrics
containers:
- name: kube-state-metrics
image: quay.io/coreos/kube-state-metrics:v2.0.0-alpha.3
image: k8s.gcr.io/kube-state-metrics/kube-state-metrics:v2.4.2
ports:
- name: metrics
containerPort: 8080

View File

@ -28,13 +28,13 @@ spec:
hostPID: true
containers:
- name: node-exporter
image: quay.io/prometheus/node-exporter:v1.0.1
image: quay.io/prometheus/node-exporter:v1.3.1
args:
- --path.procfs=/host/proc
- --path.sysfs=/host/sys
- --path.rootfs=/host/root
- --collector.filesystem.ignored-mount-points=^/(dev|proc|sys|var/lib/docker/.+)($|/)
- --collector.filesystem.ignored-fs-types=^(autofs|binfmt_misc|cgroup|configfs|debugfs|devpts|devtmpfs|fusectl|hugetlbfs|mqueue|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|sysfs|tracefs)$
- --collector.filesystem.mount-points-exclude=^/(dev|proc|sys|var/lib/docker/.+)($|/)
- --collector.filesystem.fs-types-exclude=^(autofs|binfmt_misc|cgroup|configfs|debugfs|devpts|devtmpfs|fusectl|hugetlbfs|mqueue|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|sysfs|tracefs)$
ports:
- name: metrics
containerPort: 9100

View File

@ -10,6 +10,17 @@ rules:
- services
- endpoints
- pods
verbs: ["get", "list", "watch"]
verbs:
- get
- list
- watch
- nonResourceURLs: ["/metrics"]
verbs: ["get"]
- apiGroups:
- networking.k8s.io
resources:
- ingresses
verbs:
- get
- list
- watch

View File

@ -57,10 +57,10 @@ data:
{
"alert": "etcdGRPCRequestsSlow",
"annotations": {
"description": "etcd cluster \"{{ $labels.job }}\": gRPC requests to {{ $labels.grpc_method }} are taking {{ $value }}s on etcd instance {{ $labels.instance }}.",
"description": "etcd cluster \"{{ $labels.job }}\": 99th percentile of gRPC requests is {{ $value }}s on etcd instance {{ $labels.instance }} for {{ $labels.grpc_method }} method.",
"summary": "etcd grpc requests are slow"
},
"expr": "histogram_quantile(0.99, sum(rate(grpc_server_handling_seconds_bucket{job=~\".*etcd.*\", grpc_type=\"unary\"}[5m])) without(grpc_type))\n> 0.15\n",
"expr": "histogram_quantile(0.99, sum(rate(grpc_server_handling_seconds_bucket{job=~\".*etcd.*\", grpc_method!=\"Defragment\", grpc_type=\"unary\"}[5m])) without(grpc_type))\n> 0.15\n",
"for": "10m",
"labels": {
"severity": "critical"
@ -105,7 +105,8 @@ data:
{
"alert": "etcdHighFsyncDurations",
"annotations": {
"message": "etcd cluster \"{{ $labels.job }}\": 99th percentile fsync durations are {{ $value }}s on etcd instance {{ $labels.instance }}."
"description": "etcd cluster \"{{ $labels.job }}\": 99th percentile fsync durations are {{ $value }}s on etcd instance {{ $labels.instance }}.",
"summary": "etcd cluster 99th percentile fsync durations are too high."
},
"expr": "histogram_quantile(0.99, rate(etcd_disk_wal_fsync_duration_seconds_bucket{job=~\".*etcd.*\"}[5m]))\n> 1\n",
"for": "10m",
@ -125,46 +126,11 @@ data:
"severity": "warning"
}
},
{
"alert": "etcdHighNumberOfFailedHTTPRequests",
"annotations": {
"description": "{{ $value }}% of requests for {{ $labels.method }} failed on etcd instance {{ $labels.instance }}",
"summary": "etcd has high number of failed HTTP requests."
},
"expr": "sum(rate(etcd_http_failed_total{job=~\".*etcd.*\", code!=\"404\"}[5m])) without (code) / sum(rate(etcd_http_received_total{job=~\".*etcd.*\"}[5m]))\nwithout (code) > 0.01\n",
"for": "10m",
"labels": {
"severity": "warning"
}
},
{
"alert": "etcdHighNumberOfFailedHTTPRequests",
"annotations": {
"description": "{{ $value }}% of requests for {{ $labels.method }} failed on etcd instance {{ $labels.instance }}.",
"summary": "etcd has high number of failed HTTP requests."
},
"expr": "sum(rate(etcd_http_failed_total{job=~\".*etcd.*\", code!=\"404\"}[5m])) without (code) / sum(rate(etcd_http_received_total{job=~\".*etcd.*\"}[5m]))\nwithout (code) > 0.05\n",
"for": "10m",
"labels": {
"severity": "critical"
}
},
{
"alert": "etcdHTTPRequestsSlow",
"annotations": {
"description": "etcd instance {{ $labels.instance }} HTTP requests to {{ $labels.method }} are slow.",
"summary": "etcd instance HTTP requests are slow."
},
"expr": "histogram_quantile(0.99, rate(etcd_http_successful_duration_seconds_bucket[5m]))\n> 0.15\n",
"for": "10m",
"labels": {
"severity": "warning"
}
},
{
"alert": "etcdBackendQuotaLowSpace",
"annotations": {
"message": "etcd cluster \"{{ $labels.job }}\": database size exceeds the defined quota on etcd instance {{ $labels.instance }}, please defrag or increase the quota as the writes to etcd will be disabled when it is full."
"description": "etcd cluster \"{{ $labels.job }}\": database size exceeds the defined quota on etcd instance {{ $labels.instance }}, please defrag or increase the quota as the writes to etcd will be disabled when it is full.",
"summary": "etcd cluster database is running full."
},
"expr": "(etcd_mvcc_db_total_size_in_bytes/etcd_server_quota_backend_bytes)*100 > 95\n",
"for": "10m",
@ -175,7 +141,8 @@ data:
{
"alert": "etcdExcessiveDatabaseGrowth",
"annotations": {
"message": "etcd cluster \"{{ $labels.job }}\": Observed surge in etcd writes leading to 50% increase in database size over the past four hours on etcd instance {{ $labels.instance }}, please check as it might be disruptive."
"description": "etcd cluster \"{{ $labels.job }}\": Observed surge in etcd writes leading to 50% increase in database size over the past four hours on etcd instance {{ $labels.instance }}, please check as it might be disruptive.",
"summary": "etcd cluster database growing very fast."
},
"expr": "increase(((etcd_mvcc_db_total_size_in_bytes/etcd_server_quota_backend_bytes)*100)[240m:1m]) > 50\n",
"for": "10m",
@ -191,122 +158,113 @@ data:
{
"groups": [
{
"name": "kube-apiserver.rules",
"name": "kube-apiserver-burnrate.rules",
"rules": [
{
"expr": "(\n (\n # too slow\n sum(rate(apiserver_request_duration_seconds_count{job=\"apiserver\",verb=~\"LIST|GET\"}[1d]))\n -\n (\n (\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",scope=~\"resource|\",le=\"0.1\"}[1d]))\n or\n vector(0)\n )\n +\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",scope=\"namespace\",le=\"0.5\"}[1d]))\n +\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",scope=\"cluster\",le=\"5\"}[1d]))\n )\n )\n +\n # errors\n sum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"LIST|GET\",code=~\"5..\"}[1d]))\n)\n/\nsum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"LIST|GET\"}[1d]))\n",
"expr": "(\n (\n # too slow\n sum by (cluster) (rate(apiserver_request_duration_seconds_count{job=\"apiserver\",verb=~\"LIST|GET\",subresource!~\"proxy|attach|log|exec|portforward\"}[1d]))\n -\n (\n (\n sum by (cluster) (rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",subresource!~\"proxy|attach|log|exec|portforward\",scope=~\"resource|\",le=\"1\"}[1d]))\n or\n vector(0)\n )\n +\n sum by (cluster) (rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",subresource!~\"proxy|attach|log|exec|portforward\",scope=\"namespace\",le=\"5\"}[1d]))\n +\n sum by (cluster) (rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",subresource!~\"proxy|attach|log|exec|portforward\",scope=\"cluster\",le=\"30\"}[1d]))\n )\n )\n +\n # errors\n sum by (cluster) (rate(apiserver_request_total{job=\"apiserver\",verb=~\"LIST|GET\",code=~\"5..\"}[1d]))\n)\n/\nsum by (cluster) (rate(apiserver_request_total{job=\"apiserver\",verb=~\"LIST|GET\"}[1d]))\n",
"labels": {
"verb": "read"
},
"record": "apiserver_request:burnrate1d"
},
{
"expr": "(\n (\n # too slow\n sum(rate(apiserver_request_duration_seconds_count{job=\"apiserver\",verb=~\"LIST|GET\"}[1h]))\n -\n (\n (\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",scope=~\"resource|\",le=\"0.1\"}[1h]))\n or\n vector(0)\n )\n +\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",scope=\"namespace\",le=\"0.5\"}[1h]))\n +\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",scope=\"cluster\",le=\"5\"}[1h]))\n )\n )\n +\n # errors\n sum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"LIST|GET\",code=~\"5..\"}[1h]))\n)\n/\nsum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"LIST|GET\"}[1h]))\n",
"expr": "(\n (\n # too slow\n sum by (cluster) (rate(apiserver_request_duration_seconds_count{job=\"apiserver\",verb=~\"LIST|GET\",subresource!~\"proxy|attach|log|exec|portforward\"}[1h]))\n -\n (\n (\n sum by (cluster) (rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",subresource!~\"proxy|attach|log|exec|portforward\",scope=~\"resource|\",le=\"1\"}[1h]))\n or\n vector(0)\n )\n +\n sum by (cluster) (rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",subresource!~\"proxy|attach|log|exec|portforward\",scope=\"namespace\",le=\"5\"}[1h]))\n +\n sum by (cluster) (rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",subresource!~\"proxy|attach|log|exec|portforward\",scope=\"cluster\",le=\"30\"}[1h]))\n )\n )\n +\n # errors\n sum by (cluster) (rate(apiserver_request_total{job=\"apiserver\",verb=~\"LIST|GET\",code=~\"5..\"}[1h]))\n)\n/\nsum by (cluster) (rate(apiserver_request_total{job=\"apiserver\",verb=~\"LIST|GET\"}[1h]))\n",
"labels": {
"verb": "read"
},
"record": "apiserver_request:burnrate1h"
},
{
"expr": "(\n (\n # too slow\n sum(rate(apiserver_request_duration_seconds_count{job=\"apiserver\",verb=~\"LIST|GET\"}[2h]))\n -\n (\n (\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",scope=~\"resource|\",le=\"0.1\"}[2h]))\n or\n vector(0)\n )\n +\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",scope=\"namespace\",le=\"0.5\"}[2h]))\n +\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",scope=\"cluster\",le=\"5\"}[2h]))\n )\n )\n +\n # errors\n sum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"LIST|GET\",code=~\"5..\"}[2h]))\n)\n/\nsum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"LIST|GET\"}[2h]))\n",
"expr": "(\n (\n # too slow\n sum by (cluster) (rate(apiserver_request_duration_seconds_count{job=\"apiserver\",verb=~\"LIST|GET\",subresource!~\"proxy|attach|log|exec|portforward\"}[2h]))\n -\n (\n (\n sum by (cluster) (rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",subresource!~\"proxy|attach|log|exec|portforward\",scope=~\"resource|\",le=\"1\"}[2h]))\n or\n vector(0)\n )\n +\n sum by (cluster) (rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",subresource!~\"proxy|attach|log|exec|portforward\",scope=\"namespace\",le=\"5\"}[2h]))\n +\n sum by (cluster) (rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",subresource!~\"proxy|attach|log|exec|portforward\",scope=\"cluster\",le=\"30\"}[2h]))\n )\n )\n +\n # errors\n sum by (cluster) (rate(apiserver_request_total{job=\"apiserver\",verb=~\"LIST|GET\",code=~\"5..\"}[2h]))\n)\n/\nsum by (cluster) (rate(apiserver_request_total{job=\"apiserver\",verb=~\"LIST|GET\"}[2h]))\n",
"labels": {
"verb": "read"
},
"record": "apiserver_request:burnrate2h"
},
{
"expr": "(\n (\n # too slow\n sum(rate(apiserver_request_duration_seconds_count{job=\"apiserver\",verb=~\"LIST|GET\"}[30m]))\n -\n (\n (\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",scope=~\"resource|\",le=\"0.1\"}[30m]))\n or\n vector(0)\n )\n +\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",scope=\"namespace\",le=\"0.5\"}[30m]))\n +\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",scope=\"cluster\",le=\"5\"}[30m]))\n )\n )\n +\n # errors\n sum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"LIST|GET\",code=~\"5..\"}[30m]))\n)\n/\nsum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"LIST|GET\"}[30m]))\n",
"expr": "(\n (\n # too slow\n sum by (cluster) (rate(apiserver_request_duration_seconds_count{job=\"apiserver\",verb=~\"LIST|GET\",subresource!~\"proxy|attach|log|exec|portforward\"}[30m]))\n -\n (\n (\n sum by (cluster) (rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",subresource!~\"proxy|attach|log|exec|portforward\",scope=~\"resource|\",le=\"1\"}[30m]))\n or\n vector(0)\n )\n +\n sum by (cluster) (rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",subresource!~\"proxy|attach|log|exec|portforward\",scope=\"namespace\",le=\"5\"}[30m]))\n +\n sum by (cluster) (rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",subresource!~\"proxy|attach|log|exec|portforward\",scope=\"cluster\",le=\"30\"}[30m]))\n )\n )\n +\n # errors\n sum by (cluster) (rate(apiserver_request_total{job=\"apiserver\",verb=~\"LIST|GET\",code=~\"5..\"}[30m]))\n)\n/\nsum by (cluster) (rate(apiserver_request_total{job=\"apiserver\",verb=~\"LIST|GET\"}[30m]))\n",
"labels": {
"verb": "read"
},
"record": "apiserver_request:burnrate30m"
},
{
"expr": "(\n (\n # too slow\n sum(rate(apiserver_request_duration_seconds_count{job=\"apiserver\",verb=~\"LIST|GET\"}[3d]))\n -\n (\n (\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",scope=~\"resource|\",le=\"0.1\"}[3d]))\n or\n vector(0)\n )\n +\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",scope=\"namespace\",le=\"0.5\"}[3d]))\n +\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",scope=\"cluster\",le=\"5\"}[3d]))\n )\n )\n +\n # errors\n sum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"LIST|GET\",code=~\"5..\"}[3d]))\n)\n/\nsum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"LIST|GET\"}[3d]))\n",
"expr": "(\n (\n # too slow\n sum by (cluster) (rate(apiserver_request_duration_seconds_count{job=\"apiserver\",verb=~\"LIST|GET\",subresource!~\"proxy|attach|log|exec|portforward\"}[3d]))\n -\n (\n (\n sum by (cluster) (rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",subresource!~\"proxy|attach|log|exec|portforward\",scope=~\"resource|\",le=\"1\"}[3d]))\n or\n vector(0)\n )\n +\n sum by (cluster) (rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",subresource!~\"proxy|attach|log|exec|portforward\",scope=\"namespace\",le=\"5\"}[3d]))\n +\n sum by (cluster) (rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",subresource!~\"proxy|attach|log|exec|portforward\",scope=\"cluster\",le=\"30\"}[3d]))\n )\n )\n +\n # errors\n sum by (cluster) (rate(apiserver_request_total{job=\"apiserver\",verb=~\"LIST|GET\",code=~\"5..\"}[3d]))\n)\n/\nsum by (cluster) (rate(apiserver_request_total{job=\"apiserver\",verb=~\"LIST|GET\"}[3d]))\n",
"labels": {
"verb": "read"
},
"record": "apiserver_request:burnrate3d"
},
{
"expr": "(\n (\n # too slow\n sum(rate(apiserver_request_duration_seconds_count{job=\"apiserver\",verb=~\"LIST|GET\"}[5m]))\n -\n (\n (\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",scope=~\"resource|\",le=\"0.1\"}[5m]))\n or\n vector(0)\n )\n +\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",scope=\"namespace\",le=\"0.5\"}[5m]))\n +\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",scope=\"cluster\",le=\"5\"}[5m]))\n )\n )\n +\n # errors\n sum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"LIST|GET\",code=~\"5..\"}[5m]))\n)\n/\nsum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"LIST|GET\"}[5m]))\n",
"expr": "(\n (\n # too slow\n sum by (cluster) (rate(apiserver_request_duration_seconds_count{job=\"apiserver\",verb=~\"LIST|GET\",subresource!~\"proxy|attach|log|exec|portforward\"}[5m]))\n -\n (\n (\n sum by (cluster) (rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",subresource!~\"proxy|attach|log|exec|portforward\",scope=~\"resource|\",le=\"1\"}[5m]))\n or\n vector(0)\n )\n +\n sum by (cluster) (rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",subresource!~\"proxy|attach|log|exec|portforward\",scope=\"namespace\",le=\"5\"}[5m]))\n +\n sum by (cluster) (rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",subresource!~\"proxy|attach|log|exec|portforward\",scope=\"cluster\",le=\"30\"}[5m]))\n )\n )\n +\n # errors\n sum by (cluster) (rate(apiserver_request_total{job=\"apiserver\",verb=~\"LIST|GET\",code=~\"5..\"}[5m]))\n)\n/\nsum by (cluster) (rate(apiserver_request_total{job=\"apiserver\",verb=~\"LIST|GET\"}[5m]))\n",
"labels": {
"verb": "read"
},
"record": "apiserver_request:burnrate5m"
},
{
"expr": "(\n (\n # too slow\n sum(rate(apiserver_request_duration_seconds_count{job=\"apiserver\",verb=~\"LIST|GET\"}[6h]))\n -\n (\n (\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",scope=~\"resource|\",le=\"0.1\"}[6h]))\n or\n vector(0)\n )\n +\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",scope=\"namespace\",le=\"0.5\"}[6h]))\n +\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",scope=\"cluster\",le=\"5\"}[6h]))\n )\n )\n +\n # errors\n sum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"LIST|GET\",code=~\"5..\"}[6h]))\n)\n/\nsum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"LIST|GET\"}[6h]))\n",
"expr": "(\n (\n # too slow\n sum by (cluster) (rate(apiserver_request_duration_seconds_count{job=\"apiserver\",verb=~\"LIST|GET\",subresource!~\"proxy|attach|log|exec|portforward\"}[6h]))\n -\n (\n (\n sum by (cluster) (rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",subresource!~\"proxy|attach|log|exec|portforward\",scope=~\"resource|\",le=\"1\"}[6h]))\n or\n vector(0)\n )\n +\n sum by (cluster) (rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",subresource!~\"proxy|attach|log|exec|portforward\",scope=\"namespace\",le=\"5\"}[6h]))\n +\n sum by (cluster) (rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",subresource!~\"proxy|attach|log|exec|portforward\",scope=\"cluster\",le=\"30\"}[6h]))\n )\n )\n +\n # errors\n sum by (cluster) (rate(apiserver_request_total{job=\"apiserver\",verb=~\"LIST|GET\",code=~\"5..\"}[6h]))\n)\n/\nsum by (cluster) (rate(apiserver_request_total{job=\"apiserver\",verb=~\"LIST|GET\"}[6h]))\n",
"labels": {
"verb": "read"
},
"record": "apiserver_request:burnrate6h"
},
{
"expr": "(\n (\n # too slow\n sum(rate(apiserver_request_duration_seconds_count{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\"}[1d]))\n -\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\",le=\"1\"}[1d]))\n )\n +\n sum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\",code=~\"5..\"}[1d]))\n)\n/\nsum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\"}[1d]))\n",
"expr": "(\n (\n # too slow\n sum by (cluster) (rate(apiserver_request_duration_seconds_count{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\",subresource!~\"proxy|attach|log|exec|portforward\"}[1d]))\n -\n sum by (cluster) (rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\",subresource!~\"proxy|attach|log|exec|portforward\",le=\"1\"}[1d]))\n )\n +\n sum by (cluster) (rate(apiserver_request_total{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\",code=~\"5..\"}[1d]))\n)\n/\nsum by (cluster) (rate(apiserver_request_total{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\"}[1d]))\n",
"labels": {
"verb": "write"
},
"record": "apiserver_request:burnrate1d"
},
{
"expr": "(\n (\n # too slow\n sum(rate(apiserver_request_duration_seconds_count{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\"}[1h]))\n -\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\",le=\"1\"}[1h]))\n )\n +\n sum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\",code=~\"5..\"}[1h]))\n)\n/\nsum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\"}[1h]))\n",
"expr": "(\n (\n # too slow\n sum by (cluster) (rate(apiserver_request_duration_seconds_count{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\",subresource!~\"proxy|attach|log|exec|portforward\"}[1h]))\n -\n sum by (cluster) (rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\",subresource!~\"proxy|attach|log|exec|portforward\",le=\"1\"}[1h]))\n )\n +\n sum by (cluster) (rate(apiserver_request_total{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\",code=~\"5..\"}[1h]))\n)\n/\nsum by (cluster) (rate(apiserver_request_total{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\"}[1h]))\n",
"labels": {
"verb": "write"
},
"record": "apiserver_request:burnrate1h"
},
{
"expr": "(\n (\n # too slow\n sum(rate(apiserver_request_duration_seconds_count{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\"}[2h]))\n -\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\",le=\"1\"}[2h]))\n )\n +\n sum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\",code=~\"5..\"}[2h]))\n)\n/\nsum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\"}[2h]))\n",
"expr": "(\n (\n # too slow\n sum by (cluster) (rate(apiserver_request_duration_seconds_count{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\",subresource!~\"proxy|attach|log|exec|portforward\"}[2h]))\n -\n sum by (cluster) (rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\",subresource!~\"proxy|attach|log|exec|portforward\",le=\"1\"}[2h]))\n )\n +\n sum by (cluster) (rate(apiserver_request_total{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\",code=~\"5..\"}[2h]))\n)\n/\nsum by (cluster) (rate(apiserver_request_total{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\"}[2h]))\n",
"labels": {
"verb": "write"
},
"record": "apiserver_request:burnrate2h"
},
{
"expr": "(\n (\n # too slow\n sum(rate(apiserver_request_duration_seconds_count{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\"}[30m]))\n -\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\",le=\"1\"}[30m]))\n )\n +\n sum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\",code=~\"5..\"}[30m]))\n)\n/\nsum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\"}[30m]))\n",
"expr": "(\n (\n # too slow\n sum by (cluster) (rate(apiserver_request_duration_seconds_count{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\",subresource!~\"proxy|attach|log|exec|portforward\"}[30m]))\n -\n sum by (cluster) (rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\",subresource!~\"proxy|attach|log|exec|portforward\",le=\"1\"}[30m]))\n )\n +\n sum by (cluster) (rate(apiserver_request_total{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\",code=~\"5..\"}[30m]))\n)\n/\nsum by (cluster) (rate(apiserver_request_total{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\"}[30m]))\n",
"labels": {
"verb": "write"
},
"record": "apiserver_request:burnrate30m"
},
{
"expr": "(\n (\n # too slow\n sum(rate(apiserver_request_duration_seconds_count{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\"}[3d]))\n -\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\",le=\"1\"}[3d]))\n )\n +\n sum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\",code=~\"5..\"}[3d]))\n)\n/\nsum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\"}[3d]))\n",
"expr": "(\n (\n # too slow\n sum by (cluster) (rate(apiserver_request_duration_seconds_count{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\",subresource!~\"proxy|attach|log|exec|portforward\"}[3d]))\n -\n sum by (cluster) (rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\",subresource!~\"proxy|attach|log|exec|portforward\",le=\"1\"}[3d]))\n )\n +\n sum by (cluster) (rate(apiserver_request_total{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\",code=~\"5..\"}[3d]))\n)\n/\nsum by (cluster) (rate(apiserver_request_total{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\"}[3d]))\n",
"labels": {
"verb": "write"
},
"record": "apiserver_request:burnrate3d"
},
{
"expr": "(\n (\n # too slow\n sum(rate(apiserver_request_duration_seconds_count{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\"}[5m]))\n -\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\",le=\"1\"}[5m]))\n )\n +\n sum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\",code=~\"5..\"}[5m]))\n)\n/\nsum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\"}[5m]))\n",
"expr": "(\n (\n # too slow\n sum by (cluster) (rate(apiserver_request_duration_seconds_count{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\",subresource!~\"proxy|attach|log|exec|portforward\"}[5m]))\n -\n sum by (cluster) (rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\",subresource!~\"proxy|attach|log|exec|portforward\",le=\"1\"}[5m]))\n )\n +\n sum by (cluster) (rate(apiserver_request_total{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\",code=~\"5..\"}[5m]))\n)\n/\nsum by (cluster) (rate(apiserver_request_total{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\"}[5m]))\n",
"labels": {
"verb": "write"
},
"record": "apiserver_request:burnrate5m"
},
{
"expr": "(\n (\n # too slow\n sum(rate(apiserver_request_duration_seconds_count{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\"}[6h]))\n -\n sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\",le=\"1\"}[6h]))\n )\n +\n sum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\",code=~\"5..\"}[6h]))\n)\n/\nsum(rate(apiserver_request_total{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\"}[6h]))\n",
"expr": "(\n (\n # too slow\n sum by (cluster) (rate(apiserver_request_duration_seconds_count{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\",subresource!~\"proxy|attach|log|exec|portforward\"}[6h]))\n -\n sum by (cluster) (rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\",subresource!~\"proxy|attach|log|exec|portforward\",le=\"1\"}[6h]))\n )\n +\n sum by (cluster) (rate(apiserver_request_total{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\",code=~\"5..\"}[6h]))\n)\n/\nsum by (cluster) (rate(apiserver_request_total{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\"}[6h]))\n",
"labels": {
"verb": "write"
},
"record": "apiserver_request:burnrate6h"
}
]
},
{
"expr": "sum by (code,resource) (rate(apiserver_request_total{job=\"apiserver\",verb=~\"LIST|GET\"}[5m]))\n",
"labels": {
"verb": "read"
},
"record": "code_resource:apiserver_request_total:rate5m"
},
"name": "kube-apiserver-histogram.rules",
"rules": [
{
"expr": "sum by (code,resource) (rate(apiserver_request_total{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\"}[5m]))\n",
"labels": {
"verb": "write"
},
"record": "code_resource:apiserver_request_total:rate5m"
},
{
"expr": "histogram_quantile(0.99, sum by (le, resource) (rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\"}[5m]))) > 0\n",
"expr": "histogram_quantile(0.99, sum by (cluster, le, resource) (rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",subresource!~\"proxy|attach|log|exec|portforward\"}[5m]))) > 0\n",
"labels": {
"quantile": "0.99",
"verb": "read"
@ -314,33 +272,12 @@ data:
"record": "cluster_quantile:apiserver_request_duration_seconds:histogram_quantile"
},
{
"expr": "histogram_quantile(0.99, sum by (le, resource) (rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\"}[5m]))) > 0\n",
"expr": "histogram_quantile(0.99, sum by (cluster, le, resource) (rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\",subresource!~\"proxy|attach|log|exec|portforward\"}[5m]))) > 0\n",
"labels": {
"quantile": "0.99",
"verb": "write"
},
"record": "cluster_quantile:apiserver_request_duration_seconds:histogram_quantile"
},
{
"expr": "histogram_quantile(0.99, sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",subresource!=\"log\",verb!~\"LIST|WATCH|WATCHLIST|DELETECOLLECTION|PROXY|CONNECT\"}[5m])) without(instance, pod))\n",
"labels": {
"quantile": "0.99"
},
"record": "cluster_quantile:apiserver_request_duration_seconds:histogram_quantile"
},
{
"expr": "histogram_quantile(0.9, sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",subresource!=\"log\",verb!~\"LIST|WATCH|WATCHLIST|DELETECOLLECTION|PROXY|CONNECT\"}[5m])) without(instance, pod))\n",
"labels": {
"quantile": "0.9"
},
"record": "cluster_quantile:apiserver_request_duration_seconds:histogram_quantile"
},
{
"expr": "histogram_quantile(0.5, sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",subresource!=\"log\",verb!~\"LIST|WATCH|WATCHLIST|DELETECOLLECTION|PROXY|CONNECT\"}[5m])) without(instance, pod))\n",
"labels": {
"quantile": "0.5"
},
"record": "cluster_quantile:apiserver_request_duration_seconds:histogram_quantile"
}
]
},
@ -349,135 +286,89 @@ data:
"name": "kube-apiserver-availability.rules",
"rules": [
{
"expr": "1 - (\n (\n # write too slow\n sum(increase(apiserver_request_duration_seconds_count{verb=~\"POST|PUT|PATCH|DELETE\"}[30d]))\n -\n sum(increase(apiserver_request_duration_seconds_bucket{verb=~\"POST|PUT|PATCH|DELETE\",le=\"1\"}[30d]))\n ) +\n (\n # read too slow\n sum(increase(apiserver_request_duration_seconds_count{verb=~\"LIST|GET\"}[30d]))\n -\n (\n (\n sum(increase(apiserver_request_duration_seconds_bucket{verb=~\"LIST|GET\",scope=~\"resource|\",le=\"0.1\"}[30d]))\n or\n vector(0)\n )\n +\n sum(increase(apiserver_request_duration_seconds_bucket{verb=~\"LIST|GET\",scope=\"namespace\",le=\"0.5\"}[30d]))\n +\n sum(increase(apiserver_request_duration_seconds_bucket{verb=~\"LIST|GET\",scope=\"cluster\",le=\"5\"}[30d]))\n )\n ) +\n # errors\n sum(code:apiserver_request_total:increase30d{code=~\"5..\"} or vector(0))\n)\n/\nsum(code:apiserver_request_total:increase30d)\n",
"expr": "avg_over_time(code_verb:apiserver_request_total:increase1h[30d]) * 24 * 30\n",
"record": "code_verb:apiserver_request_total:increase30d"
},
{
"expr": "sum by (cluster, code) (code_verb:apiserver_request_total:increase30d{verb=~\"LIST|GET\"})\n",
"labels": {
"verb": "read"
},
"record": "code:apiserver_request_total:increase30d"
},
{
"expr": "sum by (cluster, code) (code_verb:apiserver_request_total:increase30d{verb=~\"POST|PUT|PATCH|DELETE\"})\n",
"labels": {
"verb": "write"
},
"record": "code:apiserver_request_total:increase30d"
},
{
"expr": "sum by (cluster, verb, scope) (increase(apiserver_request_duration_seconds_count[1h]))\n",
"record": "cluster_verb_scope:apiserver_request_duration_seconds_count:increase1h"
},
{
"expr": "sum by (cluster, verb, scope) (avg_over_time(cluster_verb_scope:apiserver_request_duration_seconds_count:increase1h[30d]) * 24 * 30)\n",
"record": "cluster_verb_scope:apiserver_request_duration_seconds_count:increase30d"
},
{
"expr": "sum by (cluster, verb, scope, le) (increase(apiserver_request_duration_seconds_bucket[1h]))\n",
"record": "cluster_verb_scope_le:apiserver_request_duration_seconds_bucket:increase1h"
},
{
"expr": "sum by (cluster, verb, scope, le) (avg_over_time(cluster_verb_scope_le:apiserver_request_duration_seconds_bucket:increase1h[30d]) * 24 * 30)\n",
"record": "cluster_verb_scope_le:apiserver_request_duration_seconds_bucket:increase30d"
},
{
"expr": "1 - (\n (\n # write too slow\n sum by (cluster) (cluster_verb_scope:apiserver_request_duration_seconds_count:increase30d{verb=~\"POST|PUT|PATCH|DELETE\"})\n -\n sum by (cluster) (cluster_verb_scope_le:apiserver_request_duration_seconds_bucket:increase30d{verb=~\"POST|PUT|PATCH|DELETE\",le=\"1\"})\n ) +\n (\n # read too slow\n sum by (cluster) (cluster_verb_scope:apiserver_request_duration_seconds_count:increase30d{verb=~\"LIST|GET\"})\n -\n (\n (\n sum by (cluster) (cluster_verb_scope_le:apiserver_request_duration_seconds_bucket:increase30d{verb=~\"LIST|GET\",scope=~\"resource|\",le=\"1\"})\n or\n vector(0)\n )\n +\n sum by (cluster) (cluster_verb_scope_le:apiserver_request_duration_seconds_bucket:increase30d{verb=~\"LIST|GET\",scope=\"namespace\",le=\"5\"})\n +\n sum by (cluster) (cluster_verb_scope_le:apiserver_request_duration_seconds_bucket:increase30d{verb=~\"LIST|GET\",scope=\"cluster\",le=\"30\"})\n )\n ) +\n # errors\n sum by (cluster) (code:apiserver_request_total:increase30d{code=~\"5..\"} or vector(0))\n)\n/\nsum by (cluster) (code:apiserver_request_total:increase30d)\n",
"labels": {
"verb": "all"
},
"record": "apiserver_request:availability30d"
},
{
"expr": "1 - (\n sum(increase(apiserver_request_duration_seconds_count{job=\"apiserver\",verb=~\"LIST|GET\"}[30d]))\n -\n (\n # too slow\n (\n sum(increase(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",scope=~\"resource|\",le=\"0.1\"}[30d]))\n or\n vector(0)\n )\n +\n sum(increase(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",scope=\"namespace\",le=\"0.5\"}[30d]))\n +\n sum(increase(apiserver_request_duration_seconds_bucket{job=\"apiserver\",verb=~\"LIST|GET\",scope=\"cluster\",le=\"5\"}[30d]))\n )\n +\n # errors\n sum(code:apiserver_request_total:increase30d{verb=\"read\",code=~\"5..\"} or vector(0))\n)\n/\nsum(code:apiserver_request_total:increase30d{verb=\"read\"})\n",
"expr": "1 - (\n sum by (cluster) (cluster_verb_scope:apiserver_request_duration_seconds_count:increase30d{verb=~\"LIST|GET\"})\n -\n (\n # too slow\n (\n sum by (cluster) (cluster_verb_scope_le:apiserver_request_duration_seconds_bucket:increase30d{verb=~\"LIST|GET\",scope=~\"resource|\",le=\"1\"})\n or\n vector(0)\n )\n +\n sum by (cluster) (cluster_verb_scope_le:apiserver_request_duration_seconds_bucket:increase30d{verb=~\"LIST|GET\",scope=\"namespace\",le=\"5\"})\n +\n sum by (cluster) (cluster_verb_scope_le:apiserver_request_duration_seconds_bucket:increase30d{verb=~\"LIST|GET\",scope=\"cluster\",le=\"30\"})\n )\n +\n # errors\n sum by (cluster) (code:apiserver_request_total:increase30d{verb=\"read\",code=~\"5..\"} or vector(0))\n)\n/\nsum by (cluster) (code:apiserver_request_total:increase30d{verb=\"read\"})\n",
"labels": {
"verb": "read"
},
"record": "apiserver_request:availability30d"
},
{
"expr": "1 - (\n (\n # too slow\n sum(increase(apiserver_request_duration_seconds_count{verb=~\"POST|PUT|PATCH|DELETE\"}[30d]))\n -\n sum(increase(apiserver_request_duration_seconds_bucket{verb=~\"POST|PUT|PATCH|DELETE\",le=\"1\"}[30d]))\n )\n +\n # errors\n sum(code:apiserver_request_total:increase30d{verb=\"write\",code=~\"5..\"} or vector(0))\n)\n/\nsum(code:apiserver_request_total:increase30d{verb=\"write\"})\n",
"expr": "1 - (\n (\n # too slow\n sum by (cluster) (cluster_verb_scope:apiserver_request_duration_seconds_count:increase30d{verb=~\"POST|PUT|PATCH|DELETE\"})\n -\n sum by (cluster) (cluster_verb_scope_le:apiserver_request_duration_seconds_bucket:increase30d{verb=~\"POST|PUT|PATCH|DELETE\",le=\"1\"})\n )\n +\n # errors\n sum by (cluster) (code:apiserver_request_total:increase30d{verb=\"write\",code=~\"5..\"} or vector(0))\n)\n/\nsum by (cluster) (code:apiserver_request_total:increase30d{verb=\"write\"})\n",
"labels": {
"verb": "write"
},
"record": "apiserver_request:availability30d"
},
{
"expr": "sum by (code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=\"LIST\",code=~\"2..\"}[30d]))\n",
"record": "code_verb:apiserver_request_total:increase30d"
},
{
"expr": "sum by (code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=\"GET\",code=~\"2..\"}[30d]))\n",
"record": "code_verb:apiserver_request_total:increase30d"
},
{
"expr": "sum by (code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=\"POST\",code=~\"2..\"}[30d]))\n",
"record": "code_verb:apiserver_request_total:increase30d"
},
{
"expr": "sum by (code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=\"PUT\",code=~\"2..\"}[30d]))\n",
"record": "code_verb:apiserver_request_total:increase30d"
},
{
"expr": "sum by (code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=\"PATCH\",code=~\"2..\"}[30d]))\n",
"record": "code_verb:apiserver_request_total:increase30d"
},
{
"expr": "sum by (code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=\"DELETE\",code=~\"2..\"}[30d]))\n",
"record": "code_verb:apiserver_request_total:increase30d"
},
{
"expr": "sum by (code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=\"LIST\",code=~\"3..\"}[30d]))\n",
"record": "code_verb:apiserver_request_total:increase30d"
},
{
"expr": "sum by (code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=\"GET\",code=~\"3..\"}[30d]))\n",
"record": "code_verb:apiserver_request_total:increase30d"
},
{
"expr": "sum by (code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=\"POST\",code=~\"3..\"}[30d]))\n",
"record": "code_verb:apiserver_request_total:increase30d"
},
{
"expr": "sum by (code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=\"PUT\",code=~\"3..\"}[30d]))\n",
"record": "code_verb:apiserver_request_total:increase30d"
},
{
"expr": "sum by (code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=\"PATCH\",code=~\"3..\"}[30d]))\n",
"record": "code_verb:apiserver_request_total:increase30d"
},
{
"expr": "sum by (code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=\"DELETE\",code=~\"3..\"}[30d]))\n",
"record": "code_verb:apiserver_request_total:increase30d"
},
{
"expr": "sum by (code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=\"LIST\",code=~\"4..\"}[30d]))\n",
"record": "code_verb:apiserver_request_total:increase30d"
},
{
"expr": "sum by (code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=\"GET\",code=~\"4..\"}[30d]))\n",
"record": "code_verb:apiserver_request_total:increase30d"
},
{
"expr": "sum by (code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=\"POST\",code=~\"4..\"}[30d]))\n",
"record": "code_verb:apiserver_request_total:increase30d"
},
{
"expr": "sum by (code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=\"PUT\",code=~\"4..\"}[30d]))\n",
"record": "code_verb:apiserver_request_total:increase30d"
},
{
"expr": "sum by (code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=\"PATCH\",code=~\"4..\"}[30d]))\n",
"record": "code_verb:apiserver_request_total:increase30d"
},
{
"expr": "sum by (code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=\"DELETE\",code=~\"4..\"}[30d]))\n",
"record": "code_verb:apiserver_request_total:increase30d"
},
{
"expr": "sum by (code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=\"LIST\",code=~\"5..\"}[30d]))\n",
"record": "code_verb:apiserver_request_total:increase30d"
},
{
"expr": "sum by (code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=\"GET\",code=~\"5..\"}[30d]))\n",
"record": "code_verb:apiserver_request_total:increase30d"
},
{
"expr": "sum by (code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=\"POST\",code=~\"5..\"}[30d]))\n",
"record": "code_verb:apiserver_request_total:increase30d"
},
{
"expr": "sum by (code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=\"PUT\",code=~\"5..\"}[30d]))\n",
"record": "code_verb:apiserver_request_total:increase30d"
},
{
"expr": "sum by (code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=\"PATCH\",code=~\"5..\"}[30d]))\n",
"record": "code_verb:apiserver_request_total:increase30d"
},
{
"expr": "sum by (code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=\"DELETE\",code=~\"5..\"}[30d]))\n",
"record": "code_verb:apiserver_request_total:increase30d"
},
{
"expr": "sum by (code) (code_verb:apiserver_request_total:increase30d{verb=~\"LIST|GET\"})\n",
"expr": "sum by (cluster,code,resource) (rate(apiserver_request_total{job=\"apiserver\",verb=~\"LIST|GET\"}[5m]))\n",
"labels": {
"verb": "read"
},
"record": "code:apiserver_request_total:increase30d"
"record": "code_resource:apiserver_request_total:rate5m"
},
{
"expr": "sum by (code) (code_verb:apiserver_request_total:increase30d{verb=~\"POST|PUT|PATCH|DELETE\"})\n",
"expr": "sum by (cluster,code,resource) (rate(apiserver_request_total{job=\"apiserver\",verb=~\"POST|PUT|PATCH|DELETE\"}[5m]))\n",
"labels": {
"verb": "write"
},
"record": "code:apiserver_request_total:increase30d"
"record": "code_resource:apiserver_request_total:rate5m"
},
{
"expr": "sum by (cluster, code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=~\"LIST|GET|POST|PUT|PATCH|DELETE\",code=~\"2..\"}[1h]))\n",
"record": "code_verb:apiserver_request_total:increase1h"
},
{
"expr": "sum by (cluster, code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=~\"LIST|GET|POST|PUT|PATCH|DELETE\",code=~\"3..\"}[1h]))\n",
"record": "code_verb:apiserver_request_total:increase1h"
},
{
"expr": "sum by (cluster, code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=~\"LIST|GET|POST|PUT|PATCH|DELETE\",code=~\"4..\"}[1h]))\n",
"record": "code_verb:apiserver_request_total:increase1h"
},
{
"expr": "sum by (cluster, code, verb) (increase(apiserver_request_total{job=\"apiserver\",verb=~\"LIST|GET|POST|PUT|PATCH|DELETE\",code=~\"5..\"}[1h]))\n",
"record": "code_verb:apiserver_request_total:increase1h"
}
]
},
@ -485,8 +376,8 @@ data:
"name": "k8s.rules",
"rules": [
{
"expr": "sum by (cluster, namespace, pod, container) (\n rate(container_cpu_usage_seconds_total{job=\"kubernetes-cadvisor\", image!=\"\", container!=\"POD\"}[5m])\n) * on (cluster, namespace, pod) group_left(node) topk by (cluster, namespace, pod) (\n 1, max by(cluster, namespace, pod, node) (kube_pod_info{node!=\"\"})\n)\n",
"record": "node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate"
"expr": "sum by (cluster, namespace, pod, container) (\n irate(container_cpu_usage_seconds_total{job=\"kubernetes-cadvisor\", image!=\"\"}[5m])\n) * on (cluster, namespace, pod) group_left(node) topk by (cluster, namespace, pod) (\n 1, max by(cluster, namespace, pod, node) (kube_pod_info{node!=\"\"})\n)\n",
"record": "node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate"
},
{
"expr": "container_memory_working_set_bytes{job=\"kubernetes-cadvisor\", image!=\"\"}\n* on (namespace, pod) group_left(node) topk by(namespace, pod) (1,\n max by(namespace, pod, node) (kube_pod_info{node!=\"\"})\n)\n",
@ -505,12 +396,36 @@ data:
"record": "node_namespace_pod_container:container_memory_swap"
},
{
"expr": "sum by (namespace) (\n sum by (namespace, pod) (\n max by (namespace, pod, container) (\n kube_pod_container_resource_requests_memory_bytes{job=\"kube-state-metrics\"}\n ) * on(namespace, pod) group_left() max by (namespace, pod) (\n kube_pod_status_phase{phase=~\"Pending|Running\"} == 1\n )\n )\n)\n",
"record": "namespace:kube_pod_container_resource_requests_memory_bytes:sum"
"expr": "kube_pod_container_resource_requests{resource=\"memory\",job=\"kube-state-metrics\"} * on (namespace, pod, cluster)\ngroup_left() max by (namespace, pod, cluster) (\n (kube_pod_status_phase{phase=~\"Pending|Running\"} == 1)\n)\n",
"record": "cluster:namespace:pod_memory:active:kube_pod_container_resource_requests"
},
{
"expr": "sum by (namespace) (\n sum by (namespace, pod) (\n max by (namespace, pod, container) (\n kube_pod_container_resource_requests_cpu_cores{job=\"kube-state-metrics\"}\n ) * on(namespace, pod) group_left() max by (namespace, pod) (\n kube_pod_status_phase{phase=~\"Pending|Running\"} == 1\n )\n )\n)\n",
"record": "namespace:kube_pod_container_resource_requests_cpu_cores:sum"
"expr": "sum by (namespace, cluster) (\n sum by (namespace, pod, cluster) (\n max by (namespace, pod, container, cluster) (\n kube_pod_container_resource_requests{resource=\"memory\",job=\"kube-state-metrics\"}\n ) * on(namespace, pod, cluster) group_left() max by (namespace, pod, cluster) (\n kube_pod_status_phase{phase=~\"Pending|Running\"} == 1\n )\n )\n)\n",
"record": "namespace_memory:kube_pod_container_resource_requests:sum"
},
{
"expr": "kube_pod_container_resource_requests{resource=\"cpu\",job=\"kube-state-metrics\"} * on (namespace, pod, cluster)\ngroup_left() max by (namespace, pod, cluster) (\n (kube_pod_status_phase{phase=~\"Pending|Running\"} == 1)\n)\n",
"record": "cluster:namespace:pod_cpu:active:kube_pod_container_resource_requests"
},
{
"expr": "sum by (namespace, cluster) (\n sum by (namespace, pod, cluster) (\n max by (namespace, pod, container, cluster) (\n kube_pod_container_resource_requests{resource=\"cpu\",job=\"kube-state-metrics\"}\n ) * on(namespace, pod, cluster) group_left() max by (namespace, pod, cluster) (\n kube_pod_status_phase{phase=~\"Pending|Running\"} == 1\n )\n )\n)\n",
"record": "namespace_cpu:kube_pod_container_resource_requests:sum"
},
{
"expr": "kube_pod_container_resource_limits{resource=\"memory\",job=\"kube-state-metrics\"} * on (namespace, pod, cluster)\ngroup_left() max by (namespace, pod, cluster) (\n (kube_pod_status_phase{phase=~\"Pending|Running\"} == 1)\n)\n",
"record": "cluster:namespace:pod_memory:active:kube_pod_container_resource_limits"
},
{
"expr": "sum by (namespace, cluster) (\n sum by (namespace, pod, cluster) (\n max by (namespace, pod, container, cluster) (\n kube_pod_container_resource_limits{resource=\"memory\",job=\"kube-state-metrics\"}\n ) * on(namespace, pod, cluster) group_left() max by (namespace, pod, cluster) (\n kube_pod_status_phase{phase=~\"Pending|Running\"} == 1\n )\n )\n)\n",
"record": "namespace_memory:kube_pod_container_resource_limits:sum"
},
{
"expr": "kube_pod_container_resource_limits{resource=\"cpu\",job=\"kube-state-metrics\"} * on (namespace, pod, cluster)\ngroup_left() max by (namespace, pod, cluster) (\n (kube_pod_status_phase{phase=~\"Pending|Running\"} == 1)\n )\n",
"record": "cluster:namespace:pod_cpu:active:kube_pod_container_resource_limits"
},
{
"expr": "sum by (namespace, cluster) (\n sum by (namespace, pod, cluster) (\n max by (namespace, pod, container, cluster) (\n kube_pod_container_resource_limits{resource=\"cpu\",job=\"kube-state-metrics\"}\n ) * on(namespace, pod, cluster) group_left() max by (namespace, pod, cluster) (\n kube_pod_status_phase{phase=~\"Pending|Running\"} == 1\n )\n )\n)\n",
"record": "namespace_cpu:kube_pod_container_resource_limits:sum"
},
{
"expr": "max by (cluster, namespace, workload, pod) (\n label_replace(\n label_replace(\n kube_pod_owner{job=\"kube-state-metrics\", owner_kind=\"ReplicaSet\"},\n \"replicaset\", \"$1\", \"owner_name\", \"(.*)\"\n ) * on(replicaset, namespace) group_left(owner_name) topk by(replicaset, namespace) (\n 1, max by (replicaset, namespace, owner_name) (\n kube_replicaset_owner{job=\"kube-state-metrics\"}\n )\n ),\n \"workload\", \"$1\", \"owner_name\", \"(.*)\"\n )\n)\n",
@ -532,6 +447,13 @@ data:
"workload_type": "statefulset"
},
"record": "namespace_workload_pod:kube_pod_owner:relabel"
},
{
"expr": "max by (cluster, namespace, workload, pod) (\n label_replace(\n kube_pod_owner{job=\"kube-state-metrics\", owner_kind=\"Job\"},\n \"workload\", \"$1\", \"owner_name\", \"(.*)\"\n )\n)\n",
"labels": {
"workload_type": "job"
},
"record": "namespace_workload_pod:kube_pod_owner:relabel"
}
]
},
@ -611,12 +533,16 @@ data:
"record": "node_namespace_pod:kube_pod_info:"
},
{
"expr": "count by (cluster, node) (sum by (node, cpu) (\n node_cpu_seconds_total{job=\"node-exporter\"}\n* on (namespace, pod) group_left(node)\n node_namespace_pod:kube_pod_info:\n))\n",
"expr": "count by (cluster, node) (sum by (node, cpu) (\n node_cpu_seconds_total{job=\"node-exporter\"}\n* on (namespace, pod) group_left(node)\n topk by(namespace, pod) (1, node_namespace_pod:kube_pod_info:)\n))\n",
"record": "node:node_num_cpu:sum"
},
{
"expr": "sum(\n node_memory_MemAvailable_bytes{job=\"node-exporter\"} or\n (\n node_memory_Buffers_bytes{job=\"node-exporter\"} +\n node_memory_Cached_bytes{job=\"node-exporter\"} +\n node_memory_MemFree_bytes{job=\"node-exporter\"} +\n node_memory_Slab_bytes{job=\"node-exporter\"}\n )\n) by (cluster)\n",
"record": ":node_memory_MemAvailable_bytes:sum"
},
{
"expr": "sum(rate(node_cpu_seconds_total{job=\"node-exporter\",mode!=\"idle\",mode!=\"iowait\",mode!=\"steal\"}[5m])) /\ncount(sum(node_cpu_seconds_total{job=\"node-exporter\"}) by (cluster, instance, cpu))\n",
"record": "cluster:node_cpu:ratio_rate5m"
}
]
},
@ -624,21 +550,21 @@ data:
"name": "kubelet.rules",
"rules": [
{
"expr": "histogram_quantile(0.99, sum(rate(kubelet_pleg_relist_duration_seconds_bucket[5m])) by (instance, le) * on(instance) group_left(node) kubelet_node_name{job=\"kubelet\"})\n",
"expr": "histogram_quantile(0.99, sum(rate(kubelet_pleg_relist_duration_seconds_bucket[5m])) by (cluster, instance, le) * on(cluster, instance) group_left(node) kubelet_node_name{job=\"kubelet\"})\n",
"labels": {
"quantile": "0.99"
},
"record": "node_quantile:kubelet_pleg_relist_duration_seconds:histogram_quantile"
},
{
"expr": "histogram_quantile(0.9, sum(rate(kubelet_pleg_relist_duration_seconds_bucket[5m])) by (instance, le) * on(instance) group_left(node) kubelet_node_name{job=\"kubelet\"})\n",
"expr": "histogram_quantile(0.9, sum(rate(kubelet_pleg_relist_duration_seconds_bucket[5m])) by (cluster, instance, le) * on(cluster, instance) group_left(node) kubelet_node_name{job=\"kubelet\"})\n",
"labels": {
"quantile": "0.9"
},
"record": "node_quantile:kubelet_pleg_relist_duration_seconds:histogram_quantile"
},
{
"expr": "histogram_quantile(0.5, sum(rate(kubelet_pleg_relist_duration_seconds_bucket[5m])) by (instance, le) * on(instance) group_left(node) kubelet_node_name{job=\"kubelet\"})\n",
"expr": "histogram_quantile(0.5, sum(rate(kubelet_pleg_relist_duration_seconds_bucket[5m])) by (cluster, instance, le) * on(cluster, instance) group_left(node) kubelet_node_name{job=\"kubelet\"})\n",
"labels": {
"quantile": "0.5"
},
@ -652,11 +578,11 @@ data:
{
"alert": "KubePodCrashLooping",
"annotations": {
"description": "Pod {{ $labels.namespace }}/{{ $labels.pod }} ({{ $labels.container }}) is restarting {{ printf \"%.2f\" $value }} times / 5 minutes.",
"description": "Pod {{ $labels.namespace }}/{{ $labels.pod }} ({{ $labels.container }}) is in waiting state (reason: \"CrashLoopBackOff\").",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubepodcrashlooping",
"summary": "Pod is crash looping."
},
"expr": "rate(kube_pod_container_status_restarts_total{job=\"kube-state-metrics\"}[5m]) * 60 * 5 > 0\n",
"expr": "max_over_time(kube_pod_container_status_waiting_reason{reason=\"CrashLoopBackOff\", job=\"kube-state-metrics\"}[5m]) >= 1\n",
"for": "15m",
"labels": {
"severity": "warning"
@ -695,7 +621,7 @@ data:
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubedeploymentreplicasmismatch",
"summary": "Deployment has not matched the expected number of replicas."
},
"expr": "(\n kube_deployment_spec_replicas{job=\"kube-state-metrics\"}\n !=\n kube_deployment_status_replicas_available{job=\"kube-state-metrics\"}\n) and (\n changes(kube_deployment_status_replicas_updated{job=\"kube-state-metrics\"}[5m])\n ==\n 0\n)\n",
"expr": "(\n kube_deployment_spec_replicas{job=\"kube-state-metrics\"}\n >\n kube_deployment_status_replicas_available{job=\"kube-state-metrics\"}\n) and (\n changes(kube_deployment_status_replicas_updated{job=\"kube-state-metrics\"}[10m])\n ==\n 0\n)\n",
"for": "15m",
"labels": {
"severity": "warning"
@ -708,7 +634,7 @@ data:
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubestatefulsetreplicasmismatch",
"summary": "Deployment has not matched the expected number of replicas."
},
"expr": "(\n kube_statefulset_status_replicas_ready{job=\"kube-state-metrics\"}\n !=\n kube_statefulset_status_replicas{job=\"kube-state-metrics\"}\n) and (\n changes(kube_statefulset_status_replicas_updated{job=\"kube-state-metrics\"}[5m])\n ==\n 0\n)\n",
"expr": "(\n kube_statefulset_status_replicas_ready{job=\"kube-state-metrics\"}\n !=\n kube_statefulset_status_replicas{job=\"kube-state-metrics\"}\n) and (\n changes(kube_statefulset_status_replicas_updated{job=\"kube-state-metrics\"}[10m])\n ==\n 0\n)\n",
"for": "15m",
"labels": {
"severity": "warning"
@ -747,7 +673,7 @@ data:
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubedaemonsetrolloutstuck",
"summary": "DaemonSet rollout is stuck."
},
"expr": "(\n (\n kube_daemonset_status_current_number_scheduled{job=\"kube-state-metrics\"}\n !=\n kube_daemonset_status_desired_number_scheduled{job=\"kube-state-metrics\"}\n ) or (\n kube_daemonset_status_number_misscheduled{job=\"kube-state-metrics\"}\n !=\n 0\n ) or (\n kube_daemonset_updated_number_scheduled{job=\"kube-state-metrics\"}\n !=\n kube_daemonset_status_desired_number_scheduled{job=\"kube-state-metrics\"}\n ) or (\n kube_daemonset_status_number_available{job=\"kube-state-metrics\"}\n !=\n kube_daemonset_status_desired_number_scheduled{job=\"kube-state-metrics\"}\n )\n) and (\n changes(kube_daemonset_updated_number_scheduled{job=\"kube-state-metrics\"}[5m])\n ==\n 0\n)\n",
"expr": "(\n (\n kube_daemonset_status_current_number_scheduled{job=\"kube-state-metrics\"}\n !=\n kube_daemonset_status_desired_number_scheduled{job=\"kube-state-metrics\"}\n ) or (\n kube_daemonset_status_number_misscheduled{job=\"kube-state-metrics\"}\n !=\n 0\n ) or (\n kube_daemonset_status_updated_number_scheduled{job=\"kube-state-metrics\"}\n !=\n kube_daemonset_status_desired_number_scheduled{job=\"kube-state-metrics\"}\n ) or (\n kube_daemonset_status_number_available{job=\"kube-state-metrics\"}\n !=\n kube_daemonset_status_desired_number_scheduled{job=\"kube-state-metrics\"}\n )\n) and (\n changes(kube_daemonset_status_updated_number_scheduled{job=\"kube-state-metrics\"}[5m])\n ==\n 0\n)\n",
"for": "15m",
"labels": {
"severity": "warning"
@ -756,7 +682,7 @@ data:
{
"alert": "KubeContainerWaiting",
"annotations": {
"description": "Pod {{ $labels.namespace }}/{{ $labels.pod }} container {{ $labels.container}} has been in waiting state for longer than 1 hour.",
"description": "pod/{{ $labels.pod }} in namespace {{ $labels.namespace }} on container {{ $labels.container}} has been in waiting state for longer than 1 hour.",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubecontainerwaiting",
"summary": "Pod container waiting longer than 1 hour"
},
@ -821,11 +747,11 @@ data:
{
"alert": "KubeHpaReplicasMismatch",
"annotations": {
"description": "HPA {{ $labels.namespace }}/{{ $labels.hpa }} has not matched the desired number of replicas for longer than 15 minutes.",
"description": "HPA {{ $labels.namespace }}/{{ $labels.horizontalpodautoscaler }} has not matched the desired number of replicas for longer than 15 minutes.",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubehpareplicasmismatch",
"summary": "HPA has not matched descired number of replicas."
},
"expr": "(kube_hpa_status_desired_replicas{job=\"kube-state-metrics\"}\n !=\nkube_hpa_status_current_replicas{job=\"kube-state-metrics\"})\n and\n(kube_hpa_status_current_replicas{job=\"kube-state-metrics\"}\n >\nkube_hpa_spec_min_replicas{job=\"kube-state-metrics\"})\n and\n(kube_hpa_status_current_replicas{job=\"kube-state-metrics\"}\n <\nkube_hpa_spec_max_replicas{job=\"kube-state-metrics\"})\n and\nchanges(kube_hpa_status_current_replicas[15m]) == 0\n",
"expr": "(kube_horizontalpodautoscaler_status_desired_replicas{job=\"kube-state-metrics\"}\n !=\nkube_horizontalpodautoscaler_status_current_replicas{job=\"kube-state-metrics\"})\n and\n(kube_horizontalpodautoscaler_status_current_replicas{job=\"kube-state-metrics\"}\n >\nkube_horizontalpodautoscaler_spec_min_replicas{job=\"kube-state-metrics\"})\n and\n(kube_horizontalpodautoscaler_status_current_replicas{job=\"kube-state-metrics\"}\n <\nkube_horizontalpodautoscaler_spec_max_replicas{job=\"kube-state-metrics\"})\n and\nchanges(kube_horizontalpodautoscaler_status_current_replicas{job=\"kube-state-metrics\"}[15m]) == 0\n",
"for": "15m",
"labels": {
"severity": "warning"
@ -834,11 +760,11 @@ data:
{
"alert": "KubeHpaMaxedOut",
"annotations": {
"description": "HPA {{ $labels.namespace }}/{{ $labels.hpa }} has been running at max replicas for longer than 15 minutes.",
"description": "HPA {{ $labels.namespace }}/{{ $labels.horizontalpodautoscaler }} has been running at max replicas for longer than 15 minutes.",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubehpamaxedout",
"summary": "HPA is running at max replicas"
},
"expr": "kube_hpa_status_current_replicas{job=\"kube-state-metrics\"}\n ==\nkube_hpa_spec_max_replicas{job=\"kube-state-metrics\"}\n",
"expr": "kube_horizontalpodautoscaler_status_current_replicas{job=\"kube-state-metrics\"}\n ==\nkube_horizontalpodautoscaler_spec_max_replicas{job=\"kube-state-metrics\"}\n",
"for": "15m",
"labels": {
"severity": "warning"
@ -852,12 +778,12 @@ data:
{
"alert": "KubeCPUOvercommit",
"annotations": {
"description": "Cluster has overcommitted CPU resource requests for Pods and cannot tolerate node failure.",
"description": "Cluster has overcommitted CPU resource requests for Pods by {{ $value }} CPU shares and cannot tolerate node failure.",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubecpuovercommit",
"summary": "Cluster has overcommitted CPU resource requests."
},
"expr": "sum(namespace:kube_pod_container_resource_requests_cpu_cores:sum{})\n /\nsum(kube_node_status_allocatable_cpu_cores)\n >\n(count(kube_node_status_allocatable_cpu_cores)-1) / count(kube_node_status_allocatable_cpu_cores)\n",
"for": "5m",
"expr": "sum(namespace_cpu:kube_pod_container_resource_requests:sum{}) - (sum(kube_node_status_allocatable{resource=\"cpu\"}) - max(kube_node_status_allocatable{resource=\"cpu\"})) > 0\nand\n(sum(kube_node_status_allocatable{resource=\"cpu\"}) - max(kube_node_status_allocatable{resource=\"cpu\"})) > 0\n",
"for": "10m",
"labels": {
"severity": "warning"
}
@ -865,12 +791,12 @@ data:
{
"alert": "KubeMemoryOvercommit",
"annotations": {
"description": "Cluster has overcommitted memory resource requests for Pods and cannot tolerate node failure.",
"description": "Cluster has overcommitted memory resource requests for Pods by {{ $value | humanize }} bytes and cannot tolerate node failure.",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubememoryovercommit",
"summary": "Cluster has overcommitted memory resource requests."
},
"expr": "sum(namespace:kube_pod_container_resource_requests_memory_bytes:sum{})\n /\nsum(kube_node_status_allocatable_memory_bytes)\n >\n(count(kube_node_status_allocatable_memory_bytes)-1)\n /\ncount(kube_node_status_allocatable_memory_bytes)\n",
"for": "5m",
"expr": "sum(namespace_memory:kube_pod_container_resource_requests:sum{}) - (sum(kube_node_status_allocatable{resource=\"memory\"}) - max(kube_node_status_allocatable{resource=\"memory\"})) > 0\nand\n(sum(kube_node_status_allocatable{resource=\"memory\"}) - max(kube_node_status_allocatable{resource=\"memory\"})) > 0\n",
"for": "10m",
"labels": {
"severity": "warning"
}
@ -882,7 +808,7 @@ data:
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubecpuquotaovercommit",
"summary": "Cluster has overcommitted CPU resource requests."
},
"expr": "sum(kube_resourcequota{job=\"kube-state-metrics\", type=\"hard\", resource=\"cpu\"})\n /\nsum(kube_node_status_allocatable_cpu_cores)\n > 1.5\n",
"expr": "sum(min without(resource) (kube_resourcequota{job=\"kube-state-metrics\", type=\"hard\", resource=~\"(cpu|requests.cpu)\"}))\n /\nsum(kube_node_status_allocatable{resource=\"cpu\", job=\"kube-state-metrics\"})\n > 1.5\n",
"for": "5m",
"labels": {
"severity": "warning"
@ -895,7 +821,7 @@ data:
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubememoryquotaovercommit",
"summary": "Cluster has overcommitted memory resource requests."
},
"expr": "sum(kube_resourcequota{job=\"kube-state-metrics\", type=\"hard\", resource=\"memory\"})\n /\nsum(kube_node_status_allocatable_memory_bytes{job=\"kube-state-metrics\"})\n > 1.5\n",
"expr": "sum(min without(resource) (kube_resourcequota{job=\"kube-state-metrics\", type=\"hard\", resource=~\"(memory|requests.memory)\"}))\n /\nsum(kube_node_status_allocatable{resource=\"memory\", job=\"kube-state-metrics\"})\n > 1.5\n",
"for": "5m",
"labels": {
"severity": "warning"
@ -965,7 +891,7 @@ data:
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubepersistentvolumefillingup",
"summary": "PersistentVolume is filling up."
},
"expr": "kubelet_volume_stats_available_bytes{job=\"kubelet\"}\n /\nkubelet_volume_stats_capacity_bytes{job=\"kubelet\"}\n < 0.03\n",
"expr": "(\n kubelet_volume_stats_available_bytes{job=\"kubelet\"}\n /\n kubelet_volume_stats_capacity_bytes{job=\"kubelet\"}\n) < 0.03\nand\nkubelet_volume_stats_used_bytes{job=\"kubelet\"} > 0\nunless on(namespace, persistentvolumeclaim)\nkube_persistentvolumeclaim_access_mode{ access_mode=\"ReadOnlyMany\"} == 1\nunless on(namespace, persistentvolumeclaim)\nkube_persistentvolumeclaim_labels{label_excluded_from_alerts=\"true\"} == 1\n",
"for": "1m",
"labels": {
"severity": "critical"
@ -978,7 +904,7 @@ data:
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubepersistentvolumefillingup",
"summary": "PersistentVolume is filling up."
},
"expr": "(\n kubelet_volume_stats_available_bytes{job=\"kubelet\"}\n /\n kubelet_volume_stats_capacity_bytes{job=\"kubelet\"}\n) < 0.15\nand\npredict_linear(kubelet_volume_stats_available_bytes{job=\"kubelet\"}[6h], 4 * 24 * 3600) < 0\n",
"expr": "(\n kubelet_volume_stats_available_bytes{job=\"kubelet\"}\n /\n kubelet_volume_stats_capacity_bytes{job=\"kubelet\"}\n) < 0.15\nand\nkubelet_volume_stats_used_bytes{job=\"kubelet\"} > 0\nand\npredict_linear(kubelet_volume_stats_available_bytes{job=\"kubelet\"}[6h], 4 * 24 * 3600) < 0\nunless on(namespace, persistentvolumeclaim)\nkube_persistentvolumeclaim_access_mode{ access_mode=\"ReadOnlyMany\"} == 1\nunless on(namespace, persistentvolumeclaim)\nkube_persistentvolumeclaim_labels{label_excluded_from_alerts=\"true\"} == 1\n",
"for": "1h",
"labels": {
"severity": "warning"
@ -1009,7 +935,7 @@ data:
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeversionmismatch",
"summary": "Different semantic versions of Kubernetes components running."
},
"expr": "count(count by (gitVersion) (label_replace(kubernetes_build_info{job!~\"kube-dns|coredns\"},\"gitVersion\",\"$1\",\"gitVersion\",\"(v[0-9]*.[0-9]*).*\"))) > 1\n",
"expr": "count(count by (git_version) (label_replace(kubernetes_build_info{job!~\"kube-dns|coredns\"},\"git_version\",\"$1\",\"git_version\",\"(v[0-9]*.[0-9]*).*\"))) > 1\n",
"for": "15m",
"labels": {
"severity": "warning"
@ -1022,7 +948,7 @@ data:
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeclienterrors",
"summary": "Kubernetes API server client is experiencing errors."
},
"expr": "(sum(rate(rest_client_requests_total{code=~\"5..\"}[5m])) by (instance, job)\n /\nsum(rate(rest_client_requests_total[5m])) by (instance, job))\n> 0.01\n",
"expr": "(sum(rate(rest_client_requests_total{code=~\"5..\"}[5m])) by (instance, job, namespace)\n /\nsum(rate(rest_client_requests_total[5m])) by (instance, job, namespace))\n> 0.01\n",
"for": "15m",
"labels": {
"severity": "warning"
@ -1101,7 +1027,7 @@ data:
{
"alert": "KubeClientCertificateExpiration",
"annotations": {
"description": "A client certificate used to authenticate to the apiserver is expiring in less than 1.0 hours.",
"description": "A client certificate used to authenticate to kubernetes apiserver is expiring in less than 1.0 hours.",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeclientcertificateexpiration",
"summary": "Client certificate is about to expire."
},
@ -1113,7 +1039,7 @@ data:
{
"alert": "KubeClientCertificateExpiration",
"annotations": {
"description": "A client certificate used to authenticate to the apiserver is expiring in less than 0.1 hours.",
"description": "A client certificate used to authenticate to kubernetes apiserver is expiring in less than 0.1 hours.",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeclientcertificateexpiration",
"summary": "Client certificate is about to expire."
},
@ -1123,23 +1049,23 @@ data:
}
},
{
"alert": "AggregatedAPIErrors",
"alert": "KubeAggregatedAPIErrors",
"annotations": {
"description": "An aggregated API {{ $labels.name }}/{{ $labels.namespace }} has reported errors. It has appeared unavailable {{ $value | humanize }} times averaged over the past 10m.",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-aggregatedapierrors",
"summary": "An aggregated API has reported errors."
"description": "Kubernetes aggregated API {{ $labels.name }}/{{ $labels.namespace }} has reported errors. It has appeared unavailable {{ $value | humanize }} times averaged over the past 10m.",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeaggregatedapierrors",
"summary": "Kubernetes aggregated API has reported errors."
},
"expr": "sum by(name, namespace)(increase(aggregator_unavailable_apiservice_count[10m])) > 4\n",
"expr": "sum by(name, namespace)(increase(aggregator_unavailable_apiservice_total[10m])) > 4\n",
"labels": {
"severity": "warning"
}
},
{
"alert": "AggregatedAPIDown",
"alert": "KubeAggregatedAPIDown",
"annotations": {
"description": "An aggregated API {{ $labels.name }}/{{ $labels.namespace }} has been only {{ $value | humanize }}% available over the last 10m.",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-aggregatedapidown",
"summary": "An aggregated API is down."
"description": "Kubernetes aggregated API {{ $labels.name }}/{{ $labels.namespace }} has been only {{ $value | humanize }}% available over the last 10m.",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeaggregatedapidown",
"summary": "Kubernetes aggregated API is down."
},
"expr": "(1 - max by(name, namespace)(avg_over_time(aggregator_unavailable_apiservice[10m]))) * 100 < 85\n",
"for": "5m",
@ -1159,6 +1085,19 @@ data:
"labels": {
"severity": "critical"
}
},
{
"alert": "KubeAPITerminatedRequests",
"annotations": {
"description": "The kubernetes apiserver has terminated {{ $value | humanizePercentage }} of its incoming requests.",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeapiterminatedrequests",
"summary": "The kubernetes apiserver has terminated {{ $value | humanizePercentage }} of its incoming requests."
},
"expr": "sum(rate(apiserver_request_terminations_total{job=\"apiserver\"}[10m])) / ( sum(rate(apiserver_request_total{job=\"apiserver\"}[10m])) + sum(rate(apiserver_request_terminations_total{job=\"apiserver\"}[10m])) ) > 0.20\n",
"for": "5m",
"labels": {
"severity": "warning"
}
}
]
},
@ -1198,10 +1137,10 @@ data:
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubelettoomanypods",
"summary": "Kubelet is running at capacity."
},
"expr": "count by(node) (\n (kube_pod_status_phase{job=\"kube-state-metrics\",phase=\"Running\"} == 1) * on(instance,pod,namespace,cluster) group_left(node) topk by(instance,pod,namespace,cluster) (1, kube_pod_info{job=\"kube-state-metrics\"})\n)\n/\nmax by(node) (\n kube_node_status_capacity_pods{job=\"kube-state-metrics\"} != 1\n) > 0.95\n",
"expr": "count by(node) (\n (kube_pod_status_phase{job=\"kube-state-metrics\",phase=\"Running\"} == 1) * on(instance,pod,namespace,cluster) group_left(node) topk by(instance,pod,namespace,cluster) (1, kube_pod_info{job=\"kube-state-metrics\"})\n)\n/\nmax by(node) (\n kube_node_status_capacity{job=\"kube-state-metrics\",resource=\"pods\"} != 1\n) > 0.95\n",
"for": "15m",
"labels": {
"severity": "warning"
"severity": "info"
}
},
{
@ -1367,6 +1306,24 @@ data:
}
}
]
},
{
"name": "kubernetes-system-kube-proxy",
"rules": [
{
"alert": "KubeProxyDown",
"annotations": {
"description": "KubeProxy has disappeared from Prometheus target discovery.",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeproxydown",
"summary": "Target disappeared from Prometheus target discovery."
},
"expr": "absent(up{job=\"kube-proxy\"} == 1)\n",
"for": "15m",
"labels": {
"severity": "critical"
}
}
]
}
]
}
@ -1377,48 +1334,48 @@ data:
"name": "node-exporter.rules",
"rules": [
{
"expr": "count without (cpu) (\n count without (mode) (\n node_cpu_seconds_total{job=\"node-exporter\"}\n )\n)\n",
"expr": "count without (cpu, mode) (\n node_cpu_seconds_total{job=\"node-exporter\",mode=\"idle\"}\n)\n",
"record": "instance:node_num_cpu:sum"
},
{
"expr": "1 - avg without (cpu, mode) (\n rate(node_cpu_seconds_total{job=\"node-exporter\", mode=\"idle\"}[1m])\n)\n",
"record": "instance:node_cpu_utilisation:rate1m"
"expr": "1 - avg without (cpu) (\n sum without (mode) (rate(node_cpu_seconds_total{job=\"node-exporter\", mode=~\"idle|iowait|steal\"}[5m]))\n)\n",
"record": "instance:node_cpu_utilisation:rate5m"
},
{
"expr": "(\n node_load1{job=\"node-exporter\"}\n/\n instance:node_num_cpu:sum{job=\"node-exporter\"}\n)\n",
"record": "instance:node_load1_per_cpu:ratio"
},
{
"expr": "1 - (\n node_memory_MemAvailable_bytes{job=\"node-exporter\"}\n/\n node_memory_MemTotal_bytes{job=\"node-exporter\"}\n)\n",
"expr": "1 - (\n (\n node_memory_MemAvailable_bytes{job=\"node-exporter\"}\n or\n (\n node_memory_Buffers_bytes{job=\"node-exporter\"}\n +\n node_memory_Cached_bytes{job=\"node-exporter\"}\n +\n node_memory_MemFree_bytes{job=\"node-exporter\"}\n +\n node_memory_Slab_bytes{job=\"node-exporter\"}\n )\n )\n/\n node_memory_MemTotal_bytes{job=\"node-exporter\"}\n)\n",
"record": "instance:node_memory_utilisation:ratio"
},
{
"expr": "rate(node_vmstat_pgmajfault{job=\"node-exporter\"}[1m])\n",
"record": "instance:node_vmstat_pgmajfault:rate1m"
"expr": "rate(node_vmstat_pgmajfault{job=\"node-exporter\"}[5m])\n",
"record": "instance:node_vmstat_pgmajfault:rate5m"
},
{
"expr": "rate(node_disk_io_time_seconds_total{job=\"node-exporter\", device!~\"dm.*\"}[1m])\n",
"record": "instance_device:node_disk_io_time_seconds:rate1m"
"expr": "rate(node_disk_io_time_seconds_total{job=\"node-exporter\", device!~\"dm.*\"}[5m])\n",
"record": "instance_device:node_disk_io_time_seconds:rate5m"
},
{
"expr": "rate(node_disk_io_time_weighted_seconds_total{job=\"node-exporter\", device!~\"dm.*\"}[1m])\n",
"record": "instance_device:node_disk_io_time_weighted_seconds:rate1m"
"expr": "rate(node_disk_io_time_weighted_seconds_total{job=\"node-exporter\", device!~\"dm.*\"}[5m])\n",
"record": "instance_device:node_disk_io_time_weighted_seconds:rate5m"
},
{
"expr": "sum without (device) (\n rate(node_network_receive_bytes_total{job=\"node-exporter\", device!=\"lo\"}[1m])\n)\n",
"record": "instance:node_network_receive_bytes_excluding_lo:rate1m"
"expr": "sum without (device) (\n rate(node_network_receive_bytes_total{job=\"node-exporter\", device!=\"lo\"}[5m])\n)\n",
"record": "instance:node_network_receive_bytes_excluding_lo:rate5m"
},
{
"expr": "sum without (device) (\n rate(node_network_transmit_bytes_total{job=\"node-exporter\", device!=\"lo\"}[1m])\n)\n",
"record": "instance:node_network_transmit_bytes_excluding_lo:rate1m"
"expr": "sum without (device) (\n rate(node_network_transmit_bytes_total{job=\"node-exporter\", device!=\"lo\"}[5m])\n)\n",
"record": "instance:node_network_transmit_bytes_excluding_lo:rate5m"
},
{
"expr": "sum without (device) (\n rate(node_network_receive_drop_total{job=\"node-exporter\", device!=\"lo\"}[1m])\n)\n",
"record": "instance:node_network_receive_drop_excluding_lo:rate1m"
"expr": "sum without (device) (\n rate(node_network_receive_drop_total{job=\"node-exporter\", device!=\"lo\"}[5m])\n)\n",
"record": "instance:node_network_receive_drop_excluding_lo:rate5m"
},
{
"expr": "sum without (device) (\n rate(node_network_transmit_drop_total{job=\"node-exporter\", device!=\"lo\"}[1m])\n)\n",
"record": "instance:node_network_transmit_drop_excluding_lo:rate1m"
"expr": "sum without (device) (\n rate(node_network_transmit_drop_total{job=\"node-exporter\", device!=\"lo\"}[5m])\n)\n",
"record": "instance:node_network_transmit_drop_excluding_lo:rate5m"
}
]
},
@ -1456,7 +1413,7 @@ data:
"summary": "Filesystem has less than 5% space left."
},
"expr": "(\n node_filesystem_avail_bytes{job=\"node-exporter\",fstype!~\"tmpfs|nsfs|vfat\"} / node_filesystem_size_bytes{job=\"node-exporter\",fstype!~\"tmpfs|nsfs|vfat\"} * 100 < 5\nand\n node_filesystem_readonly{job=\"node-exporter\",fstype!~\"tmpfs|nsfs|vfat\"} == 0\n)\n",
"for": "1h",
"for": "30m",
"labels": {
"severity": "warning"
}
@ -1468,7 +1425,7 @@ data:
"summary": "Filesystem has less than 3% space left."
},
"expr": "(\n node_filesystem_avail_bytes{job=\"node-exporter\",fstype!~\"tmpfs|nsfs|vfat\"} / node_filesystem_size_bytes{job=\"node-exporter\",fstype!~\"tmpfs|nsfs|vfat\"} * 100 < 3\nand\n node_filesystem_readonly{job=\"node-exporter\",fstype!~\"tmpfs|nsfs|vfat\"} == 0\n)\n",
"for": "1h",
"for": "30m",
"labels": {
"severity": "critical"
}
@ -1570,7 +1527,7 @@ data:
{
"alert": "NodeClockSkewDetected",
"annotations": {
"message": "Clock on {{ $labels.instance }} is out of sync by more than 300s. Ensure NTP is configured correctly on this host.",
"description": "Clock on {{ $labels.instance }} is out of sync by more than 300s. Ensure NTP is configured correctly on this host.",
"summary": "Clock skew detected."
},
"expr": "(\n node_timex_offset_seconds > 0.05\nand\n deriv(node_timex_offset_seconds[5m]) >= 0\n)\nor\n(\n node_timex_offset_seconds < -0.05\nand\n deriv(node_timex_offset_seconds[5m]) <= 0\n)\n",
@ -1582,7 +1539,7 @@ data:
{
"alert": "NodeClockNotSynchronising",
"annotations": {
"message": "Clock on {{ $labels.instance }} is not synchronising. Ensure NTP is configured on this host.",
"description": "Clock on {{ $labels.instance }} is not synchronising. Ensure NTP is configured on this host.",
"summary": "Clock not synchronising."
},
"expr": "min_over_time(node_timex_sync_status[5m]) == 0\nand\nnode_timex_maxerror_seconds >= 16\n",
@ -1609,10 +1566,34 @@ data:
"description": "At least one device in RAID array on {{ $labels.instance }} failed. Array '{{ $labels.device }}' needs attention and possibly a disk swap.",
"summary": "Failed device in RAID array"
},
"expr": "node_md_disks{state=\"fail\"} > 0\n",
"expr": "node_md_disks{state=\"failed\"} > 0\n",
"labels": {
"severity": "warning"
}
},
{
"alert": "NodeFileDescriptorLimit",
"annotations": {
"description": "File descriptors limit at {{ $labels.instance }} is currently at {{ printf \"%.2f\" $value }}%.",
"summary": "Kernel is predicted to exhaust file descriptors limit soon."
},
"expr": "(\n node_filefd_allocated{job=\"node-exporter\"} * 100 / node_filefd_maximum{job=\"node-exporter\"} > 70\n)\n",
"for": "15m",
"labels": {
"severity": "warning"
}
},
{
"alert": "NodeFileDescriptorLimit",
"annotations": {
"description": "File descriptors limit at {{ $labels.instance }} is currently at {{ printf \"%.2f\" $value }}%.",
"summary": "Kernel is predicted to exhaust file descriptors limit soon."
},
"expr": "(\n node_filefd_allocated{job=\"node-exporter\"} * 100 / node_filefd_maximum{job=\"node-exporter\"} > 90\n)\n",
"for": "15m",
"labels": {
"severity": "critical"
}
}
]
}
@ -1738,7 +1719,7 @@ data:
"description": "Prometheus {{$labels.instance}} failed to send {{ printf \"%.1f\" $value }}% of the samples to {{ $labels.remote_name}}:{{ $labels.url }}",
"summary": "Prometheus fails to send samples to remote storage."
},
"expr": "(\n rate(prometheus_remote_storage_failed_samples_total{job=\"prometheus\"}[5m])\n/\n (\n rate(prometheus_remote_storage_failed_samples_total{job=\"prometheus\"}[5m])\n +\n rate(prometheus_remote_storage_succeeded_samples_total{job=\"prometheus\"}[5m])\n )\n)\n* 100\n> 1\n",
"expr": "(\n (rate(prometheus_remote_storage_failed_samples_total{job=\"prometheus\"}[5m]) or rate(prometheus_remote_storage_samples_failed_total{job=\"prometheus\"}[5m]))\n/\n (\n (rate(prometheus_remote_storage_failed_samples_total{job=\"prometheus\"}[5m]) or rate(prometheus_remote_storage_samples_failed_total{job=\"prometheus\"}[5m]))\n +\n (rate(prometheus_remote_storage_succeeded_samples_total{job=\"prometheus\"}[5m]) or rate(prometheus_remote_storage_samples_total{job=\"prometheus\"}[5m]))\n )\n)\n* 100\n> 1\n",
"for": "15m",
"labels": {
"severity": "critical"
@ -1804,6 +1785,30 @@ data:
"severity": "warning"
}
},
{
"alert": "PrometheusLabelLimitHit",
"annotations": {
"description": "Prometheus {{$labels.instance}} has dropped {{ printf \"%.0f\" $value }} targets because some samples exceeded the configured label_limit, label_name_length_limit or label_value_length_limit.",
"summary": "Prometheus has dropped targets because some scrape configs have exceeded the labels limit."
},
"expr": "increase(prometheus_target_scrape_pool_exceeded_label_limits_total{job=\"prometheus\"}[5m]) > 0\n",
"for": "15m",
"labels": {
"severity": "warning"
}
},
{
"alert": "PrometheusTargetSyncFailure",
"annotations": {
"description": "{{ printf \"%.0f\" $value }} targets in Prometheus {{$labels.instance}} have failed to sync because invalid configuration was supplied.",
"summary": "Prometheus has failed to sync targets."
},
"expr": "increase(prometheus_target_sync_failed_total{job=\"prometheus\"}[30m]) > 0\n",
"for": "5m",
"labels": {
"severity": "critical"
}
},
{
"alert": "PrometheusErrorSendingAlertsToAnyAlertmanager",
"annotations": {

View File

@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.20.2 (upstream)
* Kubernetes v1.24.1 (upstream)
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/), SELinux enforcing
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [spot](https://typhoon.psdn.io/fedora-coreos/aws/#spot) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#hosts) customization

View File

@ -1,4 +1,3 @@
data "aws_ami" "fedora-coreos" {
most_recent = true
owners = ["125523088429"]
@ -19,14 +18,11 @@ data "aws_ami" "fedora-coreos" {
}
}
# Experimental Fedora CoreOS arm64 / aarch64 AMIs from Poseidon
# WARNING: These AMIs will be removed when Fedora CoreOS publishes arm64 AMIs
# and may be removed for any reason before then as well. Do not use.
data "aws_ami" "fedora-coreos-arm" {
count = var.arch == "arm64" ? 1 : 0
most_recent = true
owners = ["099663496933"]
owners = ["125523088429"]
filter {
name = "architecture"
@ -39,8 +35,7 @@ data "aws_ami" "fedora-coreos-arm" {
}
filter {
name = "name"
values = ["fedora-coreos-*"]
name = "description"
values = ["Fedora CoreOS ${var.os_stream} *"]
}
}

View File

@ -1,6 +1,6 @@
# Kubernetes assets (kubeconfig, manifests)
module "bootstrap" {
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=b3bf2ecbbe0addab2ed473148589c3454d5b366f"
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=f325be50417e27dde6599a293409cb7d068cda0c"
cluster_name = var.cluster_name
api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)]
@ -13,7 +13,5 @@ module "bootstrap" {
enable_reporting = var.enable_reporting
enable_aggregation = var.enable_aggregation
daemonset_tolerations = var.daemonset_tolerations
trusted_certs_dir = "/etc/pki/tls/certs"
}

View File

@ -62,7 +62,6 @@ data "template_file" "controller-configs" {
vars = {
# Cannot use cyclic dependencies on controllers or their DNS records
etcd_arch = var.arch == "arm64" ? "-arm64" : ""
etcd_name = "etcd${count.index}"
etcd_domain = "${var.cluster_name}-etcd${count.index}.${var.dns_zone}"
# etcd0=https://cluster-etcd0.example.com,etcd1=https://cluster-etcd1.example.com,...

View File

@ -1,6 +1,6 @@
---
variant: fcos
version: 1.1.0
version: 1.4.0
systemd:
units:
- name: etcd-member.service
@ -12,7 +12,7 @@ systemd:
Wants=network-online.target network.target
After=network-online.target
[Service]
Environment=ETCD_IMAGE=quay.io/coreos/etcd:v3.4.14${etcd_arch}
Environment=ETCD_IMAGE=quay.io/coreos/etcd:v3.5.4
Type=exec
ExecStartPre=/bin/mkdir -p /var/lib/etcd
ExecStartPre=-/usr/bin/podman rm etcd
@ -29,8 +29,10 @@ systemd:
LimitNOFILE=40000
[Install]
WantedBy=multi-user.target
- name: docker.service
- name: containerd.service
enabled: true
- name: docker.service
mask: true
- name: wait-for-dns.service
enabled: true
contents: |
@ -50,10 +52,13 @@ systemd:
contents: |
[Unit]
Description=Kubelet (System Container)
Requires=afterburn.service
After=afterburn.service
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.20.2
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.24.1
EnvironmentFile=/run/metadata/afterburn
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /opt/cni/bin
ExecStartPre=/bin/mkdir -p /var/lib/calico
@ -64,14 +69,17 @@ systemd:
--privileged \
--pid host \
--network host \
--volume /etc/cni/net.d:/etc/cni/net.d:ro,z \
--volume /etc/kubernetes:/etc/kubernetes:ro,z \
--volume /etc/machine-id:/etc/machine-id:ro \
--volume /usr/lib/os-release:/etc/os-release:ro \
--volume /lib/modules:/lib/modules:ro \
--volume /run:/run \
--volume /sys/fs/cgroup:/sys/fs/cgroup:ro \
--volume /sys/fs/cgroup/systemd:/sys/fs/cgroup/systemd \
--volume /sys/fs/cgroup:/sys/fs/cgroup \
--volume /etc/selinux:/etc/selinux \
--volume /sys/fs/selinux:/sys/fs/selinux \
--volume /var/lib/calico:/var/lib/calico:ro \
--volume /var/lib/docker:/var/lib/docker \
--volume /var/lib/containerd:/var/lib/containerd \
--volume /var/lib/kubelet:/var/lib/kubelet:rshared,z \
--volume /var/log:/var/log \
--volume /var/run/lock:/var/run/lock:z \
@ -83,17 +91,19 @@ systemd:
--bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \
--cgroup-driver=systemd \
--cgroups-per-qos=true \
--container-runtime=remote \
--container-runtime-endpoint=unix:///run/containerd/containerd.sock \
--enforce-node-allocatable=pods \
--client-ca-file=/etc/kubernetes/ca.crt \
--cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--healthz-port=0 \
--kubeconfig=/var/lib/kubelet/kubeconfig \
--network-plugin=cni \
--node-labels=node.kubernetes.io/controller="true" \
--pod-manifest-path=/etc/kubernetes/manifests \
--provider-id=aws:///$${AFTERBURN_AWS_AVAILABILITY_ZONE}/$${AFTERBURN_AWS_INSTANCE_ID} \
--read-only-port=0 \
--resolv-conf=/run/systemd/resolve/resolv.conf \
--register-with-taints=node-role.kubernetes.io/controller=:NoSchedule \
--rotate-certificates \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
@ -119,7 +129,7 @@ systemd:
--volume /opt/bootstrap/assets:/assets:ro,Z \
--volume /opt/bootstrap/apply:/apply:ro,Z \
--entrypoint=/apply \
quay.io/poseidon/kubelet:v1.20.2
quay.io/poseidon/kubelet:v1.24.1
ExecStartPost=/bin/touch /opt/bootstrap/bootstrap.done
ExecStartPost=-/usr/bin/podman stop bootstrap
storage:
@ -214,7 +224,26 @@ storage:
ETCD_PEER_CERT_FILE=/etc/ssl/certs/etcd/peer.crt
ETCD_PEER_KEY_FILE=/etc/ssl/certs/etcd/peer.key
ETCD_PEER_CLIENT_CERT_AUTH=true
ETCD_UNSUPPORTED_ARCH=arm64
- path: /etc/fedora-coreos/iptables-legacy.stamp
- path: /etc/containerd/config.toml
overwrite: true
contents:
inline: |
version = 2
root = "/var/lib/containerd"
state = "/run/containerd"
subreaper = true
oom_score = -999
[grpc]
address = "/run/containerd/containerd.sock"
uid = 0
gid = 0
[plugins."io.containerd.grpc.v1.cri"]
enable_selinux = true
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
runtime_type = "io.containerd.runc.v2"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
SystemdCgroup = true
passwd:
users:
- name: core

View File

@ -201,8 +201,8 @@ resource "aws_security_group_rule" "controller-scheduler-metrics" {
type = "ingress"
protocol = "tcp"
from_port = 10251
to_port = 10251
from_port = 10259
to_port = 10259
source_security_group_id = aws_security_group.worker.id
}
@ -212,8 +212,8 @@ resource "aws_security_group_rule" "controller-manager-metrics" {
type = "ingress"
protocol = "tcp"
from_port = 10252
to_port = 10252
from_port = 10257
to_port = 10257
source_security_group_id = aws_security_group.worker.id
}

View File

@ -24,7 +24,7 @@ resource "null_resource" "copy-controller-secrets" {
provisioner "file" {
content = join("\n", local.assets_bundle)
destination = "$HOME/assets"
destination = "/home/core/assets"
}
provisioner "remote-exec" {

View File

@ -55,19 +55,19 @@ variable "os_stream" {
variable "disk_size" {
type = number
description = "Size of the EBS volume in GB"
default = 40
default = 30
}
variable "disk_type" {
type = string
description = "Type of the EBS volume (e.g. standard, gp2, io1)"
default = "gp2"
description = "Type of the EBS volume (e.g. standard, gp2, gp3, io1)"
default = "gp3"
}
variable "disk_iops" {
type = number
description = "IOPS of the EBS volume (e.g. 100)"
default = 0
description = "IOPS of the EBS volume (e.g. 3000)"
default = 3000
}
variable "worker_price" {
@ -84,13 +84,13 @@ variable "worker_target_groups" {
variable "controller_snippets" {
type = list(string)
description = "Controller Fedora CoreOS Config snippets"
description = "Controller Butane snippets"
default = []
}
variable "worker_snippets" {
type = list(string)
description = "Worker Fedora CoreOS Config snippets"
description = "Worker Butane snippets"
default = []
}
@ -103,8 +103,8 @@ variable "ssh_authorized_key" {
variable "networking" {
type = string
description = "Choice of networking provider (calico or flannel)"
default = "calico"
description = "Choice of networking provider (flannel, calico, or cilium)"
default = "cilium"
}
variable "network_mtu" {
@ -142,8 +142,8 @@ variable "enable_reporting" {
variable "enable_aggregation" {
type = bool
description = "Enable the Kubernetes Aggregation Layer (defaults to false)"
default = false
description = "Enable the Kubernetes Aggregation Layer"
default = true
}
variable "worker_node_labels" {
@ -176,4 +176,3 @@ variable "daemonset_tolerations" {
description = "List of additional taint keys kube-system DaemonSets should tolerate (e.g. ['custom-role', 'gpu-role'])"
default = []
}

View File

@ -1,15 +1,15 @@
# Terraform version and plugin versions
terraform {
required_version = ">= 0.13.0, < 0.15.0"
required_version = ">= 0.13.0, < 2.0.0"
required_providers {
aws = ">= 2.23, <= 4.0"
template = "~> 2.1"
null = "~> 2.1"
aws = ">= 2.23, <= 5.0"
template = "~> 2.2"
null = ">= 2.1"
ct = {
source = "poseidon/ct"
version = "~> 0.6"
version = "~> 0.9"
}
}
}

View File

@ -1,4 +1,3 @@
data "aws_ami" "fedora-coreos" {
most_recent = true
owners = ["125523088429"]
@ -19,14 +18,11 @@ data "aws_ami" "fedora-coreos" {
}
}
# Experimental Fedora CoreOS arm64 / aarch64 AMIs from Poseidon
# WARNING: These AMIs will be removed when Fedora CoreOS publishes arm64 AMIs
# and may be removed for any reason before then as well. Do not use.
data "aws_ami" "fedora-coreos-arm" {
count = var.arch == "arm64" ? 1 : 0
most_recent = true
owners = ["099663496933"]
owners = ["125523088429"]
filter {
name = "architecture"
@ -39,8 +35,7 @@ data "aws_ami" "fedora-coreos-arm" {
}
filter {
name = "name"
values = ["fedora-coreos-*"]
name = "description"
values = ["Fedora CoreOS ${var.os_stream} *"]
}
}

View File

@ -1,10 +1,12 @@
---
variant: fcos
version: 1.1.0
version: 1.4.0
systemd:
units:
- name: docker.service
- name: containerd.service
enabled: true
- name: docker.service
mask: true
- name: wait-for-dns.service
enabled: true
contents: |
@ -23,10 +25,13 @@ systemd:
contents: |
[Unit]
Description=Kubelet (System Container)
Requires=afterburn.service
After=afterburn.service
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.20.2
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.24.1
EnvironmentFile=/run/metadata/afterburn
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /opt/cni/bin
ExecStartPre=/bin/mkdir -p /var/lib/calico
@ -37,14 +42,17 @@ systemd:
--privileged \
--pid host \
--network host \
--volume /etc/cni/net.d:/etc/cni/net.d:ro,z \
--volume /etc/kubernetes:/etc/kubernetes:ro,z \
--volume /etc/machine-id:/etc/machine-id:ro \
--volume /usr/lib/os-release:/etc/os-release:ro \
--volume /lib/modules:/lib/modules:ro \
--volume /run:/run \
--volume /sys/fs/cgroup:/sys/fs/cgroup:ro \
--volume /sys/fs/cgroup/systemd:/sys/fs/cgroup/systemd \
--volume /sys/fs/cgroup:/sys/fs/cgroup \
--volume /etc/selinux:/etc/selinux \
--volume /sys/fs/selinux:/sys/fs/selinux \
--volume /var/lib/calico:/var/lib/calico:ro \
--volume /var/lib/docker:/var/lib/docker \
--volume /var/lib/containerd:/var/lib/containerd \
--volume /var/lib/kubelet:/var/lib/kubelet:rshared,z \
--volume /var/log:/var/log \
--volume /var/run/lock:/var/run/lock:z \
@ -56,14 +64,14 @@ systemd:
--bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \
--cgroup-driver=systemd \
--cgroups-per-qos=true \
--container-runtime=remote \
--container-runtime-endpoint=unix:///run/containerd/containerd.sock \
--enforce-node-allocatable=pods \
--client-ca-file=/etc/kubernetes/ca.crt \
--cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--healthz-port=0 \
--kubeconfig=/var/lib/kubelet/kubeconfig \
--network-plugin=cni \
--node-labels=node.kubernetes.io/node \
%{~ for label in split(",", node_labels) ~}
--node-labels=${label} \
@ -72,7 +80,9 @@ systemd:
--register-with-taints=${taint} \
%{~ endfor ~}
--pod-manifest-path=/etc/kubernetes/manifests \
--provider-id=aws:///$${AFTERBURN_AWS_AVAILABILITY_ZONE}/$${AFTERBURN_AWS_INSTANCE_ID} \
--read-only-port=0 \
--resolv-conf=/run/systemd/resolve/resolv.conf \
--rotate-certificates \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
ExecStop=-/usr/bin/podman stop kubelet
@ -87,7 +97,7 @@ systemd:
[Unit]
Description=Delete Kubernetes node on shutdown
[Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.20.2
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.24.1
Type=oneshot
RemainAfterExit=true
ExecStart=/bin/true
@ -126,9 +136,28 @@ storage:
DefaultCPUAccounting=yes
DefaultMemoryAccounting=yes
DefaultBlockIOAccounting=yes
- path: /etc/fedora-coreos/iptables-legacy.stamp
- path: /etc/containerd/config.toml
overwrite: true
contents:
inline: |
version = 2
root = "/var/lib/containerd"
state = "/run/containerd"
subreaper = true
oom_score = -999
[grpc]
address = "/run/containerd/containerd.sock"
uid = 0
gid = 0
[plugins."io.containerd.grpc.v1.cri"]
enable_selinux = true
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
runtime_type = "io.containerd.runc.v2"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
SystemdCgroup = true
passwd:
users:
- name: core
ssh_authorized_keys:
- ${ssh_authorized_key}

View File

@ -48,13 +48,13 @@ variable "os_stream" {
variable "disk_size" {
type = number
description = "Size of the EBS volume in GB"
default = 40
default = 30
}
variable "disk_type" {
type = string
description = "Type of the EBS volume (e.g. standard, gp2, io1)"
default = "gp2"
description = "Type of the EBS volume (e.g. standard, gp2, gp3, io1)"
default = "gp3"
}
variable "disk_iops" {
@ -77,7 +77,7 @@ variable "target_groups" {
variable "snippets" {
type = list(string)
description = "Fedora CoreOS Config snippets"
description = "Butane snippets"
default = []
}

View File

@ -1,14 +1,14 @@
# Terraform version and plugin versions
terraform {
required_version = ">= 0.13.0, < 0.15.0"
required_version = ">= 0.13.0, < 2.0.0"
required_providers {
aws = ">= 2.23, <= 4.0"
template = "~> 2.1"
aws = ">= 2.23, <= 5.0"
template = "~> 2.2"
ct = {
source = "poseidon/ct"
version = "~> 0.6"
version = "~> 0.9"
}
}
}

View File

@ -33,13 +33,11 @@ resource "aws_autoscaling_group" "workers" {
# used. Disable wait to avoid issues and align with other clouds.
wait_for_capacity_timeout = "0"
tags = [
{
tag {
key = "Name"
value = "${var.name}-worker"
propagate_at_launch = true
},
]
}
}
# Worker template

View File

@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.20.2 (upstream)
* Kubernetes v1.24.1 (upstream)
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [spot](https://typhoon.psdn.io/flatcar-linux/aws/#spot) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#hosts) customization

View File

@ -1,7 +1,7 @@
locals {
# Pick a Flatcar Linux AMI
# flatcar-stable -> Flatcar Linux AMI
ami_id = data.aws_ami.flatcar.image_id
ami_id = var.arch == "arm64" ? data.aws_ami.flatcar-arm64[0].image_id : data.aws_ami.flatcar.image_id
channel = split("-", var.os_image)[1]
}
@ -25,3 +25,25 @@ data "aws_ami" "flatcar" {
}
}
data "aws_ami" "flatcar-arm64" {
count = var.arch == "arm64" ? 1 : 0
most_recent = true
owners = ["075585003325"]
filter {
name = "architecture"
values = ["arm64"]
}
filter {
name = "virtualization-type"
values = ["hvm"]
}
filter {
name = "name"
values = ["Flatcar-${local.channel}-*"]
}
}

View File

@ -1,6 +1,6 @@
# Kubernetes assets (kubeconfig, manifests)
module "bootstrap" {
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=b3bf2ecbbe0addab2ed473148589c3454d5b366f"
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=f325be50417e27dde6599a293409cb7d068cda0c"
cluster_name = var.cluster_name
api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)]
@ -12,5 +12,6 @@ module "bootstrap" {
cluster_domain_suffix = var.cluster_domain_suffix
enable_reporting = var.enable_reporting
enable_aggregation = var.enable_aggregation
daemonset_tolerations = var.daemonset_tolerations
}

View File

@ -10,7 +10,7 @@ systemd:
Requires=docker.service
After=docker.service
[Service]
Environment=ETCD_IMAGE=quay.io/coreos/etcd:v3.4.14
Environment=ETCD_IMAGE=quay.io/coreos/etcd:v3.5.4
ExecStartPre=/usr/bin/docker run -d \
--name etcd \
--network host \
@ -53,11 +53,13 @@ systemd:
Description=Kubelet (System Container)
Requires=docker.service
After=docker.service
Requires=coreos-metadata.service
After=coreos-metadata.service
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.20.2
Environment=KUBELET_CGROUP_DRIVER=${cgroup_driver}
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.24.1
EnvironmentFile=/run/metadata/coreos
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /opt/cni/bin
ExecStartPre=/bin/mkdir -p /var/lib/calico
@ -68,15 +70,15 @@ systemd:
--privileged \
--pid host \
--network host \
-v /etc/cni/net.d:/etc/cni/net.d:ro \
-v /etc/kubernetes:/etc/kubernetes:ro \
-v /etc/machine-id:/etc/machine-id:ro \
-v /usr/lib/os-release:/etc/os-release:ro \
-v /lib/modules:/lib/modules:ro \
-v /run:/run \
-v /sys/fs/cgroup:/sys/fs/cgroup:ro \
-v /sys/fs/cgroup/systemd:/sys/fs/cgroup/systemd \
-v /sys/fs/cgroup:/sys/fs/cgroup \
-v /var/lib/calico:/var/lib/calico:ro \
-v /var/lib/docker:/var/lib/docker \
-v /var/lib/containerd:/var/lib/containerd \
-v /var/lib/kubelet:/var/lib/kubelet:rshared \
-v /var/log:/var/log \
-v /opt/cni/bin:/opt/cni/bin \
@ -85,17 +87,19 @@ systemd:
--authentication-token-webhook \
--authorization-mode=Webhook \
--bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \
--cgroup-driver=$${KUBELET_CGROUP_DRIVER} \
--cgroup-driver=systemd \
--container-runtime=remote \
--container-runtime-endpoint=unix:///run/containerd/containerd.sock \
--client-ca-file=/etc/kubernetes/ca.crt \
--cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--healthz-port=0 \
--kubeconfig=/var/lib/kubelet/kubeconfig \
--network-plugin=cni \
--node-labels=node.kubernetes.io/controller="true" \
--pod-manifest-path=/etc/kubernetes/manifests \
--provider-id=aws:///$${COREOS_EC2_AVAILABILITY_ZONE}/$${COREOS_EC2_INSTANCE_ID} \
--read-only-port=0 \
--resolv-conf=/run/systemd/resolve/resolv.conf \
--register-with-taints=node-role.kubernetes.io/controller=:NoSchedule \
--rotate-certificates \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
@ -117,7 +121,7 @@ systemd:
Type=oneshot
RemainAfterExit=true
WorkingDirectory=/opt/bootstrap
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.20.2
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.24.1
ExecStart=/usr/bin/docker run \
-v /etc/kubernetes/pki:/etc/kubernetes/pki:ro \
-v /opt/bootstrap/assets:/assets:ro \

View File

@ -67,7 +67,6 @@ data "template_file" "controller-configs" {
etcd_domain = "${var.cluster_name}-etcd${count.index}.${var.dns_zone}"
# etcd0=https://cluster-etcd0.example.com,etcd1=https://cluster-etcd1.example.com,...
etcd_initial_cluster = join(",", data.template_file.etcds.*.rendered)
cgroup_driver = local.channel == "edge" ? "systemd" : "cgroupfs"
kubeconfig = indent(10, module.bootstrap.kubeconfig-kubelet)
ssh_authorized_key = var.ssh_authorized_key
cluster_dns_service_ip = cidrhost(var.service_cidr, 10)

View File

@ -201,8 +201,8 @@ resource "aws_security_group_rule" "controller-scheduler-metrics" {
type = "ingress"
protocol = "tcp"
from_port = 10251
to_port = 10251
from_port = 10259
to_port = 10259
source_security_group_id = aws_security_group.worker.id
}
@ -212,8 +212,8 @@ resource "aws_security_group_rule" "controller-manager-metrics" {
type = "ingress"
protocol = "tcp"
from_port = 10252
to_port = 10252
from_port = 10257
to_port = 10257
source_security_group_id = aws_security_group.worker.id
}

View File

@ -24,7 +24,7 @@ resource "null_resource" "copy-controller-secrets" {
provisioner "file" {
content = join("\n", local.assets_bundle)
destination = "$HOME/assets"
destination = "/home/core/assets"
}
provisioner "remote-exec" {

View File

@ -43,31 +43,31 @@ variable "worker_type" {
variable "os_image" {
type = string
description = "AMI channel for a Container Linux derivative (flatcar-stable, flatcar-beta, flatcar-alpha, flatcar-edge)"
description = "AMI channel for a Container Linux derivative (flatcar-stable, flatcar-beta, flatcar-alpha)"
default = "flatcar-stable"
validation {
condition = contains(["flatcar-stable", "flatcar-beta", "flatcar-alpha", "flatcar-edge"], var.os_image)
error_message = "The os_image must be flatcar-stable, flatcar-beta, flatcar-alpha, or flatcar-edge."
condition = contains(["flatcar-stable", "flatcar-beta", "flatcar-alpha"], var.os_image)
error_message = "The os_image must be flatcar-stable, flatcar-beta, or flatcar-alpha."
}
}
variable "disk_size" {
type = number
description = "Size of the EBS volume in GB"
default = 40
default = 30
}
variable "disk_type" {
type = string
description = "Type of the EBS volume (e.g. standard, gp2, io1)"
default = "gp2"
description = "Type of the EBS volume (e.g. standard, gp2, gp3, io1)"
default = "gp3"
}
variable "disk_iops" {
type = number
description = "IOPS of the EBS volume (e.g. 100)"
default = 0
description = "IOPS of the EBS volume (e.g. 3000)"
default = 3000
}
variable "worker_price" {
@ -103,8 +103,8 @@ variable "ssh_authorized_key" {
variable "networking" {
type = string
description = "Choice of networking provider (calico or flannel)"
default = "calico"
description = "Choice of networking provider (flannel, calico, or cilium)"
default = "cilium"
}
variable "network_mtu" {
@ -142,8 +142,8 @@ variable "enable_reporting" {
variable "enable_aggregation" {
type = bool
description = "Enable the Kubernetes Aggregation Layer (defaults to false)"
default = false
description = "Enable the Kubernetes Aggregation Layer"
default = true
}
variable "worker_node_labels" {
@ -160,3 +160,19 @@ variable "cluster_domain_suffix" {
default = "cluster.local"
}
variable "arch" {
type = string
description = "Container architecture (amd64 or arm64)"
default = "amd64"
validation {
condition = var.arch == "amd64" || var.arch == "arm64"
error_message = "The arch must be amd64 or arm64."
}
}
variable "daemonset_tolerations" {
type = list(string)
description = "List of additional taint keys kube-system DaemonSets should tolerate (e.g. ['custom-role', 'gpu-role'])"
default = []
}

View File

@ -1,15 +1,15 @@
# Terraform version and plugin versions
terraform {
required_version = ">= 0.13.0, < 0.15.0"
required_version = ">= 0.13.0, < 2.0.0"
required_providers {
aws = ">= 2.23, <= 4.0"
template = "~> 2.1"
null = "~> 2.1"
aws = ">= 2.23, <= 5.0"
template = "~> 2.2"
null = ">= 2.1"
ct = {
source = "poseidon/ct"
version = "~> 0.6"
version = "~> 0.9"
}
}
}

View File

@ -9,6 +9,7 @@ module "workers" {
worker_count = var.worker_count
instance_type = var.worker_type
os_image = var.os_image
arch = var.arch
disk_size = var.disk_size
spot_price = var.worker_price
target_groups = var.worker_target_groups

View File

@ -1,7 +1,7 @@
locals {
# Pick a Flatcar Linux AMI
# flatcar-stable -> Flatcar Linux AMI
ami_id = data.aws_ami.flatcar.image_id
ami_id = var.arch == "arm64" ? data.aws_ami.flatcar-arm64[0].image_id : data.aws_ami.flatcar.image_id
channel = split("-", var.os_image)[1]
}
@ -25,3 +25,24 @@ data "aws_ami" "flatcar" {
}
}
data "aws_ami" "flatcar-arm64" {
count = var.arch == "arm64" ? 1 : 0
most_recent = true
owners = ["075585003325"]
filter {
name = "architecture"
values = ["arm64"]
}
filter {
name = "virtualization-type"
values = ["hvm"]
}
filter {
name = "name"
values = ["Flatcar-${local.channel}-*"]
}
}

View File

@ -25,11 +25,13 @@ systemd:
Description=Kubelet
Requires=docker.service
After=docker.service
Requires=coreos-metadata.service
After=coreos-metadata.service
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.20.2
Environment=KUBELET_CGROUP_DRIVER=${cgroup_driver}
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.24.1
EnvironmentFile=/run/metadata/coreos
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /opt/cni/bin
ExecStartPre=/bin/mkdir -p /var/lib/calico
@ -43,15 +45,15 @@ systemd:
--privileged \
--pid host \
--network host \
-v /etc/cni/net.d:/etc/cni/net.d:ro \
-v /etc/kubernetes:/etc/kubernetes:ro \
-v /etc/machine-id:/etc/machine-id:ro \
-v /usr/lib/os-release:/etc/os-release:ro \
-v /lib/modules:/lib/modules:ro \
-v /run:/run \
-v /sys/fs/cgroup:/sys/fs/cgroup:ro \
-v /sys/fs/cgroup/systemd:/sys/fs/cgroup/systemd \
-v /sys/fs/cgroup:/sys/fs/cgroup \
-v /var/lib/calico:/var/lib/calico:ro \
-v /var/lib/docker:/var/lib/docker \
-v /var/lib/containerd:/var/lib/containerd \
-v /var/lib/kubelet:/var/lib/kubelet:rshared \
-v /var/log:/var/log \
-v /opt/cni/bin:/opt/cni/bin \
@ -60,20 +62,25 @@ systemd:
--authentication-token-webhook \
--authorization-mode=Webhook \
--bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \
--cgroup-driver=$${KUBELET_CGROUP_DRIVER} \
--cgroup-driver=systemd \
--container-runtime=remote \
--container-runtime-endpoint=unix:///run/containerd/containerd.sock \
--client-ca-file=/etc/kubernetes/ca.crt \
--cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--healthz-port=0 \
--kubeconfig=/var/lib/kubelet/kubeconfig \
--network-plugin=cni \
--node-labels=node.kubernetes.io/node \
%{~ for label in split(",", node_labels) ~}
--node-labels=${label} \
%{~ endfor ~}
%{~ for taint in split(",", node_taints) ~}
--register-with-taints=${taint} \
%{~ endfor ~}
--pod-manifest-path=/etc/kubernetes/manifests \
--provider-id=aws:///$${COREOS_EC2_AVAILABILITY_ZONE}/$${COREOS_EC2_INSTANCE_ID} \
--read-only-port=0 \
--resolv-conf=/run/systemd/resolve/resolv.conf \
--rotate-certificates \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
ExecStart=docker logs -f kubelet
@ -89,7 +96,7 @@ systemd:
[Unit]
Description=Delete Kubernetes node on shutdown
[Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.20.2
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.24.1
Type=oneshot
RemainAfterExit=true
ExecStart=/bin/true

View File

@ -36,25 +36,25 @@ variable "instance_type" {
variable "os_image" {
type = string
description = "AMI channel for a Container Linux derivative (flatcar-stable, flatcar-beta, flatcar-alpha, flatcar-edge)"
description = "AMI channel for a Container Linux derivative (flatcar-stable, flatcar-beta, flatcar-alpha)"
default = "flatcar-stable"
validation {
condition = contains(["flatcar-stable", "flatcar-beta", "flatcar-alpha", "flatcar-edge"], var.os_image)
error_message = "The os_image must be flatcar-stable, flatcar-beta, flatcar-alpha, or flatcar-edge."
condition = contains(["flatcar-stable", "flatcar-beta", "flatcar-alpha"], var.os_image)
error_message = "The os_image must be flatcar-stable, flatcar-beta, or flatcar-alpha."
}
}
variable "disk_size" {
type = number
description = "Size of the EBS volume in GB"
default = 40
default = 30
}
variable "disk_type" {
type = string
description = "Type of the EBS volume (e.g. standard, gp2, io1)"
default = "gp2"
description = "Type of the EBS volume (e.g. standard, gp2, gp3, io1)"
default = "gp3"
}
variable "disk_iops" {
@ -113,3 +113,22 @@ variable "node_labels" {
description = "List of initial node labels"
default = []
}
variable "node_taints" {
type = list(string)
description = "List of initial node taints"
default = []
}
# unofficial, undocumented, unsupported
variable "arch" {
type = string
description = "Container architecture (amd64 or arm64)"
default = "amd64"
validation {
condition = var.arch == "amd64" || var.arch == "arm64"
error_message = "The arch must be amd64 or arm64."
}
}

View File

@ -1,14 +1,14 @@
# Terraform version and plugin versions
terraform {
required_version = ">= 0.13.0, < 0.15.0"
required_version = ">= 0.13.0, < 2.0.0"
required_providers {
aws = ">= 2.23, <= 4.0"
template = "~> 2.1"
aws = ">= 2.23, <= 5.0"
template = "~> 2.2"
ct = {
source = "poseidon/ct"
version = "~> 0.6"
version = "~> 0.9"
}
}
}

View File

@ -33,13 +33,11 @@ resource "aws_autoscaling_group" "workers" {
# used. Disable wait to avoid issues and align with other clouds.
wait_for_capacity_timeout = "0"
tags = [
{
tag {
key = "Name"
value = "${var.name}-worker"
propagate_at_launch = true
},
]
}
}
# Worker template
@ -85,8 +83,8 @@ data "template_file" "worker-config" {
ssh_authorized_key = var.ssh_authorized_key
cluster_dns_service_ip = cidrhost(var.service_cidr, 10)
cluster_domain_suffix = var.cluster_domain_suffix
cgroup_driver = local.channel == "edge" ? "systemd" : "cgroupfs"
node_labels = join(",", var.node_labels)
node_taints = join(",", var.node_taints)
}
}

View File

@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.20.2 (upstream)
* Kubernetes v1.24.1 (upstream)
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/), SELinux enforcing
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [spot priority](https://typhoon.psdn.io/fedora-coreos/azure/#low-priority) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#hosts) customization

View File

@ -1,6 +1,6 @@
# Kubernetes assets (kubeconfig, manifests)
module "bootstrap" {
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=b3bf2ecbbe0addab2ed473148589c3454d5b366f"
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=f325be50417e27dde6599a293409cb7d068cda0c"
cluster_name = var.cluster_name
api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)]
@ -18,8 +18,6 @@ module "bootstrap" {
cluster_domain_suffix = var.cluster_domain_suffix
enable_reporting = var.enable_reporting
enable_aggregation = var.enable_aggregation
# Fedora CoreOS
trusted_certs_dir = "/etc/pki/tls/certs"
daemonset_tolerations = var.daemonset_tolerations
}

View File

@ -1,6 +1,6 @@
---
variant: fcos
version: 1.1.0
version: 1.4.0
systemd:
units:
- name: etcd-member.service
@ -12,7 +12,7 @@ systemd:
Wants=network-online.target network.target
After=network-online.target
[Service]
Environment=ETCD_IMAGE=quay.io/coreos/etcd:v3.4.14
Environment=ETCD_IMAGE=quay.io/coreos/etcd:v3.5.4
Type=exec
ExecStartPre=/bin/mkdir -p /var/lib/etcd
ExecStartPre=-/usr/bin/podman rm etcd
@ -29,8 +29,10 @@ systemd:
LimitNOFILE=40000
[Install]
WantedBy=multi-user.target
- name: docker.service
- name: containerd.service
enabled: true
- name: docker.service
mask: true
- name: wait-for-dns.service
enabled: true
contents: |
@ -51,8 +53,8 @@ systemd:
Description=Kubelet (System Container)
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.20.2
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.24.1
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /opt/cni/bin
ExecStartPre=/bin/mkdir -p /var/lib/calico
@ -63,14 +65,17 @@ systemd:
--privileged \
--pid host \
--network host \
--volume /etc/cni/net.d:/etc/cni/net.d:ro,z \
--volume /etc/kubernetes:/etc/kubernetes:ro,z \
--volume /etc/machine-id:/etc/machine-id:ro \
--volume /usr/lib/os-release:/etc/os-release:ro \
--volume /lib/modules:/lib/modules:ro \
--volume /run:/run \
--volume /sys/fs/cgroup:/sys/fs/cgroup:ro \
--volume /sys/fs/cgroup/systemd:/sys/fs/cgroup/systemd \
--volume /sys/fs/cgroup:/sys/fs/cgroup \
--volume /etc/selinux:/etc/selinux \
--volume /sys/fs/selinux:/sys/fs/selinux \
--volume /var/lib/calico:/var/lib/calico:ro \
--volume /var/lib/docker:/var/lib/docker \
--volume /var/lib/containerd:/var/lib/containerd \
--volume /var/lib/kubelet:/var/lib/kubelet:rshared,z \
--volume /var/log:/var/log \
--volume /var/run/lock:/var/run/lock:z \
@ -82,17 +87,18 @@ systemd:
--bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \
--cgroup-driver=systemd \
--cgroups-per-qos=true \
--container-runtime=remote \
--container-runtime-endpoint=unix:///run/containerd/containerd.sock \
--enforce-node-allocatable=pods \
--client-ca-file=/etc/kubernetes/ca.crt \
--cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--healthz-port=0 \
--kubeconfig=/var/lib/kubelet/kubeconfig \
--network-plugin=cni \
--node-labels=node.kubernetes.io/controller="true" \
--pod-manifest-path=/etc/kubernetes/manifests \
--read-only-port=0 \
--resolv-conf=/run/systemd/resolve/resolv.conf \
--register-with-taints=node-role.kubernetes.io/controller=:NoSchedule \
--rotate-certificates \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
@ -118,7 +124,7 @@ systemd:
--volume /opt/bootstrap/assets:/assets:ro,Z \
--volume /opt/bootstrap/apply:/apply:ro,Z \
--entrypoint=/apply \
quay.io/poseidon/kubelet:v1.20.2
quay.io/poseidon/kubelet:v1.24.1
ExecStartPost=/bin/touch /opt/bootstrap/bootstrap.done
ExecStartPost=-/usr/bin/podman stop bootstrap
storage:
@ -213,6 +219,26 @@ storage:
ETCD_PEER_CERT_FILE=/etc/ssl/certs/etcd/peer.crt
ETCD_PEER_KEY_FILE=/etc/ssl/certs/etcd/peer.key
ETCD_PEER_CLIENT_CERT_AUTH=true
- path: /etc/fedora-coreos/iptables-legacy.stamp
- path: /etc/containerd/config.toml
overwrite: true
contents:
inline: |
version = 2
root = "/var/lib/containerd"
state = "/run/containerd"
subreaper = true
oom_score = -999
[grpc]
address = "/run/containerd/containerd.sock"
uid = 0
gid = 0
[plugins."io.containerd.grpc.v1.cri"]
enable_selinux = true
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
runtime_type = "io.containerd.runc.v2"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
SystemdCgroup = true
passwd:
users:
- name: core

View File

@ -53,8 +53,6 @@ resource "azurerm_lb" "cluster" {
}
resource "azurerm_lb_rule" "apiserver" {
resource_group_name = azurerm_resource_group.cluster.name
name = "apiserver"
loadbalancer_id = azurerm_lb.cluster.id
frontend_ip_configuration_name = "apiserver"
@ -62,13 +60,11 @@ resource "azurerm_lb_rule" "apiserver" {
protocol = "Tcp"
frontend_port = 6443
backend_port = 6443
backend_address_pool_id = azurerm_lb_backend_address_pool.controller.id
backend_address_pool_ids = [azurerm_lb_backend_address_pool.controller.id]
probe_id = azurerm_lb_probe.apiserver.id
}
resource "azurerm_lb_rule" "ingress-http" {
resource_group_name = azurerm_resource_group.cluster.name
name = "ingress-http"
loadbalancer_id = azurerm_lb.cluster.id
frontend_ip_configuration_name = "ingress"
@ -77,13 +73,11 @@ resource "azurerm_lb_rule" "ingress-http" {
protocol = "Tcp"
frontend_port = 80
backend_port = 80
backend_address_pool_id = azurerm_lb_backend_address_pool.worker.id
backend_address_pool_ids = [azurerm_lb_backend_address_pool.worker.id]
probe_id = azurerm_lb_probe.ingress.id
}
resource "azurerm_lb_rule" "ingress-https" {
resource_group_name = azurerm_resource_group.cluster.name
name = "ingress-https"
loadbalancer_id = azurerm_lb.cluster.id
frontend_ip_configuration_name = "ingress"
@ -92,14 +86,12 @@ resource "azurerm_lb_rule" "ingress-https" {
protocol = "Tcp"
frontend_port = 443
backend_port = 443
backend_address_pool_id = azurerm_lb_backend_address_pool.worker.id
backend_address_pool_ids = [azurerm_lb_backend_address_pool.worker.id]
probe_id = azurerm_lb_probe.ingress.id
}
# Worker outbound TCP/UDP SNAT
resource "azurerm_lb_outbound_rule" "worker-outbound" {
resource_group_name = azurerm_resource_group.cluster.name
name = "worker"
loadbalancer_id = azurerm_lb.cluster.id
frontend_ip_configuration {
@ -112,16 +104,12 @@ resource "azurerm_lb_outbound_rule" "worker-outbound" {
# Address pool of controllers
resource "azurerm_lb_backend_address_pool" "controller" {
resource_group_name = azurerm_resource_group.cluster.name
name = "controller"
loadbalancer_id = azurerm_lb.cluster.id
}
# Address pool of workers
resource "azurerm_lb_backend_address_pool" "worker" {
resource_group_name = azurerm_resource_group.cluster.name
name = "worker"
loadbalancer_id = azurerm_lb.cluster.id
}
@ -130,8 +118,6 @@ resource "azurerm_lb_backend_address_pool" "worker" {
# TCP health check for apiserver
resource "azurerm_lb_probe" "apiserver" {
resource_group_name = azurerm_resource_group.cluster.name
name = "apiserver"
loadbalancer_id = azurerm_lb.cluster.id
protocol = "Tcp"
@ -145,8 +131,6 @@ resource "azurerm_lb_probe" "apiserver" {
# HTTP health check for ingress
resource "azurerm_lb_probe" "ingress" {
resource_group_name = azurerm_resource_group.cluster.name
name = "ingress"
loadbalancer_id = azurerm_lb.cluster.id
protocol = "Http"

View File

@ -41,4 +41,3 @@ resource "azurerm_subnet_network_security_group_association" "worker" {
subnet_id = azurerm_subnet.worker.id
network_security_group_id = azurerm_network_security_group.worker.id
}

View File

@ -43,9 +43,9 @@ output "worker_security_group_name" {
value = azurerm_network_security_group.worker.name
}
output "worker_address_prefix" {
description = "Worker network subnet CIDR address (for source/destination)"
value = azurerm_subnet.worker.address_prefix
output "worker_address_prefixes" {
description = "Worker network subnet CIDR addresses (for source/destination)"
value = azurerm_subnet.worker.address_prefixes
}
# Outputs for custom load balancing

View File

@ -18,8 +18,8 @@ resource "azurerm_network_security_rule" "controller-icmp" {
protocol = "Icmp"
source_port_range = "*"
destination_port_range = "*"
source_address_prefixes = [azurerm_subnet.controller.address_prefix, azurerm_subnet.worker.address_prefix]
destination_address_prefix = azurerm_subnet.controller.address_prefix
source_address_prefixes = concat(azurerm_subnet.controller.address_prefixes, azurerm_subnet.worker.address_prefixes)
destination_address_prefixes = azurerm_subnet.controller.address_prefixes
}
resource "azurerm_network_security_rule" "controller-ssh" {
@ -34,7 +34,7 @@ resource "azurerm_network_security_rule" "controller-ssh" {
source_port_range = "*"
destination_port_range = "22"
source_address_prefix = "*"
destination_address_prefix = azurerm_subnet.controller.address_prefix
destination_address_prefixes = azurerm_subnet.controller.address_prefixes
}
resource "azurerm_network_security_rule" "controller-etcd" {
@ -48,8 +48,8 @@ resource "azurerm_network_security_rule" "controller-etcd" {
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "2379-2380"
source_address_prefix = azurerm_subnet.controller.address_prefix
destination_address_prefix = azurerm_subnet.controller.address_prefix
source_address_prefixes = azurerm_subnet.controller.address_prefixes
destination_address_prefixes = azurerm_subnet.controller.address_prefixes
}
# Allow Prometheus to scrape etcd metrics
@ -64,8 +64,8 @@ resource "azurerm_network_security_rule" "controller-etcd-metrics" {
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "2381"
source_address_prefix = azurerm_subnet.worker.address_prefix
destination_address_prefix = azurerm_subnet.controller.address_prefix
source_address_prefixes = azurerm_subnet.worker.address_prefixes
destination_address_prefixes = azurerm_subnet.controller.address_prefixes
}
# Allow Prometheus to scrape kube-proxy metrics
@ -80,8 +80,8 @@ resource "azurerm_network_security_rule" "controller-kube-proxy" {
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "10249"
source_address_prefix = azurerm_subnet.worker.address_prefix
destination_address_prefix = azurerm_subnet.controller.address_prefix
source_address_prefixes = azurerm_subnet.worker.address_prefixes
destination_address_prefixes = azurerm_subnet.controller.address_prefixes
}
# Allow Prometheus to scrape kube-scheduler and kube-controller-manager metrics
@ -95,9 +95,9 @@ resource "azurerm_network_security_rule" "controller-kube-metrics" {
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "10251-10252"
source_address_prefix = azurerm_subnet.worker.address_prefix
destination_address_prefix = azurerm_subnet.controller.address_prefix
destination_port_range = "10257-10259"
source_address_prefixes = azurerm_subnet.worker.address_prefixes
destination_address_prefixes = azurerm_subnet.controller.address_prefixes
}
resource "azurerm_network_security_rule" "controller-apiserver" {
@ -112,7 +112,7 @@ resource "azurerm_network_security_rule" "controller-apiserver" {
source_port_range = "*"
destination_port_range = "6443"
source_address_prefix = "*"
destination_address_prefix = azurerm_subnet.controller.address_prefix
destination_address_prefixes = azurerm_subnet.controller.address_prefixes
}
resource "azurerm_network_security_rule" "controller-cilium-health" {
@ -127,8 +127,8 @@ resource "azurerm_network_security_rule" "controller-cilium-health" {
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "4240"
source_address_prefixes = [azurerm_subnet.controller.address_prefix, azurerm_subnet.worker.address_prefix]
destination_address_prefix = azurerm_subnet.controller.address_prefix
source_address_prefixes = concat(azurerm_subnet.controller.address_prefixes, azurerm_subnet.worker.address_prefixes)
destination_address_prefixes = azurerm_subnet.controller.address_prefixes
}
resource "azurerm_network_security_rule" "controller-vxlan" {
@ -142,8 +142,8 @@ resource "azurerm_network_security_rule" "controller-vxlan" {
protocol = "Udp"
source_port_range = "*"
destination_port_range = "4789"
source_address_prefixes = [azurerm_subnet.controller.address_prefix, azurerm_subnet.worker.address_prefix]
destination_address_prefix = azurerm_subnet.controller.address_prefix
source_address_prefixes = concat(azurerm_subnet.controller.address_prefixes, azurerm_subnet.worker.address_prefixes)
destination_address_prefixes = azurerm_subnet.controller.address_prefixes
}
resource "azurerm_network_security_rule" "controller-linux-vxlan" {
@ -157,8 +157,8 @@ resource "azurerm_network_security_rule" "controller-linux-vxlan" {
protocol = "Udp"
source_port_range = "*"
destination_port_range = "8472"
source_address_prefixes = [azurerm_subnet.controller.address_prefix, azurerm_subnet.worker.address_prefix]
destination_address_prefix = azurerm_subnet.controller.address_prefix
source_address_prefixes = concat(azurerm_subnet.controller.address_prefixes, azurerm_subnet.worker.address_prefixes)
destination_address_prefixes = azurerm_subnet.controller.address_prefixes
}
# Allow Prometheus to scrape node-exporter daemonset
@ -173,8 +173,8 @@ resource "azurerm_network_security_rule" "controller-node-exporter" {
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "9100"
source_address_prefix = azurerm_subnet.worker.address_prefix
destination_address_prefix = azurerm_subnet.controller.address_prefix
source_address_prefixes = azurerm_subnet.worker.address_prefixes
destination_address_prefixes = azurerm_subnet.controller.address_prefixes
}
# Allow apiserver to access kubelet's for exec, log, port-forward
@ -191,8 +191,8 @@ resource "azurerm_network_security_rule" "controller-kubelet" {
destination_port_range = "10250"
# allow Prometheus to scrape kubelet metrics too
source_address_prefixes = [azurerm_subnet.controller.address_prefix, azurerm_subnet.worker.address_prefix]
destination_address_prefix = azurerm_subnet.controller.address_prefix
source_address_prefixes = concat(azurerm_subnet.controller.address_prefixes, azurerm_subnet.worker.address_prefixes)
destination_address_prefixes = azurerm_subnet.controller.address_prefixes
}
# Override Azure AllowVNetInBound and AllowAzureLoadBalancerInBound
@ -248,8 +248,8 @@ resource "azurerm_network_security_rule" "worker-icmp" {
protocol = "Icmp"
source_port_range = "*"
destination_port_range = "*"
source_address_prefixes = [azurerm_subnet.controller.address_prefix, azurerm_subnet.worker.address_prefix]
destination_address_prefix = azurerm_subnet.worker.address_prefix
source_address_prefixes = concat(azurerm_subnet.controller.address_prefixes, azurerm_subnet.worker.address_prefixes)
destination_address_prefixes = azurerm_subnet.worker.address_prefixes
}
resource "azurerm_network_security_rule" "worker-ssh" {
@ -263,8 +263,8 @@ resource "azurerm_network_security_rule" "worker-ssh" {
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "22"
source_address_prefix = azurerm_subnet.controller.address_prefix
destination_address_prefix = azurerm_subnet.worker.address_prefix
source_address_prefixes = azurerm_subnet.controller.address_prefixes
destination_address_prefixes = azurerm_subnet.worker.address_prefixes
}
resource "azurerm_network_security_rule" "worker-http" {
@ -279,7 +279,7 @@ resource "azurerm_network_security_rule" "worker-http" {
source_port_range = "*"
destination_port_range = "80"
source_address_prefix = "*"
destination_address_prefix = azurerm_subnet.worker.address_prefix
destination_address_prefixes = azurerm_subnet.worker.address_prefixes
}
resource "azurerm_network_security_rule" "worker-https" {
@ -294,7 +294,7 @@ resource "azurerm_network_security_rule" "worker-https" {
source_port_range = "*"
destination_port_range = "443"
source_address_prefix = "*"
destination_address_prefix = azurerm_subnet.worker.address_prefix
destination_address_prefixes = azurerm_subnet.worker.address_prefixes
}
resource "azurerm_network_security_rule" "worker-cilium-health" {
@ -309,8 +309,8 @@ resource "azurerm_network_security_rule" "worker-cilium-health" {
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "4240"
source_address_prefixes = [azurerm_subnet.controller.address_prefix, azurerm_subnet.worker.address_prefix]
destination_address_prefix = azurerm_subnet.worker.address_prefix
source_address_prefixes = concat(azurerm_subnet.controller.address_prefixes, azurerm_subnet.worker.address_prefixes)
destination_address_prefixes = azurerm_subnet.worker.address_prefixes
}
resource "azurerm_network_security_rule" "worker-vxlan" {
@ -324,8 +324,8 @@ resource "azurerm_network_security_rule" "worker-vxlan" {
protocol = "Udp"
source_port_range = "*"
destination_port_range = "4789"
source_address_prefixes = [azurerm_subnet.controller.address_prefix, azurerm_subnet.worker.address_prefix]
destination_address_prefix = azurerm_subnet.worker.address_prefix
source_address_prefixes = concat(azurerm_subnet.controller.address_prefixes, azurerm_subnet.worker.address_prefixes)
destination_address_prefixes = azurerm_subnet.worker.address_prefixes
}
resource "azurerm_network_security_rule" "worker-linux-vxlan" {
@ -339,8 +339,8 @@ resource "azurerm_network_security_rule" "worker-linux-vxlan" {
protocol = "Udp"
source_port_range = "*"
destination_port_range = "8472"
source_address_prefixes = [azurerm_subnet.controller.address_prefix, azurerm_subnet.worker.address_prefix]
destination_address_prefix = azurerm_subnet.worker.address_prefix
source_address_prefixes = concat(azurerm_subnet.controller.address_prefixes, azurerm_subnet.worker.address_prefixes)
destination_address_prefixes = azurerm_subnet.worker.address_prefixes
}
# Allow Prometheus to scrape node-exporter daemonset
@ -355,8 +355,8 @@ resource "azurerm_network_security_rule" "worker-node-exporter" {
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "9100"
source_address_prefix = azurerm_subnet.worker.address_prefix
destination_address_prefix = azurerm_subnet.worker.address_prefix
source_address_prefixes = azurerm_subnet.worker.address_prefixes
destination_address_prefixes = azurerm_subnet.worker.address_prefixes
}
# Allow Prometheus to scrape kube-proxy
@ -371,8 +371,8 @@ resource "azurerm_network_security_rule" "worker-kube-proxy" {
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "10249"
source_address_prefix = azurerm_subnet.worker.address_prefix
destination_address_prefix = azurerm_subnet.worker.address_prefix
source_address_prefixes = azurerm_subnet.worker.address_prefixes
destination_address_prefixes = azurerm_subnet.worker.address_prefixes
}
# Allow apiserver to access kubelet's for exec, log, port-forward
@ -389,8 +389,8 @@ resource "azurerm_network_security_rule" "worker-kubelet" {
destination_port_range = "10250"
# allow Prometheus to scrape kubelet metrics too
source_address_prefixes = [azurerm_subnet.controller.address_prefix, azurerm_subnet.worker.address_prefix]
destination_address_prefix = azurerm_subnet.worker.address_prefix
source_address_prefixes = concat(azurerm_subnet.controller.address_prefixes, azurerm_subnet.worker.address_prefixes)
destination_address_prefixes = azurerm_subnet.worker.address_prefixes
}
# Override Azure AllowVNetInBound and AllowAzureLoadBalancerInBound

View File

@ -25,7 +25,7 @@ resource "null_resource" "copy-controller-secrets" {
provisioner "file" {
content = join("\n", local.assets_bundle)
destination = "$HOME/assets"
destination = "/home/core/assets"
}
provisioner "remote-exec" {

View File

@ -54,7 +54,7 @@ variable "os_image" {
variable "disk_size" {
type = number
description = "Size of the disk in GB"
default = 40
default = 30
}
variable "worker_priority" {
@ -65,13 +65,13 @@ variable "worker_priority" {
variable "controller_snippets" {
type = list(string)
description = "Controller Fedora CoreOS Config snippets"
description = "Controller Butane snippets"
default = []
}
variable "worker_snippets" {
type = list(string)
description = "Worker Fedora CoreOS Config snippets"
description = "Worker Butane snippets"
default = []
}
@ -84,8 +84,8 @@ variable "ssh_authorized_key" {
variable "networking" {
type = string
description = "Choice of networking provider (flannel or calico)"
default = "calico"
description = "Choice of networking provider (flannel, calico, or cilium)"
default = "cilium"
}
variable "host_cidr" {
@ -117,8 +117,8 @@ variable "enable_reporting" {
variable "enable_aggregation" {
type = bool
description = "Enable the Kubernetes Aggregation Layer (defaults to false)"
default = false
description = "Enable the Kubernetes Aggregation Layer"
default = true
}
variable "worker_node_labels" {
@ -135,3 +135,8 @@ variable "cluster_domain_suffix" {
default = "cluster.local"
}
variable "daemonset_tolerations" {
type = list(string)
description = "List of additional taint keys kube-system DaemonSets should tolerate (e.g. ['custom-role', 'gpu-role'])"
default = []
}

View File

@ -1,15 +1,15 @@
# Terraform version and plugin versions
terraform {
required_version = ">= 0.13.0, < 0.15.0"
required_version = ">= 0.13.0, < 2.0.0"
required_providers {
azurerm = "~> 2.8"
template = "~> 2.1"
null = "~> 2.1"
azurerm = ">= 2.8, < 4.0"
template = "~> 2.2"
null = ">= 2.1"
ct = {
source = "poseidon/ct"
version = "~> 0.6"
version = "~> 0.9"
}
}
}

View File

@ -1,10 +1,12 @@
---
variant: fcos
version: 1.1.0
version: 1.4.0
systemd:
units:
- name: docker.service
- name: containerd.service
enabled: true
- name: docker.service
mask: true
- name: wait-for-dns.service
enabled: true
contents: |
@ -24,8 +26,8 @@ systemd:
Description=Kubelet (System Container)
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.20.2
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.24.1
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /opt/cni/bin
ExecStartPre=/bin/mkdir -p /var/lib/calico
@ -36,14 +38,17 @@ systemd:
--privileged \
--pid host \
--network host \
--volume /etc/cni/net.d:/etc/cni/net.d:ro,z \
--volume /etc/kubernetes:/etc/kubernetes:ro,z \
--volume /usr/lib/os-release:/etc/os-release:ro \
--volume /etc/machine-id:/etc/machine-id:ro \
--volume /lib/modules:/lib/modules:ro \
--volume /run:/run \
--volume /sys/fs/cgroup:/sys/fs/cgroup:ro \
--volume /sys/fs/cgroup/systemd:/sys/fs/cgroup/systemd \
--volume /sys/fs/cgroup:/sys/fs/cgroup \
--volume /etc/selinux:/etc/selinux \
--volume /sys/fs/selinux:/sys/fs/selinux \
--volume /var/lib/calico:/var/lib/calico:ro \
--volume /var/lib/docker:/var/lib/docker \
--volume /var/lib/containerd:/var/lib/containerd \
--volume /var/lib/kubelet:/var/lib/kubelet:rshared,z \
--volume /var/log:/var/log \
--volume /var/run/lock:/var/run/lock:z \
@ -55,20 +60,24 @@ systemd:
--bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \
--cgroup-driver=systemd \
--cgroups-per-qos=true \
--container-runtime=remote \
--container-runtime-endpoint=unix:///run/containerd/containerd.sock \
--enforce-node-allocatable=pods \
--client-ca-file=/etc/kubernetes/ca.crt \
--cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--healthz-port=0 \
--kubeconfig=/var/lib/kubelet/kubeconfig \
--network-plugin=cni \
--node-labels=node.kubernetes.io/node \
%{~ for label in split(",", node_labels) ~}
--node-labels=${label} \
%{~ endfor ~}
%{~ for taint in split(",", node_taints) ~}
--register-with-taints=${taint} \
%{~ endfor ~}
--pod-manifest-path=/etc/kubernetes/manifests \
--read-only-port=0 \
--resolv-conf=/run/systemd/resolve/resolv.conf \
--rotate-certificates \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
ExecStop=-/usr/bin/podman stop kubelet
@ -83,7 +92,7 @@ systemd:
[Unit]
Description=Delete Kubernetes node on shutdown
[Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.20.2
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.24.1
Type=oneshot
RemainAfterExit=true
ExecStart=/bin/true
@ -122,10 +131,29 @@ storage:
DefaultCPUAccounting=yes
DefaultMemoryAccounting=yes
DefaultBlockIOAccounting=yes
- path: /etc/fedora-coreos/iptables-legacy.stamp
- path: /etc/containerd/config.toml
overwrite: true
contents:
inline: |
version = 2
root = "/var/lib/containerd"
state = "/run/containerd"
subreaper = true
oom_score = -999
[grpc]
address = "/run/containerd/containerd.sock"
uid = 0
gid = 0
[plugins."io.containerd.grpc.v1.cri"]
enable_selinux = true
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
runtime_type = "io.containerd.runc.v2"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
SystemdCgroup = true
passwd:
users:
- name: core
ssh_authorized_keys:
- ${ssh_authorized_key}

View File

@ -57,7 +57,7 @@ variable "priority" {
variable "snippets" {
type = list(string)
description = "Fedora CoreOS Config snippets"
description = "Butane snippets"
default = []
}
@ -88,6 +88,12 @@ variable "node_labels" {
default = []
}
variable "node_taints" {
type = list(string)
description = "List of initial node taints"
default = []
}
# unofficial, undocumented, unsupported
variable "cluster_domain_suffix" {

View File

@ -1,14 +1,14 @@
# Terraform version and plugin versions
terraform {
required_version = ">= 0.13.0, < 0.15.0"
required_version = ">= 0.13.0, < 2.0.0"
required_providers {
azurerm = "~> 2.8"
template = "~> 2.1"
azurerm = ">= 2.8, < 4.0"
template = "~> 2.2"
ct = {
source = "poseidon/ct"
version = "~> 0.6"
version = "~> 0.9"
}
}
}

View File

@ -87,6 +87,7 @@ data "template_file" "worker-config" {
cluster_dns_service_ip = cidrhost(var.service_cidr, 10)
cluster_domain_suffix = var.cluster_domain_suffix
node_labels = join(",", var.node_labels)
node_taints = join(",", var.node_taints)
}
}

View File

@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.20.2 (upstream)
* Kubernetes v1.24.1 (upstream)
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [low-priority](https://typhoon.psdn.io/flatcar-linux/azure/#low-priority) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#hosts) customization

View File

@ -1,6 +1,6 @@
# Kubernetes assets (kubeconfig, manifests)
module "bootstrap" {
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=b3bf2ecbbe0addab2ed473148589c3454d5b366f"
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=f325be50417e27dde6599a293409cb7d068cda0c"
cluster_name = var.cluster_name
api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)]
@ -18,5 +18,6 @@ module "bootstrap" {
cluster_domain_suffix = var.cluster_domain_suffix
enable_reporting = var.enable_reporting
enable_aggregation = var.enable_aggregation
daemonset_tolerations = var.daemonset_tolerations
}

View File

@ -10,7 +10,7 @@ systemd:
Requires=docker.service
After=docker.service
[Service]
Environment=ETCD_IMAGE=quay.io/coreos/etcd:v3.4.14
Environment=ETCD_IMAGE=quay.io/coreos/etcd:v3.5.4
ExecStartPre=/usr/bin/docker run -d \
--name etcd \
--network host \
@ -55,9 +55,8 @@ systemd:
After=docker.service
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.20.2
Environment=KUBELET_CGROUP_DRIVER=${cgroup_driver}
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.24.1
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /opt/cni/bin
ExecStartPre=/bin/mkdir -p /var/lib/calico
@ -68,15 +67,15 @@ systemd:
--privileged \
--pid host \
--network host \
-v /etc/cni/net.d:/etc/cni/net.d:ro \
-v /etc/kubernetes:/etc/kubernetes:ro \
-v /etc/machine-id:/etc/machine-id:ro \
-v /usr/lib/os-release:/etc/os-release:ro \
-v /lib/modules:/lib/modules:ro \
-v /run:/run \
-v /sys/fs/cgroup:/sys/fs/cgroup:ro \
-v /sys/fs/cgroup/systemd:/sys/fs/cgroup/systemd \
-v /sys/fs/cgroup:/sys/fs/cgroup \
-v /var/lib/calico:/var/lib/calico:ro \
-v /var/lib/docker:/var/lib/docker \
-v /var/lib/containerd:/var/lib/containerd \
-v /var/lib/kubelet:/var/lib/kubelet:rshared \
-v /var/log:/var/log \
-v /opt/cni/bin:/opt/cni/bin \
@ -85,17 +84,18 @@ systemd:
--authentication-token-webhook \
--authorization-mode=Webhook \
--bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \
--cgroup-driver=$${KUBELET_CGROUP_DRIVER} \
--cgroup-driver=systemd \
--container-runtime=remote \
--container-runtime-endpoint=unix:///run/containerd/containerd.sock \
--client-ca-file=/etc/kubernetes/ca.crt \
--cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--healthz-port=0 \
--kubeconfig=/var/lib/kubelet/kubeconfig \
--network-plugin=cni \
--node-labels=node.kubernetes.io/controller="true" \
--pod-manifest-path=/etc/kubernetes/manifests \
--read-only-port=0 \
--resolv-conf=/run/systemd/resolve/resolv.conf \
--register-with-taints=node-role.kubernetes.io/controller=:NoSchedule \
--rotate-certificates \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
@ -117,7 +117,7 @@ systemd:
Type=oneshot
RemainAfterExit=true
WorkingDirectory=/opt/bootstrap
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.20.2
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.24.1
ExecStart=/usr/bin/docker run \
-v /etc/kubernetes/pki:/etc/kubernetes/pki:ro \
-v /opt/bootstrap/assets:/assets:ro \

View File

@ -150,7 +150,6 @@ data "template_file" "controller-configs" {
etcd_domain = "${var.cluster_name}-etcd${count.index}.${var.dns_zone}"
# etcd0=https://cluster-etcd0.example.com,etcd1=https://cluster-etcd1.example.com,...
etcd_initial_cluster = join(",", data.template_file.etcds.*.rendered)
cgroup_driver = local.channel == "edge" ? "systemd" : "cgroupfs"
kubeconfig = indent(10, module.bootstrap.kubeconfig-kubelet)
ssh_authorized_key = var.ssh_authorized_key
cluster_dns_service_ip = cidrhost(var.service_cidr, 10)

View File

@ -53,8 +53,6 @@ resource "azurerm_lb" "cluster" {
}
resource "azurerm_lb_rule" "apiserver" {
resource_group_name = azurerm_resource_group.cluster.name
name = "apiserver"
loadbalancer_id = azurerm_lb.cluster.id
frontend_ip_configuration_name = "apiserver"
@ -62,13 +60,11 @@ resource "azurerm_lb_rule" "apiserver" {
protocol = "Tcp"
frontend_port = 6443
backend_port = 6443
backend_address_pool_id = azurerm_lb_backend_address_pool.controller.id
backend_address_pool_ids = [azurerm_lb_backend_address_pool.controller.id]
probe_id = azurerm_lb_probe.apiserver.id
}
resource "azurerm_lb_rule" "ingress-http" {
resource_group_name = azurerm_resource_group.cluster.name
name = "ingress-http"
loadbalancer_id = azurerm_lb.cluster.id
frontend_ip_configuration_name = "ingress"
@ -77,13 +73,11 @@ resource "azurerm_lb_rule" "ingress-http" {
protocol = "Tcp"
frontend_port = 80
backend_port = 80
backend_address_pool_id = azurerm_lb_backend_address_pool.worker.id
backend_address_pool_ids = [azurerm_lb_backend_address_pool.worker.id]
probe_id = azurerm_lb_probe.ingress.id
}
resource "azurerm_lb_rule" "ingress-https" {
resource_group_name = azurerm_resource_group.cluster.name
name = "ingress-https"
loadbalancer_id = azurerm_lb.cluster.id
frontend_ip_configuration_name = "ingress"
@ -92,14 +86,12 @@ resource "azurerm_lb_rule" "ingress-https" {
protocol = "Tcp"
frontend_port = 443
backend_port = 443
backend_address_pool_id = azurerm_lb_backend_address_pool.worker.id
backend_address_pool_ids = [azurerm_lb_backend_address_pool.worker.id]
probe_id = azurerm_lb_probe.ingress.id
}
# Worker outbound TCP/UDP SNAT
resource "azurerm_lb_outbound_rule" "worker-outbound" {
resource_group_name = azurerm_resource_group.cluster.name
name = "worker"
loadbalancer_id = azurerm_lb.cluster.id
frontend_ip_configuration {
@ -112,16 +104,12 @@ resource "azurerm_lb_outbound_rule" "worker-outbound" {
# Address pool of controllers
resource "azurerm_lb_backend_address_pool" "controller" {
resource_group_name = azurerm_resource_group.cluster.name
name = "controller"
loadbalancer_id = azurerm_lb.cluster.id
}
# Address pool of workers
resource "azurerm_lb_backend_address_pool" "worker" {
resource_group_name = azurerm_resource_group.cluster.name
name = "worker"
loadbalancer_id = azurerm_lb.cluster.id
}
@ -130,8 +118,6 @@ resource "azurerm_lb_backend_address_pool" "worker" {
# TCP health check for apiserver
resource "azurerm_lb_probe" "apiserver" {
resource_group_name = azurerm_resource_group.cluster.name
name = "apiserver"
loadbalancer_id = azurerm_lb.cluster.id
protocol = "Tcp"
@ -145,8 +131,6 @@ resource "azurerm_lb_probe" "apiserver" {
# HTTP health check for ingress
resource "azurerm_lb_probe" "ingress" {
resource_group_name = azurerm_resource_group.cluster.name
name = "ingress"
loadbalancer_id = azurerm_lb.cluster.id
protocol = "Http"

View File

@ -41,4 +41,3 @@ resource "azurerm_subnet_network_security_group_association" "worker" {
subnet_id = azurerm_subnet.worker.id
network_security_group_id = azurerm_network_security_group.worker.id
}

View File

@ -43,9 +43,9 @@ output "worker_security_group_name" {
value = azurerm_network_security_group.worker.name
}
output "worker_address_prefix" {
description = "Worker network subnet CIDR address (for source/destination)"
value = azurerm_subnet.worker.address_prefix
output "worker_address_prefixes" {
description = "Worker network subnet CIDR addresses (for source/destination)"
value = azurerm_subnet.worker.address_prefixes
}
# Outputs for custom load balancing

View File

@ -18,8 +18,8 @@ resource "azurerm_network_security_rule" "controller-icmp" {
protocol = "Icmp"
source_port_range = "*"
destination_port_range = "*"
source_address_prefixes = [azurerm_subnet.controller.address_prefix, azurerm_subnet.worker.address_prefix]
destination_address_prefix = azurerm_subnet.controller.address_prefix
source_address_prefixes = concat(azurerm_subnet.controller.address_prefixes, azurerm_subnet.worker.address_prefixes)
destination_address_prefixes = azurerm_subnet.controller.address_prefixes
}
resource "azurerm_network_security_rule" "controller-ssh" {
@ -34,7 +34,7 @@ resource "azurerm_network_security_rule" "controller-ssh" {
source_port_range = "*"
destination_port_range = "22"
source_address_prefix = "*"
destination_address_prefix = azurerm_subnet.controller.address_prefix
destination_address_prefixes = azurerm_subnet.controller.address_prefixes
}
resource "azurerm_network_security_rule" "controller-etcd" {
@ -48,8 +48,8 @@ resource "azurerm_network_security_rule" "controller-etcd" {
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "2379-2380"
source_address_prefix = azurerm_subnet.controller.address_prefix
destination_address_prefix = azurerm_subnet.controller.address_prefix
source_address_prefixes = azurerm_subnet.controller.address_prefixes
destination_address_prefixes = azurerm_subnet.controller.address_prefixes
}
# Allow Prometheus to scrape etcd metrics
@ -64,8 +64,8 @@ resource "azurerm_network_security_rule" "controller-etcd-metrics" {
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "2381"
source_address_prefix = azurerm_subnet.worker.address_prefix
destination_address_prefix = azurerm_subnet.controller.address_prefix
source_address_prefixes = azurerm_subnet.worker.address_prefixes
destination_address_prefixes = azurerm_subnet.controller.address_prefixes
}
# Allow Prometheus to scrape kube-proxy metrics
@ -80,8 +80,8 @@ resource "azurerm_network_security_rule" "controller-kube-proxy" {
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "10249"
source_address_prefix = azurerm_subnet.worker.address_prefix
destination_address_prefix = azurerm_subnet.controller.address_prefix
source_address_prefixes = azurerm_subnet.worker.address_prefixes
destination_address_prefixes = azurerm_subnet.controller.address_prefixes
}
# Allow Prometheus to scrape kube-scheduler and kube-controller-manager metrics
@ -95,9 +95,9 @@ resource "azurerm_network_security_rule" "controller-kube-metrics" {
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "10251-10252"
source_address_prefix = azurerm_subnet.worker.address_prefix
destination_address_prefix = azurerm_subnet.controller.address_prefix
destination_port_range = "10257-10259"
source_address_prefixes = azurerm_subnet.worker.address_prefixes
destination_address_prefixes = azurerm_subnet.controller.address_prefixes
}
resource "azurerm_network_security_rule" "controller-apiserver" {
@ -112,7 +112,7 @@ resource "azurerm_network_security_rule" "controller-apiserver" {
source_port_range = "*"
destination_port_range = "6443"
source_address_prefix = "*"
destination_address_prefix = azurerm_subnet.controller.address_prefix
destination_address_prefixes = azurerm_subnet.controller.address_prefixes
}
resource "azurerm_network_security_rule" "controller-cilium-health" {
@ -127,8 +127,8 @@ resource "azurerm_network_security_rule" "controller-cilium-health" {
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "4240"
source_address_prefixes = [azurerm_subnet.controller.address_prefix, azurerm_subnet.worker.address_prefix]
destination_address_prefix = azurerm_subnet.controller.address_prefix
source_address_prefixes = concat(azurerm_subnet.controller.address_prefixes, azurerm_subnet.worker.address_prefixes)
destination_address_prefixes = azurerm_subnet.controller.address_prefixes
}
resource "azurerm_network_security_rule" "controller-vxlan" {
@ -142,8 +142,8 @@ resource "azurerm_network_security_rule" "controller-vxlan" {
protocol = "Udp"
source_port_range = "*"
destination_port_range = "4789"
source_address_prefixes = [azurerm_subnet.controller.address_prefix, azurerm_subnet.worker.address_prefix]
destination_address_prefix = azurerm_subnet.controller.address_prefix
source_address_prefixes = concat(azurerm_subnet.controller.address_prefixes, azurerm_subnet.worker.address_prefixes)
destination_address_prefixes = azurerm_subnet.controller.address_prefixes
}
resource "azurerm_network_security_rule" "controller-linux-vxlan" {
@ -157,8 +157,8 @@ resource "azurerm_network_security_rule" "controller-linux-vxlan" {
protocol = "Udp"
source_port_range = "*"
destination_port_range = "8472"
source_address_prefixes = [azurerm_subnet.controller.address_prefix, azurerm_subnet.worker.address_prefix]
destination_address_prefix = azurerm_subnet.controller.address_prefix
source_address_prefixes = concat(azurerm_subnet.controller.address_prefixes, azurerm_subnet.worker.address_prefixes)
destination_address_prefixes = azurerm_subnet.controller.address_prefixes
}
# Allow Prometheus to scrape node-exporter daemonset
@ -173,8 +173,8 @@ resource "azurerm_network_security_rule" "controller-node-exporter" {
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "9100"
source_address_prefix = azurerm_subnet.worker.address_prefix
destination_address_prefix = azurerm_subnet.controller.address_prefix
source_address_prefixes = azurerm_subnet.worker.address_prefixes
destination_address_prefixes = azurerm_subnet.controller.address_prefixes
}
# Allow apiserver to access kubelet's for exec, log, port-forward
@ -191,8 +191,8 @@ resource "azurerm_network_security_rule" "controller-kubelet" {
destination_port_range = "10250"
# allow Prometheus to scrape kubelet metrics too
source_address_prefixes = [azurerm_subnet.controller.address_prefix, azurerm_subnet.worker.address_prefix]
destination_address_prefix = azurerm_subnet.controller.address_prefix
source_address_prefixes = concat(azurerm_subnet.controller.address_prefixes, azurerm_subnet.worker.address_prefixes)
destination_address_prefixes = azurerm_subnet.controller.address_prefixes
}
# Override Azure AllowVNetInBound and AllowAzureLoadBalancerInBound
@ -248,8 +248,8 @@ resource "azurerm_network_security_rule" "worker-icmp" {
protocol = "Icmp"
source_port_range = "*"
destination_port_range = "*"
source_address_prefixes = [azurerm_subnet.controller.address_prefix, azurerm_subnet.worker.address_prefix]
destination_address_prefix = azurerm_subnet.worker.address_prefix
source_address_prefixes = concat(azurerm_subnet.controller.address_prefixes, azurerm_subnet.worker.address_prefixes)
destination_address_prefixes = azurerm_subnet.worker.address_prefixes
}
resource "azurerm_network_security_rule" "worker-ssh" {
@ -263,8 +263,8 @@ resource "azurerm_network_security_rule" "worker-ssh" {
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "22"
source_address_prefix = azurerm_subnet.controller.address_prefix
destination_address_prefix = azurerm_subnet.worker.address_prefix
source_address_prefixes = azurerm_subnet.controller.address_prefixes
destination_address_prefixes = azurerm_subnet.worker.address_prefixes
}
resource "azurerm_network_security_rule" "worker-http" {
@ -279,7 +279,7 @@ resource "azurerm_network_security_rule" "worker-http" {
source_port_range = "*"
destination_port_range = "80"
source_address_prefix = "*"
destination_address_prefix = azurerm_subnet.worker.address_prefix
destination_address_prefixes = azurerm_subnet.worker.address_prefixes
}
resource "azurerm_network_security_rule" "worker-https" {
@ -294,7 +294,7 @@ resource "azurerm_network_security_rule" "worker-https" {
source_port_range = "*"
destination_port_range = "443"
source_address_prefix = "*"
destination_address_prefix = azurerm_subnet.worker.address_prefix
destination_address_prefixes = azurerm_subnet.worker.address_prefixes
}
resource "azurerm_network_security_rule" "worker-cilium-health" {
@ -309,8 +309,8 @@ resource "azurerm_network_security_rule" "worker-cilium-health" {
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "4240"
source_address_prefixes = [azurerm_subnet.controller.address_prefix, azurerm_subnet.worker.address_prefix]
destination_address_prefix = azurerm_subnet.worker.address_prefix
source_address_prefixes = concat(azurerm_subnet.controller.address_prefixes, azurerm_subnet.worker.address_prefixes)
destination_address_prefixes = azurerm_subnet.worker.address_prefixes
}
resource "azurerm_network_security_rule" "worker-vxlan" {
@ -324,8 +324,8 @@ resource "azurerm_network_security_rule" "worker-vxlan" {
protocol = "Udp"
source_port_range = "*"
destination_port_range = "4789"
source_address_prefixes = [azurerm_subnet.controller.address_prefix, azurerm_subnet.worker.address_prefix]
destination_address_prefix = azurerm_subnet.worker.address_prefix
source_address_prefixes = concat(azurerm_subnet.controller.address_prefixes, azurerm_subnet.worker.address_prefixes)
destination_address_prefixes = azurerm_subnet.worker.address_prefixes
}
resource "azurerm_network_security_rule" "worker-linux-vxlan" {
@ -339,8 +339,8 @@ resource "azurerm_network_security_rule" "worker-linux-vxlan" {
protocol = "Udp"
source_port_range = "*"
destination_port_range = "8472"
source_address_prefixes = [azurerm_subnet.controller.address_prefix, azurerm_subnet.worker.address_prefix]
destination_address_prefix = azurerm_subnet.worker.address_prefix
source_address_prefixes = concat(azurerm_subnet.controller.address_prefixes, azurerm_subnet.worker.address_prefixes)
destination_address_prefixes = azurerm_subnet.worker.address_prefixes
}
# Allow Prometheus to scrape node-exporter daemonset
@ -355,8 +355,8 @@ resource "azurerm_network_security_rule" "worker-node-exporter" {
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "9100"
source_address_prefix = azurerm_subnet.worker.address_prefix
destination_address_prefix = azurerm_subnet.worker.address_prefix
source_address_prefixes = azurerm_subnet.worker.address_prefixes
destination_address_prefixes = azurerm_subnet.worker.address_prefixes
}
# Allow Prometheus to scrape kube-proxy
@ -371,8 +371,8 @@ resource "azurerm_network_security_rule" "worker-kube-proxy" {
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "10249"
source_address_prefix = azurerm_subnet.worker.address_prefix
destination_address_prefix = azurerm_subnet.worker.address_prefix
source_address_prefixes = azurerm_subnet.worker.address_prefixes
destination_address_prefixes = azurerm_subnet.worker.address_prefixes
}
# Allow apiserver to access kubelet's for exec, log, port-forward
@ -389,8 +389,8 @@ resource "azurerm_network_security_rule" "worker-kubelet" {
destination_port_range = "10250"
# allow Prometheus to scrape kubelet metrics too
source_address_prefixes = [azurerm_subnet.controller.address_prefix, azurerm_subnet.worker.address_prefix]
destination_address_prefix = azurerm_subnet.worker.address_prefix
source_address_prefixes = concat(azurerm_subnet.controller.address_prefixes, azurerm_subnet.worker.address_prefixes)
destination_address_prefixes = azurerm_subnet.worker.address_prefixes
}
# Override Azure AllowVNetInBound and AllowAzureLoadBalancerInBound

View File

@ -25,7 +25,7 @@ resource "null_resource" "copy-controller-secrets" {
provisioner "file" {
content = join("\n", local.assets_bundle)
destination = "$HOME/assets"
destination = "/home/core/assets"
}
provisioner "remote-exec" {

View File

@ -48,19 +48,19 @@ variable "worker_type" {
variable "os_image" {
type = string
description = "Channel for a Container Linux derivative (flatcar-stable, flatcar-beta, flatcar-alpha, flatcar-edge)"
description = "Channel for a Container Linux derivative (flatcar-stable, flatcar-beta, flatcar-alpha)"
default = "flatcar-stable"
validation {
condition = contains(["flatcar-stable", "flatcar-beta", "flatcar-alpha", "flatcar-edge"], var.os_image)
error_message = "The os_image must be flatcar-stable, flatcar-beta, flatcar-alpha, or flatcar-edge."
condition = contains(["flatcar-stable", "flatcar-beta", "flatcar-alpha"], var.os_image)
error_message = "The os_image must be flatcar-stable, flatcar-beta, or flatcar-alpha."
}
}
variable "disk_size" {
type = number
description = "Size of the disk in GB"
default = 40
default = 30
}
variable "worker_priority" {
@ -90,8 +90,8 @@ variable "ssh_authorized_key" {
variable "networking" {
type = string
description = "Choice of networking provider (flannel or calico)"
default = "calico"
description = "Choice of networking provider (flannel, calico, or cilium)"
default = "cilium"
}
variable "host_cidr" {
@ -123,8 +123,8 @@ variable "enable_reporting" {
variable "enable_aggregation" {
type = bool
description = "Enable the Kubernetes Aggregation Layer (defaults to false)"
default = false
description = "Enable the Kubernetes Aggregation Layer"
default = true
}
variable "worker_node_labels" {
@ -141,3 +141,8 @@ variable "cluster_domain_suffix" {
default = "cluster.local"
}
variable "daemonset_tolerations" {
type = list(string)
description = "List of additional taint keys kube-system DaemonSets should tolerate (e.g. ['custom-role', 'gpu-role'])"
default = []
}

View File

@ -1,15 +1,15 @@
# Terraform version and plugin versions
terraform {
required_version = ">= 0.13.0, < 0.15.0"
required_version = ">= 0.13.0, < 2.0.0"
required_providers {
azurerm = "~> 2.8"
template = "~> 2.1"
null = "~> 2.1"
azurerm = ">= 2.8, < 4.0"
template = "~> 2.2"
null = ">= 2.1"
ct = {
source = "poseidon/ct"
version = "~> 0.6"
version = "~> 0.9"
}
}
}

View File

@ -27,9 +27,8 @@ systemd:
After=docker.service
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.20.2
Environment=KUBELET_CGROUP_DRIVER=${cgroup_driver}
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.24.1
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /opt/cni/bin
ExecStartPre=/bin/mkdir -p /var/lib/calico
@ -43,15 +42,15 @@ systemd:
--privileged \
--pid host \
--network host \
-v /etc/cni/net.d:/etc/cni/net.d:ro \
-v /etc/kubernetes:/etc/kubernetes:ro \
-v /etc/machine-id:/etc/machine-id:ro \
-v /usr/lib/os-release:/etc/os-release:ro \
-v /lib/modules:/lib/modules:ro \
-v /run:/run \
-v /sys/fs/cgroup:/sys/fs/cgroup:ro \
-v /sys/fs/cgroup/systemd:/sys/fs/cgroup/systemd \
-v /sys/fs/cgroup:/sys/fs/cgroup \
-v /var/lib/calico:/var/lib/calico:ro \
-v /var/lib/docker:/var/lib/docker \
-v /var/lib/containerd:/var/lib/containerd \
-v /var/lib/kubelet:/var/lib/kubelet:rshared \
-v /var/log:/var/log \
-v /opt/cni/bin:/opt/cni/bin \
@ -60,20 +59,24 @@ systemd:
--authentication-token-webhook \
--authorization-mode=Webhook \
--bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \
--cgroup-driver=$${KUBELET_CGROUP_DRIVER} \
--cgroup-driver=systemd \
--container-runtime=remote \
--container-runtime-endpoint=unix:///run/containerd/containerd.sock \
--client-ca-file=/etc/kubernetes/ca.crt \
--cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--healthz-port=0 \
--kubeconfig=/var/lib/kubelet/kubeconfig \
--network-plugin=cni \
--node-labels=node.kubernetes.io/node \
%{~ for label in split(",", node_labels) ~}
--node-labels=${label} \
%{~ endfor ~}
%{~ for taint in split(",", node_taints) ~}
--register-with-taints=${taint} \
%{~ endfor ~}
--pod-manifest-path=/etc/kubernetes/manifests \
--read-only-port=0 \
--resolv-conf=/run/systemd/resolve/resolv.conf \
--rotate-certificates \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
ExecStart=docker logs -f kubelet
@ -89,7 +92,7 @@ systemd:
[Unit]
Description=Delete Kubernetes node on shutdown
[Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.20.2
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.24.1
Type=oneshot
RemainAfterExit=true
ExecStart=/bin/true

View File

@ -46,12 +46,12 @@ variable "vm_type" {
variable "os_image" {
type = string
description = "Channel for a Container Linux derivative (flatcar-stable, flatcar-beta, flatcar-alpha, flatcar-edge)"
description = "Channel for a Container Linux derivative (flatcar-stable, flatcar-beta, flatcar-alpha)"
default = "flatcar-stable"
validation {
condition = contains(["flatcar-stable", "flatcar-beta", "flatcar-alpha", "flatcar-edge"], var.os_image)
error_message = "The os_image must be flatcar-stable, flatcar-beta, flatcar-alpha, or flatcar-edge."
condition = contains(["flatcar-stable", "flatcar-beta", "flatcar-alpha"], var.os_image)
error_message = "The os_image must be flatcar-stable, flatcar-beta, or flatcar-alpha."
}
}
@ -94,6 +94,12 @@ variable "node_labels" {
default = []
}
variable "node_taints" {
type = list(string)
description = "List of initial node taints"
default = []
}
# unofficial, undocumented, unsupported
variable "cluster_domain_suffix" {

View File

@ -1,14 +1,14 @@
# Terraform version and plugin versions
terraform {
required_version = ">= 0.13.0, < 0.15.0"
required_version = ">= 0.13.0, < 2.0.0"
required_providers {
azurerm = "~> 2.8"
template = "~> 2.1"
azurerm = ">= 2.8, < 4.0"
template = "~> 2.2"
ct = {
source = "poseidon/ct"
version = "~> 0.6"
version = "~> 0.9"
}
}
}

View File

@ -104,8 +104,8 @@ data "template_file" "worker-config" {
ssh_authorized_key = var.ssh_authorized_key
cluster_dns_service_ip = cidrhost(var.service_cidr, 10)
cluster_domain_suffix = var.cluster_domain_suffix
cgroup_driver = local.channel == "edge" ? "systemd" : "cgroupfs"
node_labels = join(",", var.node_labels)
node_taints = join(",", var.node_taints)
}
}

View File

@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.20.2 (upstream)
* Kubernetes v1.24.1 (upstream)
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/), SELinux enforcing
* Advanced features like [snippets](https://typhoon.psdn.io/advanced/customization/#hosts) customization

View File

@ -1,6 +1,6 @@
# Kubernetes assets (kubeconfig, manifests)
module "bootstrap" {
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=b3bf2ecbbe0addab2ed473148589c3454d5b366f"
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=f325be50417e27dde6599a293409cb7d068cda0c"
cluster_name = var.cluster_name
api_servers = [var.k8s_domain_name]
@ -13,8 +13,6 @@ module "bootstrap" {
cluster_domain_suffix = var.cluster_domain_suffix
enable_reporting = var.enable_reporting
enable_aggregation = var.enable_aggregation
trusted_certs_dir = "/etc/pki/tls/certs"
}

View File

@ -1,6 +1,6 @@
---
variant: fcos
version: 1.1.0
version: 1.4.0
systemd:
units:
- name: etcd-member.service
@ -12,7 +12,7 @@ systemd:
Wants=network-online.target network.target
After=network-online.target
[Service]
Environment=ETCD_IMAGE=quay.io/coreos/etcd:v3.4.14
Environment=ETCD_IMAGE=quay.io/coreos/etcd:v3.5.4
Type=exec
ExecStartPre=/bin/mkdir -p /var/lib/etcd
ExecStartPre=-/usr/bin/podman rm etcd
@ -29,8 +29,10 @@ systemd:
LimitNOFILE=40000
[Install]
WantedBy=multi-user.target
- name: docker.service
- name: containerd.service
enabled: true
- name: docker.service
mask: true
- name: wait-for-dns.service
enabled: true
contents: |
@ -50,8 +52,8 @@ systemd:
Description=Kubelet (System Container)
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.20.2
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.24.1
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /opt/cni/bin
ExecStartPre=/bin/mkdir -p /var/lib/calico
@ -62,14 +64,17 @@ systemd:
--privileged \
--pid host \
--network host \
--volume /etc/cni/net.d:/etc/cni/net.d:ro,z \
--volume /etc/kubernetes:/etc/kubernetes:ro,z \
--volume /usr/lib/os-release:/etc/os-release:ro \
--volume /etc/machine-id:/etc/machine-id:ro \
--volume /lib/modules:/lib/modules:ro \
--volume /run:/run \
--volume /sys/fs/cgroup:/sys/fs/cgroup:ro \
--volume /sys/fs/cgroup/systemd:/sys/fs/cgroup/systemd \
--volume /sys/fs/cgroup:/sys/fs/cgroup \
--volume /etc/selinux:/etc/selinux \
--volume /sys/fs/selinux:/sys/fs/selinux \
--volume /var/lib/calico:/var/lib/calico:ro \
--volume /var/lib/docker:/var/lib/docker \
--volume /var/lib/containerd:/var/lib/containerd \
--volume /var/lib/kubelet:/var/lib/kubelet:rshared,z \
--volume /var/log:/var/log \
--volume /var/run/lock:/var/run/lock:z \
@ -81,18 +86,19 @@ systemd:
--bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \
--cgroup-driver=systemd \
--cgroups-per-qos=true \
--container-runtime=remote \
--container-runtime-endpoint=unix:///run/containerd/containerd.sock \
--enforce-node-allocatable=pods \
--client-ca-file=/etc/kubernetes/ca.crt \
--cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--healthz-port=0 \
--hostname-override=${domain_name} \
--kubeconfig=/var/lib/kubelet/kubeconfig \
--network-plugin=cni \
--node-labels=node.kubernetes.io/controller="true" \
--pod-manifest-path=/etc/kubernetes/manifests \
--read-only-port=0 \
--resolv-conf=/run/systemd/resolve/resolv.conf \
--register-with-taints=node-role.kubernetes.io/controller=:NoSchedule \
--rotate-certificates \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
@ -120,6 +126,7 @@ systemd:
Type=oneshot
RemainAfterExit=true
WorkingDirectory=/opt/bootstrap
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.24.1
ExecStartPre=-/usr/bin/podman rm bootstrap
ExecStart=/usr/bin/podman run --name bootstrap \
--network host \
@ -127,7 +134,7 @@ systemd:
--volume /opt/bootstrap/assets:/assets:ro,Z \
--volume /opt/bootstrap/apply:/apply:ro,Z \
--entrypoint=/apply \
quay.io/poseidon/kubelet:v1.20.2
$${KUBELET_IMAGE}
ExecStartPost=/bin/touch /opt/bootstrap/bootstrap.done
ExecStartPost=-/usr/bin/podman stop bootstrap
storage:
@ -222,6 +229,26 @@ storage:
ETCD_PEER_CERT_FILE=/etc/ssl/certs/etcd/peer.crt
ETCD_PEER_KEY_FILE=/etc/ssl/certs/etcd/peer.key
ETCD_PEER_CLIENT_CERT_AUTH=true
- path: /etc/fedora-coreos/iptables-legacy.stamp
- path: /etc/containerd/config.toml
overwrite: true
contents:
inline: |
version = 2
root = "/var/lib/containerd"
state = "/run/containerd"
subreaper = true
oom_score = -999
[grpc]
address = "/run/containerd/containerd.sock"
uid = 0
gid = 0
[plugins."io.containerd.grpc.v1.cri"]
enable_selinux = true
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
runtime_type = "io.containerd.runc.v2"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
SystemdCgroup = true
passwd:
users:
- name: core

View File

@ -1,10 +1,12 @@
---
variant: fcos
version: 1.1.0
version: 1.4.0
systemd:
units:
- name: docker.service
- name: containerd.service
enabled: true
- name: docker.service
mask: true
- name: wait-for-dns.service
enabled: true
contents: |
@ -23,8 +25,8 @@ systemd:
Description=Kubelet (System Container)
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.20.2
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.24.1
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /opt/cni/bin
ExecStartPre=/bin/mkdir -p /var/lib/calico
@ -35,14 +37,17 @@ systemd:
--privileged \
--pid host \
--network host \
--volume /etc/cni/net.d:/etc/cni/net.d:ro,z \
--volume /etc/kubernetes:/etc/kubernetes:ro,z \
--volume /usr/lib/os-release:/etc/os-release:ro \
--volume /etc/machine-id:/etc/machine-id:ro \
--volume /lib/modules:/lib/modules:ro \
--volume /run:/run \
--volume /sys/fs/cgroup:/sys/fs/cgroup:ro \
--volume /sys/fs/cgroup/systemd:/sys/fs/cgroup/systemd \
--volume /sys/fs/cgroup:/sys/fs/cgroup \
--volume /etc/selinux:/etc/selinux \
--volume /sys/fs/selinux:/sys/fs/selinux \
--volume /var/lib/calico:/var/lib/calico:ro \
--volume /var/lib/docker:/var/lib/docker \
--volume /var/lib/containerd:/var/lib/containerd \
--volume /var/lib/kubelet:/var/lib/kubelet:rshared,z \
--volume /var/log:/var/log \
--volume /var/run/lock:/var/run/lock:z \
@ -54,15 +59,15 @@ systemd:
--bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \
--cgroup-driver=systemd \
--cgroups-per-qos=true \
--container-runtime=remote \
--container-runtime-endpoint=unix:///run/containerd/containerd.sock \
--enforce-node-allocatable=pods \
--client-ca-file=/etc/kubernetes/ca.crt \
--cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--healthz-port=0 \
--hostname-override=${domain_name} \
--kubeconfig=/var/lib/kubelet/kubeconfig \
--network-plugin=cni \
--node-labels=node.kubernetes.io/node \
%{~ for label in compact(split(",", node_labels)) ~}
--node-labels=${label} \
@ -72,6 +77,7 @@ systemd:
%{~ endfor ~}
--pod-manifest-path=/etc/kubernetes/manifests \
--read-only-port=0 \
--resolv-conf=/run/systemd/resolve/resolv.conf \
--rotate-certificates \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
ExecStop=-/usr/bin/podman stop kubelet
@ -121,6 +127,26 @@ storage:
DefaultCPUAccounting=yes
DefaultMemoryAccounting=yes
DefaultBlockIOAccounting=yes
- path: /etc/fedora-coreos/iptables-legacy.stamp
- path: /etc/containerd/config.toml
overwrite: true
contents:
inline: |
version = 2
root = "/var/lib/containerd"
state = "/run/containerd"
subreaper = true
oom_score = -999
[grpc]
address = "/run/containerd/containerd.sock"
uid = 0
gid = 0
[plugins."io.containerd.grpc.v1.cri"]
enable_selinux = true
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
runtime_type = "io.containerd.runc.v2"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
SystemdCgroup = true
passwd:
users:
- name: core

View File

@ -28,17 +28,18 @@ resource "null_resource" "copy-controller-secrets" {
provisioner "file" {
content = module.bootstrap.kubeconfig-kubelet
destination = "$HOME/kubeconfig"
destination = "/home/core/kubeconfig"
}
provisioner "file" {
content = join("\n", local.assets_bundle)
destination = "$HOME/assets"
destination = "/home/core/assets"
}
provisioner "remote-exec" {
inline = [
"sudo mv $HOME/kubeconfig /etc/kubernetes/kubeconfig",
"sudo mv /home/core/kubeconfig /etc/kubernetes/kubeconfig",
"sudo touch /etc/kubernetes",
"sudo /opt/bootstrap/layout",
]
}
@ -64,12 +65,13 @@ resource "null_resource" "copy-worker-secrets" {
provisioner "file" {
content = module.bootstrap.kubeconfig-kubelet
destination = "$HOME/kubeconfig"
destination = "/home/core/kubeconfig"
}
provisioner "remote-exec" {
inline = [
"sudo mv $HOME/kubeconfig /etc/kubernetes/kubeconfig",
"sudo mv /home/core/kubeconfig /etc/kubernetes/kubeconfig",
"sudo touch /etc/kubernetes",
]
}
}

View File

@ -57,7 +57,7 @@ EOD
variable "snippets" {
type = map(list(string))
description = "Map from machine names to lists of Fedora CoreOS Config snippets"
description = "Map from machine names to lists of Butane snippets"
default = {}
}
@ -87,8 +87,8 @@ variable "ssh_authorized_key" {
variable "networking" {
type = string
description = "Choice of networking provider (flannel or calico)"
default = "calico"
description = "Choice of networking provider (flannel, calico, or cilium)"
default = "cilium"
}
variable "network_mtu" {
@ -146,8 +146,8 @@ variable "enable_reporting" {
variable "enable_aggregation" {
type = bool
description = "Enable the Kubernetes Aggregation Layer (defaults to false)"
default = false
description = "Enable the Kubernetes Aggregation Layer"
default = true
}
# unofficial, undocumented, unsupported

View File

@ -1,19 +1,19 @@
# Terraform version and plugin versions
terraform {
required_version = ">= 0.13.0, < 0.15.0"
required_version = ">= 0.13.0, < 2.0.0"
required_providers {
template = "~> 2.1"
null = "~> 2.1"
template = "~> 2.2"
null = ">= 2.1"
ct = {
source = "poseidon/ct"
version = "~> 0.6"
version = "~> 0.9"
}
matchbox = {
source = "poseidon/matchbox"
version = "~> 0.4.1"
version = "~> 0.5.0"
}
}
}

View File

@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.20.2 (upstream)
* Kubernetes v1.24.1 (upstream)
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Advanced features like [snippets](https://typhoon.psdn.io/advanced/customization/#hosts) customization

View File

@ -1,6 +1,6 @@
# Kubernetes assets (kubeconfig, manifests)
module "bootstrap" {
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=b3bf2ecbbe0addab2ed473148589c3454d5b366f"
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=f325be50417e27dde6599a293409cb7d068cda0c"
cluster_name = var.cluster_name
api_servers = [var.k8s_domain_name]

View File

@ -10,7 +10,7 @@ systemd:
Requires=docker.service
After=docker.service
[Service]
Environment=ETCD_IMAGE=quay.io/coreos/etcd:v3.4.14
Environment=ETCD_IMAGE=quay.io/coreos/etcd:v3.5.4
ExecStartPre=/usr/bin/docker run -d \
--name etcd \
--network host \
@ -63,9 +63,8 @@ systemd:
After=docker.service
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.20.2
Environment=KUBELET_CGROUP_DRIVER=${cgroup_driver}
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.24.1
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /opt/cni/bin
ExecStartPre=/bin/mkdir -p /var/lib/calico
@ -76,15 +75,15 @@ systemd:
--privileged \
--pid host \
--network host \
-v /etc/cni/net.d:/etc/cni/net.d:ro \
-v /etc/kubernetes:/etc/kubernetes:ro \
-v /etc/machine-id:/etc/machine-id:ro \
-v /usr/lib/os-release:/etc/os-release:ro \
-v /lib/modules:/lib/modules:ro \
-v /run:/run \
-v /sys/fs/cgroup:/sys/fs/cgroup:ro \
-v /sys/fs/cgroup/systemd:/sys/fs/cgroup/systemd \
-v /sys/fs/cgroup:/sys/fs/cgroup \
-v /var/lib/calico:/var/lib/calico:ro \
-v /var/lib/docker:/var/lib/docker \
-v /var/lib/containerd:/var/lib/containerd \
-v /var/lib/kubelet:/var/lib/kubelet:rshared \
-v /var/log:/var/log \
-v /opt/cni/bin:/opt/cni/bin \
@ -93,18 +92,19 @@ systemd:
--authentication-token-webhook \
--authorization-mode=Webhook \
--bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \
--cgroup-driver=$${KUBELET_CGROUP_DRIVER} \
--cgroup-driver=systemd \
--container-runtime=remote \
--container-runtime-endpoint=unix:///run/containerd/containerd.sock \
--client-ca-file=/etc/kubernetes/ca.crt \
--cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--healthz-port=0 \
--hostname-override=${domain_name} \
--kubeconfig=/var/lib/kubelet/kubeconfig \
--network-plugin=cni \
--node-labels=node.kubernetes.io/controller="true" \
--pod-manifest-path=/etc/kubernetes/manifests \
--read-only-port=0 \
--resolv-conf=/run/systemd/resolve/resolv.conf \
--register-with-taints=node-role.kubernetes.io/controller=:NoSchedule \
--rotate-certificates \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
@ -126,7 +126,7 @@ systemd:
Type=oneshot
RemainAfterExit=true
WorkingDirectory=/opt/bootstrap
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.20.2
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.24.1
ExecStart=/usr/bin/docker run \
-v /etc/kubernetes/pki:/etc/kubernetes/pki:ro \
-v /opt/bootstrap/assets:/assets:ro \

View File

@ -35,9 +35,8 @@ systemd:
After=docker.service
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.20.2
Environment=KUBELET_CGROUP_DRIVER=${cgroup_driver}
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.24.1
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /opt/cni/bin
ExecStartPre=/bin/mkdir -p /var/lib/calico
@ -51,15 +50,15 @@ systemd:
--privileged \
--pid host \
--network host \
-v /etc/cni/net.d:/etc/cni/net.d:ro \
-v /etc/kubernetes:/etc/kubernetes:ro \
-v /etc/machine-id:/etc/machine-id:ro \
-v /usr/lib/os-release:/etc/os-release:ro \
-v /lib/modules:/lib/modules:ro \
-v /run:/run \
-v /sys/fs/cgroup:/sys/fs/cgroup:ro \
-v /sys/fs/cgroup/systemd:/sys/fs/cgroup/systemd \
-v /sys/fs/cgroup:/sys/fs/cgroup \
-v /var/lib/calico:/var/lib/calico:ro \
-v /var/lib/docker:/var/lib/docker \
-v /var/lib/containerd:/var/lib/containerd \
-v /var/lib/kubelet:/var/lib/kubelet:rshared \
-v /var/log:/var/log \
-v /opt/cni/bin:/opt/cni/bin \
@ -68,15 +67,15 @@ systemd:
--authentication-token-webhook \
--authorization-mode=Webhook \
--bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \
--cgroup-driver=$${KUBELET_CGROUP_DRIVER} \
--cgroup-driver=systemd \
--container-runtime=remote \
--container-runtime-endpoint=unix:///run/containerd/containerd.sock \
--client-ca-file=/etc/kubernetes/ca.crt \
--cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--healthz-port=0 \
--hostname-override=${domain_name} \
--kubeconfig=/var/lib/kubelet/kubeconfig \
--network-plugin=cni \
--node-labels=node.kubernetes.io/node \
%{~ for label in compact(split(",", node_labels)) ~}
--node-labels=${label} \
@ -86,6 +85,7 @@ systemd:
%{~ endfor ~}
--pod-manifest-path=/etc/kubernetes/manifests \
--read-only-port=0 \
--resolv-conf=/run/systemd/resolve/resolv.conf \
--rotate-certificates \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
ExecStart=docker logs -f kubelet

Some files were not shown because too many files have changed in this diff Show More